From patchwork Wed Apr 4 23:11:01 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895188 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="F95UU1JM"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="i5W47Ugx"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhZT00r1z9s1P for ; Thu, 5 Apr 2018 09:12:48 +1000 (AEST) Received: from localhost ([::1]:37769 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3raA-0004E7-VE for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:12:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54386) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZN-0004AV-A8 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZJ-0003zr-Gd for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:11:57 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:48519) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZJ-0003zc-A3 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:11:53 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 1E4D621B63; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=hATtP2smTRk0lh cXPt4DYgO2Z6nYBLOwOn9v3FVCmJM=; b=F95UU1JMQNHZUvPeAWA9OqVl8mTujL 3bsm3j2WpFoPKUHZg02mhg8X5SQlZY1fBJ3qu9vscZJKhB8M7hr/lB+l652kFq+b 4pjDpnaIs9VCuZS6MDSBY2Y0UJiVs7UiVwAqGhBc/OhkxYh1oM6+cBBWTnOq43s/ vV9jdNPYFLYto= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=hATtP2smTRk0lhcXPt4DYgO2Z6nYBLOwOn9v3FVCmJM=; b=i5W47Ugx QK6G2HOkA2SatCeTpjPPG2VHKFC6u5W5LS88KUOMJWmUUZhZuxs95mZqc3djspI4 dURD7KPS2xmGrheaUfA18Nl/kCYWTSz6IE8swLOMuLRW91iypFUR2n73l4/gjktI OGyLqSJO8FnSpyt2GktFDJ2nKrKttRXZjh4d3Tr1uMBJduDa0oPWoYxp754YsGMf e8MYUdHoRCqXHnue+A3kjsq9Z2AUj3HXlnj3aJdMVFBmTxsx2zH9FeAP24YFn0Bc I0HwJbay3EHqOyQlzAab5rMNS7nXXT1SOdnzzgA3QPOxZzImgt5ff1yu/ac2Xvm3 TT0x9DS6mi6INg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B2AEEE4925; Wed, 4 Apr 2018 19:11:15 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:01 -0400 Message-Id: <1522883475-27858-2-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 01/15] tests: add fp-test, a floating point test suite X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This will allow us to run correctness tests against our FP implementation. The test can be run in two modes (called "testers"): host and soft. With the former we check the results and FP flags on the host machine against the model. With the latter we check QEMU's fpu primitives against the model. Note that in soft mode we are not instantiating any particular CPU (hence the HW_POISON_H hack to avoid macro poisoning); for that we need to run the test in host mode under QEMU. The input files are taken from IBM's FPGen test suite: https://www.research.ibm.com/haifa/projects/verification/fpgen/ I see no license file in there so I am just downloading them with wget. We might want to keep a copy on a qemu server though, in case IBM takes those files down in the future. The "IBM" syntax of those files (for now the only syntax supported in fp-test) is documented here: https://www.research.ibm.com/haifa/projects/verification/fpgen/papers/ieee-test-suite-v2.pdf Note that the syntax document has some inaccuracies; the appended parsing code works around some of those. The exception flag (-e) is important: many of the optimizations included in the following commits assume that the inexact flag is set, so "-e x" is necessary in order to test those code paths. The whitelist flag (-w) points to a file with test cases to be ignored. I have put some whitelist files online, but we should have them on a QEMU-related server. Thus, a typical of fp-test is as follows: $ cd qemu/build/tests/fp-test $ make -j && \ ./fp-test -t soft ibm/*.fptest \ -w whitelist.txt \ -e x If we want to test after-rounding tininess detection, then we need to pass "-a -w whitelist-tininess-after.txt" in addition to the above. (NB. we can pass "-w" as many times as we want.) The patch immediately after this one fixes a mismatch against the model in softfloat, but after that is applied the above should finish with a 0 return code, and print something like: All tests OK. Tests passed: 76572. Not handled: 51237, whitelisted: 2662 The tests pass on "host" mode on x86_64 and aarch64 machines, although note that for the x86_64 you need to pass -w whitelist-tininess-after.txt. Running on host mode under QEMU reports flag mismatches (e.g. for x86_64-linux-user), but that isn't too surprising given how little love the i386 frontend gets. Host mode under aarch64-linux-user passes OK. Flush-to-zero and flush-inputs-to-zero modes can be tested with the -z and -Z flags. Note however that the IBM input files are only IEEE-compliant, so for now I've tested these modes by diff'ing the reported errors against the model files. We should look into generating files for these non-standard modes to make testing these modes less painful. Signed-off-by: Emilio G. Cota Signed-off-by: Alex Bennée --- configure | 2 + tests/fp/fp-test.c | 1159 ++++++++++++++++++++++++++++++++++++++++++++++++ tests/Makefile.include | 3 + tests/fp/.gitignore | 3 + tests/fp/Makefile | 34 ++ 5 files changed, 1201 insertions(+) create mode 100644 tests/fp/fp-test.c create mode 100644 tests/fp/.gitignore create mode 100644 tests/fp/Makefile diff --git a/configure b/configure index f156805..07dc5da 100755 --- a/configure +++ b/configure @@ -7106,12 +7106,14 @@ fi # build tree in object directory in case the source is not in the current directory DIRS="tests tests/tcg tests/tcg/cris tests/tcg/lm32 tests/libqos tests/qapi-schema tests/tcg/xtensa tests/qemu-iotests tests/vm" +DIRS="$DIRS tests/fp" DIRS="$DIRS docs docs/interop fsdev scsi" DIRS="$DIRS pc-bios/optionrom pc-bios/spapr-rtas pc-bios/s390-ccw" DIRS="$DIRS roms/seabios roms/vgabios" FILES="Makefile tests/tcg/Makefile qdict-test-data.txt" FILES="$FILES tests/tcg/cris/Makefile tests/tcg/cris/.gdbinit" FILES="$FILES tests/tcg/lm32/Makefile tests/tcg/xtensa/Makefile po/Makefile" +FILES="$FILES tests/fp/Makefile" FILES="$FILES pc-bios/optionrom/Makefile pc-bios/keymaps" FILES="$FILES pc-bios/spapr-rtas/Makefile" FILES="$FILES pc-bios/s390-ccw/Makefile" diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c new file mode 100644 index 0000000..27637c4 --- /dev/null +++ b/tests/fp/fp-test.c @@ -0,0 +1,1159 @@ +/* + * fp-test.c - Floating point test suite. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_POISON_H +#error Must define HW_POISON_H to work around TARGET_* poisoning +#endif + +#include "qemu/osdep.h" +#include "fpu/softfloat.h" + +#include +#include + +enum error { + ERROR_NONE, + ERROR_NOT_HANDLED, + ERROR_WHITELISTED, + ERROR_COMMENT, + ERROR_INPUT, + ERROR_RESULT, + ERROR_EXCEPTIONS, + ERROR_MAX, +}; + +enum input_fmt { + INPUT_FMT_IBM, +}; + +struct input { + const char * const name; + enum error (*test_line)(const char *line); +}; + +enum precision { + PREC_FLOAT, + PREC_DOUBLE, + PREC_QUAD, + PREC_FLOAT_TO_DOUBLE, +}; + +struct op_desc { + const char * const name; + int n_operands; +}; + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_MULADD, + OP_DIV, + OP_SQRT, + OP_MINNUM, + OP_MAXNUM, + OP_MAXNUMMAG, + OP_ABS, + OP_IS_NAN, + OP_IS_INF, + OP_FLOAT_TO_DOUBLE, +}; + +static const struct op_desc ops[] = { + [OP_ADD] = { "+", 2 }, + [OP_SUB] = { "-", 2 }, + [OP_MUL] = { "*", 2 }, + [OP_MULADD] = { "*+", 3 }, + [OP_DIV] = { "/", 2 }, + [OP_SQRT] = { "V", 1 }, + [OP_MINNUM] = { "C", 2 }, + [OP_MAXNUMMAG] = { ">A", 2 }, + [OP_ABS] = { "A", 1 }, + [OP_IS_NAN] = { "?N", 1 }, + [OP_IS_INF] = { "?i", 1 }, + [OP_FLOAT_TO_DOUBLE] = { "cff", 1 }, +}; + +/* + * We could enumerate all the types here. But really we only care about + * QNaN and SNaN since only those can vary across ISAs. + */ +enum op_type { + OP_TYPE_NUMBER, + OP_TYPE_QNAN, + OP_TYPE_SNAN, +}; + +struct operand { + uint64_t val; + enum op_type type; +}; + +struct test_op { + struct operand operands[3]; + struct operand expected_result; + enum precision prec; + enum op op; + signed char round; + uint8_t trapped_exceptions; + uint8_t exceptions; + bool expected_result_is_valid; +}; + +typedef enum error (*tester_func_t)(struct test_op *); + +struct tester { + tester_func_t func; + const char *name; +}; + +struct whitelist { + char **lines; + size_t n; + GHashTable *ht; +}; + +static uint64_t test_stats[ERROR_MAX]; +static struct whitelist whitelist; +static uint8_t default_exceptions; +static bool die_on_error = true; +static struct float_status soft_status = { + .float_detect_tininess = float_tininess_before_rounding, +}; + +static inline float u64_to_float(uint64_t v) +{ + uint32_t v32 = v; + uint32_t *v32p = &v32; + + return *(float *)v32p; +} + +static inline double u64_to_double(uint64_t v) +{ + uint64_t *vp = &v; + + return *(double *)vp; +} + +static inline uint64_t float_to_u64(float f) +{ + float *fp = &f; + + return *(uint32_t *)fp; +} + +static inline uint64_t double_to_u64(double d) +{ + double *dp = &d; + + return *(uint64_t *)dp; +} + +static inline bool is_err(enum error err) +{ + return err != ERROR_NONE && + err != ERROR_NOT_HANDLED && + err != ERROR_WHITELISTED && + err != ERROR_COMMENT; +} + +static int host_exceptions_translate(int host_flags) +{ + int flags = 0; + + if (host_flags & FE_INEXACT) { + flags |= float_flag_inexact; + } + if (host_flags & FE_UNDERFLOW) { + flags |= float_flag_underflow; + } + if (host_flags & FE_OVERFLOW) { + flags |= float_flag_overflow; + } + if (host_flags & FE_DIVBYZERO) { + flags |= float_flag_divbyzero; + } + if (host_flags & FE_INVALID) { + flags |= float_flag_invalid; + } + return flags; +} + +static inline uint8_t host_get_exceptions(void) +{ + return host_exceptions_translate(fetestexcept(FE_ALL_EXCEPT)); +} + +static void host_set_exceptions(uint8_t flags) +{ + int host_flags = 0; + + if (flags & float_flag_inexact) { + host_flags |= FE_INEXACT; + } + if (flags & float_flag_underflow) { + host_flags |= FE_UNDERFLOW; + } + if (flags & float_flag_overflow) { + host_flags |= FE_OVERFLOW; + } + if (flags & float_flag_divbyzero) { + host_flags |= FE_DIVBYZERO; + } + if (flags & float_flag_invalid) { + host_flags |= FE_INVALID; + } + feraiseexcept(host_flags); +} + +#define STANDARD_EXCEPTIONS \ + (float_flag_inexact | float_flag_underflow | \ + float_flag_overflow | float_flag_divbyzero | float_flag_invalid) +#define FMT_EXCEPTIONS "%s%s%s%s%s%s" +#define PR_EXCEPTIONS(x) \ + ((x) & STANDARD_EXCEPTIONS ? "" : "none"), \ + (((x) & float_flag_inexact) ? "x" : ""), \ + (((x) & float_flag_underflow) ? "u" : ""), \ + (((x) & float_flag_overflow) ? "o" : ""), \ + (((x) & float_flag_divbyzero) ? "z" : ""), \ + (((x) & float_flag_invalid) ? "i" : "") + +static enum error tester_check(const struct test_op *t, uint64_t res64, + bool res_is_nan, uint8_t flags) +{ + enum error err = ERROR_NONE; + + if (t->expected_result_is_valid) { + if (t->expected_result.type == OP_TYPE_QNAN || + t->expected_result.type == OP_TYPE_SNAN) { + if (!res_is_nan) { + err = ERROR_RESULT; + goto out; + } + } else if (res64 != t->expected_result.val) { + err = ERROR_RESULT; + goto out; + } + } + if (t->exceptions && flags != (t->exceptions | default_exceptions)) { + err = ERROR_EXCEPTIONS; + goto out; + } + + out: + if (is_err(err)) { + int i; + + fprintf(stderr, "%s ", ops[t->op].name); + for (i = 0; i < ops[t->op].n_operands; i++) { + fprintf(stderr, "0x%" PRIx64 "%s", t->operands[i].val, + i < ops[t->op].n_operands - 1 ? " " : ""); + } + fprintf(stderr, ", expected: 0x%" PRIx64 ", returned: 0x%" PRIx64, + t->expected_result.val, res64); + if (err == ERROR_EXCEPTIONS) { + fprintf(stderr, ", expected exceptions: " FMT_EXCEPTIONS + ", returned: " FMT_EXCEPTIONS, + PR_EXCEPTIONS(t->exceptions), PR_EXCEPTIONS(flags)); + } + fprintf(stderr, "\n"); + } + return err; +} + +static enum error host_tester(struct test_op *t) +{ + uint64_t res64; + bool result_is_nan; + uint8_t flags = 0; + + feclearexcept(FE_ALL_EXCEPT); + if (default_exceptions) { + host_set_exceptions(default_exceptions); + } + + if (t->prec == PREC_FLOAT) { + float a, b, c; + float *in[] = { &a, &b, &c }; + float res; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + /* use the host's QNaN/SNaN patterns */ + if (t->operands[i].type == OP_TYPE_QNAN) { + *in[i] = __builtin_nanf(""); + } else if (t->operands[i].type == OP_TYPE_SNAN) { + *in[i] = __builtin_nansf(""); + } else { + *in[i] = u64_to_float(t->operands[i].val); + } + } + + if (t->expected_result.type == OP_TYPE_QNAN) { + t->expected_result.val = float_to_u64(__builtin_nanf("")); + } else if (t->expected_result.type == OP_TYPE_SNAN) { + t->expected_result.val = float_to_u64(__builtin_nansf("")); + } + + switch (t->op) { + case OP_ADD: + res = a + b; + break; + case OP_SUB: + res = a - b; + break; + case OP_MUL: + res = a * b; + break; + case OP_MULADD: + res = fmaf(a, b, c); + break; + case OP_DIV: + res = a / b; + break; + case OP_SQRT: + res = sqrtf(a); + break; + case OP_ABS: + res = fabsf(a); + break; + case OP_IS_NAN: + res = !!isnan(a); + break; + case OP_IS_INF: + res = !!isinf(a); + break; + default: + return ERROR_NOT_HANDLED; + } + flags = host_get_exceptions(); + res64 = float_to_u64(res); + result_is_nan = isnan(res); + } else if (t->prec == PREC_DOUBLE) { + double a, b, c; + double *in[] = { &a, &b, &c }; + double res; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + /* use the host's QNaN/SNaN patterns */ + if (t->operands[i].type == OP_TYPE_QNAN) { + *in[i] = __builtin_nan(""); + } else if (t->operands[i].type == OP_TYPE_SNAN) { + *in[i] = __builtin_nans(""); + } else { + *in[i] = u64_to_double(t->operands[i].val); + } + } + + if (t->expected_result.type == OP_TYPE_QNAN) { + t->expected_result.val = double_to_u64(__builtin_nan("")); + } else if (t->expected_result.type == OP_TYPE_SNAN) { + t->expected_result.val = double_to_u64(__builtin_nans("")); + } + + switch (t->op) { + case OP_ADD: + res = a + b; + break; + case OP_SUB: + res = a - b; + break; + case OP_MUL: + res = a * b; + break; + case OP_MULADD: + res = fma(a, b, c); + break; + case OP_DIV: + res = a / b; + break; + case OP_SQRT: + res = sqrt(a); + break; + case OP_ABS: + res = fabs(a); + break; + case OP_IS_NAN: + res = !!isnan(a); + break; + case OP_IS_INF: + res = !!isinf(a); + break; + default: + return ERROR_NOT_HANDLED; + } + flags = host_get_exceptions(); + res64 = double_to_u64(res); + result_is_nan = isnan(res); + } else if (t->prec == PREC_FLOAT_TO_DOUBLE) { + float a; + double res; + + if (t->operands[0].type == OP_TYPE_QNAN) { + a = __builtin_nanf(""); + } else if (t->operands[0].type == OP_TYPE_SNAN) { + a = __builtin_nansf(""); + } else { + a = u64_to_float(t->operands[0].val); + } + + if (t->expected_result.type == OP_TYPE_QNAN) { + t->expected_result.val = double_to_u64(__builtin_nan("")); + } else if (t->expected_result.type == OP_TYPE_SNAN) { + t->expected_result.val = double_to_u64(__builtin_nans("")); + } + + switch (t->op) { + case OP_FLOAT_TO_DOUBLE: + res = a; + break; + default: + return ERROR_NOT_HANDLED; + } + flags = host_get_exceptions(); + res64 = double_to_u64(res); + result_is_nan = isnan(res); + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + return tester_check(t, res64, result_is_nan, flags); +} + +static enum error soft_tester(struct test_op *t) +{ + float_status *s = &soft_status; + uint64_t res64; + enum error err = ERROR_NONE; + bool result_is_nan; + + s->float_rounding_mode = t->round; + s->float_exception_flags = default_exceptions; + + if (t->prec == PREC_FLOAT) { + float32 a, b, c; + float32 *in[] = { &a, &b, &c }; + float32 res; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + *in[i] = t->operands[i].val; + } + + switch (t->op) { + case OP_ADD: + res = float32_add(a, b, s); + break; + case OP_SUB: + res = float32_sub(a, b, s); + break; + case OP_MUL: + res = float32_mul(a, b, s); + break; + case OP_MULADD: + res = float32_muladd(a, b, c, 0, s); + break; + case OP_DIV: + res = float32_div(a, b, s); + break; + case OP_SQRT: + res = float32_sqrt(a, s); + break; + case OP_MINNUM: + res = float32_minnum(a, b, s); + break; + case OP_MAXNUM: + res = float32_maxnum(a, b, s); + break; + case OP_MAXNUMMAG: + res = float32_maxnummag(a, b, s); + break; + case OP_IS_NAN: + { + float f = !!float32_is_any_nan(a); + + res = float_to_u64(f); + break; + } + case OP_IS_INF: + { + float f = !!float32_is_infinity(a); + + res = float_to_u64(f); + break; + } + case OP_ABS: + /* Fall-through: float32_abs does not handle NaN's */ + default: + return ERROR_NOT_HANDLED; + } + res64 = res; + result_is_nan = isnan(*(float *)&res); + } else if (t->prec == PREC_DOUBLE) { + float64 a, b, c; + float64 *in[] = { &a, &b, &c }; + int i; + + g_assert(ops[t->op].n_operands <= ARRAY_SIZE(in)); + for (i = 0; i < ops[t->op].n_operands; i++) { + *in[i] = t->operands[i].val; + } + + switch (t->op) { + case OP_ADD: + res64 = float64_add(a, b, s); + break; + case OP_SUB: + res64 = float64_sub(a, b, s); + break; + case OP_MUL: + res64 = float64_mul(a, b, s); + break; + case OP_MULADD: + res64 = float64_muladd(a, b, c, 0, s); + break; + case OP_DIV: + res64 = float64_div(a, b, s); + break; + case OP_SQRT: + res64 = float64_sqrt(a, s); + break; + case OP_MINNUM: + res64 = float64_minnum(a, b, s); + break; + case OP_MAXNUM: + res64 = float64_maxnum(a, b, s); + break; + case OP_MAXNUMMAG: + res64 = float64_maxnummag(a, b, s); + break; + case OP_IS_NAN: + { + double d = !!float64_is_any_nan(a); + + res64 = double_to_u64(d); + break; + } + case OP_IS_INF: + { + double d = !!float64_is_infinity(a); + + res64 = double_to_u64(d); + break; + } + case OP_ABS: + /* Fall-through: float64_abs does not handle NaN's */ + default: + return ERROR_NOT_HANDLED; + } + result_is_nan = isnan(*(double *)&res64); + } else if (t->prec == PREC_FLOAT_TO_DOUBLE) { + float32 a = t->operands[0].val; + + switch (t->op) { + case OP_FLOAT_TO_DOUBLE: + res64 = float32_to_float64(a, s); + break; + default: + return ERROR_NOT_HANDLED; + } + result_is_nan = isnan(*(double *)&res64); + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + return tester_check(t, res64, result_is_nan, s->float_exception_flags); + return err; +} + +static const struct tester valid_testers[] = { + [0] = { + .name = "soft", + .func = soft_tester, + }, + [1] = { + .name = "host", + .func = host_tester, + }, +}; +static const struct tester *tester = &valid_testers[0]; + +static int ibm_get_exceptions(const char *p, uint8_t *excp) +{ + while (*p) { + switch (*p) { + case 'x': + *excp |= float_flag_inexact; + break; + case 'u': + *excp |= float_flag_underflow; + break; + case 'o': + *excp |= float_flag_overflow; + break; + case 'z': + *excp |= float_flag_divbyzero; + break; + case 'i': + *excp |= float_flag_invalid; + break; + default: + return 1; + } + p++; + } + return 0; +} + +static uint64_t fp_choose(enum precision prec, uint64_t f, uint64_t d) +{ + switch (prec) { + case PREC_FLOAT: + return f; + case PREC_DOUBLE: + return d; + default: + g_assert_not_reached(); + } +} + +static int +ibm_fp_hex(const char *p, enum precision prec, struct operand *ret) +{ + int len; + + ret->type = OP_TYPE_NUMBER; + + /* QNaN */ + if (unlikely(!strcmp("Q", p))) { + ret->val = fp_choose(prec, 0xffc00000, 0xfff8000000000000); + ret->type = OP_TYPE_QNAN; + return 0; + } + /* SNaN */ + if (unlikely(!strcmp("S", p))) { + ret->val = fp_choose(prec, 0xffb00000, 0xfff7000000000000); + ret->type = OP_TYPE_SNAN; + return 0; + } + if (unlikely(!strcmp("+Zero", p))) { + ret->val = fp_choose(prec, 0x00000000, 0x0000000000000000); + return 0; + } + if (unlikely(!strcmp("-Zero", p))) { + ret->val = fp_choose(prec, 0x80000000, 0x8000000000000000); + return 0; + } + if (unlikely(!strcmp("+inf", p) || !strcmp("+Inf", p))) { + ret->val = fp_choose(prec, 0x7f800000, 0x7ff0000000000000); + return 0; + } + if (unlikely(!strcmp("-inf", p) || !strcmp("-Inf", p))) { + ret->val = fp_choose(prec, 0xff800000, 0xfff0000000000000); + return 0; + } + + len = strlen(p); + + if (strchr(p, 'P')) { + bool negative = p[0] == '-'; + char *pos; + bool denormal; + + if (len <= 4) { + return 1; + } + denormal = p[1] == '0'; + if (prec == PREC_FLOAT) { + uint32_t exponent; + uint32_t significand; + uint32_t h; + + significand = strtoul(&p[3], &pos, 16); + if (*pos != 'P') { + return 1; + } + pos++; + exponent = strtol(pos, &pos, 10) + 127; + if (pos != p + len) { + return 1; + } + /* + * When there's a leading zero, we have a denormal number. We'd + * expect the input (unbiased) exponent to be -127, but for some + * reason -126 is used. Correct that here. + */ + if (denormal) { + if (exponent != 1) { + return 1; + } + exponent = 0; + } + h = negative ? (1 << 31) : 0; + h |= exponent << 23; + h |= significand; + ret->val = h; + return 0; + } else if (prec == PREC_DOUBLE) { + uint64_t exponent; + uint64_t significand; + uint64_t h; + + significand = strtoul(&p[3], &pos, 16); + if (*pos != 'P') { + return 1; + } + pos++; + exponent = strtol(pos, &pos, 10) + 1023; + if (pos != p + len) { + return 1; + } + if (denormal) { + return 1; /* XXX */ + } + h = negative ? (1ULL << 63) : 0; + h |= exponent << 52; + h |= significand; + ret->val = h; + return 0; + } else { /* XXX */ + return 1; + } + } else if (strchr(p, 'e')) { + char *pos; + + if (prec == PREC_FLOAT) { + float f = strtof(p, &pos); + + if (*pos) { + return 1; + } + ret->val = float_to_u64(f); + return 0; + } + if (prec == PREC_DOUBLE) { + double d = strtod(p, &pos); + + if (*pos) { + return 1; + } + ret->val = double_to_u64(d); + return 0; + } + return 0; + } else if (!strcmp(p, "0x0")) { + if (prec == PREC_FLOAT) { + ret->val = float_to_u64(0.0); + } else if (prec == PREC_DOUBLE) { + ret->val = double_to_u64(0.0); + } else { + g_assert_not_reached(); + } + return 0; + } else if (!strcmp(p, "0x1")) { + if (prec == PREC_FLOAT) { + ret->val = float_to_u64(1.0); + } else if (prec == PREC_DOUBLE) { + ret->val = double_to_u64(1.0); + } else { + g_assert_not_reached(); + } + return 0; + } + return 1; +} + +static int find_op(const char *name, enum op *op) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(ops); i++) { + if (strcmp(ops[i].name, name) == 0) { + *op = i; + return 0; + } + } + return 1; +} + +/* Syntax of IBM FP test cases: + * https://www.research.ibm.com/haifa/projects/verification/fpgen/syntax.txt + */ +static enum error ibm_test_line(const char *line) +{ + struct test_op t; + /* at most nine fields; this should be more than enough for each field */ + char s[9][64]; + char *p; + int n, field; + int i; + + /* data lines start with either b32 or d(64|128) */ + if (unlikely(line[0] != 'b' && line[0] != 'd')) { + return ERROR_COMMENT; + } + n = sscanf(line, "%63s %63s %63s %63s %63s %63s %63s %63s %63s", + s[0], s[1], s[2], s[3], s[4], s[5], s[6], s[7], s[8]); + if (unlikely(n < 5 || n > 9)) { + return ERROR_INPUT; + } + + field = 0; + p = s[field]; + if (unlikely(strlen(p) < 4)) { + return ERROR_INPUT; + } + if (strcmp("b32b64cff", p) == 0) { + t.prec = PREC_FLOAT_TO_DOUBLE; + if (find_op(&p[6], &t.op)) { + return ERROR_NOT_HANDLED; + } + } else { + if (strncmp("b32", p, 3) == 0) { + t.prec = PREC_FLOAT; + } else if (strncmp("d64", p, 3) == 0) { + t.prec = PREC_DOUBLE; + } else if (strncmp("d128", p, 4) == 0) { + return ERROR_NOT_HANDLED; /* XXX */ + } else { + return ERROR_INPUT; + } + if (find_op(&p[3], &t.op)) { + return ERROR_NOT_HANDLED; + } + } + + field = 1; + p = s[field]; + if (!strncmp("=0", p, 2)) { + t.round = float_round_nearest_even; + } else { + return ERROR_NOT_HANDLED; /* XXX */ + } + + /* The trapped exceptions field is optional */ + t.trapped_exceptions = 0; + field = 2; + p = s[field]; + if (ibm_get_exceptions(p, &t.trapped_exceptions)) { + if (unlikely(n == 9)) { + return ERROR_INPUT; + } + } else { + field++; + } + + for (i = 0; i < ops[t.op].n_operands; i++) { + enum precision prec = t.prec == PREC_FLOAT_TO_DOUBLE ? + PREC_FLOAT : t.prec; + + p = s[field++]; + if (ibm_fp_hex(p, prec, &t.operands[i])) { + return ERROR_INPUT; + } + } + + p = s[field++]; + if (strcmp("->", p)) { + return ERROR_INPUT; + } + + p = s[field++]; + if (unlikely(strcmp("#", p) == 0)) { + t.expected_result_is_valid = false; + } else { + enum precision prec = t.prec == PREC_FLOAT_TO_DOUBLE ? + PREC_DOUBLE : t.prec; + + if (ibm_fp_hex(p, prec, &t.expected_result)) { + return ERROR_INPUT; + } + t.expected_result_is_valid = true; + } + + /* + * A 0 here means "do not check the exceptions", i.e. it does NOT mean + * "there should be no exceptions raised". + */ + t.exceptions = 0; + /* the expected exceptions field is optional */ + if (field == n - 1) { + p = s[field++]; + if (ibm_get_exceptions(p, &t.exceptions)) { + return ERROR_INPUT; + } + } + + /* + * We ignore "trapped exceptions" because we're not testing the trapping + * mechanism of the host CPU. + * We test though that the exception bits are correctly set. + */ + if (t.trapped_exceptions) { + return ERROR_NOT_HANDLED; + } + return tester->func(&t); +} + +static const struct input valid_input_types[] = { + [INPUT_FMT_IBM] = { + .name = "ibm", + .test_line = ibm_test_line, + }, +}; + +static const struct input *input_type = &valid_input_types[INPUT_FMT_IBM]; + +static bool line_is_whitelisted(const char *line) +{ + if (whitelist.ht == NULL) { + return false; + } + return !!g_hash_table_lookup(whitelist.ht, line); +} + +static void test_file(const char *filename) +{ + static char line[256]; + unsigned int i; + FILE *fp; + + fp = fopen(filename, "r"); + if (fp == NULL) { + fprintf(stderr, "cannot open file '%s': %s\n", + filename, strerror(errno)); + exit(EXIT_FAILURE); + } + i = 0; + while (fgets(line, sizeof(line), fp)) { + enum error err; + + i++; + if (unlikely(line_is_whitelisted(line))) { + test_stats[ERROR_WHITELISTED]++; + continue; + } + err = input_type->test_line(line); + if (unlikely(is_err(err))) { + switch (err) { + case ERROR_INPUT: + fprintf(stderr, "error: malformed input @ %s:%d:\n", + filename, i); + break; + case ERROR_RESULT: + fprintf(stderr, "error: result mismatch for input @ %s:%d:\n", + filename, i); + break; + case ERROR_EXCEPTIONS: + fprintf(stderr, "error: flags mismatch for input @ %s:%d:\n", + filename, i); + break; + default: + g_assert_not_reached(); + } + fprintf(stderr, "%s", line); + if (die_on_error) { + exit(EXIT_FAILURE); + } + } + test_stats[err]++; + } + if (fclose(fp)) { + fprintf(stderr, "warning: cannot close file '%s': %s\n", + filename, strerror(errno)); + } +} + +static void set_input_fmt(const char *optarg) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(valid_input_types); i++) { + const struct input *type = &valid_input_types[i]; + + if (strcmp(optarg, type->name) == 0) { + input_type = type; + return; + } + } + fprintf(stderr, "Unknown input format '%s'", optarg); + exit(EXIT_FAILURE); +} + +static void set_tester(const char *optarg) +{ + int i; + + for (i = 0; i < ARRAY_SIZE(valid_testers); i++) { + const struct tester *t = &valid_testers[i]; + + if (strcmp(optarg, t->name) == 0) { + tester = t; + return; + } + } + fprintf(stderr, "Unknown tester '%s'", optarg); + exit(EXIT_FAILURE); +} + +static void whitelist_add_line(const char *orig_line) +{ + char *line; + bool inserted; + + if (whitelist.ht == NULL) { + whitelist.ht = g_hash_table_new(g_str_hash, g_str_equal); + } + line = g_hash_table_lookup(whitelist.ht, orig_line); + if (unlikely(line != NULL)) { + return; + } + whitelist.n++; + whitelist.lines = g_realloc_n(whitelist.lines, whitelist.n, sizeof(line)); + line = strdup(orig_line); + whitelist.lines[whitelist.n - 1] = line; + /* if we pass key == val GLib will not reserve space for the value */ + inserted = g_hash_table_insert(whitelist.ht, line, line); + g_assert(inserted); +} + +static void set_whitelist(const char *filename) +{ + FILE *fp; + static char line[256]; + + fp = fopen(filename, "r"); + if (fp == NULL) { + fprintf(stderr, "warning: cannot open white list file '%s': %s\n", + filename, strerror(errno)); + return; + } + while (fgets(line, sizeof(line), fp)) { + if (isspace(line[0]) || line[0] == '#') { + continue; + } + whitelist_add_line(line); + } + if (fclose(fp)) { + fprintf(stderr, "warning: cannot close file '%s': %s\n", + filename, strerror(errno)); + } +} + +static void set_default_exceptions(const char *str) +{ + if (ibm_get_exceptions(str, &default_exceptions)) { + fprintf(stderr, "Invalid exception '%s'\n", str); + exit(EXIT_FAILURE); + } +} + +static void usage_complete(int argc, char *argv[]) +{ + fprintf(stderr, "Usage: %s [options] file1 [file2 ...]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -a = Perform tininess detection after rounding " + "(soft tester only). Default: before\n"); + fprintf(stderr, " -n = do not die on error. Default: dies on error\n"); + fprintf(stderr, " -e = default exception flags (xiozu). Default: none\n"); + fprintf(stderr, " -f = format of the input file(s). Default: %s\n", + valid_input_types[0].name); + fprintf(stderr, " -t = tester. Default: %s\n", valid_testers[0].name); + fprintf(stderr, " -w = path to file with test cases to be whitelisted\n"); + fprintf(stderr, " -z = flush inputs to zero (soft tester only). " + "Default: disabled\n"); + fprintf(stderr, " -Z = flush output to zero (soft tester only). " + "Default: disabled\n"); +} + +static void parse_opts(int argc, char *argv[]) +{ + int c; + + for (;;) { + c = getopt(argc, argv, "ae:f:hnt:w:zZ"); + if (c < 0) { + return; + } + switch (c) { + case 'a': + soft_status.float_detect_tininess = float_tininess_after_rounding; + break; + case 'e': + set_default_exceptions(optarg); + break; + case 'f': + set_input_fmt(optarg); + break; + case 'h': + usage_complete(argc, argv); + exit(EXIT_SUCCESS); + case 'n': + die_on_error = false; + break; + case 't': + set_tester(optarg); + break; + case 'w': + set_whitelist(optarg); + break; + case 'z': + soft_status.flush_inputs_to_zero = 1; + break; + case 'Z': + soft_status.flush_to_zero = 1; + break; + } + } + g_assert_not_reached(); +} + +static uint64_t count_errors(void) +{ + uint64_t ret = 0; + int i; + + for (i = ERROR_INPUT; i < ERROR_MAX; i++) { + ret += test_stats[i]; + } + return ret; +} + +int main(int argc, char *argv[]) +{ + uint64_t n_errors; + int i; + + if (argc == 1) { + usage_complete(argc, argv); + exit(EXIT_FAILURE); + } + parse_opts(argc, argv); + for (i = optind; i < argc; i++) { + test_file(argv[i]); + } + + n_errors = count_errors(); + if (n_errors) { + printf("Tests failed: %"PRIu64". Parsing: %"PRIu64 + ", result:%"PRIu64", flags:%"PRIu64"\n", + n_errors, test_stats[ERROR_INPUT], test_stats[ERROR_RESULT], + test_stats[ERROR_EXCEPTIONS]); + } else { + printf("All tests OK.\n"); + } + printf("Tests passed: %" PRIu64 ". Not handled: %" PRIu64 + ", whitelisted: %"PRIu64 "\n", + test_stats[ERROR_NONE], test_stats[ERROR_NOT_HANDLED], + test_stats[ERROR_WHITELISTED]); + return !!n_errors; +} diff --git a/tests/Makefile.include b/tests/Makefile.include index 0b27703..77d7353 100644 --- a/tests/Makefile.include +++ b/tests/Makefile.include @@ -642,6 +642,9 @@ tests/qht-bench$(EXESUF): tests/qht-bench.o $(test-util-obj-y) tests/test-bufferiszero$(EXESUF): tests/test-bufferiszero.o $(test-util-obj-y) tests/atomic_add-bench$(EXESUF): tests/atomic_add-bench.o $(test-util-obj-y) +tests/fp/%: + $(MAKE) -C $(dir $@) $(notdir $@) + tests/test-qdev-global-props$(EXESUF): tests/test-qdev-global-props.o \ hw/core/qdev.o hw/core/qdev-properties.o hw/core/hotplug.o\ hw/core/bus.o \ diff --git a/tests/fp/.gitignore b/tests/fp/.gitignore new file mode 100644 index 0000000..0a9fef4 --- /dev/null +++ b/tests/fp/.gitignore @@ -0,0 +1,3 @@ +ibm +*.txt +fp-test diff --git a/tests/fp/Makefile b/tests/fp/Makefile new file mode 100644 index 0000000..a208f4c --- /dev/null +++ b/tests/fp/Makefile @@ -0,0 +1,34 @@ +BUILD_DIR=$(CURDIR)/../.. + +include ../../config-host.mak +include $(SRC_PATH)/rules.mak + +$(call set-vpath, $(SRC_PATH)/tests/fp $(SRC_PATH)/fpu) + +QEMU_INCLUDES += -I../.. +QEMU_INCLUDES += -I$(SRC_PATH)/fpu +# work around TARGET_* poisoning +QEMU_CFLAGS += -DHW_POISON_H + +IBMFP := ibm-fptests.zip + +OBJS := fp-test$(EXESUF) + +WHITELIST_FILES := whitelist.txt whitelist-tininess-after.txt + +all: $(OBJS) ibm $(WHITELIST_FILES) + +ibm: + wget -nv -O $(IBMFP) http://www.haifa.il.ibm.com/projects/verification/fpgen/download/test_suite.zip + mkdir -p $@ + unzip $(IBMFP) -d $@ + rm -rf $(IBMFP) + +# XXX: upload this to a qemu server, or just commit it. +$(WHITELIST_FILES): + wget -nv -O $@ http://www.cs.columbia.edu/~cota/qemu/fpbench-$@ + +fp-test$(EXESUF): fp-test.o softfloat.o + +clean: + rm -f *.o *.d $(OBJS) From patchwork Wed Apr 4 23:11:02 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895187 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="My1W2vUZ"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="CsKPaaFK"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhZM0VcQz9rxs for ; Thu, 5 Apr 2018 09:12:43 +1000 (AEST) Received: from localhost ([::1]:37768 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3ra4-0004Ch-GY for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:12:40 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54393) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZO-0004BJ-KD for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:11:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZN-000412-Am for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:11:58 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:48407) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZN-00040q-53 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:11:57 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3F5A421C1A; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=mesmtp; bh=vN1Xv4bVHQA+LlJk6/rlOE8wWU clNizBROsKZLvGdWc=; b=My1W2vUZfLkDgR0zq4ORbi64VZIFGAwsdJwJQhcPsa uKOEd31OSNIQ2uqb044ebNqJ8doFjBMCA9dZQj86nWxVWSvlMpAQlI4RQ0oPwjF/ Gj0JwlyiSKdYDW39n1HXxV5tuRV8oV1uj6ur7WOIX7IPcFmXunGaaCbAE5VXD9Xo 4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=vN1Xv4 bVHQA+LlJk6/rlOE8wWUclNizBROsKZLvGdWc=; b=CsKPaaFKruSts+vug9tKVL 8993XQI2L3PAJ61lndMUM37DO0I+brxMRoI17IL3Ev+C30UIjydNsN7oly8fOuC8 A8SMALYDBb3poKJS4LULEgWrLi43DjnINuL8U4iB48dS+oCwF5B2lZ7cfRs/p4TB tco3tgAxHhdkHoXolfhTKt6wQp3O1jUscfjlHjc4x23HiGBDcPxaEl3tvENOaFLD ZKTzwQ0yAwidR/5nlImteEa2QATWGAJV/pckVJb3iBjSbT/7EyDUxvlz0tN4+dT8 lRM5AwziL73cd3d2HE+/MY31NhfMiJw4kKYGQSLWoZKr0pj62Q7TNFO4OzpUJhtA == X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id E7F4310259; Wed, 4 Apr 2018 19:11:15 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:02 -0400 Message-Id: <1522883475-27858-3-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 02/15] softfloat: fix {min, max}nummag for same-abs-value inputs X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Before 8936006 ("fpu/softfloat: re-factor minmax", 2018-02-21), we used to return +Zero for maxnummag(-Zero,+Zero); after that commit, we return -Zero. Fix it by making {min,max}nummag consistent with {min,max}num, deferring to the latter when the absolute value of the operands is the same. With this fix we now pass fp-test. Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 6e16284..6803279 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1704,7 +1704,6 @@ static FloatParts minmax_floats(FloatParts a, FloatParts b, bool ismin, return pick_nan(a, b, s); } else { int a_exp, b_exp; - bool a_sign, b_sign; switch (a.cls) { case float_class_normal: @@ -1735,20 +1734,22 @@ static FloatParts minmax_floats(FloatParts a, FloatParts b, bool ismin, break; } - a_sign = a.sign; - b_sign = b.sign; - if (ismag) { - a_sign = b_sign = 0; + if (ismag && (a_exp != b_exp || a.frac != b.frac)) { + bool a_less = a_exp < b_exp; + if (a_exp == b_exp) { + a_less = a.frac < b.frac; + } + return a_less ^ ismin ? b : a; } - if (a_sign == b_sign) { + if (a.sign == b.sign) { bool a_less = a_exp < b_exp; if (a_exp == b_exp) { a_less = a.frac < b.frac; } - return a_sign ^ a_less ^ ismin ? b : a; + return a.sign ^ a_less ^ ismin ? b : a; } else { - return a_sign ^ ismin ? b : a; + return a.sign ^ ismin ? b : a; } } } From patchwork Wed Apr 4 23:11:03 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895193 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="IO57EesX"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="Ofp5d0Vw"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Ghfl3Vdkz9ry1 for ; Thu, 5 Apr 2018 09:16:31 +1000 (AEST) Received: from localhost ([::1]:37805 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rdl-0007cR-Gp for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:16:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54431) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZU-0004Fq-HT for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZR-000424-3D for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:04 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:32851) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZQ-00041z-UM for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:01 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 7D58620A52; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=KjKLxqGb26Z0f2 lAkAlo6DyHUOmW1DkTJUnxIar8Ui0=; b=IO57EesXsADny4dFqgGTp/icLpIcC3 kn0Y0RT7ZlaWJtRlqrxWi4RWxJX/ddivuSNCSM4jk/wNXcvtcyvem/Qx9IsYlzMe VqM73HRi7quthpp+8kUGQ/ON1XXthbebaQsuAoVWGk3UnsMvPMIhkPwNsKLmn4W1 EZl862OLY5TF8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=KjKLxqGb26Z0f2lAkAlo6DyHUOmW1DkTJUnxIar8Ui0=; b=Ofp5d0Vw hH+JfwqiNyYR6yqn1feyjl+tJngNpY7uxoKtZTKv03+ihYDFsVmJj9963pC5dTwf eGcmEtsibCzKnuPvIoyz7h+Xe+WXnrSV1tAiWdTu8RKfwWJKUqTlvkjCPw0g2DBg 6S8Y0CqNAsFZqTRCf04wk60vTTUA2ttGa4V2XJrvkmTTKXdl6CsY+veWVlHFdHCs JBIB3EU0ZWb6+ImZJK7Vmpj2Uvccb2p3uBBFGDZ1w0G2SOBc1Iyh9BizscdkwqIo Lwq6Fdg68z0OnK80SIAJU9kg7dv7BsA57vSkcWwCytzxYvFchE0MXp/bKP2XMO3x smEcfG6AyDYE7w== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 2CF85E4853; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:03 -0400 Message-Id: <1522883475-27858-4-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 03/15] fp-test: add muladd variants X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" These are a few muladd-related operations that the original IBM syntax does not specify; model files for these are in muladd.fptest. Signed-off-by: Emilio G. Cota --- tests/fp/fp-test.c | 24 ++++++++++++++++++++++++ tests/fp/muladd.fptest | 51 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 75 insertions(+) create mode 100644 tests/fp/muladd.fptest diff --git a/tests/fp/fp-test.c b/tests/fp/fp-test.c index 27637c4..2200d40 100644 --- a/tests/fp/fp-test.c +++ b/tests/fp/fp-test.c @@ -53,6 +53,9 @@ enum op { OP_SUB, OP_MUL, OP_MULADD, + OP_MULADD_NEG_ADDEND, + OP_MULADD_NEG_PRODUCT, + OP_MULADD_NEG_RESULT, OP_DIV, OP_SQRT, OP_MINNUM, @@ -69,6 +72,9 @@ static const struct op_desc ops[] = { [OP_SUB] = { "-", 2 }, [OP_MUL] = { "*", 2 }, [OP_MULADD] = { "*+", 3 }, + [OP_MULADD_NEG_ADDEND] = { "*+nc", 3 }, + [OP_MULADD_NEG_PRODUCT] = { "*+np", 3 }, + [OP_MULADD_NEG_RESULT] = { "*+nr", 3 }, [OP_DIV] = { "/", 2 }, [OP_SQRT] = { "V", 1 }, [OP_MINNUM] = { " Q i +b32*+nc =0 -1.7FFFFFP127 -Inf +Inf -> Q i +b32*+nc =0 -1.6C9AE7P113 -Inf +Inf -> Q i +b32*+nc =0 -1.000000P-126 -Inf +Inf -> Q i +b32*+nc =0 -0.7FFFFFP-126 -Inf +Inf -> Q i +b32*+nc =0 -0.1B977AP-126 -Inf +Inf -> Q i +b32*+nc =0 -0.000001P-126 -Inf +Inf -> Q i +b32*+nc =0 -1.000000P0 -Inf +Inf -> Q i +b32*+nc =0 -Zero -Inf +Inf -> Q i +b32*+nc =0 +Zero -Inf +Inf -> Q i +b32*+nc =0 -Zero -1.000000P-126 +1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+nc =0 +Zero -1.000000P-126 +1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+nc =0 -1.000000P-126 -1.7FFFFFP127 -1.4B9156P109 -> +1.4B9156P109 x +b32*+nc =0 -0.7FFFFFP-126 -1.7FFFFFP127 -1.51BA59P-113 -> +1.7FFFFDP1 x +b32*+nc =0 -0.3D6B57P-126 -1.7FFFFFP127 -1.265398P-67 -> +1.75AD5BP0 x +b32*+nc =0 -0.000001P-126 -1.7FFFFFP127 -1.677330P-113 -> +1.7FFFFFP-22 x + +# np == negate product +b32*+np =0 +Inf -Inf -Inf -> Q i +b32*+np =0 +1.7FFFFFP127 -Inf -Inf -> Q i +b32*+np =0 +1.6C9AE7P113 -Inf -Inf -> Q i +b32*+np =0 +1.000000P-126 -Inf -Inf -> Q i +b32*+np =0 +0.7FFFFFP-126 -Inf -Inf -> Q i +b32*+np =0 +0.1B977AP-126 -Inf -Inf -> Q i +b32*+np =0 +0.000001P-126 -Inf -Inf -> Q i +b32*+np =0 +1.000000P0 -Inf -Inf -> Q i +b32*+np =0 +Zero -Inf -Inf -> Q i +b32*+np =0 +Zero -Inf -Inf -> Q i +b32*+np =0 -Zero -1.000000P-126 -1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+np =0 +Zero -1.000000P-126 -1.7FFFFFP127 -> -1.7FFFFFP127 +b32*+np =0 -1.3A6A89P-18 +1.24E7AEP9 -0.7FFFFFP-126 -> +1.7029E9P-9 x + +# nr == negate result +b32*+nr =0 -Inf -Inf -Inf -> Q i +b32*+nr =0 -1.7FFFFFP127 -Inf -Inf -> Q i +b32*+nr =0 -1.6C9AE7P113 -Inf -Inf -> Q i +b32*+nr =0 -1.000000P-126 -Inf -Inf -> Q i +b32*+nr =0 -0.7FFFFFP-126 -Inf -Inf -> Q i +b32*+nr =0 -0.1B977AP-126 -Inf -Inf -> Q i +b32*+nr =0 -0.000001P-126 -Inf -Inf -> Q i +b32*+nr =0 -1.000000P0 -Inf -Inf -> Q i +b32*+nr =0 -Zero -Inf -Inf -> Q i +b32*+nr =0 -Zero -Inf -Inf -> Q i +b32*+nr =0 +Zero -1.000000P-126 -1.7FFFFFP127 -> +1.7FFFFFP127 +b32*+nr =0 -Zero -1.000000P-126 -1.7FFFFFP127 -> +1.7FFFFFP127 +b32*+nr =0 -1.000000P-126 -1.7FFFFFP127 -1.4B9156P109 -> +1.4B9156P109 x +b32*+nr =0 -0.7FFFFFP-126 -1.7FFFFFP127 -1.51BA59P-113 -> -1.7FFFFDP1 x +b32*+nr =0 -0.3D6B57P-126 -1.7FFFFFP127 -1.265398P-67 -> -1.75AD5BP0 x +b32*+nr =0 -0.000001P-126 -1.7FFFFFP127 -1.677330P-113 -> -1.7FFFFFP-22 x +b32*+nr =0 +1.72E53AP-33 -1.7FFFFFP127 -1.5AA684P-2 -> +1.72E539P95 x From patchwork Wed Apr 4 23:11:04 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895192 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="C8tWnh7C"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="dBSaLqZW"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Ghds0pvCz9s0W for ; Thu, 5 Apr 2018 09:15:45 +1000 (AEST) Received: from localhost ([::1]:37796 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rd1-00070E-2r for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:15:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54430) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZU-0004Fp-HS for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:05 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZT-00042g-5Z for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:04 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:40121) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZT-00042Y-0x for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:03 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id B539921B5C; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-me-sender :x-me-sender:x-sasl-enc; s=mesmtp; bh=hUIsgPwBRI2thpHPPNzdoHw187 RuqPAtJvGygIE8XKs=; b=C8tWnh7CvXhV8xQc+Pu4QcAoy6PNOo27nO3F9HlA0z SrLFUbgDLjY5FmqUkO4r8rbIMUWPRIjm8XuJpV6uEmik9xzO2S6n63TIMwcjB++J rZzT1Z78XFVg51CWzx65qtdbQUtZaduNQQFOPq9OLVqqJR+1olJszEfdWLIk7Uwr s= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-sender:x-me-sender:x-sasl-enc; s=fm2; bh=hUIsgP wBRI2thpHPPNzdoHw187RuqPAtJvGygIE8XKs=; b=dBSaLqZWW0K6T0qJh8zoP2 k8FEtGNWs3ZrcgGoWYsVszbqnkv0+jUsMJhnM9LicWtxToCTgL1NKfpPeJzfftOf Eiah0m1b2g1WACNOwfDcKUSn1dyDUk6xIFX8TDMdXnWewM3al/z2GUasv/C4swdB D3fghG3Bm/ZeH3QiEU+OcKE3MLLEBE/b0kise/RFOY1q8+lY3zweMXbBEuC34K9u /d22hyZHJ+sTe8aF/gnB0Cf5MgfEboG13LBjecJMrQ6QQaoKPYkdqdop9vVBLgtZ xnajRt6xkKRgAbg2ED8bnT38ZdFrox8XISulO8jhFrWyh+aW5GVB/C83o0Irinbg == X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 6397610259; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:04 -0400 Message-Id: <1522883475-27858-5-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 04/15] softfloat: add float{32, 64}_is_{de, }normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This paves the way for upcoming work. Cc: Bastian Koppelmann Reviewed-by: Alex Bennée Signed-off-by: Emilio G. Cota Reviewed-by: Bastian Koppelmann --- include/fpu/softfloat.h | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index 36626a5..a8512fb 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -412,6 +412,16 @@ static inline int float32_is_zero_or_denormal(float32 a) return (float32_val(a) & 0x7f800000) == 0; } +static inline bool float32_is_normal(float32 a) +{ + return ((float32_val(a) + 0x00800000) & 0x7fffffff) >= 0x01000000; +} + +static inline bool float32_is_denormal(float32 a) +{ + return float32_is_zero_or_denormal(a) && !float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -541,6 +551,16 @@ static inline int float64_is_zero_or_denormal(float64 a) return (float64_val(a) & 0x7ff0000000000000LL) == 0; } +static inline bool float64_is_normal(float64 a) +{ + return ((float64_val(a) + (1ULL << 52)) & -1ULL >> 1) >= 1ULL << 53; +} + +static inline bool float64_is_denormal(float64 a) +{ + return float64_is_zero_or_denormal(a) && !float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) From patchwork Wed Apr 4 23:11:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895196 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="YzY+dp3V"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="e6ByrSrj"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhkG0WtZz9ry1 for ; Thu, 5 Apr 2018 09:19:34 +1000 (AEST) Received: from localhost ([::1]:37915 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rgi-0001i2-4G for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:19:32 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54451) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZW-0004HW-Cj for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:07 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZV-00043T-Gx for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:06 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:54767) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZV-00043H-CX for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:05 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id E756F21B6A; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:16 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=iHZEg60NRJCKqD QArDeMrUTpdOBFcwVyEvPS1xrCM3s=; b=YzY+dp3VIbufhij//NDVSQLiSXs0E9 hxUxfPe2/Vu8KQ8PC9IU9lUEVGci08VIE+3v7/w07Mh7cwoVCgoUtEfiXQdIJsvO TuepdysREut9ZVLksTy+JOpfLYvSH2SRyAToVa9eWsIvHWENNh7+uh2+ioFNa/OU 7HKE6JjOFegTE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=iHZEg60NRJCKqDQArDeMrUTpdOBFcwVyEvPS1xrCM3s=; b=e6ByrSrj kbgBOoTJei41R+sjIsFwPkSNMB3jhJEycntcFiP7/iLf0cWciy72U6iUyTsKl2qJ c9tyzGT3wl3Gk6mijRDKoRLPDlJzncnh/u0osSMTdKDbxf//rU2cdLGBxHUXjgyE d2g0EybYlrjNYp+jhXoM26hbhVs28JTA7PDZCEEW5cZRjbMlfFLbtwfPqXBo8h6D KJWUTp6Kq0pdIOuQ3X87FbpHka2rS/+wut+ZTTSCQ5CtTj5szrRMK0yFhnacRcBC 4lES+ACUGO4Yi93Ggu9/Qd2gWtyod0FfryfttjZ6u6/7gRCKeaKERI7RvXeyy/oz c0TsbspagwF4Wg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 9FD80E43C8; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:05 -0400 Message-Id: <1522883475-27858-6-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 05/15] target/tricore: use float32_is_denormal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Cc: Bastian Koppelmann Signed-off-by: Emilio G. Cota Reviewed-by: Bastian Koppelmann --- target/tricore/fpu_helper.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/target/tricore/fpu_helper.c b/target/tricore/fpu_helper.c index df16290..31df462 100644 --- a/target/tricore/fpu_helper.c +++ b/target/tricore/fpu_helper.c @@ -44,11 +44,6 @@ static inline uint8_t f_get_excp_flags(CPUTriCoreState *env) | float_flag_inexact); } -static inline bool f_is_denormal(float32 arg) -{ - return float32_is_zero_or_denormal(arg) && !float32_is_zero(arg); -} - static inline float32 f_maddsub_nan_result(float32 arg1, float32 arg2, float32 arg3, float32 result, uint32_t muladd_negate_c) @@ -260,8 +255,8 @@ uint32_t helper_fcmp(CPUTriCoreState *env, uint32_t r1, uint32_t r2) set_flush_inputs_to_zero(0, &env->fp_status); result = 1 << (float32_compare_quiet(arg1, arg2, &env->fp_status) + 1); - result |= f_is_denormal(arg1) << 4; - result |= f_is_denormal(arg2) << 5; + result |= float32_is_denormal(arg1) << 4; + result |= float32_is_denormal(arg2) << 5; flags = f_get_excp_flags(env); if (flags) { From patchwork Wed Apr 4 23:11:06 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="wdym+MkF"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="ZbslL+H/"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhnD74ymz9ry1 for ; Thu, 5 Apr 2018 09:22:08 +1000 (AEST) Received: from localhost ([::1]:37976 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rjD-0003se-0M for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:22:07 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54471) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZZ-0004Jx-53 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:11 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZW-000443-W8 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:09 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:33479) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZW-00043w-Q4 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:06 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 4002621B4F; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=qDVhauQvTQV+Ik Gp4X87vNV7mJwSbmBajOLIzRgofO0=; b=wdym+MkFYDQZnaUjNZTXG6R8jhWW+n hIxFGNSFdiklwdPxs4TV91TvLSaqkzi/5/P5EiPnr2DyQCoUNOcuT7jQ2THuHEXx fSM6PVcgytHgKJgwVN1fccyW2rIhj61O6i3/JmCiZWPjdPc4Ncudd5lgTwEpWDR5 HAIp/MAU29t6Y= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=qDVhauQvTQV+IkGp4X87vNV7mJwSbmBajOLIzRgofO0=; b=ZbslL+H/ pZW75yldC7kJnvV9syrkjY5+BRT+G83JOPbAvNxK5vBnWRxDqbWiBPk72AFi3No3 WJdQLnhXeF32q5Of2X7fd5mT2k7BhjclRH9Ro4wNO0UxKXyYZi8rY+UL4jSKszRd poMRUneXy80biQ4SKifNVtmL0QSDk4sP5F+Mq0EhWQS0aBw355lwntvaLaMlfTb3 Mzddsq3QOWy5ajjpH/Pj1Urivpm0HnoUxATxi5vVKxRmI6r5uSsyWfCa08Movnld psuDNsguxEQg8muAeuqGvp97clHVjit/kl4jmC3bFdugB9BwilqQ0fQxHXe2QPyx JWXeswCjXQphEw== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id D1EEA10259; Wed, 4 Apr 2018 19:11:16 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:06 -0400 Message-Id: <1522883475-27858-7-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 06/15] tests/fp: add fp-bench, a collection of simple floating point microbenchmarks X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This will allow us to measure the performance impact of FP emulation optimizations. Note that we can measure both directly the impact on the softfloat functions (with "-t soft"), or the impact on an emulated workload (call with "-t host" and run under qemu user-mode). Signed-off-by: Emilio G. Cota --- tests/fp/fp-bench.c | 528 ++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/fp/.gitignore | 1 + tests/fp/Makefile | 4 +- 3 files changed, 532 insertions(+), 1 deletion(-) create mode 100644 tests/fp/fp-bench.c diff --git a/tests/fp/fp-bench.c b/tests/fp/fp-bench.c new file mode 100644 index 0000000..a012b78 --- /dev/null +++ b/tests/fp/fp-bench.c @@ -0,0 +1,528 @@ +/* + * fp-bench.c - A collection of simple floating point microbenchmarks. + * + * Copyright (C) 2018, Emilio G. Cota + * + * License: GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ +#ifndef HW_POISON_H +#error Must define HW_POISON_H to work around TARGET_* poisoning +#endif + +#include "qemu/osdep.h" +#include "qemu/timer.h" + +#include "fpu/softfloat.h" + +#include + +/* amortize the computation of random inputs */ +#define OPS_PER_ITER 50000 + +#define MAX_OPERANDS 3 + +#define SEED_A 0xdeadfacedeadface +#define SEED_B 0xbadc0feebadc0fee +#define SEED_C 0xbeefdeadbeefdead + +enum op { + OP_ADD, + OP_SUB, + OP_MUL, + OP_DIV, + OP_FMA, + OP_SQRT, + OP_CMP, + OP_MAX_NR, +}; + +static const char * const op_names[] = { + [OP_ADD] = "add", + [OP_SUB] = "sub", + [OP_MUL] = "mul", + [OP_DIV] = "div", + [OP_FMA] = "fma", + [OP_SQRT] = "sqrt", + [OP_CMP] = "cmp", + [OP_MAX_NR] = NULL, +}; + +enum precision { + PREC_SINGLE, + PREC_DOUBLE, + PREC_FLOAT32, + PREC_FLOAT64, + PREC_MAX_NR, +}; + +enum tester { + TESTER_SOFT, + TESTER_HOST, + TESTER_MAX_NR, +}; + +static const char * const tester_names[] = { + [TESTER_SOFT] = "soft", + [TESTER_HOST] = "host", + [TESTER_MAX_NR] = NULL, +}; + +union fp { + float f; + double d; + float32 f32; + float64 f64; + uint64_t u64; +}; + +struct op_state; + +typedef float (*float_func_t)(const struct op_state *s); +typedef double (*double_func_t)(const struct op_state *s); + +union fp_func { + float_func_t float_func; + double_func_t double_func; +}; + +typedef void (*bench_func_t)(void); + +struct op_desc { + const char * const name; +}; + +#define DEFAULT_DURATION_SECS 1 + +static uint64_t random_ops[MAX_OPERANDS] = { + SEED_A, SEED_B, SEED_C, +}; +static float_status soft_status; +static enum precision precision; +static enum op operation; +static enum tester tester; +static uint64_t n_completed_ops; +static unsigned int duration = DEFAULT_DURATION_SECS; +static int64_t ns_elapsed; +/* disable optimizations with volatile */ +static volatile union fp res; + +/* + * From: https://en.wikipedia.org/wiki/Xorshift + * This is faster than rand_r(), and gives us a wider range (RAND_MAX is only + * guaranteed to be >= INT_MAX). + */ +static uint64_t xorshift64star(uint64_t x) +{ + x ^= x >> 12; /* a */ + x ^= x << 25; /* b */ + x ^= x >> 27; /* c */ + return x * UINT64_C(2685821657736338717); +} + +static void update_random_ops(int n_ops, enum precision prec) +{ + int i; + + for (i = 0; i < n_ops; i++) { + uint64_t r = random_ops[i]; + + if (prec == PREC_SINGLE || PREC_FLOAT32) { + do { + r = xorshift64star(r); + } while (!float32_is_normal(r)); + } else if (prec == PREC_DOUBLE || PREC_FLOAT64) { + do { + r = xorshift64star(r); + } while (!float64_is_normal(r)); + } else { + g_assert_not_reached(); + } + random_ops[i] = r; + } +} + +static void fill_random(union fp *ops, int n_ops, enum precision prec, + bool no_neg) +{ + int i; + + for (i = 0; i < n_ops; i++) { + switch (prec) { + case PREC_SINGLE: + case PREC_FLOAT32: + ops[i].f32 = make_float32(random_ops[i]); + if (no_neg && float32_is_neg(ops[i].f32)) { + ops[i].f32 = float32_chs(ops[i].f32); + } + /* raise the exponent to limit the frequency of denormal results */ + ops[i].f32 |= 0x40000000; + break; + case PREC_DOUBLE: + case PREC_FLOAT64: + ops[i].f64 = make_float64(random_ops[i]); + if (no_neg && float64_is_neg(ops[i].f64)) { + ops[i].f64 = float64_chs(ops[i].f64); + } + /* raise the exponent to limit the frequency of denormal results */ + ops[i].f64 |= LIT64(0x4000000000000000); + break; + default: + g_assert_not_reached(); + } + } +} + +/* + * The main benchmark function. Instead of (ab)using macros, we rely + * on the compiler to unfold this at compile-time. + */ +static void bench(enum precision prec, enum op op, int n_ops, bool no_neg) +{ + int64_t tf = get_clock_realtime() + duration * 1000000000LL; + + while (get_clock_realtime() < tf) { + union fp ops[MAX_OPERANDS]; + int64_t t0; + int i; + + update_random_ops(n_ops, prec); + switch (prec) { + case PREC_SINGLE: + fill_random(ops, n_ops, prec, no_neg); + t0 = get_clock_realtime(); + for (i = 0; i < OPS_PER_ITER; i++) { + float a = ops[0].f; + float b = ops[1].f; + float c = ops[2].f; + + switch (op) { + case OP_ADD: + res.f = a + b; + break; + case OP_SUB: + res.f = a - b; + break; + case OP_MUL: + res.f = a * b; + break; + case OP_DIV: + res.f = a / b; + break; + case OP_FMA: + res.f = fmaf(a, b, c); + break; + case OP_SQRT: + res.f = sqrtf(a); + break; + case OP_CMP: + res.u64 = isgreater(a, b); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_DOUBLE: + fill_random(ops, n_ops, prec, no_neg); + t0 = get_clock_realtime(); + for (i = 0; i < OPS_PER_ITER; i++) { + double a = ops[0].d; + double b = ops[1].d; + double c = ops[2].d; + + switch (op) { + case OP_ADD: + res.d = a + b; + break; + case OP_SUB: + res.d = a - b; + break; + case OP_MUL: + res.d = a * b; + break; + case OP_DIV: + res.d = a / b; + break; + case OP_FMA: + res.d = fma(a, b, c); + break; + case OP_SQRT: + res.d = sqrt(a); + break; + case OP_CMP: + res.u64 = isgreater(a, b); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_FLOAT32: + fill_random(ops, n_ops, prec, no_neg); + t0 = get_clock_realtime(); + for (i = 0; i < OPS_PER_ITER; i++) { + float32 a = ops[0].f32; + float32 b = ops[1].f32; + float32 c = ops[2].f32; + + switch (op) { + case OP_ADD: + res.f32 = float32_add(a, b, &soft_status); + break; + case OP_SUB: + res.f32 = float32_sub(a, b, &soft_status); + break; + case OP_MUL: + res.f = float32_mul(a, b, &soft_status); + break; + case OP_DIV: + res.f32 = float32_div(a, b, &soft_status); + break; + case OP_FMA: + res.f32 = float32_muladd(a, b, c, 0, &soft_status); + break; + case OP_SQRT: + res.f32 = float32_sqrt(a, &soft_status); + break; + case OP_CMP: + res.u64 = float32_compare_quiet(a, b, &soft_status); + break; + default: + g_assert_not_reached(); + } + } + break; + case PREC_FLOAT64: + fill_random(ops, n_ops, prec, no_neg); + t0 = get_clock_realtime(); + for (i = 0; i < OPS_PER_ITER; i++) { + float64 a = ops[0].f64; + float64 b = ops[1].f64; + float64 c = ops[2].f64; + + switch (op) { + case OP_ADD: + res.f64 = float64_add(a, b, &soft_status); + break; + case OP_SUB: + res.f64 = float64_sub(a, b, &soft_status); + break; + case OP_MUL: + res.f = float64_mul(a, b, &soft_status); + break; + case OP_DIV: + res.f64 = float64_div(a, b, &soft_status); + break; + case OP_FMA: + res.f64 = float64_muladd(a, b, c, 0, &soft_status); + break; + case OP_SQRT: + res.f64 = float64_sqrt(a, &soft_status); + break; + case OP_CMP: + res.u64 = float64_compare_quiet(a, b, &soft_status); + break; + default: + g_assert_not_reached(); + } + } + break; + default: + g_assert_not_reached(); + } + ns_elapsed += get_clock_realtime() - t0; + n_completed_ops += OPS_PER_ITER; + } +} + +#define GEN_BENCH(name, type, prec, op, n_ops) \ + static void __attribute__((flatten)) name(void) \ + { \ + bench(prec, op, n_ops, false); \ + } + +#define GEN_BENCH_NO_NEG(name, type, prec, op, n_ops) \ + static void __attribute__((flatten)) name(void) \ + { \ + bench(prec, op, n_ops, true); \ + } + +#define GEN_BENCH_ALL_TYPES(opname, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _float, float, PREC_SINGLE, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _double, double, PREC_DOUBLE, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _float32, float32, PREC_FLOAT32, op, n_ops) \ + GEN_BENCH(bench_ ## opname ## _float64, float64, PREC_FLOAT64, op, n_ops) + +GEN_BENCH_ALL_TYPES(add, OP_ADD, 2) +GEN_BENCH_ALL_TYPES(sub, OP_SUB, 2) +GEN_BENCH_ALL_TYPES(mul, OP_MUL, 2) +GEN_BENCH_ALL_TYPES(div, OP_DIV, 2) +GEN_BENCH_ALL_TYPES(fma, OP_FMA, 3) +GEN_BENCH_ALL_TYPES(cmp, OP_CMP, 2) +#undef GEN_BENCH_ALL_TYPES + +#define GEN_BENCH_ALL_TYPES_NO_NEG(name, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float, float, PREC_SINGLE, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _double, double, PREC_DOUBLE, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float32, float32, PREC_FLOAT32, op, n) \ + GEN_BENCH_NO_NEG(bench_ ## name ## _float64, float64, PREC_FLOAT64, op, n) + +GEN_BENCH_ALL_TYPES_NO_NEG(sqrt, OP_SQRT, 1) +#undef GEN_BENCH_ALL_TYPES_NO_NEG + +#undef GEN_BENCH_NO_NEG +#undef GEN_BENCH + +#define GEN_BENCH_FUNCS(opname, op) \ + [op] = { \ + [PREC_SINGLE] = bench_ ## opname ## _float, \ + [PREC_DOUBLE] = bench_ ## opname ## _double, \ + [PREC_FLOAT32] = bench_ ## opname ## _float32, \ + [PREC_FLOAT64] = bench_ ## opname ## _float64, \ + } + +static const bench_func_t bench_funcs[OP_MAX_NR][PREC_MAX_NR] = { + GEN_BENCH_FUNCS(add, OP_ADD), + GEN_BENCH_FUNCS(sub, OP_SUB), + GEN_BENCH_FUNCS(mul, OP_MUL), + GEN_BENCH_FUNCS(div, OP_DIV), + GEN_BENCH_FUNCS(fma, OP_FMA), + GEN_BENCH_FUNCS(sqrt, OP_SQRT), + GEN_BENCH_FUNCS(cmp, OP_CMP), +}; + +#undef GEN_BENCH_FUNCS + +static void run_bench(void) +{ + bench_func_t f; + + f = bench_funcs[operation][precision]; + g_assert(f); + f(); +} + +/* @arr must be NULL-terminated */ +static int find_name(const char * const *arr, const char *name) +{ + int i; + + for (i = 0; arr[i] != NULL; i++) { + if (strcmp(name, arr[i]) == 0) { + return i; + } + } + return -1; +} + +static void usage_complete(int argc, char *argv[]) +{ + gchar *op_list = g_strjoinv(", ", (gchar **)op_names); + gchar *tester_list = g_strjoinv(", ", (gchar **)tester_names); + + fprintf(stderr, "Usage: %s [options]\n", argv[0]); + fprintf(stderr, "options:\n"); + fprintf(stderr, " -d = duration, in seconds. Default: %d\n", + DEFAULT_DURATION_SECS); + fprintf(stderr, " -h = show this help message.\n"); + fprintf(stderr, " -o = floating point operation (%s). Default: %s\n", + op_list, op_names[0]); + fprintf(stderr, " -p = floating point precision (single, double). " + "Default: single\n"); + fprintf(stderr, " -t = tester (%s). Default: %s\n", + tester_list, tester_names[0]); + fprintf(stderr, " -z = flush inputs to zero (soft tester only). " + "Default: disabled\n"); + fprintf(stderr, " -Z = flush output to zero (soft tester only). " + "Default: disabled\n"); + + g_free(tester_list); + g_free(op_list); +} + +static void parse_args(int argc, char *argv[]) +{ + int c; + int val; + + for (;;) { + c = getopt(argc, argv, "d:ho:p:t:zZ"); + if (c < 0) { + break; + } + switch (c) { + case 'd': + duration = atoi(optarg); + break; + case 'h': + usage_complete(argc, argv); + exit(EXIT_SUCCESS); + case 'o': + val = find_name(op_names, optarg); + if (val < 0) { + fprintf(stderr, "Unsupported op '%s'\n", optarg); + exit(EXIT_FAILURE); + } + operation = val; + break; + case 'p': + if (!strcmp(optarg, "single")) { + precision = PREC_SINGLE; + } else if (!strcmp(optarg, "double")) { + precision = PREC_DOUBLE; + } else { + fprintf(stderr, "Unsupported precision '%s'\n", optarg); + exit(EXIT_FAILURE); + } + break; + case 't': + val = find_name(tester_names, optarg); + if (val < 0) { + fprintf(stderr, "Unsupported tester '%s'\n", optarg); + exit(EXIT_FAILURE); + } + tester = val; + break; + case 'z': + soft_status.flush_inputs_to_zero = 1; + break; + case 'Z': + soft_status.flush_to_zero = 1; + break; + } + } + + /* set precision based on the tester */ + switch (tester) { + case TESTER_HOST: + break; + case TESTER_SOFT: + switch (precision) { + case PREC_SINGLE: + precision = PREC_FLOAT32; + break; + case PREC_DOUBLE: + precision = PREC_FLOAT64; + break; + default: + g_assert_not_reached(); + } + break; + default: + g_assert_not_reached(); + } +} + +static void pr_stats(void) +{ + printf("%.2f MFlops\n", (double)n_completed_ops / ns_elapsed * 1e3); +} + +int main(int argc, char *argv[]) +{ + parse_args(argc, argv); + run_bench(); + pr_stats(); + return 0; +} diff --git a/tests/fp/.gitignore b/tests/fp/.gitignore index 0a9fef4..a4e59d7 100644 --- a/tests/fp/.gitignore +++ b/tests/fp/.gitignore @@ -1,3 +1,4 @@ ibm *.txt fp-test +fp-bench diff --git a/tests/fp/Makefile b/tests/fp/Makefile index a208f4c..7c88ab0 100644 --- a/tests/fp/Makefile +++ b/tests/fp/Makefile @@ -12,7 +12,7 @@ QEMU_CFLAGS += -DHW_POISON_H IBMFP := ibm-fptests.zip -OBJS := fp-test$(EXESUF) +OBJS := fp-test$(EXESUF) fp-bench$(EXESUF) WHITELIST_FILES := whitelist.txt whitelist-tininess-after.txt @@ -30,5 +30,7 @@ $(WHITELIST_FILES): fp-test$(EXESUF): fp-test.o softfloat.o +fp-bench$(EXESUF): fp-bench.o softfloat.o + clean: rm -f *.o *.d $(OBJS) From patchwork Wed Apr 4 23:11:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895195 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="bOK14l/1"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="KjO33eh2"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhjX6dRDz9ry1 for ; Thu, 5 Apr 2018 09:18:56 +1000 (AEST) Received: from localhost ([::1]:37828 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rg6-000133-VA for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:18:54 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54476) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZZ-0004KL-JK for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZY-00044W-PM for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:09 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:52037) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZY-00044O-Kk for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:08 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 6AA6D21C2A; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=w8J+r9YlzMNMsq xUhfYXtiZ2XANWrWOA8o9a/N/vWv0=; b=bOK14l/1LIXKZOVqnj7f7LQVxFbn1z moUrxB4qV5ZpO8b8VvNv/RK+/HjHTximekUcCY0lvnm9vKYeyF8Zd/AXBJI37vRQ OXIosv4u6B3SihWVQtY37LWXQk4sLMw8ZV6WMR4k5gftSdy/zLVwWTqAESAo2cJQ rbvWsLwNgsGhE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=w8J+r9YlzMNMsqxUhfYXtiZ2XANWrWOA8o9a/N/vWv0=; b=KjO33eh2 29+iCUrFiDc8W9jGDz4MxexZVnUr2/80JSa0JSUVMgwQwamBV6ZAa2Ukq1r6brJz rdRrfWZluDcn2XUdlHVrJWnrD9/EojzEjOH810b6C8evN6b5VZSkhVyrkltjH8Az s0r6AdOdRWECRB4397HeqNNjgNtqE/sGoNYFLr3TddxRH5YuiX4SksBoCTZv9fKv p1Zwh5Vy8mcKMyIi+WgH6X/lsHATRxqnznG53XfdAwPURBtgJXe52BvqdIgHPuWr bhG4q913zPC2he3UyeBDG8pLzKD+fHQ+57WmHxWiIt3Sw2qwmBAN+U/7l4B8oSIp GqNUzvCNQXU5hg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 1A6D4E43C8; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:07 -0400 Message-Id: <1522883475-27858-8-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 07/15] softfloat: rename canonicalize to sf_canonicalize X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Bastian Koppelmann , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" glibc >= 2.25 defines canonicalize in commit eaf5ad0 (Add canonicalize, canonicalizef, canonicalizel., 2016-10-26). Given that we'll be including soon, prepare for this by prefixing our canonicalize() with sf_ to avoid clashing with the libc's canonicalize(). Reported-by: Bastian Koppelmann Cc: Bastian Koppelmann Signed-off-by: Emilio G. Cota Tested-by: Bastian Koppelmann --- fpu/softfloat.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 6803279..c3b9d07 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -323,8 +323,8 @@ static inline float64 float64_pack_raw(FloatParts p) } /* Canonicalize EXP and FRAC, setting CLS. */ -static FloatParts canonicalize(FloatParts part, const FloatFmt *parm, - float_status *status) +static FloatParts sf_canonicalize(FloatParts part, const FloatFmt *parm, + float_status *status) { if (part.exp == parm->exp_max) { if (part.frac == 0) { @@ -494,7 +494,7 @@ static FloatParts round_canonical(FloatParts p, float_status *s, static FloatParts float16_unpack_canonical(float16 f, float_status *s) { - return canonicalize(float16_unpack_raw(f), &float16_params, s); + return sf_canonicalize(float16_unpack_raw(f), &float16_params, s); } static float16 float16_round_pack_canonical(FloatParts p, float_status *s) @@ -512,7 +512,7 @@ static float16 float16_round_pack_canonical(FloatParts p, float_status *s) static FloatParts float32_unpack_canonical(float32 f, float_status *s) { - return canonicalize(float32_unpack_raw(f), &float32_params, s); + return sf_canonicalize(float32_unpack_raw(f), &float32_params, s); } static float32 float32_round_pack_canonical(FloatParts p, float_status *s) @@ -530,7 +530,7 @@ static float32 float32_round_pack_canonical(FloatParts p, float_status *s) static FloatParts float64_unpack_canonical(float64 f, float_status *s) { - return canonicalize(float64_unpack_raw(f), &float64_params, s); + return sf_canonicalize(float64_unpack_raw(f), &float64_params, s); } static float64 float64_round_pack_canonical(FloatParts p, float_status *s) From patchwork Wed Apr 4 23:11:08 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895204 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="Yym5p5lO"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="YeewKQwi"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Ghr86gPHz9ry1 for ; Thu, 5 Apr 2018 09:24:40 +1000 (AEST) Received: from localhost ([::1]:38077 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rlf-0005yC-0x for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:24:39 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54505) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZd-0004Nw-EM for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:14 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZZ-00044u-W9 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:13 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:43519) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZZ-00044m-Ry for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:09 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 97C0B21C2E; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=E1WSL9cJCqQ80c N7oMYEqYscaDrUy68G99Dj/0UoI5M=; b=Yym5p5lO5EOnwJH3Q31XQDJf6X/BqT mPcNvbSCKWEEMksdOTDVKw433nQKEdM+/vlqPmxG0lelTua00Xb/3nRsFutxAD6p o3J6038TaEnp1Fm6PCABNlDeriC7RaKn/eqcFMal6yr6zfT5NZAz62/ai62CfJyc wfENxKNg0tFrs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=E1WSL9cJCqQ80cN7oMYEqYscaDrUy68G99Dj/0UoI5M=; b=YeewKQwi 4kvIGYOC41/Rh6tMCReSCi62mrjCyjrOJBhSBeX4MKxVR2tClJGaBjdBP2oeE8BO JvNQYbh/zOx1IDNN7+fwvijGUmtngTiMvbt6mLmt3Oqd2gBofNK9QQOH6d9hdIhP 8mr6T8Ivq5orO+/PHF3rNfXs8jDa+Kmg47qn1wtMM3DQMctuzChbRMkwgZaT0OrT o8IG1WNeNidOybZcqRKouTjEckt9Le0KMDnDVvuiP2kFJ8IsLOBmIHa7LCppfpwl 76IVF7pv2aCA1v+/94M5+79OvYANQq8eO1LoNafbK2QWi+4JF1ZldACfvpaLrAmx fENXWuEzV7iqxA== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 4CBCB10260; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:08 -0400 Message-Id: <1522883475-27858-9-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 08/15] softfloat: add float{32, 64}_is_zero_or_normal X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" These will gain some users very soon. Signed-off-by: Emilio G. Cota --- include/fpu/softfloat.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/fpu/softfloat.h b/include/fpu/softfloat.h index a8512fb..66985e1 100644 --- a/include/fpu/softfloat.h +++ b/include/fpu/softfloat.h @@ -422,6 +422,11 @@ static inline bool float32_is_denormal(float32 a) return float32_is_zero_or_denormal(a) && !float32_is_zero(a); } +static inline bool float32_is_zero_or_normal(float32 a) +{ + return float32_is_normal(a) || float32_is_zero(a); +} + static inline float32 float32_set_sign(float32 a, int sign) { return make_float32((float32_val(a) & 0x7fffffff) | (sign << 31)); @@ -561,6 +566,11 @@ static inline bool float64_is_denormal(float64 a) return float64_is_zero_or_denormal(a) && !float64_is_zero(a); } +static inline bool float64_is_zero_or_normal(float64 a) +{ + return float64_is_normal(a) || float64_is_zero(a); +} + static inline float64 float64_set_sign(float64 a, int sign) { return make_float64((float64_val(a) & 0x7fffffffffffffffULL) From patchwork Wed Apr 4 23:11:09 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895205 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="V1RGf+dM"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="S+5GdIHD"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Ghts60Q9z9ry1 for ; Thu, 5 Apr 2018 09:27:01 +1000 (AEST) Received: from localhost ([::1]:38374 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rnv-0008Tj-Tu for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:26:59 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54506) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZd-0004Nx-Eh for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:15 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZb-00045M-IH for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:13 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:49331) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZb-00045C-Cw for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:11 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id D235C21B6E; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:17 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=FK/5otqV3iyXDb UeTtYbfSvylSVHOU3YGuPcMMmocDY=; b=V1RGf+dMhn0hfRIl1txg2kNQCerpYz d/0uPdOVcBZ6j83ZWtIqL2RuUeWgihyumiuPSdJz7wX9be8nit/vJRsPSnYr1RY2 yXDMlW/tDjzJz8kzaVyt85ZfSdJlfP89grZexNWWr8Dkc0/aUPRkmFS/SwW6mrWu p6rP/CI4N/vS0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=FK/5otqV3iyXDbUeTtYbfSvylSVHOU3YGuPcMMmocDY=; b=S+5GdIHD pMA9/pE5RJpiL4s6YqWkedcfZw+PWvhUYyFxDuRo48UicZla8q7xW59AkEQQHzk7 Sywv3N1V7gasplvEN5EFRvj1uf+gVyCXCioQK74DpD+nDtt+qLrmVhHK9MyGj2tv yQaN8JhZsiV9jhuvJATcFhlzYMfFQKrjdX7sxXM4BCyjEHCIRcp+dRElCsprOB1R fUIn4dnGLH7/UmbX5tTLhnZlJa7EAMrLEKGtp5ib664gvZmecKcttquwBopqou81 iQ6y+bEPtS8x38ej8ZzKHU0O5JbI1wBUmwQnDu+g3nDjpPI/yXHYDKIxkW+Zrl2Q Hkh5Hs595C0z7Q== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 83747E43C8; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:09 -0400 Message-Id: <1522883475-27858-10-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 09/15] fpu: introduce hardfloat X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The appended paves the way for leveraging the host FPU for a subset of guest FP operations. For most guest workloads (e.g. FP flags aren't ever cleared, inexact occurs often and rounding is set to the default [to nearest]) this will yield sizable performance speedups. The approach followed here avoids checking the FP exception flags register. See the added comment for details. This assumes that QEMU is running on an IEEE754-compliant FPU and that the rounding is set to the default (to nearest). The implementation-dependent specifics of the FPU should not matter; things like tininess detection and snan representation are still dealt with in soft-fp. However, this approach will break on most hosts if we compile QEMU with flags such as -ffast-math. We control the flags so this should be easy to enforce though. This patch just adds common code. Some operations will be migrated to hardfloat in subsequent patches to ease bisection. Note: some architectures (at least PPC, there might be others) clear the status flags passed to softfloat before most FP operations. This precludes the use of hardfloat, so to avoid introducing a performance regression for those targets, we add a flag to disable hardfloat. In the long run though it would be good to fix the targets so that at least the inexact flag passed to softfloat is indeed sticky. Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 342 insertions(+) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index c3b9d07..956b938 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -82,6 +82,8 @@ this code that are retained. /* softfloat (and in particular the code in softfloat-specialize.h) is * target-dependent and needs the TARGET_* macros. */ +#include + #include "qemu/osdep.h" #include "qemu/bitops.h" #include "fpu/softfloat.h" @@ -105,6 +107,346 @@ this code that are retained. *----------------------------------------------------------------------------*/ #include "softfloat-specialize.h" +/* + * Hardfloat + * + * Fast emulation of guest FP instructions is challenging for two reasons. + * First, FP instruction semantics are similar but not identical, particularly + * when handling NaNs. Second, emulating at reasonable speed the guest FP + * exception flags is not trivial: reading the host's flags register with a + * feclearexcept & fetestexcept pair is slow [slightly slower than soft-fp], + * and trapping on every FP exception is not fast nor pleasant to work with. + * + * We address these challenges by leverage the host FPU for a subset of the + * operations. To do this we follow the main idea presented in this paper: + * + * Guo, Yu-Chuan, et al. "Translating the ARM Neon and VFP instructions in a + * binary translator." Software: Practice and Experience 46.12 (2016):1591-1615. + * + * The idea is thus to leverage the host FPU to (1) compute FP operations + * and (2) identify whether FP exceptions occurred while avoiding + * expensive exception flag register accesses. + * + * An important optimization shown in the paper is that given that exception + * flags are rarely cleared by the guest, we can avoid recomputing some flags. + * This is particularly useful for the inexact flag, which is very frequently + * raised in floating-point workloads. + * + * We optimize the code further by deferring to soft-fp whenever FP exception + * detection might get hairy. Two examples: (1) when at least one operand is + * denormal/inf/NaN; (2) when operands are not guaranteed to lead to a 0 result + * and the result is < the minimum normal. + */ +#define GEN_TYPE_CONV(name, to_t, from_t) \ + static inline to_t name(from_t a) \ + { \ + to_t r = *(to_t *)&a; \ + return r; \ + } + +GEN_TYPE_CONV(float32_to_float, float, float32) +GEN_TYPE_CONV(float64_to_double, double, float64) +GEN_TYPE_CONV(float_to_float32, float32, float) +GEN_TYPE_CONV(double_to_float64, float64, double) +#undef GEN_TYPE_CONV + +#define GEN_INPUT_FLUSH__NOCHECK(name, soft_t) \ + static inline void name(soft_t *a, float_status *s) \ + { \ + if (unlikely(soft_t ## _is_denormal(*a))) { \ + *a = soft_t ## _set_sign(soft_t ## _zero, \ + soft_t ## _is_neg(*a)); \ + s->float_exception_flags |= float_flag_input_denormal; \ + } \ + } + +GEN_INPUT_FLUSH__NOCHECK(float32_input_flush__nocheck, float32) +GEN_INPUT_FLUSH__NOCHECK(float64_input_flush__nocheck, float64) +#undef GEN_INPUT_FLUSH__NOCHECK + +#define GEN_INPUT_FLUSH1(name, soft_t) \ + static inline void name(soft_t *a, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + } + +GEN_INPUT_FLUSH1(float32_input_flush1, float32) +GEN_INPUT_FLUSH1(float64_input_flush1, float64) +#undef GEN_INPUT_FLUSH1 + +#define GEN_INPUT_FLUSH2(name, soft_t) \ + static inline void name(soft_t *a, soft_t *b, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + } + +GEN_INPUT_FLUSH2(float32_input_flush2, float32) +GEN_INPUT_FLUSH2(float64_input_flush2, float64) +#undef GEN_INPUT_FLUSH2 + +#define GEN_INPUT_FLUSH3(name, soft_t) \ + static inline void name(soft_t *a, soft_t *b, soft_t *c, float_status *s) \ + { \ + if (likely(!s->flush_inputs_to_zero)) { \ + return; \ + } \ + soft_t ## _input_flush__nocheck(a, s); \ + soft_t ## _input_flush__nocheck(b, s); \ + soft_t ## _input_flush__nocheck(c, s); \ + } + +GEN_INPUT_FLUSH3(float32_input_flush3, float32) +GEN_INPUT_FLUSH3(float64_input_flush3, float64) +#undef GEN_INPUT_FLUSH3 + +static inline bool can_use_fpu(const float_status *s) +{ + return likely(s->float_exception_flags & float_flag_inexact && + s->float_rounding_mode == float_round_nearest_even); +} + +/* + * Choose whether to use fpclassify or float32/64_* primitives in the generated + * hardfloat functions. Each combination of number of inputs and float size + * gets its own value. + */ +#if defined(__x86_64__) +# define QEMU_HARDFLOAT_1F32_USE_FP 0 +# define QEMU_HARDFLOAT_1F64_USE_FP 0 +# define QEMU_HARDFLOAT_2F32_USE_FP 0 +# define QEMU_HARDFLOAT_2F64_USE_FP 1 +# define QEMU_HARDFLOAT_3F32_USE_FP 0 +# define QEMU_HARDFLOAT_3F64_USE_FP 1 +#else +# define QEMU_HARDFLOAT_1F32_USE_FP 0 +# define QEMU_HARDFLOAT_1F64_USE_FP 0 +# define QEMU_HARDFLOAT_2F32_USE_FP 0 +# define QEMU_HARDFLOAT_2F64_USE_FP 0 +# define QEMU_HARDFLOAT_3F32_USE_FP 0 +# define QEMU_HARDFLOAT_3F64_USE_FP 0 +#endif + +/* + * QEMU_HARDFLOAT_USE_ISINF chooses whether to use isinf() over + * float{32,64}_is_infinity when !USE_FP. + * On x86_64/aarch64, using the former over the latter can yield a ~6% speedup. + * On power64 however, using isinf() reduces fp-bench performance by up to 50%. + */ +#if defined(__x86_64__) || defined(__aarch64__) +# define QEMU_HARDFLOAT_USE_ISINF 1 +#else +# define QEMU_HARDFLOAT_USE_ISINF 0 +#endif + +/* + * Some targets clear the FP flags before most FP operations. This prevents + * the use of hardfloat, since hardfloat relies on the inexact flag being + * already set. + */ +#if defined(TARGET_PPC) +# define QEMU_NO_HARDFLOAT 1 +# define QEMU_SOFTFLOAT_ATTR __attribute__((flatten)) +#else +# define QEMU_NO_HARDFLOAT 0 +# define QEMU_SOFTFLOAT_ATTR __attribute__((noinline)) +#endif + +/* + * Hardfloat generation functions. Each operation can have two flavors: + * either using softfloat primitives (e.g. float32_is_zero_or_normal) for + * most condition checks, or native ones (e.g. fpclassify). + * + * The flavor is chosen by the callers. Instead of using macros, we rely on the + * compiler to propagate constants and inline everything into the callers. + * + * We only generate functions for operations with two inputs, since only + * these are common enough to justify consolidating them into common code. + */ +typedef bool (*f32_check_func_t)(float32 a, float32 b, const float_status *s); +typedef bool (*f64_check_func_t)(float64 a, float64 b, const float_status *s); +typedef bool (*float_check_func_t)(float a, float b, const float_status *s); +typedef bool (*double_check_func_t)(double a, double b, const float_status *s); + +typedef float32 (*f32_op2_func_t)(float32 a, float32 b, float_status *s); +typedef float64 (*f64_op2_func_t)(float64 a, float64 b, float_status *s); +typedef float (*float_op2_func_t)(float a, float b); +typedef double (*double_op2_func_t)(double a, double b); + +/* 2-input is-zero-or-normal */ +static inline bool +f32_is_zon2(float32 a, float32 b, const struct float_status *s) +{ + return likely(float32_is_zero_or_normal(a) && + float32_is_zero_or_normal(b) && + can_use_fpu(s)); +} + +static inline bool +float_is_zon2(float a, float b, const struct float_status *s) +{ + return likely((fpclassify(a) == FP_NORMAL || fpclassify(a) == FP_ZERO) && + (fpclassify(b) == FP_NORMAL || fpclassify(b) == FP_ZERO) && + can_use_fpu(s)); +} + +static inline bool +f64_is_zon2(float64 a, float64 b, const struct float_status *s) +{ + return likely(float64_is_zero_or_normal(a) && + float64_is_zero_or_normal(b) && + can_use_fpu(s)); +} + +static inline bool +double_is_zon2(double a, double b, const struct float_status *s) +{ + return likely((fpclassify(a) == FP_NORMAL || fpclassify(a) == FP_ZERO) && + (fpclassify(b) == FP_NORMAL || fpclassify(b) == FP_ZERO) && + can_use_fpu(s)); +} + +/* + * Note: @fast and @post can be NULL. + * Note: @fast and @fast_op always use softfloat types. + */ +static inline float32 +f32_gen2(float32 a, float32 b, float_status *s, float_op2_func_t hard, + f32_op2_func_t soft, f32_check_func_t pre, f32_check_func_t post, + f32_check_func_t fast, f32_op2_func_t fast_op) +{ + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float32_input_flush2(&a, &b, s); + if (likely(pre(a, b, s))) { + if (fast != NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + float ha = float32_to_float(a); + float hb = float32_to_float(b); + float hr = hard(ha, hb); + float32 r = float_to_float32(hr); + + if (unlikely(QEMU_HARDFLOAT_USE_ISINF ? + isinf(hr) : float32_is_infinity(r))) { + s->float_exception_flags |= float_flag_overflow; + } else if (unlikely(fabsf(hr) <= FLT_MIN && + (post == NULL || post(a, b, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + +static inline float32 +float_gen2(float32 a, float32 b, float_status *s, float_op2_func_t hard, + f32_op2_func_t soft, float_check_func_t pre, float_check_func_t post, + f32_check_func_t fast, f32_op2_func_t fast_op) +{ + float ha, hb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float32_input_flush2(&a, &b, s); + ha = float32_to_float(a); + hb = float32_to_float(b); + if (likely(pre(ha, hb, s))) { + if (fast != NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + float hr = hard(ha, hb); + float32 r = float_to_float32(hr); + + if (unlikely(isinf(hr))) { + s->float_exception_flags |= float_flag_overflow; + } else if (unlikely(fabsf(hr) <= FLT_MIN && + (post == NULL || post(ha, hb, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + +static inline float64 +f64_gen2(float64 a, float64 b, float_status *s, double_op2_func_t hard, + f64_op2_func_t soft, f64_check_func_t pre, f64_check_func_t post, + f64_check_func_t fast, f64_op2_func_t fast_op) +{ + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float64_input_flush2(&a, &b, s); + if (likely(pre(a, b, s))) { + if (fast != NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + double ha = float64_to_double(a); + double hb = float64_to_double(b); + double hr = hard(ha, hb); + float64 r = double_to_float64(hr); + + if (unlikely(QEMU_HARDFLOAT_USE_ISINF ? + isinf(hr) : float64_is_infinity(r))) { + s->float_exception_flags |= float_flag_overflow; + } else if (unlikely(fabsf(hr) <= FLT_MIN && + (post == NULL || post(a, b, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + +static inline float64 +double_gen2(float64 a, float64 b, float_status *s, double_op2_func_t hard, + f64_op2_func_t soft, double_check_func_t pre, + double_check_func_t post, f64_check_func_t fast, + f64_op2_func_t fast_op) +{ + double ha, hb; + + if (QEMU_NO_HARDFLOAT) { + goto soft; + } + float64_input_flush2(&a, &b, s); + ha = float64_to_double(a); + hb = float64_to_double(b); + if (likely(pre(ha, hb, s))) { + if (fast != NULL && fast(a, b, s)) { + return fast_op(a, b, s); + } else { + double hr = hard(ha, hb); + float64 r = double_to_float64(hr); + + if (unlikely(isinf(hr))) { + s->float_exception_flags |= float_flag_overflow; + } else if (unlikely(fabs(hr) <= DBL_MIN && + (post == NULL || post(ha, hb, s)))) { + goto soft; + } + return r; + } + } + soft: + return soft(a, b, s); +} + /*---------------------------------------------------------------------------- | Returns the fraction bits of the half-precision floating-point value `a'. *----------------------------------------------------------------------------*/ From patchwork Wed Apr 4 23:11:10 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895190 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="QiuSCDFH"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="JsyW2cGo"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Ghdc63b1z9ry1 for ; Thu, 5 Apr 2018 09:15:31 +1000 (AEST) Received: from localhost ([::1]:37792 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rcm-0006hG-UZ for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:15:29 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54521) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZe-0004O0-Oo for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:16 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZd-00045t-9V for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:14 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:38979) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZd-00045i-4c for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:13 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 12B3A21B5B; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=tqTTzuy/aZFKfJ +dTegM5cJn8TUpW4Lq6DQ+DflTz70=; b=QiuSCDFHNewvvGLmwNINLpgsfFBWit D+2SxVSNMkN2Uwt+VD2Dr+gCvyHICnPNENNo7BWib6tQ2Cxd3q2mXvj5SGauTe1V /qXu3KTInUHndaZgCdSd3sT2/kqnl7/9GYKqXLvBD+WvkFEfR6J2BZCkdMGQsQkG foCEl7Okuuhas= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=tqTTzuy/aZFKfJ+dTegM5cJn8TUpW4Lq6DQ+DflTz70=; b=JsyW2cGo qxJne9ajQkH06EKYHwP4GH/mwPVMx80S8wRlmz9SZhgEVZpRPW1n27wJNIt84CgS 3CQi63mBPC9uEunZK74BGZS8AAL2Dq58BeJ74wTQ+DIIip2b12PTlPUMCHask0s+ X6yIB+UKBVQyHPmmkmqfuoBvdZwYdA50OQX+YOPb1yjGQq4ZjkT9bDfEhax1JouE G5t+6sxK/vnscnBhWeg1u/44Q8TUEfOXkNSx9oqQ02RDM1LZjOJXvCw8d9FQaIrW x2ZypQYWWyBnGX1GKEHjTKi07oYCf0+Lss8gfd5jcGoEgXfOgXuLqnK9+qN8D83o zL124cPDtb5srA== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B5EBF10259; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:10 -0400 Message-Id: <1522883475-27858-11-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 10/15] hardfloat: support float32/64 addition and subtraction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results (single and double precision) for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: add-single: 135.07 MFlops add-double: 131.60 MFlops sub-single: 130.04 MFlops sub-double: 133.01 MFlops - after: add-single: 443.04 MFlops add-double: 301.95 MFlops sub-single: 411.36 MFlops sub-double: 293.15 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: add-single: 44.79 MFlops add-double: 49.20 MFlops sub-single: 44.55 MFlops sub-double: 49.06 MFlops - after: add-single: 93.28 MFlops add-double: 88.27 MFlops sub-single: 91.47 MFlops sub-double: 88.27 MFlops 3. IBM POWER8E @ 2.1 GHz - before: add-single: 72.59 MFlops add-double: 72.27 MFlops sub-single: 75.33 MFlops sub-double: 70.54 MFlops - after: add-single: 112.95 MFlops add-double: 201.11 MFlops sub-single: 116.80 MFlops sub-double: 188.72 MFlops Note that the IBM and ARM machines benefit from having HARDFLOAT_2F{32,64}_USE_FP set to 0. Otherwise their performance can suffer significantly: - IBM Power8: add-single: [1] 54.94 vs [0] 116.37 MFlops add-double: [1] 58.92 vs [0] 201.44 MFlops - Aarch64 A57: add-single: [1] 80.72 vs [0] 93.24 MFlops add-double: [1] 82.10 vs [0] 88.18 MFlops On the Intel machine, having 2F64 set to 1 pays off, but it doesn't for 2F32: - Intel i7-6700K: add-single: [1] 285.79 vs [0] 426.70 MFlops add-double: [1] 302.15 vs [0] 278.82 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 106 +++++++++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 98 insertions(+), 8 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 956b938..ca0b8ab 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1080,8 +1080,8 @@ float16 __attribute__((flatten)) float16_add(float16 a, float16 b, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_add(float32 a, float32 b, - float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_add(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1090,8 +1090,8 @@ float32 __attribute__((flatten)) float32_add(float32 a, float32 b, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_add(float64 a, float64 b, - float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_add(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1110,8 +1110,8 @@ float16 __attribute__((flatten)) float16_sub(float16 a, float16 b, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_sub(float32 a, float32 b, - float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_sub(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1120,8 +1120,8 @@ float32 __attribute__((flatten)) float32_sub(float32 a, float32 b, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_sub(float64 a, float64 b, - float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_sub(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1130,6 +1130,96 @@ float64 __attribute__((flatten)) float64_sub(float64 a, float64 b, return float64_round_pack_canonical(pr, status); } +static float float_add(float a, float b) +{ + return a + b; +} + +static float float_sub(float a, float b) +{ + return a - b; +} + +static double double_add(double a, double b) +{ + return a + b; +} + +static double double_sub(double a, double b) +{ + return a - b; +} + +static bool f32_addsub_post(float32 a, float32 b, const struct float_status *s) +{ + return !(float32_is_zero(a) && float32_is_zero(b)); +} + +static bool +float_addsub_post(float a, float b, const struct float_status *s) +{ + return !(fpclassify(a) == FP_ZERO && fpclassify(b) == FP_ZERO); +} + +static bool f64_addsub_post(float64 a, float64 b, const struct float_status *s) +{ + return !(float64_is_zero(a) && float64_is_zero(b)); +} + +static bool +double_addsub_post(double a, double b, const struct float_status *s) +{ + return !(fpclassify(a) == FP_ZERO && fpclassify(b) == FP_ZERO); +} + +static float32 float32_addsub(float32 a, float32 b, float_status *s, + float_op2_func_t hard, f32_op2_func_t soft) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return float_gen2(a, b, s, hard, soft, float_is_zon2, float_addsub_post, + NULL, NULL); + } else { + return f32_gen2(a, b, s, hard, soft, f32_is_zon2, f32_addsub_post, + NULL, NULL); + } +} + +static float64 float64_addsub(float64 a, float64 b, float_status *s, + double_op2_func_t hard, f64_op2_func_t soft) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return double_gen2(a, b, s, hard, soft, double_is_zon2, + double_addsub_post, NULL, NULL); + } else { + return f64_gen2(a, b, s, hard, soft, f64_is_zon2, f64_addsub_post, + NULL, NULL); + } +} + +float32 __attribute__((flatten)) +float32_add(float32 a, float32 b, float_status *s) +{ + return float32_addsub(a, b, s, float_add, soft_float32_add); +} + +float32 __attribute__((flatten)) +float32_sub(float32 a, float32 b, float_status *s) +{ + return float32_addsub(a, b, s, float_sub, soft_float32_sub); +} + +float64 __attribute__((flatten)) +float64_add(float64 a, float64 b, float_status *s) +{ + return float64_addsub(a, b, s, double_add, soft_float64_add); +} + +float64 __attribute__((flatten)) +float64_sub(float64 a, float64 b, float_status *s) +{ + return float64_addsub(a, b, s, double_sub, soft_float64_sub); +} + /* * Returns the result of multiplying the floating-point values `a' and * `b'. The operation is performed according to the IEC/IEEE Standard From patchwork Wed Apr 4 23:11:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895199 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="sT3t/7/Q"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="HtBk3Sar"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40Ghmf0JQMz9ry1 for ; Thu, 5 Apr 2018 09:21:37 +1000 (AEST) Received: from localhost ([::1]:37962 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rii-0003RE-06 for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:21:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54545) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZg-0004P6-4v for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZf-00047R-17 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:16 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:49101) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZe-00046n-Rk for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:14 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 5B77121C30; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=ba1CD9ZQemWEsv 1shEIkNxJpSxn2x7tcfDWuuUzVMj4=; b=sT3t/7/QVK1Ns/HD7ylS2XMn4vufcl Q0qdNIzNsGazCM+jnY9ppcZZwfWMCyTlfCLvJCTtT3vIsIcLJC1hWuWAZY4YslNn D+Aaywj6lcjZyO9DDkfG9UE5053M/7NaryL8Op2oxMbmwGjh8fsGsEKdpsU4Ja59 g3ckUPTwhEFkw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=ba1CD9ZQemWEsv1shEIkNxJpSxn2x7tcfDWuuUzVMj4=; b=HtBk3Sar E9jtdiVDG/tSHpgtaFTIwxKq+oljmsorxKHW9tF5ydHwV2x1h90eqVp1BLacSEU2 TG23lFCVZnVpz52ZkW3wZ38PY4hmqFWa1RTPLAE3fxdBAtNMOnHS+u6SweOn9NYX dJUudiQU7DqemVLq8apavCRXeR3LHueMfJUwBM34mnKeRY8ZCrPM4jzgWdjME7zo Hj8SlaQAV47lAI2VhZi5noz9ty6XKzVV12gshywv4E0Lj1gRjUwAGhEpXx3w1Uh8 pUq076ZhB6DHlv84aCuW92MAJEUGPxwIRp9vHngxCZ96++O83A2kMUBaMXQEZmPp +Mg+NVcXzQbbCg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id ED76EE4924; Wed, 4 Apr 2018 19:11:17 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:11 -0400 Message-Id: <1522883475-27858-12-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 11/15] hardfloat: support float32/64 multiplication X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: mul-single: 126.91 MFlops mul-double: 118.28 MFlops - after: mul-single: 258.02 MFlops mul-double: 197.96 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: mul-single: 37.42 MFlops mul-double: 38.77 MFlops - after: mul-single: 73.41 MFlops mul-double: 76.93 MFlops 3. IBM POWER8E @ 2.1 GHz - before: mul-single: 58.40 MFlops mul-double: 59.33 MFlops - after: mul-single: 60.25 MFlops mul-double: 94.79 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 62 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index ca0b8ab..2c68b9d 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1281,8 +1281,8 @@ float16 __attribute__((flatten)) float16_mul(float16 a, float16 b, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_mul(float32 a, float32 b, - float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_mul(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1291,8 +1291,8 @@ float32 __attribute__((flatten)) float32_mul(float32 a, float32 b, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_mul(float64 a, float64 b, - float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_mul(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1301,6 +1301,64 @@ float64 __attribute__((flatten)) float64_mul(float64 a, float64 b, return float64_round_pack_canonical(pr, status); } +static float float_mul(float a, float b) +{ + return a * b; +} + +static double double_mul(double a, double b) +{ + return a * b; +} + +static bool f32_mul_fast(float32 a, float32 b, const struct float_status *s) +{ + return float32_is_zero(a) || float32_is_zero(b); +} + +static bool f64_mul_fast(float64 a, float64 b, const struct float_status *s) +{ + return float64_is_zero(a) || float64_is_zero(b); +} + +static float32 f32_mul_fast_op(float32 a, float32 b, float_status *s) +{ + bool signbit = float32_is_neg(a) ^ float32_is_neg(b); + + return float32_set_sign(float32_zero, signbit); +} + +static float64 f64_mul_fast_op(float64 a, float64 b, float_status *s) +{ + bool signbit = float64_is_neg(a) ^ float64_is_neg(b); + + return float64_set_sign(float64_zero, signbit); +} + +float32 __attribute__((flatten)) +float32_mul(float32 a, float32 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return float_gen2(a, b, s, float_mul, soft_float32_mul, float_is_zon2, + NULL, f32_mul_fast, f32_mul_fast_op); + } else { + return f32_gen2(a, b, s, float_mul, soft_float32_mul, f32_is_zon2, NULL, + f32_mul_fast, f32_mul_fast_op); + } +} + +float64 __attribute__((flatten)) +float64_mul(float64 a, float64 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return double_gen2(a, b, s, double_mul, soft_float64_mul, + double_is_zon2, NULL, f64_mul_fast, f64_mul_fast_op); + } else { + return f64_gen2(a, b, s, double_mul, soft_float64_mul, f64_is_zon2, + NULL, f64_mul_fast, f64_mul_fast_op); + } +} + /* * Returns the result of multiplying the floating-point values `a' and * `b' then adding 'c', with no intermediate rounding step after the From patchwork Wed Apr 4 23:11:12 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895202 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="bWVPkCoY"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="U5F4uH+/"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhqV5tyxz9ry1 for ; Thu, 5 Apr 2018 09:24:06 +1000 (AEST) Received: from localhost ([::1]:38066 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rl6-0005QB-Qj for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:24:04 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54576) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZh-0004QT-Cp for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZg-000489-7F for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:17 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:33229) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZg-00047f-3F for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:16 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 9239D21C34; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=JzJ1mrnCCnhJcN LOSHqcvkGN3yQIRYDI3LYrv8Otd+o=; b=bWVPkCoYpQiJAP8wsavNSYr8DYk/oP TMZ5NRV7MX4SjznwDH35b7Xr7Fw2ZCqrcSm0cwigPwwhCuPlqykdAGV4/CCFlMcY iL0TBtJ5tjIKXizx3fyzaMkpRkbpjtHEn+ZPam/WNFWRaJSCBFQo6Hic/L1bLDse fTClNCi4pMWGs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=JzJ1mrnCCnhJcNLOSHqcvkGN3yQIRYDI3LYrv8Otd+o=; b=U5F4uH+/ OZM/QngBf0bx6uFRaC/Co3Ah5sRrRJQXcZUbZ4/n9vPaZ+OJxYCkqPYpK/30iTEu h/iqBfXJeJ4QqBE06M3cDyXYlfmu/0PLeAyZ0LnvMJiJ9nafsW2hFXNMiryiNe6w mvkLpao/mU/13x5g+2KzZxdgj7f9OtCRUeownGhV7g4MqO60qYRy4d3SHvq4z8r1 FNZ7DJjNWJROyEAWpslMhZd6E6BSohQhLLREL5RMFLwUC3P0FhTXYzfvykP7fnFc sSWv+KPnylE8Jb349pSP2jJYR6Bsohh1/ZSPWJemNZN8TMXWi8kIUZXZmw6/D64S OxlowdaVBAteqg== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 451AD10260; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:12 -0400 Message-Id: <1522883475-27858-13-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 12/15] hardfloat: support float32/64 division X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: div-single: 34.84 MFlops div-double: 34.04 MFlops - after: div-single: 275.23 MFlops div-double: 216.38 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: div-single: 9.33 MFlops div-double: 9.30 MFlops - after: div-single: 51.55 MFlops div-double: 15.09 MFlops 3. IBM POWER8E @ 2.1 GHz - before: div-single: 25.65 MFlops div-double: 24.91 MFlops - after: div-single: 96.83 MFlops div-double: 31.01 MFlops Here setting 2FP64_USE_FP to 1 pays off for x86_64: [1] 215.97 vs [0] 62.15 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 86 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 2c68b9d..4323dc2 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1666,7 +1666,8 @@ float16 float16_div(float16 a, float16 b, float_status *status) return float16_round_pack_canonical(pr, status); } -float32 float32_div(float32 a, float32 b, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_div(float32 a, float32 b, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1675,7 +1676,8 @@ float32 float32_div(float32 a, float32 b, float_status *status) return float32_round_pack_canonical(pr, status); } -float64 float64_div(float64 a, float64 b, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_div(float64 a, float64 b, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1684,6 +1686,88 @@ float64 float64_div(float64 a, float64 b, float_status *status) return float64_round_pack_canonical(pr, status); } +static float float_div(float a, float b) +{ + return a / b; +} + +static double double_div(double a, double b) +{ + return a / b; +} + +static bool f32_div_pre(float32 a, float32 b, const struct float_status *s) +{ + return likely(float32_is_zero_or_normal(a) && + float32_is_normal(b) && + can_use_fpu(s)); +} + +static bool f64_div_pre(float64 a, float64 b, const struct float_status *s) +{ + return likely(float64_is_zero_or_normal(a) && + float64_is_normal(b) && + can_use_fpu(s)); +} + +static bool float_div_pre(float a, float b, const struct float_status *s) +{ + return likely((fpclassify(a) == FP_NORMAL || fpclassify(a) == FP_ZERO) && + fpclassify(b) == FP_NORMAL && + can_use_fpu(s)); +} + +static bool double_div_pre(double a, double b, const struct float_status *s) +{ + return likely((fpclassify(a) == FP_NORMAL || fpclassify(a) == FP_ZERO) && + fpclassify(b) == FP_NORMAL && + can_use_fpu(s)); +} + +static bool f32_div_post(float32 a, float32 b, const struct float_status *s) +{ + return !float32_is_zero(a); +} + +static bool f64_div_post(float64 a, float64 b, const struct float_status *s) +{ + return !float64_is_zero(a); +} + +static bool float_div_post(float a, float b, const struct float_status *s) +{ + return fpclassify(a) != FP_ZERO; +} + +static bool double_div_post(double a, double b, const struct float_status *s) +{ + return fpclassify(a) != FP_ZERO; +} + +float32 __attribute__((flatten)) +float32_div(float32 a, float32 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F32_USE_FP) { + return float_gen2(a, b, s, float_div, soft_float32_div, float_div_pre, + float_div_post, NULL, NULL); + } else { + return f32_gen2(a, b, s, float_div, soft_float32_div, f32_div_pre, + f32_div_post, NULL, NULL); + } +} + +float64 __attribute__((flatten)) +float64_div(float64 a, float64 b, float_status *s) +{ + if (QEMU_HARDFLOAT_2F64_USE_FP) { + return double_gen2(a, b, s, double_div, soft_float64_div, + double_div_pre, double_div_post, NULL, NULL); + } else { + return f64_gen2(a, b, s, double_div, soft_float64_div, f64_div_pre, + f64_div_post, NULL, NULL); + } +} + /* * Rounds the floating-point value `a' to an integer, and returns the * result as a floating-point value. The operation is performed From patchwork Wed Apr 4 23:11:13 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895189 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="HCbSMbXe"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="RZkMChsR"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhZr1ldSz9rxs for ; Thu, 5 Apr 2018 09:13:08 +1000 (AEST) Received: from localhost ([::1]:37776 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3raU-0004TT-6G for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:13:06 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54580) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZh-0004Qm-Nu for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZg-00048M-8e for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:17 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:44687) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZg-00047g-3T for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:16 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id C587D21B6D; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:18 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=7cOVQIGpNqSEIO h+M1rVhj0yoehwA4kwYTXYU8bkJUs=; b=HCbSMbXeDeZj06b+gWXTdmNka5QW0S XFiRiiQybshKuOssF+riELwOKMrXF0ABQ3TByCPX9SqFIlpSBqMj6qIc5Jtn/s9d 5o+mvpc0d46drRHxbxFhpD4lMNSyHQVvLpd9HfheUsWrYN8u10E1njq73aAH96DY cv1bOIAL08Zss= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=7cOVQIGpNqSEIOh+M1rVhj0yoehwA4kwYTXYU8bkJUs=; b=RZkMChsR AZWsaw+Zxgm0V/vxMiQ8wrwK3jUtsuWYYBnc+3exkWrYVn1FPKbakXeEVwmjz1DM 2w5drGW85zptM99lg2CUbycgxALXF8gtkaB+5aM7lBpfVyVHmhAAT/ej9iLLlfi9 urLxazR07z+wC1oPRbhlz5I8pq8zGC++ABFprswT/a4jPOZAiNENYO9a2D87P7v+ UZ3M/stpQzP3zHSUQt/seRY/4vb7FDjVbfKnb+uNln1d8u0SbCrkz2Q9Bd2sCEJG yvXrjMYpR2FY1cgLurnSQMYPtgCP+bvo4gFh6tDqgH2LPB5AMol9hqNUPrOK8QCZ MnZ+OEuOdhFxJA== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id 81439E43C8; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:13 -0400 Message-Id: <1522883475-27858-14-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 13/15] hardfloat: support float32/64 fused multiply-add X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: fma-single: 74.73 MFlops fma-double: 74.54 MFlops - after: fma-single: 203.37 MFlops fma-double: 169.37 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: fma-single: 23.24 MFlops fma-double: 23.70 MFlops - after: fma-single: 66.14 MFlops fma-double: 63.10 MFlops 3. IBM POWER8E @ 2.1 GHz - before: fma-single: 37.26 MFlops fma-double: 37.29 MFlops - after: fma-single: 48.90 MFlops fma-double: 59.51 MFlops Here having 3FP64 set to 1 pays off for x86_64: [1] 170.15 vs [0] 153.12 MFlops Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 169 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 165 insertions(+), 4 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 4323dc2..ce14c87 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -1574,8 +1574,9 @@ float16 __attribute__((flatten)) float16_muladd(float16 a, float16 b, float16 c, return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_muladd(float32 a, float32 b, float32 c, - int flags, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_muladd(float32 a, float32 b, float32 c, int flags, + float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pb = float32_unpack_canonical(b, status); @@ -1585,8 +1586,9 @@ float32 __attribute__((flatten)) float32_muladd(float32 a, float32 b, float32 c, return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_muladd(float64 a, float64 b, float64 c, - int flags, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_muladd(float64 a, float64 b, float64 c, int flags, + float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pb = float64_unpack_canonical(b, status); @@ -1597,6 +1599,165 @@ float64 __attribute__((flatten)) float64_muladd(float64 a, float64 b, float64 c, } /* + * FMA generator for softfloat-based condition checks. + * + * When (a || b) == 0, there's no need to check for under/over flow, + * since we know the addend is (normal || 0) and the product is 0. + */ +#define GEN_FMA_SF(name, soft_t, host_t, host_fma_f, host_abs_f, min_normal) \ + static soft_t \ + name(soft_t a, soft_t b, soft_t c, int flags, float_status *s) \ + { \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush3(&a, &b, &c, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + (soft_t ## _is_normal(b) || soft_t ## _is_zero(b)) && \ + (soft_t ## _is_normal(c) || soft_t ## _is_zero(c)) && \ + !(flags & float_muladd_halve_result) && \ + can_use_fpu(s))) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + soft_t p, r; \ + host_t hp, hc, hr; \ + bool prod_sign; \ + \ + prod_sign = soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b); \ + prod_sign ^= !!(flags & float_muladd_negate_product); \ + p = soft_t ## _set_sign(soft_t ## _zero, prod_sign); \ + \ + if (flags & float_muladd_negate_c) { \ + c = soft_t ## _chs(c); \ + } \ + \ + hp = soft_t ## _to_ ## host_t(p); \ + hc = soft_t ## _to_ ## host_t(c); \ + hr = hp + hc; \ + r = host_t ## _to_ ## soft_t(hr); \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } else { \ + host_t ha, hb, hc, hr; \ + soft_t r; \ + soft_t sa = flags & float_muladd_negate_product ? \ + soft_t ## _chs(a) : a; \ + soft_t sc = flags & float_muladd_negate_c ? \ + soft_t ## _chs(c) : c; \ + \ + ha = soft_t ## _to_ ## host_t(sa); \ + hb = soft_t ## _to_ ## host_t(b); \ + hc = soft_t ## _to_ ## host_t(sc); \ + hr = host_fma_f(ha, hb, hc); \ + r = host_t ## _to_ ## soft_t(hr); \ + \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_f(hr) <= min_normal)) { \ + goto soft; \ + } \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _muladd(a, b, c, flags, s); \ + } + +/* FMA generator for native floating point condition checks */ +#define GEN_FMA_FP(name, soft_t, host_t, host_fma_f, host_abs_f, min_normal) \ + static soft_t \ + name(soft_t a, soft_t b, soft_t c, int flags, float_status *s) \ + { \ + host_t ha, hb, hc; \ + \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush3(&a, &b, &c, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + hc = soft_t ## _to_ ## host_t(c); \ + if (likely((fpclassify(ha) == FP_NORMAL || \ + fpclassify(ha) == FP_ZERO) && \ + (fpclassify(hb) == FP_NORMAL || \ + fpclassify(hb) == FP_ZERO) && \ + (fpclassify(hc) == FP_NORMAL || \ + fpclassify(hc) == FP_ZERO) && \ + !(flags & float_muladd_halve_result) && \ + can_use_fpu(s))) { \ + if (soft_t ## _is_zero(a) || soft_t ## _is_zero(b)) { \ + soft_t p, r; \ + host_t hp, hc, hr; \ + bool prod_sign; \ + \ + prod_sign = soft_t ## _is_neg(a) ^ soft_t ## _is_neg(b); \ + prod_sign ^= !!(flags & float_muladd_negate_product); \ + p = soft_t ## _set_sign(soft_t ## _zero, prod_sign); \ + \ + if (flags & float_muladd_negate_c) { \ + c = soft_t ## _chs(c); \ + } \ + \ + hp = soft_t ## _to_ ## host_t(p); \ + hc = soft_t ## _to_ ## host_t(c); \ + hr = hp + hc; \ + r = host_t ## _to_ ## soft_t(hr); \ + return flags & float_muladd_negate_result ? \ + soft_t ## _chs(r) : r; \ + } else { \ + host_t hr; \ + \ + if (flags & float_muladd_negate_product) { \ + ha = -ha; \ + } \ + if (flags & float_muladd_negate_c) { \ + hc = -hc; \ + } \ + hr = host_fma_f(ha, hb, hc); \ + if (unlikely(isinf(hr))) { \ + s->float_exception_flags |= float_flag_overflow; \ + } else if (unlikely(host_abs_f(hr) <= min_normal)) { \ + goto soft; \ + } \ + if (flags & float_muladd_negate_result) { \ + hr = -hr; \ + } \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + } \ + soft: \ + return soft_ ## soft_t ## _muladd(a, b, c, flags, s); \ + } + +GEN_FMA_SF(f32_muladd, float32, float, fmaf, fabsf, FLT_MIN) +GEN_FMA_SF(f64_muladd, float64, double, fma, fabs, DBL_MIN) +#undef GEN_FMA_SF + +GEN_FMA_FP(float_muladd, float32, float, fmaf, fabsf, FLT_MIN) +GEN_FMA_FP(double_muladd, float64, double, fma, fabs, DBL_MIN) +#undef GEN_FMA_FP + +float32 __attribute__((flatten)) +float32_muladd(float32 a, float32 b, float32 c, int flags, float_status *s) +{ + if (QEMU_HARDFLOAT_3F32_USE_FP) { + return float_muladd(a, b, c, flags, s); + } else { + return f32_muladd(a, b, c, flags, s); + } +} + +float64 __attribute__((flatten)) +float64_muladd(float64 a, float64 b, float64 c, int flags, float_status *s) +{ + if (QEMU_HARDFLOAT_3F64_USE_FP) { + return double_muladd(a, b, c, flags, s); + } else { + return f64_muladd(a, b, c, flags, s); + } +} + +/* * Returns the result of dividing the floating-point value `a' by the * corresponding value `b'. The operation is performed according to * the IEC/IEEE Standard for Binary Floating-Point Arithmetic. From patchwork Wed Apr 4 23:11:14 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895194 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="WXKEwp9y"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="S1QXq8dW"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhjQ3JFJz9ry1 for ; Thu, 5 Apr 2018 09:18:50 +1000 (AEST) Received: from localhost ([::1]:37816 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rg0-0000tV-Ge for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:18:48 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54572) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZh-0004QK-A5 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:18 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZg-00048G-8S for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:17 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:53577) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZg-00047i-3I for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:16 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 10F2E21C37; Wed, 4 Apr 2018 19:11:19 -0400 (EDT) Received: from mailfrontend2 ([10.202.2.163]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=D/+o3PcjY89q1s aAN8nT0V8HNnX20CHGYA6WCFhXi6I=; b=WXKEwp9yyqNVqcPcFgs01BN15vICib yKZrEwwmMox3QT/mx5bJ13EK9+j1Sz1U069UFLR963HrJPENhuULws5epYQVFtUY dFZISNOOk/TgJjzsAS4CCwEQ7zHU1PM8f+dmzUYB9jAqCtA9CsQm6IhtV0Gey8PV Z6/VgAI5E+U5Q= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=D/+o3PcjY89q1saAN8nT0V8HNnX20CHGYA6WCFhXi6I=; b=S1QXq8dW Fa58zppvPF6+auPGsZmJc7Zp9gy/7YgnxMSvgVwmwhqkEcnOjGS2+BXYsmWSSx2V Q4eZZCccUbmjptfOERv66Fb6+V2O/D++TLIE5sv6DZt0OMbluQGLeJKEjTQZfWni 0GcqlQXrq2bhGoMyBM+vLimkp0+MHURmT1/wGhC8KKWaUKVTtxCC4vv7s2qNLjPG 9qoZ4wP+FiTG4Ko6mpuBH7G3GpmDn88VxzTzHGzSzlkLoeqyw2alfTfl/Ob4xPlp 18KwQqJcIP3/mGGPHGRHvSqwFi0go2AlZT50vJdPTblfZOOjF0Ebcc++FTFkbrE7 ShNusYgrszC8xQ== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id B45DE10251; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:14 -0400 Message-Id: <1522883475-27858-15-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 14/15] hardfloat: support float32/64 square root X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: sqrt-single: 43.27 MFlops sqrt-double: 24.81 MFlops - after: sqrt-single: 297.94 MFlops sqrt-double: 210.46 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: sqrt-single: 12.41 MFlops sqrt-double: 6.22 MFlops - after: sqrt-single: 55.58 MFlops sqrt-double: 40.63 MFlops 3. IBM POWER8E @ 2.1 GHz - before: sqrt-single: 17.01 MFlops sqrt-double: 9.61 MFlops - after: sqrt-single: 104.17 MFlops sqrt-double: 133.32 MFlops Here none of the machines got faster from enabling USE_FP. For instance, on x86_64 sqrt is 23% slower for single precision, with it enabled, and 17% slower for double precision. Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 71 insertions(+), 2 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index ce14c87..5434d29 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2717,20 +2717,89 @@ float16 __attribute__((flatten)) float16_sqrt(float16 a, float_status *status) return float16_round_pack_canonical(pr, status); } -float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *status) +static float32 QEMU_SOFTFLOAT_ATTR +soft_float32_sqrt(float32 a, float_status *status) { FloatParts pa = float32_unpack_canonical(a, status); FloatParts pr = sqrt_float(pa, status, &float32_params); return float32_round_pack_canonical(pr, status); } -float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *status) +static float64 QEMU_SOFTFLOAT_ATTR +soft_float64_sqrt(float64 a, float_status *status) { FloatParts pa = float64_unpack_canonical(a, status); FloatParts pr = sqrt_float(pa, status, &float64_params); return float64_round_pack_canonical(pr, status); } +#define GEN_SQRT_SF(name, soft_t, host_t, host_sqrt_func) \ + static soft_t name(soft_t a, float_status *s) \ + { \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush1(&a, s); \ + if (likely((soft_t ## _is_normal(a) || soft_t ## _is_zero(a)) && \ + !soft_t ## _is_neg(a) && \ + can_use_fpu(s))) { \ + host_t ha = soft_t ## _to_ ## host_t(a); \ + host_t hr = host_sqrt_func(ha); \ + \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + soft: \ + return soft_ ## soft_t ## _sqrt(a, s); \ + } + +#define GEN_SQRT_FP(name, soft_t, host_t, host_sqrt_func) \ + static soft_t name(soft_t a, float_status *s) \ + { \ + host_t ha; \ + \ + if (QEMU_NO_HARDFLOAT) { \ + goto soft; \ + } \ + soft_t ## _input_flush1(&a, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + if (likely((fpclassify(ha) == FP_NORMAL || \ + fpclassify(ha) == FP_ZERO) && \ + !signbit(ha) && \ + can_use_fpu(s))) { \ + host_t hr = host_sqrt_func(ha); \ + \ + return host_t ## _to_ ## soft_t(hr); \ + } \ + soft: \ + return soft_ ## soft_t ## _sqrt(a, s); \ + } + +GEN_SQRT_SF(f32_sqrt, float32, float, sqrtf) +GEN_SQRT_SF(f64_sqrt, float64, double, sqrt) +#undef GEN_SQRT_SF + +GEN_SQRT_FP(float_sqrt, float32, float, sqrtf) +GEN_SQRT_FP(double_sqrt, float64, double, sqrt) +#undef GEN_SQRT_FP + +float32 __attribute__((flatten)) float32_sqrt(float32 a, float_status *s) +{ + if (QEMU_HARDFLOAT_1F32_USE_FP) { + return float_sqrt(a, s); + } else { + return f32_sqrt(a, s); + } +} + +float64 __attribute__((flatten)) float64_sqrt(float64 a, float_status *s) +{ + if (QEMU_HARDFLOAT_1F64_USE_FP) { + return double_sqrt(a, s); + } else { + return f64_sqrt(a, s); + } +} + /*---------------------------------------------------------------------------- | Takes a 64-bit fixed-point value `absZ' with binary point between bits 6 From patchwork Wed Apr 4 23:11:15 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emilio Cota X-Patchwork-Id: 895198 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=braap.org Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=braap.org header.i=@braap.org header.b="pi4UzZNz"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=messagingengine.com header.i=@messagingengine.com header.b="gVUaQM0c"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40GhmQ2DYlz9s0W for ; Thu, 5 Apr 2018 09:21:25 +1000 (AEST) Received: from localhost ([::1]:37956 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3riV-0003Hz-5y for incoming@patchwork.ozlabs.org; Wed, 04 Apr 2018 19:21:23 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:54583) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f3rZh-0004Qw-Tk for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:19 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f3rZg-00048P-92 for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:17 -0400 Received: from out5-smtp.messagingengine.com ([66.111.4.29]:48859) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1f3rZg-00047k-3K for qemu-devel@nongnu.org; Wed, 04 Apr 2018 19:12:16 -0400 Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 3C42721C39; Wed, 4 Apr 2018 19:11:19 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Wed, 04 Apr 2018 19:11:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=braap.org; h=cc :date:from:in-reply-to:message-id:references:subject:to :x-me-sender:x-me-sender:x-sasl-enc; s=mesmtp; bh=P4n82GbD5+17ip lgcsS5ZXqcbNkp38SWIpi9jRbrqQM=; b=pi4UzZNzLHBXc2VtfgCAvt0+WPJS+R 6G18ixgSMowv5kRr5MPsYqirvDWBhL6hIpXjra449n7Gp40O3yisajI9BLgIkyhO cTWVgzv2ZMDRwclTCYSh4B9laUmRAXcixB+8s2YEWPuRvV2MFvMbVhUAj6xDz8Te t5XI1rgSrdlIs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:date:from:in-reply-to:message-id :references:subject:to:x-me-sender:x-me-sender:x-sasl-enc; s= fm2; bh=P4n82GbD5+17iplgcsS5ZXqcbNkp38SWIpi9jRbrqQM=; b=gVUaQM0c pV7CWvFcMFUfVMFUT6xmMnYpTwi5vbWnoCHeyu1bqSO/9k9olrRKzIZFitGslnFZ rOceto1PkBn0kfx6voGcugouvhfN5vXG7UCD1i1BPCTaCn6wHwEfvNc+4XOYYKF4 yV1TdXaswGB9oHCUHhen27faou7bpFb+pc6B/EU2zdXFq4kHUbxRUIeB/O4YO7eQ o42Hu7RZWqU2TV3EsuaEREdfw0sMnd1m+kxlj4nTF4Z+WT4UA1PJGE3GoTSCF9e6 HmoHeNi5tRIgpE03UQC3TmhkJEzVcdJvhkePut+K/Ok83t8YHocJQKmeD2er7ZrY wsumLPmfTqSYRQ== X-ME-Sender: Received: from localhost (flamenco.cs.columbia.edu [128.59.20.216]) by mail.messagingengine.com (Postfix) with ESMTPA id EE327E43C8; Wed, 4 Apr 2018 19:11:18 -0400 (EDT) From: "Emilio G. Cota" To: qemu-devel@nongnu.org Date: Wed, 4 Apr 2018 19:11:15 -0400 Message-Id: <1522883475-27858-16-git-send-email-cota@braap.org> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1522883475-27858-1-git-send-email-cota@braap.org> References: <1522883475-27858-1-git-send-email-cota@braap.org> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 66.111.4.29 Subject: [Qemu-devel] [PATCH v3 15/15] hardfloat: support float32/64 comparison X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Peter Maydell , Mark Cave-Ayland , Richard Henderson , Laurent Vivier , Paolo Bonzini , =?utf-8?q?Alex_Benn=C3=A9e?= , Aurelien Jarno Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Performance results for fp-bench: 1. Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz - before: cmp-single: 113.01 MFlops cmp-double: 115.54 MFlops - after: cmp-single: 527.83 MFlops cmp-double: 457.21 MFlops 2. ARM Aarch64 A57 @ 2.4GHz - before: cmp-single: 39.32 MFlops cmp-double: 39.80 MFlops - after: cmp-single: 162.74 MFlops cmp-double: 167.08 MFlops 3. IBM POWER8E @ 2.1 GHz - before: cmp-single: 60.81 MFlops cmp-double: 62.76 MFlops - after: cmp-single: 235.39 MFlops cmp-double: 283.44 MFlops Here using float{32,64}_is_any_nan is faster than using isnan for all machines. On x86_64 the perf difference is just a few percentage points, but on aarch64 we go from 117/119 to 164/169 MFlops for single/double precision, respectively. Aggregate performance improvement for the last few patches: [ all charts in png: https://imgur.com/a/4yV8p ] 1. Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz qemu-aarch64 NBench score; higher is better Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz 16 +-+-----------+-------------+----===-------+---===-------+-----------+-+ 14 +-+..........................@@@&&.=.......@@@&&.=...................+-+ 12 +-+..........................@.@.&.=.......@.@.&.=.....+befor=== +-+ 10 +-+..........................@.@.&.=.......@.@.&.=.....+ad@@&& = +-+ 8 +-+.......................$$$%.@.&.=.......@.@.&.=.....+ @@u& = +-+ 6 +-+............@@@&&=+***##.$%.@.&.=***##$$%+@.&.=..###$$%%@i& = +-+ 4 +-+.......###$%%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=+**.#+$ +@m& = +-+ 2 +-+.....***.#$.%.@.&=.*.*.#.$%.@.&.=*.*.#.$%.@.&.=.**.#+$+sqr& = +-+ 0 +-+-----***##$%%@@&&=-***##$$%@@&&==***##$$%@@&&==-**##$$%+cmp==-----+-+ FOURIER NEURAL NELU DECOMPOSITION gmean qemu-aarch64 SPEC06fp (test set) speedup over QEMU 4c2c1015905 Host: Intel(R) Core(TM) i7-6700K CPU @ 4.00GHz error bars: 95% confidence interval 4.5 +-+---+-----+----+-----+-----+-&---+-----+----+-----+-----+-----+----+-----+-----+-----+-----+----+-----+---+-+ 4 +-+..........................+@@+...........................................................................+-+ 3.5 +-+..............%%@&.........@@..............%%@&............................................+++dsub +-+ 2.5 +-+....&&+.......%%@&.......+%%@..+%%&+..@@&+.%%@&....................................+%%&+.+%@&++%%@& +-+ 2 +-+..+%%&..+%@&+.%%@&...+++..%%@...%%&.+$$@&..%%@&..%%@&.......+%%&+.%%@&+......+%%@&.+%%&++$$@&++d%@& %%@&+-+ 1.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+f%@&**$%@&+-+ 0.5 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&+sqr@&**$%@&+-+ 0 +-+**#$%&**#$@&**#%@&**$%@**#$%@**#$%&**#$@&**$%@&*#$%@**#$%@**#$%&**#%@&**$%@&*#$%@**#$%&**#$@&*+cmp&**$%@&+-+ 410.bw416.gam433.434.z435.436.cac437.lesli444.447.de450.so453454.ca459.GemsF465.tont470.lb4482.sphinxgeomean 2. Host: ARM Aarch64 A57 @ 2.4GHz qemu-aarch64 NBench score; higher is better Host: Applied Micro X-Gene, Aarch64 A57 @ 2.4 GHz 5 +-+-----------+-------------+-------------+-------------+-----------+-+ 4.5 +-+........................................@@@&==...................+-+ 3 4 +-+..........................@@@&==........@.@&.=.....+before +-+ 3 +-+..........................@.@&.=........@.@&.=.....+ad@@@&== +-+ 2.5 +-+.....................##$$%%.@&.=........@.@&.=.....+ @m@& = +-+ 2 +-+............@@@&==.***#.$.%.@&.=.***#$$%%.@&.=.***#$$%%d@& = +-+ 1.5 +-+.....***#$$%%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$ +f@& = +-+ 0.5 +-+.....*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#.$.%.@&.=.*.*#+$+sqr& = +-+ 0 +-+-----***#$$%%@@&==-***#$$%%@@&==-***#$$%%@@&==-***#$$%+cmp==-----+-+ FOURIER NEURAL NLU DECOMPOSITION gmean Note that by not inlining the soft-fp primitives we end up with a smaller softfloat.o--in particular, see the difference for the softfloat.o built for fp-bench: - before this series: text data bss dec hex filename 103235 0 0 103235 19343 softfloat.o - after: text data bss dec hex filename 93369 0 0 93369 16cb9 softfloat.o Signed-off-by: Emilio G. Cota --- fpu/softfloat.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 60 insertions(+), 14 deletions(-) diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 5434d29..459dd87 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -2581,28 +2581,74 @@ static int compare_floats(FloatParts a, FloatParts b, bool is_quiet, } } -#define COMPARE(sz) \ -int float ## sz ## _compare(float ## sz a, float ## sz b, \ - float_status *s) \ +#define COMPARE(name, attr, sz) \ +static int attr \ +name(float ## sz a, float ## sz b, bool is_quiet, float_status *s) \ { \ FloatParts pa = float ## sz ## _unpack_canonical(a, s); \ FloatParts pb = float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, false, s); \ -} \ -int float ## sz ## _compare_quiet(float ## sz a, float ## sz b, \ - float_status *s) \ -{ \ - FloatParts pa = float ## sz ## _unpack_canonical(a, s); \ - FloatParts pb = float ## sz ## _unpack_canonical(b, s); \ - return compare_floats(pa, pb, true, s); \ + return compare_floats(pa, pb, is_quiet, s); \ } -COMPARE(16) -COMPARE(32) -COMPARE(64) +COMPARE(soft_float16_compare, , 16) +COMPARE(soft_float32_compare, QEMU_SOFTFLOAT_ATTR, 32) +COMPARE(soft_float64_compare, QEMU_SOFTFLOAT_ATTR, 64) #undef COMPARE +int __attribute__((flatten)) +float16_compare(float16 a, float16 b, float_status *s) +{ + return soft_float16_compare(a, b, false, s); +} + +int __attribute__((flatten)) +float16_compare_quiet(float16 a, float16 b, float_status *s) +{ + return soft_float16_compare(a, b, true, s); +} + +#define GEN_FPU_COMPARE(name, quiet_name, soft_t, host_t) \ + static int \ + fpu_ ## name(soft_t a, soft_t b, bool is_quiet, float_status *s) \ + { \ + host_t ha, hb; \ + \ + if (QEMU_NO_HARDFLOAT) { \ + return soft_ ## name(a, b, is_quiet, s); \ + } \ + soft_t ## _input_flush2(&a, &b, s); \ + ha = soft_t ## _to_ ## host_t(a); \ + hb = soft_t ## _to_ ## host_t(b); \ + if (unlikely(soft_t ## _is_any_nan(a) || \ + soft_t ## _is_any_nan(b))) { \ + return soft_ ## name(a, b, is_quiet, s); \ + } \ + if (isgreater(ha, hb)) { \ + return float_relation_greater; \ + } \ + if (isless(ha, hb)) { \ + return float_relation_less; \ + } \ + return float_relation_equal; \ + } \ + \ + int __attribute__((flatten)) \ + name(soft_t a, soft_t b, float_status *s) \ + { \ + return fpu_ ## name(a, b, false, s); \ + } \ + \ + int __attribute__((flatten)) \ + quiet_name(soft_t a, soft_t b, float_status *s) \ + { \ + return fpu_ ## name(a, b, true, s); \ + } + +GEN_FPU_COMPARE(float32_compare, float32_compare_quiet, float32, float) +GEN_FPU_COMPARE(float64_compare, float64_compare_quiet, float64, double) +#undef GEN_FPU_COMPARE + /* Multiply A by 2 raised to the power N. */ static FloatParts scalbn_decomposed(FloatParts a, int n, float_status *s) {