From patchwork Mon Oct 7 18:43:01 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 1172944 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-510414-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Lz8LPCSa"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="HYC+jPm9"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46n8WK1VhQz9sPT for ; Tue, 8 Oct 2019 05:43:24 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=S3DVsScsa1+SqIbF1ulBE+gbla479Gk/NVUKCmqC//g/xr 6jStvpnQXTlQrOS3uNPLZVxee8/cJ1sylPephSGOwGR3OZ0XVGsJXJE10YI46Di6 Vx/5+muKcub/RJkbmIbn6miSco4eGqD5W9QwPYp5C4u70qSsZqMmq5R2pNYho= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=1hrOn5ltna+1vHjORtKdGIci4jc=; b=Lz8LPCSasFqxKpXf19YX K4nhtSYdn6O3uPrALeYGdNde03r2TvlhF1920W8dF34lwce8BtPVmkHLL7f3bg4Q llt+RUF2XJgSM7NFtda6USZDD8jIQi0QqffvPvetx+nswyBGLrgs4+nmBst+9mOb sixluXdQX2NAak4vfxz3NvI= Received: (qmail 63608 invoked by alias); 7 Oct 2019 18:43:17 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 63598 invoked by uid 89); 7 Oct 2019 18:43:17 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-15.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=m32, NULL_RTX, null_rtx, Compensate X-HELO: mail-io1-f43.google.com Received: from mail-io1-f43.google.com (HELO mail-io1-f43.google.com) (209.85.166.43) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 07 Oct 2019 18:43:15 +0000 Received: by mail-io1-f43.google.com with SMTP id h144so30873934iof.7 for ; Mon, 07 Oct 2019 11:43:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=Cav/0qGH0vf2AJMVkfva6P6WavqZm5nkKwr91qjBKbo=; b=HYC+jPm98TJBhYLNwiwjvNprwMMp2A4whNqB6aR2EIOV4NEir/fwdoAn7+dCaVFXnt Z1e265aLsNhyXULzORnNUNNKy+QmSOFddlPixfmXEZwIBhv+lRnQv+/y7Vhhm14S1lfL DnVhclPJY5mLFkvfP2QoY1AybELl6eA65qj+VJm3fDIPgvvCljXzn624VgJwFpsdm94v lwoye2Es76px8r6dG7dSUhgzo2ySs7shmNLWVh6vtnEpvKX8euLjR1wGebNW2XS65QDz xQpBEGPkXbZwyZ1CiakBNybl9RS7Qw9j87yfZThH+T/Z+xTvBaeZQvpeJuonO5jRiFk6 VWsA== MIME-Version: 1.0 From: Uros Bizjak Date: Mon, 7 Oct 2019 20:43:01 +0200 Message-ID: Subject: [PATCH, i386]: Reorder a couple of rounding functions To: "gcc-patches@gcc.gnu.org" Put some functions to a better place. 2019-10-07 Uroš Bizjak * config/i386/i386-expand.c (ix86_expand_floorceildf_32, ix86_expand_rounddf_32): Reorder functions. * config/i386/i386-protos.h: Update.. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 22c2823c549f..3635de597d0b 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -15903,71 +15903,8 @@ ix86_expand_rint (rtx operand0, rtx operand1) emit_move_insn (operand0, res); } -/* Expand SSE2 sequence for computing floor or ceil from OPERAND1 storing - into OPERAND0. */ -void -ix86_expand_floorceildf_32 (rtx operand0, rtx operand1, bool do_floor) -{ - /* C code for the stuff we expand below. - double xa = fabs (x), x2; - if (!isless (xa, TWO52)) - return x; - xa = xa + TWO52 - TWO52; - x2 = copysign (xa, x); - Compensate. Floor: - if (x2 > x) - x2 -= 1; - Compensate. Ceil: - if (x2 < x) - x2 += 1; - if (HONOR_SIGNED_ZEROS (mode)) - x2 = copysign (x2, x); - return x2; - */ - machine_mode mode = GET_MODE (operand0); - rtx xa, TWO52, tmp, one, res, mask; - rtx_code_label *label; - - TWO52 = ix86_gen_TWO52 (mode); - - /* Temporary for holding the result, initialized to the input - operand to ease control flow. */ - res = gen_reg_rtx (mode); - emit_move_insn (res, operand1); - - /* xa = abs (operand1) */ - xa = ix86_expand_sse_fabs (res, &mask); - - /* if (!isless (xa, TWO52)) goto label; */ - label = ix86_expand_sse_compare_and_jump (UNLE, TWO52, xa, false); - - /* xa = xa + TWO52 - TWO52; */ - xa = expand_simple_binop (mode, PLUS, xa, TWO52, NULL_RTX, 0, OPTAB_DIRECT); - xa = expand_simple_binop (mode, MINUS, xa, TWO52, xa, 0, OPTAB_DIRECT); - - /* xa = copysign (xa, operand1) */ - ix86_sse_copysign_to_positive (xa, xa, res, mask); - - /* generate 1.0 */ - one = force_reg (mode, const_double_from_real_value (dconst1, mode)); - - /* Compensate: xa = xa - (xa > operand1 ? 1 : 0) */ - tmp = ix86_expand_sse_compare_mask (UNGT, xa, res, !do_floor); - emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, one, tmp))); - tmp = expand_simple_binop (mode, do_floor ? MINUS : PLUS, - xa, tmp, NULL_RTX, 0, OPTAB_DIRECT); - if (!do_floor && HONOR_SIGNED_ZEROS (mode)) - ix86_sse_copysign_to_positive (tmp, tmp, res, mask); - emit_move_insn (res, tmp); - - emit_label (label); - LABEL_NUSES (label) = 1; - - emit_move_insn (operand0, res); -} - -/* Expand SSE2 sequence for computing floor or ceil from OPERAND1 storing - into OPERAND0. */ +/* Expand SSE2 sequence for computing floor or ceil + from OPERAND1 storing into OPERAND0. */ void ix86_expand_floorceil (rtx operand0, rtx operand1, bool do_floor) { @@ -16027,30 +15964,30 @@ ix86_expand_floorceil (rtx operand0, rtx operand1, bool do_floor) emit_move_insn (operand0, res); } -/* Expand SSE sequence for computing round from OPERAND1 storing - into OPERAND0. Sequence that works without relying on DImode truncation - via cvttsd2siq that is only available on 64bit targets. */ +/* Expand SSE2 sequence for computing floor or ceil from OPERAND1 storing + into OPERAND0 without relying on DImode truncation via cvttsd2siq + that is only available on 64bit targets. */ void -ix86_expand_rounddf_32 (rtx operand0, rtx operand1) +ix86_expand_floorceildf_32 (rtx operand0, rtx operand1, bool do_floor) { /* C code for the stuff we expand below. - double xa = fabs (x), xa2, x2; + double xa = fabs (x), x2; if (!isless (xa, TWO52)) return x; - Using the absolute value and copying back sign makes - -0.0 -> -0.0 correct. - xa2 = xa + TWO52 - TWO52; - Compensate. - dxa = xa2 - xa; - if (dxa <= -0.5) - xa2 += 1; - else if (dxa > 0.5) - xa2 -= 1; - x2 = copysign (xa2, x); - return x2; + xa = xa + TWO52 - TWO52; + x2 = copysign (xa, x); + Compensate. Floor: + if (x2 > x) + x2 -= 1; + Compensate. Ceil: + if (x2 < x) + x2 += 1; + if (HONOR_SIGNED_ZEROS (mode)) + x2 = copysign (x2, x); + return x2; */ machine_mode mode = GET_MODE (operand0); - rtx xa, xa2, dxa, TWO52, tmp, half, mhalf, one, res, mask; + rtx xa, TWO52, tmp, one, res, mask; rtx_code_label *label; TWO52 = ix86_gen_TWO52 (mode); @@ -16066,31 +16003,24 @@ ix86_expand_rounddf_32 (rtx operand0, rtx operand1) /* if (!isless (xa, TWO52)) goto label; */ label = ix86_expand_sse_compare_and_jump (UNLE, TWO52, xa, false); - /* xa2 = xa + TWO52 - TWO52; */ - xa2 = expand_simple_binop (mode, PLUS, xa, TWO52, NULL_RTX, 0, OPTAB_DIRECT); - xa2 = expand_simple_binop (mode, MINUS, xa2, TWO52, xa2, 0, OPTAB_DIRECT); - - /* dxa = xa2 - xa; */ - dxa = expand_simple_binop (mode, MINUS, xa2, xa, NULL_RTX, 0, OPTAB_DIRECT); + /* xa = xa + TWO52 - TWO52; */ + xa = expand_simple_binop (mode, PLUS, xa, TWO52, NULL_RTX, 0, OPTAB_DIRECT); + xa = expand_simple_binop (mode, MINUS, xa, TWO52, xa, 0, OPTAB_DIRECT); - /* generate 0.5, 1.0 and -0.5 */ - half = force_reg (mode, const_double_from_real_value (dconsthalf, mode)); - one = expand_simple_binop (mode, PLUS, half, half, NULL_RTX, 0, OPTAB_DIRECT); - mhalf = expand_simple_binop (mode, MINUS, half, one, NULL_RTX, - 0, OPTAB_DIRECT); + /* xa = copysign (xa, operand1) */ + ix86_sse_copysign_to_positive (xa, xa, res, mask); - /* Compensate. */ - /* xa2 = xa2 - (dxa > 0.5 ? 1 : 0) */ - tmp = ix86_expand_sse_compare_mask (UNGT, dxa, half, false); - emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, tmp, one))); - xa2 = expand_simple_binop (mode, MINUS, xa2, tmp, NULL_RTX, 0, OPTAB_DIRECT); - /* xa2 = xa2 + (dxa <= -0.5 ? 1 : 0) */ - tmp = ix86_expand_sse_compare_mask (UNGE, mhalf, dxa, false); - emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, tmp, one))); - xa2 = expand_simple_binop (mode, PLUS, xa2, tmp, NULL_RTX, 0, OPTAB_DIRECT); + /* generate 1.0 */ + one = force_reg (mode, const_double_from_real_value (dconst1, mode)); - /* res = copysign (xa2, operand1) */ - ix86_sse_copysign_to_positive (res, xa2, force_reg (mode, operand1), mask); + /* Compensate: xa = xa - (xa > operand1 ? 1 : 0) */ + tmp = ix86_expand_sse_compare_mask (UNGT, xa, res, !do_floor); + emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, one, tmp))); + tmp = expand_simple_binop (mode, do_floor ? MINUS : PLUS, + xa, tmp, NULL_RTX, 0, OPTAB_DIRECT); + if (!do_floor && HONOR_SIGNED_ZEROS (mode)) + ix86_sse_copysign_to_positive (tmp, tmp, res, mask); + emit_move_insn (res, tmp); emit_label (label); LABEL_NUSES (label) = 1; @@ -16098,8 +16028,8 @@ ix86_expand_rounddf_32 (rtx operand0, rtx operand1) emit_move_insn (operand0, res); } -/* Expand SSE sequence for computing trunc from OPERAND1 storing - into OPERAND0. */ +/* Expand SSE sequence for computing trunc + from OPERAND1 storing into OPERAND0. */ void ix86_expand_trunc (rtx operand0, rtx operand1) { @@ -16144,7 +16074,8 @@ ix86_expand_trunc (rtx operand0, rtx operand1) } /* Expand SSE sequence for computing trunc from OPERAND1 storing - into OPERAND0. */ + into OPERAND0 without relying on DImode truncation via cvttsd2siq + that is only available on 64bit targets. */ void ix86_expand_truncdf_32 (rtx operand0, rtx operand1) { @@ -16201,8 +16132,8 @@ ix86_expand_truncdf_32 (rtx operand0, rtx operand1) emit_move_insn (operand0, res); } -/* Expand SSE sequence for computing round from OPERAND1 storing - into OPERAND0. */ +/* Expand SSE sequence for computing round + from OPERAND1 storing into OPERAND0. */ void ix86_expand_round (rtx operand0, rtx operand1) { @@ -16251,6 +16182,77 @@ ix86_expand_round (rtx operand0, rtx operand1) emit_move_insn (operand0, res); } +/* Expand SSE sequence for computing round from OPERAND1 storing + into OPERAND0 without relying on DImode truncation via cvttsd2siq + that is only available on 64bit targets. */ +void +ix86_expand_rounddf_32 (rtx operand0, rtx operand1) +{ + /* C code for the stuff we expand below. + double xa = fabs (x), xa2, x2; + if (!isless (xa, TWO52)) + return x; + Using the absolute value and copying back sign makes + -0.0 -> -0.0 correct. + xa2 = xa + TWO52 - TWO52; + Compensate. + dxa = xa2 - xa; + if (dxa <= -0.5) + xa2 += 1; + else if (dxa > 0.5) + xa2 -= 1; + x2 = copysign (xa2, x); + return x2; + */ + machine_mode mode = GET_MODE (operand0); + rtx xa, xa2, dxa, TWO52, tmp, half, mhalf, one, res, mask; + rtx_code_label *label; + + TWO52 = ix86_gen_TWO52 (mode); + + /* Temporary for holding the result, initialized to the input + operand to ease control flow. */ + res = gen_reg_rtx (mode); + emit_move_insn (res, operand1); + + /* xa = abs (operand1) */ + xa = ix86_expand_sse_fabs (res, &mask); + + /* if (!isless (xa, TWO52)) goto label; */ + label = ix86_expand_sse_compare_and_jump (UNLE, TWO52, xa, false); + + /* xa2 = xa + TWO52 - TWO52; */ + xa2 = expand_simple_binop (mode, PLUS, xa, TWO52, NULL_RTX, 0, OPTAB_DIRECT); + xa2 = expand_simple_binop (mode, MINUS, xa2, TWO52, xa2, 0, OPTAB_DIRECT); + + /* dxa = xa2 - xa; */ + dxa = expand_simple_binop (mode, MINUS, xa2, xa, NULL_RTX, 0, OPTAB_DIRECT); + + /* generate 0.5, 1.0 and -0.5 */ + half = force_reg (mode, const_double_from_real_value (dconsthalf, mode)); + one = expand_simple_binop (mode, PLUS, half, half, NULL_RTX, 0, OPTAB_DIRECT); + mhalf = expand_simple_binop (mode, MINUS, half, one, NULL_RTX, + 0, OPTAB_DIRECT); + + /* Compensate. */ + /* xa2 = xa2 - (dxa > 0.5 ? 1 : 0) */ + tmp = ix86_expand_sse_compare_mask (UNGT, dxa, half, false); + emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, tmp, one))); + xa2 = expand_simple_binop (mode, MINUS, xa2, tmp, NULL_RTX, 0, OPTAB_DIRECT); + /* xa2 = xa2 + (dxa <= -0.5 ? 1 : 0) */ + tmp = ix86_expand_sse_compare_mask (UNGE, mhalf, dxa, false); + emit_insn (gen_rtx_SET (tmp, gen_rtx_AND (mode, tmp, one))); + xa2 = expand_simple_binop (mode, PLUS, xa2, tmp, NULL_RTX, 0, OPTAB_DIRECT); + + /* res = copysign (xa2, operand1) */ + ix86_sse_copysign_to_positive (res, xa2, force_reg (mode, operand1), mask); + + emit_label (label); + LABEL_NUSES (label) = 1; + + emit_move_insn (operand0, res); +} + /* Expand SSE sequence for computing round from OP1 storing into OP0 using sse4 round insn. */ void diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h index 4d6e76d55800..c07dfe508557 100644 --- a/gcc/config/i386/i386-protos.h +++ b/gcc/config/i386/i386-protos.h @@ -193,11 +193,11 @@ extern void ix86_expand_lfloorceil (rtx, rtx, bool); extern void ix86_expand_rint (rtx, rtx); extern void ix86_expand_floorceil (rtx, rtx, bool); extern void ix86_expand_floorceildf_32 (rtx, rtx, bool); -extern void ix86_expand_round_sse4 (rtx, rtx); -extern void ix86_expand_round (rtx, rtx); -extern void ix86_expand_rounddf_32 (rtx, rtx); extern void ix86_expand_trunc (rtx, rtx); extern void ix86_expand_truncdf_32 (rtx, rtx); +extern void ix86_expand_round (rtx, rtx); +extern void ix86_expand_rounddf_32 (rtx, rtx); +extern void ix86_expand_round_sse4 (rtx, rtx); extern void ix86_expand_vecop_qihi (enum rtx_code, rtx, rtx, rtx);