From patchwork Sat Jul 13 14:05:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejas Joshi X-Patchwork-Id: 1131622 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-505052-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="WcM9RMXK"; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="KgVGx3Y4"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45mBJ21czYz9sNf for ; Sat, 13 Jul 2019 23:59:56 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=wskwWotpqBqA4F3fHcckSd6PL+48yXBdXjj1T/OvqVrhcA qFdSZqx6ErkGJuOxoi1Hdd8+UNQmJOaW2mcXnygcM7Y2CGG34X19h7Zkj+sdX2QG 6Dr8aNqRGz5ISlJhiv9fpao0XmYXITwXJTB5iXI8mg8gNkwKNqTynlFKhcARw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=MOpI5SeJc+jvX/iSXKzUwS5OjKE=; b=WcM9RMXKYMdRSNS93ggj M9PNc3vJhnbky+Wx/sjWE5CsW/ufQGUwDEPbxIwUKfgb9glNR/dd6aKUIVNEWFpi 24o4ROqcfgnELCq/SFqK7XX+Vqu2MrbWxGqN74kF5DKw8QDX9BNd5uROdFqnuQf5 gq7CPj2W0rzCcFAvJRz2P70= Received: (qmail 5583 invoked by alias); 13 Jul 2019 13:59:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 5572 invoked by uid 89); 13 Jul 2019 13:59:48 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-21.1 required=5.0 tests=AWL, BAYES_00, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.1 spammy=expanding, optab, HX-Received:a50 X-HELO: mail-ed1-f53.google.com Received: from mail-ed1-f53.google.com (HELO mail-ed1-f53.google.com) (209.85.208.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 13 Jul 2019 13:59:46 +0000 Received: by mail-ed1-f53.google.com with SMTP id w13so11573435eds.4 for ; Sat, 13 Jul 2019 06:59:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=elkwlfFrCKKPqUt7PBo8oOj4dm1JvmJkUnEnqCq9hOw=; b=KgVGx3Y4/4u9NbWLD7DqLSqE5hwB6xTgz0nC6TEZfINOxgbuVhFmN87msH+30CD4Q1 twqB/TEUSXzW34D0SG3HaFpg+SZcFdSyI3Bk5D2tComUfiPCGqxq//yyJQxpAs+m7nGY PHqZ9Q4ZVJrbRL9wsxwO651nwK4gRKXuww9FCvkCeUdHAdamMhB7o5gC868nnTqLiwpB 4Gb4SNgOXVippyzYx+ELTvKBcnJH2u7Ps8/UWfq0rFm/OwcrfjVY6TWfsMbjzM2j+ass NrqAQr0blgBUceIalG4BODdMzhHenkdfx6Er2dWFzsC0Yf4y3DCxIbCFxTZVU/e3uctq Ki9Q== MIME-Version: 1.0 From: Tejas Joshi Date: Sat, 13 Jul 2019 19:35:22 +0530 Message-ID: Subject: [PATCH] i386: Expand roundeven for SSE4.1+ To: gcc-patches@gcc.gnu.org Hi. This patch is for expanding roundeven inline for SSE4.1 and later. Note that this patch is to be applied on top of . The patch is bootstrapped and regression tested on x86_64-linux-gnu. Thanks, Tejas gcc/ChangeLog: 2019-07-13 Tejas Joshi * builtins.c (mathfn_built_in_2): Changed a CASE_MATHFN to CASE_MATHFN_FLOATN for roundeven. * config/i386/i386.md: Define UNSPEC_ROUNDEVEN. (define_constant): Define ROUND_ROUNDEVEN rounding mode. (roundeven2): New define_expand. * internal-fn.def (ROUNDEVEN): New builtin function. * optabs.def (roundeven_optab): New optab. gcc/testsuite/ChangeLog: 2019-07-13 Tejas Joshi * gcc.target/i386/avx-vround-roundeven-1.c: New test. * gcc.target/i386/avx-vround-roundeven-2.c: New test. diff --git a/gcc/builtins.c b/gcc/builtins.c index 8ceb077b0bf..f61f10422fd 100644 --- a/gcc/builtins.c +++ b/gcc/builtins.c @@ -2056,7 +2056,7 @@ mathfn_built_in_2 (tree type, combined_fn fn) CASE_MATHFN (REMQUO) CASE_MATHFN_FLOATN (RINT) CASE_MATHFN_FLOATN (ROUND) - CASE_MATHFN (ROUNDEVEN) + CASE_MATHFN_FLOATN (ROUNDEVEN) CASE_MATHFN (SCALB) CASE_MATHFN (SCALBLN) CASE_MATHFN (SCALBN) diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index db5fa9ae3ca..bd5d6335f2b 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -169,6 +169,9 @@ ;; For ROUND support UNSPEC_ROUND + ;;for SSE 4.1+ rounding + UNSPEC_ROUNDEVEN + ;; For CRC32 support UNSPEC_CRC32 @@ -303,7 +306,8 @@ ;; Constants to represent rounding modes in the ROUND instruction (define_constants - [(ROUND_FLOOR 0x1) + [(ROUND_ROUNDEVEN 0x0) + (ROUND_FLOOR 0x1) (ROUND_CEIL 0x2) (ROUND_TRUNC 0x3) (ROUND_MXCSR 0x4) @@ -16328,6 +16332,20 @@ "TARGET_USE_FANCY_MATH_387 && (flag_fp_int_builtin_inexact || !flag_trapping_math)") +(define_expand "roundeven2" + [(parallel [(set (match_operand:MODEF 0 "register_operand") + (unspec:MODEF [(match_operand:MODEF 1 "register_operand")] + UNSPEC_ROUNDEVEN)) + (clobber (reg:CC FLAGS_REG))])] + "(SSE_FLOAT_MODE_P (mode) && TARGET_SSE_MATH && TARGET_SSE4_1)" +{ + gcc_assert (TARGET_SSE4_1); + emit_insn (gen_sse4_1_round2 + (operands[0], operands[1], GEN_INT (ROUND_ROUNDEVEN + | ROUND_NO_EXC))); + DONE; +}) + (define_expand "2" [(parallel [(set (match_operand:MODEF 0 "register_operand") (unspec:MODEF [(match_operand:MODEF 1 "register_operand")] diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 906d74b1d08..15f019b9b49 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -234,6 +234,7 @@ DEF_INTERNAL_FLT_FLOATN_FN (FLOOR, ECF_CONST, floor, unary) DEF_INTERNAL_FLT_FLOATN_FN (NEARBYINT, ECF_CONST, nearbyint, unary) DEF_INTERNAL_FLT_FLOATN_FN (RINT, ECF_CONST, rint, unary) DEF_INTERNAL_FLT_FLOATN_FN (ROUND, ECF_CONST, round, unary) +DEF_INTERNAL_FLT_FLOATN_FN (ROUNDEVEN, ECF_CONST, roundeven, unary) DEF_INTERNAL_FLT_FLOATN_FN (TRUNC, ECF_CONST, btrunc, unary) /* Binary math functions. */ diff --git a/gcc/optabs.def b/gcc/optabs.def index 4ffd0f35a40..065e3f64dda 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -268,6 +268,7 @@ OPTAB_D (fnms_optab, "fnms$a4") OPTAB_D (rint_optab, "rint$a2") OPTAB_D (round_optab, "round$a2") +OPTAB_D (roundeven_optab, "roundeven$a2") OPTAB_D (floor_optab, "floor$a2") OPTAB_D (ceil_optab, "ceil$a2") OPTAB_D (btrunc_optab, "btrunc$a2") diff --git a/gcc/testsuite/gcc.target/i386/avx-vround-roundeven-1.c b/gcc/testsuite/gcc.target/i386/avx-vround-roundeven-1.c new file mode 100644 index 00000000000..072d0f0e73a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx-vround-roundeven-1.c @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx" } */ + +__attribute__((noinline, noclone)) double +f1 (double x) +{ + return __builtin_roundeven (x); +} + +__attribute__((noinline, noclone)) float +f2 (float x) +{ + return __builtin_roundevenf (x); +} + +/* { dg-final { scan-assembler-times "vroundsd\[^\n\r\]*xmm" 1 } } */ +/* { dg-final { scan-assembler-times "vroundss\[^\n\r\]*xmm" 1 } } */ diff --git a/gcc/testsuite/gcc.target/i386/avx-vround-roundeven-2.c b/gcc/testsuite/gcc.target/i386/avx-vround-roundeven-2.c new file mode 100644 index 00000000000..211758d026a --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx-vround-roundeven-2.c @@ -0,0 +1,21 @@ +/* { dg-do run } */ +/* { dg-require-effective-target avx } */ +/* { dg-options "-mavx" } */ + +#ifndef CHECK_H +#define CHECK_H "avx-check.h" +#define TEST avx_test +#define SRC "avx-vround-roundeven-1.c" +#endif + +#include CHECK_H +#include SRC + +static void +TEST (void) +{ + if (f1 (0.5) != 0.0 || f1 (1.5) != 2.0 || f1 (-0.5) != 0.0 || f1 (-1.5) != -2.0) + abort (); + if (f2 (0.5f) != 0.0f || f2 (1.5f) != 2.0f || f2 (-0.5f) != 0.0f || f2 (-1.5f) != -2.0f) + abort (); +}