From patchwork Wed Oct 18 20:25:19 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 827821 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-464469-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="VFG243Dp"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yHNq15TX9z9t39 for ; Thu, 19 Oct 2017 07:25:32 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; q= dns; s=default; b=YscONgy/rSRUl/5vtfukuJjWKs8p5114O8T6xTjSnte9yH fFKfeUtR+F9/VpYr0va/WN/SodQUr+VwlHlYDyySBFbphWNXi7qlOInxxbz7d5ap JYp+0AZ/fl+O4Su1xO0bG8LG4Jbv3+XF07S3i9dM+0aCWf7TwIcNs1NY7CWlM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:from:date:message-id:subject:to:content-type; s= default; bh=VAQeQo8ZtAyZnskqb2I6/QPHC5A=; b=VFG243Dp3zjFcwk/MRN+ uSgwpun+/0EzftnVBS8EfhfAfuDFOrTXfNldr87uFpWPZQahKhBD2K7H8cyhYorL J6al7L3hookbu9EtnOxzixi+1N0FcqblOPRZVV+ofjFqMWSdu5hbIoSW1TFnxfnS dbpS+r12WL1iBZ8bGhWAz7c= Received: (qmail 93961 invoked by alias); 18 Oct 2017 20:25:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 92024 invoked by uid 89); 18 Oct 2017 20:25:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-10.0 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, RCVD_IN_SORBS_SPAM, SPF_PASS autolearn=ham version=3.3.2 spammy=3922, CCC, Above, 253855 X-HELO: mail-it0-f46.google.com Received: from mail-it0-f46.google.com (HELO mail-it0-f46.google.com) (209.85.214.46) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 18 Oct 2017 20:25:21 +0000 Received: by mail-it0-f46.google.com with SMTP id 72so7423030itk.3 for ; Wed, 18 Oct 2017 13:25:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:from:date:message-id:subject:to; bh=z9oN3ExSX7v/H5pRiWM6M2u+nb5Vw+Bg/Uv+WunybYs=; b=thaVwrT5D2uS+OIkzp5KnPR/l983m9O7fJRR6kTDLBu6URvaPylIf/dGs4Rw4lySm+ N0B1bD7WdOkY7Bh6F0mtsIX1Q+wG1QmNfK++uw/z7ysKe1o1WAzfTOlnsI6aYntYisBZ SaZf+kSBFoksIBVdALoA6GPdn3FoSgJrb+iw1Azsi6yeImJtkakEG8qSftcrV6N5TYxb ROEu4AJs6cQB5DA1nhH4kdHat6z+3hXCedA+9eeiE1vqyJmzzPpRgypi3YMvkqN6WwC+ BC6rzKce+vtCx8Pvy1rlt9wBl5/s+ikCa4Q8HEIkpmACpTVFw1soOwj9hXDeovkco6um Yaug== X-Gm-Message-State: AMCzsaVa1G8NEpC0baGCpyK4LFx98n7AVsrM6i+MyHilvmZE23Ei7Ebx XxoGG/f7zr07eO2vw62r3bqiQRsuR4qqzx7Wqd7OUw== X-Google-Smtp-Source: ABhQp+Q+0lZ0h5lx37y88W9OnKJDLfrW3lEtPC1d0r4fYJOF3+hLbeJHdvzLn2uvd4o4FXeh9RQ1AAJuXFbFS+QCE1M= X-Received: by 10.36.138.133 with SMTP id v127mr11615627itd.151.1508358319688; Wed, 18 Oct 2017 13:25:19 -0700 (PDT) MIME-Version: 1.0 Received: by 10.2.74.18 with HTTP; Wed, 18 Oct 2017 13:25:19 -0700 (PDT) From: Uros Bizjak Date: Wed, 18 Oct 2017 22:25:19 +0200 Message-ID: Subject: [PATCH, i386]: Fix PR82580, Optimize double-word comparisons To: "gcc-patches@gcc.gnu.org" Hello! Attached patch emulates double-word comparisons with a double-word subtraction. Note that only comparisons that test Carry, Sign and Overflow flags are valid, so we have to avoid comparisons that test Zero flag. 2017-10-18 Uros Bizjak PR target/82580 * config/i386/i386-modes.def (CCGZ): New CC mode. * config/i386/i386.md (sub3_carry_ccgz): New insn pattern. * config/i386/predicates.md (ix86_comparison_operator): Handle CCGZmode. * config/i386/i386.c (ix86_expand_branch) : Emulate LE, LEU, GT, GTU, LT, LTU, GE and GEU double-word comparisons with double-word subtraction. (put_condition_code): Handle CCGZmode. testsuite/ChangeLog: 2017-10-18 Uros Bizjak Jakub Jelinek PR target/82580 * gcc.target/i386/pr82580.c: New test. Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Committed to mainline SVN. Uros. Index: config/i386/i386-modes.def =================================================================== --- config/i386/i386-modes.def (revision 253855) +++ config/i386/i386-modes.def (working copy) @@ -39,19 +39,22 @@ ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? For the i386, we need separate modes when floating-point equality comparisons are being done. - Add CCNO to indicate comparisons against zero that requires + Add CCNO to indicate comparisons against zero that require Overflow flag to be unset. Sign bit test is used instead and thus can be used to form "a&b>0" type of tests. - Add CCGC to indicate comparisons against zero that allows + Add CCGC to indicate comparisons against zero that allow unspecified garbage in the Carry flag. This mode is used by inc/dec instructions. - Add CCGOC to indicate comparisons against zero that allows + Add CCGOC to indicate comparisons against zero that allow unspecified garbage in the Carry and Overflow flag. This mode is used to simulate comparisons of (a-b) and (a+b) against zero using sub/cmp/add operations. + Add CCGZ to indicate comparisons that allow unspecified garbage + in the Zero flag. This mode is used in double-word comparisons. + Add CCA to indicate that only the Above flag is valid. Add CCC to indicate that only the Carry flag is valid. Add CCO to indicate that only the Overflow flag is valid. @@ -62,6 +65,7 @@ ADJUST_ALIGNMENT (XF, TARGET_128BIT_LONG_DOUBLE ? CC_MODE (CCGC); CC_MODE (CCGOC); CC_MODE (CCNO); +CC_MODE (CCGZ); CC_MODE (CCA); CC_MODE (CCC); CC_MODE (CCO); Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 253855) +++ config/i386/i386.c (working copy) @@ -16732,6 +16732,7 @@ put_condition_code (enum rtx_code code, machine_mo switch (code) { case EQ: + gcc_assert (mode != CCGZmode); switch (mode) { case E_CCAmode: @@ -16755,6 +16756,7 @@ put_condition_code (enum rtx_code code, machine_mo } break; case NE: + gcc_assert (mode != CCGZmode); switch (mode) { case E_CCAmode: @@ -16799,6 +16801,7 @@ put_condition_code (enum rtx_code code, machine_mo case E_CCmode: case E_CCGCmode: + case E_CCGZmode: suffix = "l"; break; @@ -16807,7 +16810,7 @@ put_condition_code (enum rtx_code code, machine_mo } break; case LTU: - if (mode == CCmode) + if (mode == CCmode || mode == CCGZmode) suffix = "b"; else if (mode == CCCmode) suffix = fp ? "b" : "c"; @@ -16824,6 +16827,7 @@ put_condition_code (enum rtx_code code, machine_mo case E_CCmode: case E_CCGCmode: + case E_CCGZmode: suffix = "ge"; break; @@ -16832,7 +16836,7 @@ put_condition_code (enum rtx_code code, machine_mo } break; case GEU: - if (mode == CCmode) + if (mode == CCmode || mode == CCGZmode) suffix = "nb"; else if (mode == CCCmode) suffix = fp ? "nb" : "nc"; @@ -21469,6 +21473,8 @@ ix86_match_ccmode (rtx insn, machine_mode req_mode case E_CCZmode: break; + case E_CCGZmode: + case E_CCAmode: case E_CCCmode: case E_CCOmode: @@ -22177,6 +22183,52 @@ ix86_expand_branch (enum rtx_code code, rtx op0, r break; } + /* Emulate comparisons that do not depend on Zero flag with + double-word subtraction. Note that only Overflow, Sign + and Carry flags are valid, so swap arguments and condition + of comparisons that would otherwise test Zero flag. */ + + switch (code) + { + case LE: case LEU: case GT: case GTU: + std::swap (lo[0], lo[1]); + std::swap (hi[0], hi[1]); + code = swap_condition (code); + /* FALLTHRU */ + + case LT: case LTU: case GE: case GEU: + { + rtx (*cmp_insn) (rtx, rtx); + rtx (*sbb_insn) (rtx, rtx, rtx); + + if (TARGET_64BIT) + cmp_insn = gen_cmpdi_1, sbb_insn = gen_subdi3_carry_ccgz; + else + cmp_insn = gen_cmpsi_1, sbb_insn = gen_subsi3_carry_ccgz; + + if (!nonimmediate_operand (lo[0], submode)) + lo[0] = force_reg (submode, lo[0]); + if (!x86_64_general_operand (lo[1], submode)) + lo[1] = force_reg (submode, lo[1]); + + if (!register_operand (hi[0], submode)) + hi[0] = force_reg (submode, hi[0]); + if (!x86_64_general_operand (hi[1], submode)) + hi[1] = force_reg (submode, hi[1]); + + emit_insn (cmp_insn (lo[0], lo[1])); + emit_insn (sbb_insn (gen_rtx_SCRATCH (submode), hi[0], hi[1])); + + tmp = gen_rtx_REG (CCGZmode, FLAGS_REG); + + ix86_expand_branch (code, tmp, const0_rtx, label); + return; + } + + default: + break; + } + /* Otherwise, we need two or three jumps. */ label2 = gen_label_rtx (); Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 253855) +++ config/i386/i386.md (working copy) @@ -6871,6 +6871,19 @@ (set_attr "pent_pair" "pu") (set_attr "mode" "SI")]) +(define_insn "sub3_carry_ccgz" + [(set (reg:CCGZ FLAGS_REG) + (compare:CCGZ + (match_operand:DWIH 1 "register_operand" "0") + (plus:DWIH + (ltu:DWIH (reg:CC FLAGS_REG) (const_int 0)) + (match_operand:DWIH 2 "x86_64_general_operand" "rme")))) + (clobber (match_scratch:DWIH 0 "=r"))] + "" + "sbb{}\t{%2, %0|%0, %2}" + [(set_attr "type" "alu") + (set_attr "mode" "")]) + (define_insn "subborrow" [(set (reg:CCC FLAGS_REG) (compare:CCC Index: config/i386/predicates.md =================================================================== --- config/i386/predicates.md (revision 253855) +++ config/i386/predicates.md (working copy) @@ -1329,15 +1329,21 @@ switch (code) { case EQ: case NE: + if (inmode == CCGZmode) + return false; return true; - case LT: case GE: + case GE: case LT: if (inmode == CCmode || inmode == CCGCmode - || inmode == CCGOCmode || inmode == CCNOmode) + || inmode == CCGOCmode || inmode == CCNOmode || inmode == CCGZmode) return true; return false; - case LTU: case GTU: case LEU: case GEU: - if (inmode == CCmode || inmode == CCCmode) + case GEU: case LTU: + if (inmode == CCGZmode) return true; + /* FALLTHRU */ + case GTU: case LEU: + if (inmode == CCmode || inmode == CCCmode || inmode == CCGZmode) + return true; return false; case ORDERED: case UNORDERED: if (inmode == CCmode) Index: testsuite/gcc.target/i386/pr82580.c =================================================================== --- testsuite/gcc.target/i386/pr82580.c (nonexistent) +++ testsuite/gcc.target/i386/pr82580.c (working copy) @@ -0,0 +1,38 @@ +/* PR target/82580 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#ifdef __SIZEOF_INT128__ +typedef unsigned __int128 U; +typedef signed __int128 S; +#else +typedef unsigned long long U; +typedef signed long long S; +#endif +void bar (void); +int f0 (U x, U y) { return x == y; } +int f1 (U x, U y) { return x != y; } +int f2 (U x, U y) { return x > y; } +int f3 (U x, U y) { return x >= y; } +int f4 (U x, U y) { return x < y; } +int f5 (U x, U y) { return x <= y; } +int f6 (S x, S y) { return x == y; } +int f7 (S x, S y) { return x != y; } +int f8 (S x, S y) { return x > y; } +int f9 (S x, S y) { return x >= y; } +int f10 (S x, S y) { return x < y; } +int f11 (S x, S y) { return x <= y; } +void f12 (U x, U y) { if (x == y) bar (); } +void f13 (U x, U y) { if (x != y) bar (); } +void f14 (U x, U y) { if (x > y) bar (); } +void f15 (U x, U y) { if (x >= y) bar (); } +void f16 (U x, U y) { if (x < y) bar (); } +void f17 (U x, U y) { if (x <= y) bar (); } +void f18 (S x, S y) { if (x == y) bar (); } +void f19 (S x, S y) { if (x != y) bar (); } +void f20 (S x, S y) { if (x > y) bar (); } +void f21 (S x, S y) { if (x >= y) bar (); } +void f22 (S x, S y) { if (x < y) bar (); } +void f23 (S x, S y) { if (x <= y) bar (); } + +/* { dg-final { scan-assembler-times "sbb" 16 } } */