From patchwork Sat Oct 19 05:53:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1179734 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-511344-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="xw7xVGkU"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46wBtL6R0wz9sP3 for ; Sat, 19 Oct 2019 16:54:16 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type:content-transfer-encoding; q=dns; s=default; b=xYy BEmmdsf9XnH1PfaVQpaxZq3OAywSauY3Toku/KzetvM7ZXInZdUB8fc1+UUu859Y kfI+hvfRDIiaCZkzqtD6L4uMpJqZgA9vGgzkgZadSRXB1lJJezpaQeBufu0YgVZ5 CR8BMHK+Ecgv4f40gL/Kp/PwasT079KYFYPZ1GWU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type:content-transfer-encoding; s=default; bh=Ox+nOEOyh uuBAUupn351yM9Sio4=; b=xw7xVGkU46/ZVs4VZlIOv6YfXCqvuTgaGi6eEH4rb OjPKNM+7YHqL2/CiWM8+3VMMmm+fw//jLRKHdsD+Tn2m8HIfSrNcEehSlWUyIxTr fo6eZW7vmqlZijod6OvYlvelQXJZkLjDBSyT+h80ObyxlBQxxC8XevLklA4d2sZn 38= Received: (qmail 22328 invoked by alias); 19 Oct 2019 05:54:08 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22319 invoked by uid 89); 19 Oct 2019 05:54:08 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-7.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, GIT_PATCH_3, SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=att, match_test, mtune X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 19 Oct 2019 05:54:03 +0000 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 9AE3D3073C51; Sat, 19 Oct 2019 05:54:01 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.36.118.135]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 05E7960BF3; Sat, 19 Oct 2019 05:54:00 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.15.2/8.15.2) with ESMTP id x9J5rwBO024189; Sat, 19 Oct 2019 07:53:58 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.15.2/8.15.2/Submit) id x9J5rt40024188; Sat, 19 Oct 2019 07:53:55 +0200 Date: Sat, 19 Oct 2019 07:53:55 +0200 From: Jakub Jelinek To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve code generation of v += (c == 0) etc. on x86 (PR target/92140) Message-ID: <20191019055355.GK2116@tucnak> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.11.3 (2019-02-01) X-IsSubscribed: yes Hi! As mentioned in the PR, x == 0 can be equivalently tested as x < 1U and the latter form has the advantage that it sets the carry flag and if it is consumed by an instruction that can directly use the carry flag, it is a win. The following patch adds a couple of (pre-reload only) define_insn_and_split to handle the most common cases. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2019-10-18 Jakub Jelinek Uroš Bizjak PR target/92140 * config/i386/predicates.md (int_nonimmediate_operand): New special predicate. * config/i386/i386.md (*add3_eq, *add3_ne, *add3_eq_0, *add3_ne_0, *sub3_eq, *sub3_ne, *sub3_eq_1, *sub3_eq_0, *sub3_ne_0): New define_insn_and_split patterns. * gcc.target/i386/pr92140.c: New test. * gcc.c-torture/execute/pr92140.c: New test. Jakub --- gcc/config/i386/predicates.md.jj 2019-10-07 13:09:06.486261815 +0200 +++ gcc/config/i386/predicates.md 2019-10-18 15:47:50.781855838 +0200 @@ -100,6 +100,15 @@ (define_special_predicate "ext_register_ (match_test "GET_MODE (op) == SImode") (match_test "GET_MODE (op) == HImode")))) +;; Match a DI, SI, HI or QImode nonimmediate_operand. +(define_special_predicate "int_nonimmediate_operand" + (and (match_operand 0 "nonimmediate_operand") + (ior (and (match_test "TARGET_64BIT") + (match_test "GET_MODE (op) == DImode")) + (match_test "GET_MODE (op) == SImode") + (match_test "GET_MODE (op) == HImode") + (match_test "GET_MODE (op) == QImode")))) + ;; Match register operands, but include memory operands for TARGET_SSE_MATH. (define_predicate "register_ssemem_operand" (if_then_else --- gcc/config/i386/i386.md.jj 2019-09-20 12:25:48.000000000 +0200 +++ gcc/config/i386/i386.md 2019-10-18 15:52:22.697717013 +0200 @@ -6843,6 +6843,228 @@ (define_insn "*addsi3_zext_cc_overflow_2 [(set_attr "type" "alu") (set_attr "mode" "SI")]) +;; x == 0 with zero flag test can be done also as x < 1U with carry flag +;; test, where the latter is preferrable if we have some carry consuming +;; instruction. +;; For x != 0, we need to use x < 1U with negation of carry, i.e. +;; + (1 - CF). +(define_insn_and_split "*add3_eq" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (plus:SWI + (plus:SWI + (eq:SWI (match_operand 3 "int_nonimmediate_operand") (const_int 0)) + (match_operand:SWI 1 "nonimmediate_operand")) + (match_operand:SWI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_binary_operator_ok (PLUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 3) (const_int 1))) + (parallel [(set (match_dup 0) + (plus:SWI + (plus:SWI (ltu:SWI (reg:CC FLAGS_REG) (const_int 0)) + (match_dup 1)) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])]) + +(define_insn_and_split "*add3_ne" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (plus:SWI + (plus:SWI + (ne:SWI (match_operand 3 "int_nonimmediate_operand") (const_int 0)) + (match_operand:SWI 1 "nonimmediate_operand")) + (match_operand:SWI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "CONST_INT_P (operands[2]) + && (mode != DImode + || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) + && ix86_binary_operator_ok (PLUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 3) (const_int 1))) + (parallel [(set (match_dup 0) + (minus:SWI + (minus:SWI (match_dup 1) + (ltu:SWI (reg:CC FLAGS_REG) (const_int 0))) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] +{ + operands[2] = gen_int_mode (~INTVAL (operands[2]), + mode == DImode ? SImode : mode); +}) + +(define_insn_and_split "*add3_eq_0" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (plus:SWI + (eq:SWI (match_operand 2 "int_nonimmediate_operand") (const_int 0)) + (match_operand:SWI 1 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_unary_operator_ok (PLUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) (const_int 1))) + (parallel [(set (match_dup 0) + (plus:SWI (ltu:SWI (reg:CC FLAGS_REG) (const_int 0)) + (match_dup 1))) + (clobber (reg:CC FLAGS_REG))])] +{ + if (!nonimmediate_operand (operands[1], mode)) + operands[1] = force_reg (mode, operands[1]); +}) + +(define_insn_and_split "*add3_ne_0" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (plus:SWI + (ne:SWI (match_operand 2 "int_nonimmediate_operand") (const_int 0)) + (match_operand:SWI 1 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_unary_operator_ok (PLUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) (const_int 1))) + (parallel [(set (match_dup 0) + (minus:SWI (minus:SWI + (match_dup 1) + (ltu:SWI (reg:CC FLAGS_REG) (const_int 0))) + (const_int -1))) + (clobber (reg:CC FLAGS_REG))])] +{ + if (!nonimmediate_operand (operands[1], mode)) + operands[1] = force_reg (mode, operands[1]); +}) + +(define_insn_and_split "*sub3_eq" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (minus:SWI + (minus:SWI + (match_operand:SWI 1 "nonimmediate_operand") + (eq:SWI (match_operand 3 "int_nonimmediate_operand") + (const_int 0))) + (match_operand:SWI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "ix86_binary_operator_ok (MINUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 3) (const_int 1))) + (parallel [(set (match_dup 0) + (minus:SWI + (minus:SWI (match_dup 1) + (ltu:SWI (reg:CC FLAGS_REG) (const_int 0))) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])]) + +(define_insn_and_split "*sub3_ne" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (plus:SWI + (minus:SWI + (match_operand:SWI 1 "nonimmediate_operand") + (ne:SWI (match_operand 3 "int_nonimmediate_operand") + (const_int 0))) + (match_operand:SWI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "CONST_INT_P (operands[2]) + && (mode != DImode + || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) + && ix86_binary_operator_ok (MINUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 3) (const_int 1))) + (parallel [(set (match_dup 0) + (plus:SWI + (plus:SWI (ltu:SWI (reg:CC FLAGS_REG) (const_int 0)) + (match_dup 1)) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] +{ + operands[2] = gen_int_mode (INTVAL (operands[2]) - 1, + mode == DImode ? SImode : mode); +}) + +(define_insn_and_split "*sub3_eq_1" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (plus:SWI + (minus:SWI + (match_operand:SWI 1 "nonimmediate_operand") + (eq:SWI (match_operand 3 "int_nonimmediate_operand") + (const_int 0))) + (match_operand:SWI 2 ""))) + (clobber (reg:CC FLAGS_REG))] + "CONST_INT_P (operands[2]) + && (mode != DImode + || INTVAL (operands[2]) != HOST_WIDE_INT_C (-0x80000000)) + && ix86_binary_operator_ok (MINUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 3) (const_int 1))) + (parallel [(set (match_dup 0) + (minus:SWI + (minus:SWI (match_dup 1) + (ltu:SWI (reg:CC FLAGS_REG) (const_int 0))) + (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] +{ + operands[2] = gen_int_mode (-INTVAL (operands[2]), + mode == DImode ? SImode : mode); +}) + +(define_insn_and_split "*sub3_eq_0" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (minus:SWI + (match_operand:SWI 1 "") + (eq:SWI (match_operand 2 "int_nonimmediate_operand") (const_int 0)))) + (clobber (reg:CC FLAGS_REG))] + "ix86_unary_operator_ok (MINUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) (const_int 1))) + (parallel [(set (match_dup 0) + (minus:SWI (match_dup 1) + (ltu:SWI (reg:CC FLAGS_REG) (const_int 0)))) + (clobber (reg:CC FLAGS_REG))])] +{ + if (!nonimmediate_operand (operands[1], mode)) + operands[1] = force_reg (mode, operands[1]); +}) + +(define_insn_and_split "*sub3_ne_0" + [(set (match_operand:SWI 0 "nonimmediate_operand") + (minus:SWI + (match_operand:SWI 1 "") + (ne:SWI (match_operand 2 "int_nonimmediate_operand") (const_int 0)))) + (clobber (reg:CC FLAGS_REG))] + "ix86_unary_operator_ok (MINUS, mode, operands) + && can_create_pseudo_p ()" + "#" + "&& 1" + [(set (reg:CC FLAGS_REG) + (compare:CC (match_dup 2) (const_int 1))) + (parallel [(set (match_dup 0) + (plus:SWI (plus:SWI + (ltu:SWI (reg:CC FLAGS_REG) (const_int 0)) + (match_dup 1)) + (const_int -1))) + (clobber (reg:CC FLAGS_REG))])] +{ + if (!nonimmediate_operand (operands[1], mode)) + operands[1] = force_reg (mode, operands[1]); +}) + ;; The patterns that match these are at the end of this file. (define_expand "xf3" --- gcc/testsuite/gcc.target/i386/pr92140.c.jj 2019-10-18 15:21:26.347972472 +0200 +++ gcc/testsuite/gcc.target/i386/pr92140.c 2019-10-18 15:41:10.748944727 +0200 @@ -0,0 +1,38 @@ +/* PR target/92140 */ +/* { dg-do compile { target nonpic } } */ +/* { dg-options "-O2 -mtune=generic -masm=att" } */ +/* { dg-additional-options "-mregparm=1" { target ia32 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$-1, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t\\\$-1, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t\\\$0, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$0, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$25, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t\\\$25, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t\\\$-26, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$-26, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$-43, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t\\\$-43, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t\\\$42, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$42, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tadcl\t%\[a-z0-9]*, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t%\[a-z0-9]*, v" 1 } } */ +/* { dg-final { scan-assembler-times "\tsbbl\t\\\$-1, %" 1 } } */ + +char c; +int v; + +__attribute__((noipa)) void f1 (void) { v += c != 0; } +__attribute__((noipa)) void f2 (void) { v -= c != 0; } +__attribute__((noipa)) void f3 (void) { v += c == 0; } +__attribute__((noipa)) void f4 (void) { v -= c == 0; } +__attribute__((noipa)) void f5 (void) { v += (c != 0) - 26; } +__attribute__((noipa)) void f6 (void) { v -= (c != 0) - 26; } +__attribute__((noipa)) void f7 (void) { v += (c == 0) - 26; } +__attribute__((noipa)) void f8 (void) { v -= (c == 0) - 26; } +__attribute__((noipa)) void f9 (void) { v += (c != 0) + 42; } +__attribute__((noipa)) void f10 (void) { v -= (c != 0) + 42; } +__attribute__((noipa)) void f11 (void) { v += (c == 0) + 42; } +__attribute__((noipa)) void f12 (void) { v -= (c == 0) + 42; } +__attribute__((noipa)) void f13 (int z) { v += (c == 0) + z; } +__attribute__((noipa)) void f14 (int z) { v -= (c == 0) + z; } +__attribute__((noipa)) unsigned int f15 (unsigned int n) { return n ? 2 : 1; } --- gcc/testsuite/gcc.c-torture/execute/pr92140.c.jj 2019-10-18 14:13:57.787580586 +0200 +++ gcc/testsuite/gcc.c-torture/execute/pr92140.c 2019-10-18 15:20:27.629866214 +0200 @@ -0,0 +1,83 @@ +/* PR target/92140 */ + +char c; +int v; + +__attribute__((noipa)) void f1 (void) { v += c != 0; } +__attribute__((noipa)) void f2 (void) { v -= c != 0; } +__attribute__((noipa)) void f3 (void) { v += c == 0; } +__attribute__((noipa)) void f4 (void) { v -= c == 0; } +__attribute__((noipa)) void f5 (void) { v += (c != 0) - 26; } +__attribute__((noipa)) void f6 (void) { v -= (c != 0) - 26; } +__attribute__((noipa)) void f7 (void) { v += (c == 0) - 26; } +__attribute__((noipa)) void f8 (void) { v -= (c == 0) - 26; } +__attribute__((noipa)) void f9 (void) { v += (c != 0) + 42; } +__attribute__((noipa)) void f10 (void) { v -= (c != 0) + 42; } +__attribute__((noipa)) void f11 (void) { v += (c == 0) + 42; } +__attribute__((noipa)) void f12 (void) { v -= (c == 0) + 42; } +__attribute__((noipa)) void f13 (int z) { v += (c == 0) + z; } +__attribute__((noipa)) void f14 (int z) { v -= (c == 0) + z; } +__attribute__((noipa)) unsigned int f15 (unsigned int n) { return n ? 2 : 1; } + +int +main () +{ + int i; + for (i = 0; i < 2; i++) + { + v = 15; + if (i == 1) + c = 37; + f1 (); + if (v != 15 + i) + __builtin_abort (); + f2 (); + if (v != 15) + __builtin_abort (); + f3 (); + if (v != 16 - i) + __builtin_abort (); + f4 (); + if (v != 15) + __builtin_abort (); + f5 (); + if (v != 15 + i - 26) + __builtin_abort (); + f6 (); + if (v != 15) + __builtin_abort (); + f7 (); + if (v != 16 - i - 26) + __builtin_abort (); + f8 (); + if (v != 15) + __builtin_abort (); + f9 (); + if (v != 15 + i + 42) + __builtin_abort (); + f10 (); + if (v != 15) + __builtin_abort (); + f11 (); + if (v != 16 - i + 42) + __builtin_abort (); + f12 (); + if (v != 15) + __builtin_abort (); + f13 (173); + if (v != 16 - i + 173) + __builtin_abort (); + f14 (173); + if (v != 15) + __builtin_abort (); + f13 (-35); + if (v != 16 - i - 35) + __builtin_abort (); + f14 (-35); + if (v != 15) + __builtin_abort (); + } + if (f15 (0) != 1 || f15 (1) != 2 || f15 (371) != 2) + __builtin_abort (); + return 0; +}