From patchwork Fri Dec  4 12:18:19 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Richard Biener <richard.guenther@gmail.com>
X-Patchwork-Id: 552670
Return-Path: 
 <gcc-patches-return-416372-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 7AA9814029E
	for <incoming@patchwork.ozlabs.org>;
	Fri,  4 Dec 2015 23:18:37 +1100 (AEDT)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b=WY8NB6aC; dkim-atps=neutral
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:mime-version:in-reply-to:references:date:message-id:subject
	:from:to:cc:content-type; q=dns; s=default; b=HwSr3WvaGj1lGDOpLi
	sUAiiNdfODNOlW9XN2PpYU9Ze7q8M/Ex3jxP+qIj+7W5re8QUkqNTc3/+EM+EWh2
	pl455ZCqPxO1eFFeliP247LRcJuM8tUKlSqy1zVve+ws+D7UxwT30RfihDA/Bnup
	+S4n9bfZIl6TKcTwg9vKn67jQ=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:mime-version:in-reply-to:references:date:message-id:subject
	:from:to:cc:content-type; s=default; bh=5+3xK3Wt12DZZAGPyH8IbhsG
	lkU=; b=WY8NB6aC2j7TwtlD0vb9RkKHmCcxiLso+2+eJbA5IyC4GF4bv99kT9YZ
	DZsf/S1ECzlP5gNvWx4xCqy+FBrWusT4hPMfqBT6HJLSQSMLVtpSrvCi0nAGwxCq
	U3WEoNwFwrivGtbzGaPrjMazdh1a0cGwTcUeNNS259pp1JVCuKU=
Received: (qmail 89077 invoked by alias); 4 Dec 2015 12:18:25 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 88968 invoked by uid 89); 4 Dec 2015 12:18:25 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00,
	FREEMAIL_FROM, RCVD_IN_DNSWL_LOW,
	SPF_PASS autolearn=ham version=3.3.2
X-HELO: mail-wm0-f45.google.com
Received: from mail-wm0-f45.google.com (HELO mail-wm0-f45.google.com)
	(74.125.82.45) by sourceware.org
	(qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256
	encrypted) ESMTPS; Fri, 04 Dec 2015 12:18:22 +0000
Received: by wmec201 with SMTP id c201so71467418wme.0 for
	<gcc-patches@gcc.gnu.org>; Fri, 04 Dec 2015 04:18:19 -0800 (PST)
MIME-Version: 1.0
X-Received: by 10.28.50.193 with SMTP id y184mr4994032wmy.26.1449231499624;
	Fri, 04 Dec 2015 04:18:19 -0800 (PST)
Received: by 10.28.221.196 with HTTP; Fri, 4 Dec 2015 04:18:19 -0800 (PST)
In-Reply-To: 
 <CAEoMCqQT9xxV-1sZPEQPfbuVrTvCVsCyWc4pEbiuph_tXMMqFw@mail.gmail.com>
References: 
 <CAEoMCqQT9xxV-1sZPEQPfbuVrTvCVsCyWc4pEbiuph_tXMMqFw@mail.gmail.com>
Date: Fri, 4 Dec 2015 13:18:19 +0100
Message-ID: 
 <CAFiYyc1BRb0-u5mmzT-M6PJk1JUVKLxJHMCJYoxi3f9ABjNhBw@mail.gmail.com>
Subject: Re: [PATCH PR68542]
From: Richard Biener <richard.guenther@gmail.com>
To: Yuri Rumyantsev <ysrumyan@gmail.com>
Cc: gcc-patches <gcc-patches@gcc.gnu.org>,
	Igor Zamyatin <izamyatin@gmail.com>,
	Kirill Yukhin <kirill.Yukhin@gmail.com>
X-IsSubscribed: yes

On Mon, Nov 30, 2015 at 2:11 PM, Yuri Rumyantsev <ysrumyan@gmail.com> wrote:
> Hi All,
>
> Here is a patch for 481.wrf preformance regression for avx2 which is
> sligthly modified mask store optimization. This transformation allows
> perform unpredication for semi-hammock containing masked stores, other
> words if we have a loop like
> for (i=0; i<n; i++)
>   if (c[i]) {
>     p1[i] += 1;
>     p2[i] = p3[i] +2;
>   }
>
> then it will be transformed to
>    if (!mask__ifc__42.18_165 == { 0, 0, 0, 0, 0, 0, 0, 0 }) {
>      vect__11.19_170 = MASK_LOAD (vectp_p1.20_168, 0B, mask__ifc__42.18_165);
>      vect__12.22_172 = vect__11.19_170 + vect_cst__171;
>      MASK_STORE (vectp_p1.23_175, 0B, mask__ifc__42.18_165, vect__12.22_172);
>      vect__18.25_182 = MASK_LOAD (vectp_p3.26_180, 0B, mask__ifc__42.18_165);
>      vect__19.28_184 = vect__18.25_182 + vect_cst__183;
>      MASK_STORE (vectp_p2.29_187, 0B, mask__ifc__42.18_165, vect__19.28_184);
>    }
> i.e. it will put all computations related to masked stores to semi-hammock.
>
> Bootstrapping and regression testing did not show any new failures.

Can you please split out the middle-end support for vector equality compares?

@@ -3448,10 +3448,17 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
       if (TREE_CODE (op0_type) == VECTOR_TYPE
          || TREE_CODE (op1_type) == VECTOR_TYPE)
         {
-          error ("vector comparison returning a boolean");
-          debug_generic_expr (op0_type);
-          debug_generic_expr (op1_type);
-          return true;
+         /* Allow vector comparison returning boolean if operand types
+            are equal and CODE is EQ/NE.  */
+         if ((code != EQ_EXPR && code != NE_EXPR)
+             || !(VECTOR_BOOLEAN_TYPE_P (op0_type)
+                  || VECTOR_INTEGER_TYPE_P (op0_type)))
+           {
+             error ("type mismatch for vector comparison returning a boolean");
+             debug_generic_expr (op0_type);
+             debug_generic_expr (op1_type);
+             return true;
+           }
         }
     }

please merge the conditions with a &&

@@ -13888,6 +13888,25 @@ fold_relational_const (enum tree_code code,
tree type, tree op0, tree op1)

   if (TREE_CODE (op0) == VECTOR_CST && TREE_CODE (op1) == VECTOR_CST)
     {
+      if (INTEGRAL_TYPE_P (type)
+         && (TREE_CODE (type) == BOOLEAN_TYPE
+             || TYPE_PRECISION (type) == 1))
+       {
+         /* Have vector comparison with scalar boolean result.  */
+         bool result = true;
+         gcc_assert (code == EQ_EXPR || code == NE_EXPR);
+         gcc_assert (VECTOR_CST_NELTS (op0) == VECTOR_CST_NELTS (op1));
+         for (unsigned i = 0; i < VECTOR_CST_NELTS (op0); i++)
+           {
+             tree elem0 = VECTOR_CST_ELT (op0, i);
+             tree elem1 = VECTOR_CST_ELT (op1, i);
+             tree tmp = fold_relational_const (code, type, elem0, elem1);
+             result &= integer_onep (tmp);
+         if (code == NE_EXPR)
+           result = !result;
+         return constant_boolean_node (result, type);

... just assumes it is either EQ_EXPR or NE_EXPR.   I believe you want
to change the
guarding condition to just

   if (! VECTOR_TYPE_P (type))

and assert the boolean/precision.  Please also merge the asserts into
one with &&


I believe the comment should simply say that VRP doesn't track ranges for
vector types.

In the previous review I suggested you should make sure that RTL expansion
ends up using a well-defined optab for these compares.  To make sure
this happens across targets I suggest you make these comparisons available
via the GCC vector extension.  Thus allow

typedef int v4si __attribute__((vector_size(16)));

int foo (v4si a, v4si b)
{
  if (a == b)
    return 4;
}

and != and also using floating point vectors.

Otherwise it's hard to see the impact of this change.  Obvious choices
are the eq/ne optabs for FP compares and [u]cmp optabs for integer
compares.

A half-way implementation like your VRP comment suggests (only
==/!= zero against integer vectors is implemented?!) this doesn't sound
good without also limiting the feature this way in the verifier.

Btw, the regression with WRF is >50% on AMD Bulldozer (which only
has AVX, not AVX2).

Thanks,
Richard.

> ChangeLog:
> 2015-11-30  Yuri Rumyantsev  <ysrumyan@gmail.com>
>
> PR middle-end/68542
> * config/i386/i386.c (ix86_expand_branch): Implement integral vector
> comparison with boolean result.
> * config/i386/sse.md (define_expand "cbranch<mode>4): Add define-expand
> for vector comparion with eq/ne only.
> * fold-const.c (fold_relational_const): Add handling of vector
> comparison with boolean result.
> * tree-cfg.c (verify_gimple_comparison): Add argument CODE, allow
> comparison of vector operands with boolean result for EQ/NE only.
> (verify_gimple_assign_binary): Adjust call for verify_gimple_comparison.
> (verify_gimple_cond): Likewise.
> * tree-ssa-forwprop.c (combine_cond_expr_cond): Do not perform
> combining for non-compatible vector types.
> * tree-vect-loop.c (is_valid_sink): New function.
> (optimize_mask_stores): Likewise.
> * tree-vect-stmts.c (vectorizable_mask_load_store): Initialize
> has_mask_store field of vect_info.
> * tree-vectorizer.c (vectorize_loops): Invoke optimaze_mask_stores for
> vectorized loops having masked stores.
> * tree-vectorizer.h (loop_vec_info): Add new has_mask_store field and
> correspondent macros.
> (optimize_mask_stores): Add prototype.
> * tree-vrp.c (register_edge_assert_for): Do not handle NAME with vector
> type.
>
> gcc/testsuite/ChangeLog:
> * gcc.target/i386/avx2-vect-mask-store-move1.c: New test.

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index b82ae3c..73ee3be 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -373,6 +373,11 @@ combine_cond_expr_cond (gimple *stmt, enum
tree_code code, tree type,

   gcc_assert (TREE_CODE_CLASS (code) == tcc_comparison);

+  /* Do not perform combining it types are not compatible.  */
+  if (TREE_CODE (TREE_TYPE (op0)) == VECTOR_TYPE
+      && !tree_int_cst_equal (TYPE_SIZE (type), TYPE_SIZE (TREE_TYPE (op0))))
+    return NULL_TREE;
+

again, how does this happen?

diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index e67048e..1605520c 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -5760,6 +5760,12 @@ register_edge_assert_for (tree name, edge e,
gimple_stmt_iterator si,
                                                &comp_code, &val))
     return;

+  /* Use of vector comparison in gcond is very restricted and used to check
+     that the mask in masked store is zero, so assert for such comparison
+     is not implemented yet.  */
+  if (TREE_CODE (TREE_TYPE (name)) == VECTOR_TYPE)
+    return;
+

VECTOR_TYPE_P