From patchwork Mon Oct 31 22:29:21 2011
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jakub Jelinek <jakub@redhat.com>
X-Patchwork-Id: 122987
Return-Path: 
 <gcc-patches-return-306105-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	by ozlabs.org (Postfix) with SMTP id 3F282B6F94
	for <incoming@patchwork.ozlabs.org>;
	Tue,  1 Nov 2011 09:29:47 +1100 (EST)
Received: (qmail 4820 invoked by alias); 31 Oct 2011 22:29:44 -0000
Received: (qmail 4811 invoked by uid 22791); 31 Oct 2011 22:29:43 -0000
X-SWARE-Spam-Status: No, hits=-7.2 required=5.0	tests=AWL, BAYES_00,
	RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, SPF_HELO_PASS
X-Spam-Check-By: sourceware.org
Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by
	sourceware.org (qpsmtpd/0.43rc1) with ESMTP;
	Mon, 31 Oct 2011 22:29:23 +0000
Received: from int-mx09.intmail.prod.int.phx2.redhat.com
	(int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22])	by
	mx1.redhat.com (8.14.4/8.14.4) with ESMTP id
	p9VMTMUb030104	(version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA
	bits=256 verify=OK); Mon, 31 Oct 2011 18:29:22 -0400
Received: from tyan-ft48-01.lab.bos.redhat.com
	(tyan-ft48-01.lab.bos.redhat.com [10.16.42.4])	by
	int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4)
	with ESMTP id p9VMTLwq017041	(version=TLSv1/SSLv3
	cipher=DHE-RSA-AES256-SHA bits=256 verify=NO);
	Mon, 31 Oct 2011 18:29:22 -0400
Received: from tyan-ft48-01.lab.bos.redhat.com
	(tyan-ft48-01.lab.bos.redhat.com [127.0.0.1])	by
	tyan-ft48-01.lab.bos.redhat.com (8.14.4/8.14.4) with ESMTP id
	p9VMTLXD029393; Mon, 31 Oct 2011 23:29:21 +0100
Received: (from jakub@localhost)	by tyan-ft48-01.lab.bos.redhat.com
	(8.14.4/8.14.4/Submit) id p9VMTLeR029391;
	Mon, 31 Oct 2011 23:29:21 +0100
Date: Mon, 31 Oct 2011 23:29:21 +0100
From: Jakub Jelinek <jakub@redhat.com>
To: Richard Henderson <rth@redhat.com>, Uros Bizjak <ubizjak@gmail.com>
Cc: gcc-patches@gcc.gnu.org
Subject: [PATCH] Add fixuns_trunc<mode><sseintvecmodelower>2
Message-ID: <20111031222920.GH1052@tyan-ft48-01.lab.bos.redhat.com>
Reply-To: Jakub Jelinek <jakub@redhat.com>
MIME-Version: 1.0
Content-Disposition: inline
User-Agent: Mutt/1.5.21 (2010-09-15)
X-IsSubscribed: yes
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org

Hi!

This allows to vectorize float -> uint conversion.
To convert V{4,8}SFmode op0 to V{4,8}SImode target, it emits:
  V{4,8}SFmode mask = op0 >= { INT_MAX + 1U + .0f, INT_MAX + 1U + .0f, ... }	// non-signalling GE
  V{4,8}SFmode tmp1 = mask & { 2.0f * INT_MIN, 2.0f * INT_MIN, ... }
  V{4,8}SFmode tmp2 = op0 + tmp1
  V{4,8}SImode target = (V{4,8}SImode) tmp2
TARGET_AVX is needed, because pre-AVX we didn't have non-signalling GE in
cmpps and we don't want to raise exceptions if op0 is QNaN (scalar code uses
vucomiss).

Ok for trunk?

2011-10-31  Jakub Jelinek  <jakub@redhat.com>

	* config/i386/sse.md (fixuns_trunc<mode><sseintvecmodelower>2): New
	expander.


	Jakub
--- gcc/config/i386/sse.md.jj	2011-10-31 21:05:21.000000000 +0100
+++ gcc/config/i386/sse.md	2011-10-31 22:53:13.000000000 +0100
@@ -2322,6 +2322,35 @@ (define_insn "fix_truncv4sfv4si2"
    (set_attr "prefix" "maybe_vex")
    (set_attr "mode" "TI")])
 
+(define_expand "fixuns_trunc<mode><sseintvecmodelower>2"
+  [(set (match_dup 4)
+	(unspec:VF1
+	  [(match_operand:VF1 1 "register_operand" "")
+	   (match_dup 2)
+	   (const_int 29)] UNSPEC_PCMP))
+   (set (match_dup 5)
+	(and:VF1 (match_dup 4) (match_dup 3)))
+   (set (match_dup 6)
+	(plus:VF1 (match_dup 1) (match_dup 5)))
+   (set (match_operand:<sseintvecmode> 0 "register_operand" "")
+	(fix:<sseintvecmode> (match_dup 6)))]
+  "TARGET_AVX"
+{
+  REAL_VALUE_TYPE MTWO32r, TWO31r;
+  int i;
+
+  real_ldexp (&TWO31r, &dconst1, 31);
+  operands[2] = const_double_from_real_value (TWO31r, SFmode);
+  operands[2] = ix86_build_const_vector (<MODE>mode, 1, operands[2]);
+  operands[2] = force_reg (<MODE>mode, operands[2]);
+  real_ldexp (&MTWO32r, &dconstm1, 32);
+  operands[3] = const_double_from_real_value (MTWO32r, SFmode);
+  operands[3] = ix86_build_const_vector (<MODE>mode, 1, operands[3]);
+  operands[3] = force_reg (<MODE>mode, operands[3]);
+  for (i = 4; i < 7; i++)
+    operands[i] = gen_reg_rtx (<MODE>mode);
+})
+
 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
 ;;
 ;; Parallel double-precision floating point conversion operations