From patchwork Wed May 2 21:46:39 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marc Glisse X-Patchwork-Id: 156559 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 668D4B6F62 for ; Thu, 3 May 2012 07:47:09 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1336600029; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Date: From:To:Subject:Message-ID:User-Agent:MIME-Version:Content-Type: Content-ID:Mailing-List:Precedence:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:Sender:Delivered-To; bh=/9dUe8y woio76TZ7ES1HD2BJ0dI=; b=AEeF5i7nfM67Fsm03CREsov0rtofbw+4OLW65Fi ISq17kLQ1FNuzWs3LS+s08tynjq3tTU9yPMvJuK/5IQqIM6mIGvNBH90FhdoSU9Z nbtSH1tV9cUUeyg2E5s33zZrzcpj42h7iDzR8b3DbxBloax+/0da6lFdgz4eZ87w QSPk= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:Date:From:To:Subject:Message-ID:User-Agent:MIME-Version:Content-Type:Content-ID:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=JVgy6sqpLFIzs2qd8PVbrLf2QFppa+ghI4Ic5IEzNe8zmn/UYuJY4uJb0Ecf+c yy9xo8wseLPBBVLUXZzOm+8L9FgSkvCLLWjygK2pg//sDAc7KegWufN2DO/F9/DA ZeR9UCpkXNvCYq9TCsMNqnt6fGcNs9nRSeFl9Cu17uwjY=; Received: (qmail 11010 invoked by alias); 2 May 2012 21:47:00 -0000 Received: (qmail 10965 invoked by uid 22791); 2 May 2012 21:46:59 -0000 X-SWARE-Spam-Status: No, hits=-6.6 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail1-relais-roc.national.inria.fr (HELO mail1-relais-roc.national.inria.fr) (192.134.164.82) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 02 May 2012 21:46:42 +0000 Received: from ip-216.net-81-220-90.toulouse.rev.numericable.fr (HELO laptop-mg.local) ([81.220.90.216]) by mail1-relais-roc.national.inria.fr with ESMTP/TLS/DHE-RSA-AES256-SHA; 02 May 2012 23:46:40 +0200 Date: Wed, 2 May 2012 23:46:39 +0200 (CEST) From: Marc Glisse To: gcc-patches@gcc.gnu.org Subject: [i386] access subvectors Message-ID: User-Agent: Alpine 2.02 (DEB 1266 2009-07-14) MIME-Version: 1.0 Content-ID: Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello, I definitely don't expect the attached patch to be accepted, but I would like some advice on the direction to go, and a patch that passes the testsuite and does the optimization I want on a couple testcases seems like it may help start the conversation. This is the first time I even look at .md files... The goal is to optimize: v8sf x; v4sf y=*(v4sf*)&x; so the compiler doesn't copy x to memory (yes, I know there is an intrinsic to do that). If I understood Richard Guenther's comment in the PR, it can be optimized in the back-end. The only way I found to place this kind of transformation is with define_peephole2. And I couldn't figure out how to test if 2 memory operands correspond to the same address, with different types (so match_dup is unhappy), and for some reason the XEXP(*,0) comparison said yes on my test and no when using an unrelated piece of memory, but it looks like a nonsense test that is just lucky on a couple trivial examples. Any help? 2012-05-02 Marc Glisse PR target/53101 gcc/ * config/i386/sse.md: New peephole2 for subvectors. gcc/testsuite/ * gcc.target/i386/pr53101.c: New test. Index: gcc/testsuite/gcc.target/i386/pr53101.c =================================================================== --- gcc/testsuite/gcc.target/i386/pr53101.c (revision 0) +++ gcc/testsuite/gcc.target/i386/pr53101.c (revision 0) @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx" } */ + +typedef double v2df __attribute__ ((vector_size (16))); +typedef double v4df __attribute__ ((vector_size (32))); +typedef double v4si __attribute__ ((vector_size (16))); +typedef double v8si __attribute__ ((vector_size (32))); + +v4si +avx_extract_v4si (v8si x) +{ + return *(v4si*)&x; +} + +v2df +avx_extract_v2df (v4df x __attribute((unused)), v4df y) +{ + return *(v2df*)&y; +} + +/* { dg-final { scan-assembler-not "movdq" } } */ +/* { dg-final { scan-assembler-times "movapd" 1 } } */ Property changes on: gcc/testsuite/gcc.target/i386/pr53101.c ___________________________________________________________________ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: gcc/config/i386/sse.md =================================================================== --- gcc/config/i386/sse.md (revision 187012) +++ gcc/config/i386/sse.md (working copy) @@ -4104,10 +4104,34 @@ emit_move_insn (operands[0], adjust_address (operands[1], SFmode, i*4)); DONE; }) +;; This is how we receive accesses to the first half of a vector. +(define_peephole2 + [(set (match_operand:VI8F_256 3 "memory_operand") + (match_operand:VI8F_256 1 "register_operand")) + (set (match_operand: 0 "register_operand") + (match_operand: 2 "memory_operand"))] + "TARGET_AVX && rtx_equal_p (XEXP (operands[2], 0), XEXP (operands[3], 0))" + [(set (match_dup 0) + (vec_select: (match_dup 1) + (parallel [(const_int 0) (const_int 1)])))] +) + +(define_peephole2 + [(set (match_operand:VI4F_256 3 "memory_operand") + (match_operand:VI4F_256 1 "register_operand")) + (set (match_operand: 0 "register_operand") + (match_operand: 2 "memory_operand"))] + "TARGET_AVX && rtx_equal_p (XEXP (operands[2], 0), XEXP (operands[3], 0))" + [(set (match_dup 0) + (vec_select: (match_dup 1) + (parallel [(const_int 0) (const_int 1) + (const_int 2) (const_int 3)])))] +) + (define_expand "avx_vextractf128" [(match_operand: 0 "nonimmediate_operand") (match_operand:V_256 1 "register_operand") (match_operand:SI 2 "const_0_to_1_operand")] "TARGET_AVX"