From patchwork Wed Dec 18 15:11:52 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 302909 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 320C12C00AD for ; Thu, 19 Dec 2013 02:12:11 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; q=dns; s=default; b=URrMyzMYsqR+fsa5NZXiI3xRc1m8C 8bXFNngsBypCxRyPzillMUIFRorSGxf4BEpMo+PXL+SdFqDmWW9kERZY1bqN14Uu UDfHPIEU0VL4pNdM9tj3z6xT1Fcupq63166QYD4WvO0raJLRHx4/3YqDCwaGDvH1 zigirgKjCp6Nw0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:reply-to:mime-version :content-type; s=default; bh=9NI+unHYc05Qj/eLSQmBYHiVEGQ=; b=t+E 71gm+XZ4VyfJp3liihEV7nNJey+/iYRUE5hQGcOdC5Ct4AsNuS9CL5Wt7pd6dOf+ TqakoQkgPzvqOPDBINKOlSNRGo/b5AtUY+bLw2rrK/slYSzG2qJry5lY/ZtOiamL b1D3zXnwP2Bt1T7dupBC/JV7sPQYKNvgo2jYHJRM= Received: (qmail 19961 invoked by alias); 18 Dec 2013 15:12:04 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 19950 invoked by uid 89); 18 Dec 2013 15:12:03 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.0 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 18 Dec 2013 15:12:02 +0000 Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rBIFC1aO010919 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 18 Dec 2013 10:12:01 -0500 Received: from tucnak.zalov.cz (vpn1-6-97.ams2.redhat.com [10.36.6.97]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rBIFBxbe011527 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 18 Dec 2013 10:12:00 -0500 Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.14.7/8.14.7) with ESMTP id rBIFBvBc015334; Wed, 18 Dec 2013 16:11:58 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.14.7/8.14.7/Submit) id rBIFBqF3015331; Wed, 18 Dec 2013 16:11:52 +0100 Date: Wed, 18 Dec 2013 16:11:52 +0100 From: Jakub Jelinek To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Improve _mm*loadu* intrinsics handling (PR target/59539) Message-ID: <20131218151152.GD892@tucnak.redhat.com> Reply-To: Jakub Jelinek MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-IsSubscribed: yes Hi! As discussed in the PR, this patch similarly to the recent changes in movmisalign expansion for TARGET_AVX for unaligned loads from misaligned_operand just expands those as *mov_internal pattern, because that pattern emits vmovdqu/vmovup[sd] too, but doesn't contain UNSPECs and thus can be also merged into most other AVX insns that use the load target if those insns accept a memory operand. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2013-12-18 Jakub Jelinek PR target/59539 * config/i386/sse.md (_loadu, _loaddqu): New expanders, prefix existing define_insn names with *. * gcc.target/i386/pr59539-1.c: New test. * gcc.target/i386/pr59539-2.c: New test. Jakub --- gcc/config/i386/sse.md.jj 2013-12-10 12:43:21.000000000 +0100 +++ gcc/config/i386/sse.md 2013-12-18 11:10:36.428643400 +0100 @@ -912,7 +912,27 @@ (define_expand "movmisalign" DONE; }) -(define_insn "_loadu" +(define_expand "_loadu" + [(set (match_operand:VF 0 "register_operand") + (unspec:VF [(match_operand:VF 1 "nonimmediate_operand")] + UNSPEC_LOADU))] + "TARGET_SSE && " +{ + /* For AVX, normal *mov_internal pattern will handle unaligned loads + just fine if misaligned_operand is true, and without the UNSPEC it can + be combined with arithmetic instructions. If misaligned_operand is + false, still emit UNSPEC_LOADU insn to honor user's request for + misaligned load. */ + if (TARGET_AVX + && misaligned_operand (operands[1], mode) + && !) + { + emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[1])); + DONE; + } +}) + +(define_insn "*_loadu" [(set (match_operand:VF 0 "register_operand" "=v") (unspec:VF [(match_operand:VF 1 "nonimmediate_operand" "vm")] @@ -999,7 +1019,28 @@ (define_insn "avx512f_storeu")]) -(define_insn "_loaddqu" +(define_expand "_loaddqu" + [(set (match_operand:VI_UNALIGNED_LOADSTORE 0 "register_operand") + (unspec:VI_UNALIGNED_LOADSTORE + [(match_operand:VI_UNALIGNED_LOADSTORE 1 "nonimmediate_operand")] + UNSPEC_LOADU))] + "TARGET_SSE2 && " +{ + /* For AVX, normal *mov_internal pattern will handle unaligned loads + just fine if misaligned_operand is true, and without the UNSPEC it can + be combined with arithmetic instructions. If misaligned_operand is + false, still emit UNSPEC_LOADU insn to honor user's request for + misaligned load. */ + if (TARGET_AVX + && misaligned_operand (operands[1], mode) + && !) + { + emit_insn (gen_rtx_SET (VOIDmode, operands[0], operands[1])); + DONE; + } +}) + +(define_insn "*_loaddqu" [(set (match_operand:VI_UNALIGNED_LOADSTORE 0 "register_operand" "=v") (unspec:VI_UNALIGNED_LOADSTORE [(match_operand:VI_UNALIGNED_LOADSTORE 1 "nonimmediate_operand" "vm")] --- gcc/testsuite/gcc.target/i386/pr59539-1.c.jj 2013-12-18 08:46:26.023864371 +0100 +++ gcc/testsuite/gcc.target/i386/pr59539-1.c 2013-12-18 08:53:12.304743270 +0100 @@ -0,0 +1,16 @@ +/* PR target/59539 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx" } */ + +#include + +int +foo (void *p1, void *p2) +{ + __m128i d1 = _mm_loadu_si128 ((__m128i *) p1); + __m128i d2 = _mm_loadu_si128 ((__m128i *) p2); + __m128i result = _mm_cmpeq_epi16 (d1, d2); + return _mm_movemask_epi8 (result); +} + +/* { dg-final { scan-assembler-times "vmovdqu" 1 } } */ --- gcc/testsuite/gcc.target/i386/pr59539-2.c.jj 2013-12-18 08:46:33.130826198 +0100 +++ gcc/testsuite/gcc.target/i386/pr59539-2.c 2013-12-18 08:47:14.890608917 +0100 @@ -0,0 +1,16 @@ +/* PR target/59539 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx2" } */ + +#include + +int +foo (void *p1, void *p2) +{ + __m256i d1 = _mm256_loadu_si256 ((__m256i *) p1); + __m256i d2 = _mm256_loadu_si256 ((__m256i *) p2); + __m256i result = _mm256_cmpeq_epi16 (d1, d2); + return _mm256_movemask_epi8 (result); +} + +/* { dg-final { scan-assembler-times "vmovdqu" 1 } } */