From patchwork Thu Dec 20 13:43:42 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Joern Wolfgang Rennecke X-Patchwork-Id: 1016772 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-492888-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=amylaar.uk Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="WVpzTFHV"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=amylaar.uk header.i=@amylaar.uk header.b="oaVs917k"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43LCf5576Gz9s8r for ; Fri, 21 Dec 2018 00:43:56 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=f+QBrft13PmwT9lvKVo1Lj66ai8FQGnytzjbtvmAVjbZiA xmGYgkBAOJvOSp/lF5lI1kG9f1b/6YNV1dWqJzFsYBQKOasHwzQ81HuyPkSnb6Wo 2LJxG+xBeRdFyqt3jiIkh+D3quaz4bHD+bQh8gZkxrGTEE8u44DtU96/VA4Rk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=lVK5T/eWQT8PhlyCgK1/fsC6yTU=; b=WVpzTFHVjYvJXX5k6nzk ze9FIsoV1CzLMHyMMl/rZVFA930uAPPmgIhN+yh5aAvkNVVUMJovyTcWfJ36xgw2 aYNwGMnx2e0bmTLUzCOxESucB2yAURZciH6L4qpMV9+WPsfBfZQkABv9MssARKvx kAlLliNGFX4Zaw+l6y2fulU= Received: (qmail 100488 invoked by alias); 20 Dec 2018 13:43:48 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 100452 invoked by uid 89); 20 Dec 2018 13:43:48 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-10.1 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 spammy=sustain, H*RU:sk:mailrel, HX-HELO:sk:mailrel, Hx-spam-relays-external:sk:mailrel X-HELO: mailrelay3-1.pub.mailoutpod1-cph3.one.com Received: from mailrelay3-1.pub.mailoutpod1-cph3.one.com (HELO mailrelay3-1.pub.mailoutpod1-cph3.one.com) (46.30.210.184) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 20 Dec 2018 13:43:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amylaar.uk; s=20140924; h=content-type:subject:to:mime-version:from:date:message-id:from; bh=tfEgwDdgH6/R8V5S4mqALuPFSz3YvhRSuIKewHzFOoE=; b=oaVs917k7F4eqklCN9mCNKl8xgP1W2F1iweBsOL4B4wGb/6GRsZ1YptEOipx3RhKdEC8CTxPSaEPQ DDdE5JoeHhztKZZMBJNMA/Gz0khYE5GxmVVfqw48MCTDT6A36Tyal8hfv6FozBaYW8P8fOi2B8Emss wA/tAnkwcJHM1870= X-HalOne-Cookie: 1de20218761701dddbd81dba9886a930cdffc839 X-HalOne-ID: 4398f880-045d-11e9-ba10-d0431ea8bb03 Received: from [192.168.1.129] (unknown [91.135.11.213]) by mailrelay3.pub.mailoutpod1-cph3.one.com (Halon) with ESMTPSA id 4398f880-045d-11e9-ba10-d0431ea8bb03; Thu, 20 Dec 2018 13:43:42 +0000 (UTC) Message-ID: <5C1B9C8E.70108@amylaar.uk> Date: Thu, 20 Dec 2018 13:43:42 +0000 From: Joern Wolfgang Rennecke User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: GCC Patches Subject: RFA: Avoid versioning loop with unaligned step eSi-RISC has vector permute functionality, but no unaligned loads. We see execution failures on gcc.dg/vect/slp-perm-12.c because loop versioning is used to make the tptr aligned for the first loop iteration, and then with a step of originally 11, 22 after vectorization, and a vector alignment of 8 bytes, the second iteration causes an AlignmentError exception. The attached patch to tree-vect-data-refs.c suppresses attempts to align data accesses where the step alignment times the vectorization factor is insufficient to sustain the alignment during the loop. Bootstrapped and regression tested on x86_64-pc-linux-gnu . I have also attached a matching testsuite patch to not expect SLP vectorization for slp-perm-12 when no unaligned loads are available, although in terms of testing, I can only say that it works for us. 2018-12-15 Joern Rennecke * tree-vect-data-refs.c (vect_enhance_data_refs_alignment): Don't do versioning for data accesses with misaligned step. 2018-12-15 Joern Rennecke * testsuite/gcc.dg/vect/slp-perm-12.c (dg-final): Don't expect SLP vectorization for ! vect_no_align. Index: testsuite/gcc.dg/vect/slp-perm-12.c =================================================================== --- testsuite/gcc.dg/vect/slp-perm-12.c (revision 5616) +++ testsuite/gcc.dg/vect/slp-perm-12.c (revision 5617) @@ -49,4 +49,4 @@ int main() return 0; } -/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target vect_perm } } } */ +/* { dg-final { scan-tree-dump-times "vectorizing stmts using SLP" 1 "vect" { target { vect_perm && {! vect_no_align } } } } } */ Index: tree-vect-data-refs.c =================================================================== --- tree-vect-data-refs.c (revision 267262) +++ tree-vect-data-refs.c (working copy) @@ -2160,6 +2160,20 @@ vect_enhance_data_refs_alignment (loop_v break; } + /* Forcing alignment in the first iteration is no good if + we don't keep it across iterations. For now, just disable + versioning in this case. + ?? We could actually unroll the loop to archive the required + overall step alignemnt, and forcing the alignment could be + done by doing some iterations of the non-vectorized loop. */ + if (maybe_lt (LOOP_VINFO_VECT_FACTOR (loop_vinfo) + * DR_STEP_ALIGNMENT (dr), + TYPE_ALIGN_UNIT (vectype))) + { + do_versioning = false; + break; + } + /* The rightmost bits of an aligned address must be zeros. Construct the mask needed for this test. For example, GET_MODE_SIZE for the vector mode V4SI is 16 bytes so the