From patchwork Wed Dec 19 19:53:05 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aaron Sawdey X-Patchwork-Id: 1016306 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-492837-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="y6nw1dWa"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43Kltz3C2Kz9s1c for ; Thu, 20 Dec 2018 06:53:30 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; q=dns; s=default; b=cGCwY Fq8vLgWKh3LCaCSCnXZkEOfZT/pK0bFltLd5MsrP5hNAudTxJiWY7fU7+h5vO+Xx HO6lS0sjS7MIlHybXBoDuz6mWiqjgztdw5jVcTPIqfNuJm9RIhTAAmy4i3ji56wM fDBVrCt0+kGu8p3vIxRczQ1k6NQPiZkHFJRiFQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:date:mime-version:content-type :content-transfer-encoding:message-id; s=default; bh=FcW/3VcCKyZ EaXcGRZULQNoARrc=; b=y6nw1dWa1bOPqGN9OQXoSaMbriQ31OKsRix/5fxcrSw 8VVVOqK6piB9OuSPiFbrH/N3GsSn+lh2T+3WB3J9YfaL+Ir3JGRn5lTc5ybjTE/0 SJ8VGV4hS3BxDms5EQ0cJ2paM5Y0G8AfZNlyX3InGZkCBVbLcXgygMagZzhqNKmg = Received: (qmail 51655 invoked by alias); 19 Dec 2018 19:53:23 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 51634 invoked by uid 89); 19 Dec 2018 19:53:22 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-12.6 required=5.0 tests=BAYES_00, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 spammy=PhD, phd, Aaron, aaron X-HELO: mx0a-001b2d01.pphosted.com Received: from mx0a-001b2d01.pphosted.com (HELO mx0a-001b2d01.pphosted.com) (148.163.156.1) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 19 Dec 2018 19:53:20 +0000 Received: from pps.filterd (m0098410.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wBJJnIId023911 for ; Wed, 19 Dec 2018 14:53:19 -0500 Received: from e36.co.us.ibm.com (e36.co.us.ibm.com [32.97.110.154]) by mx0a-001b2d01.pphosted.com with ESMTP id 2pfsqnhrbb-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Wed, 19 Dec 2018 14:53:19 -0500 Received: from localhost by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 19 Dec 2018 19:53:18 -0000 Received: from b03cxnp07029.gho.boulder.ibm.com (9.17.130.16) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Wed, 19 Dec 2018 19:53:17 -0000 Received: from b03ledav006.gho.boulder.ibm.com (b03ledav006.gho.boulder.ibm.com [9.17.130.237]) by b03cxnp07029.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wBJJrG0e20185306 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Wed, 19 Dec 2018 19:53:16 GMT Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 78DD2C6065; Wed, 19 Dec 2018 19:53:16 +0000 (GMT) Received: from b03ledav006.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 48F82C6055; Wed, 19 Dec 2018 19:53:06 +0000 (GMT) Received: from ragesh4.local (unknown [9.211.108.112]) by b03ledav006.gho.boulder.ibm.com (Postfix) with ESMTP; Wed, 19 Dec 2018 19:53:06 +0000 (GMT) To: GCC Patches , Segher Boessenkool , David Edelsohn , Bill Schmidt From: Aaron Sawdey Subject: [PATCH][rs6000] avoid using unaligned vsx or lxvd2x/stxvd2x for memcpy/memmove inline expansion Date: Wed, 19 Dec 2018 13:53:05 -0600 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.3.3 MIME-Version: 1.0 x-cbid: 18121919-0020-0000-0000-00000E9CCC12 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00010250; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000271; SDB=6.01134127; UDB=6.00589643; IPR=6.00914301; MB=3.00024755; MTD=3.00000008; XFM=3.00000015; UTC=2018-12-19 19:53:18 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18121919-0021-0000-0000-0000641C722E Message-Id: <0a17416b-57a0-99e7-2e7e-90a63da66fe6@linux.ibm.com> X-IsSubscribed: yes Because of POWER9 dd2.1 issues with certain unaligned vsx instructions to cache inhibited memory, here is a patch that keeps memmove (and memcpy) inline expansion from doing unaligned vector or using vector load/store other than lvx/stvx. More description of the issue is here: https://patchwork.ozlabs.org/patch/814059/ OK for trunk if bootstrap/regtest ok? Thanks! Aaron 2018-12-19 Aaron Sawdey * config/rs6000/rs6000-string.c (expand_block_move): Don't use unaligned vsx and avoid lxvd2x/stxvd2x. (gen_lvx_v4si_move): New function. Index: gcc/config/rs6000/rs6000-string.c =================================================================== --- gcc/config/rs6000/rs6000-string.c (revision 267055) +++ gcc/config/rs6000/rs6000-string.c (working copy) @@ -2669,6 +2669,35 @@ return true; } +/* Generate loads and stores for a move of v4si mode using lvx/stvx. + This uses altivec_{l,st}vx__internal which use unspecs to + keep combine from changing what instruction gets used. + + DEST is the destination for the data. + SRC is the source of the data for the move. */ + +static rtx +gen_lvx_v4si_move (rtx dest, rtx src) +{ + rtx rv = NULL; + if (MEM_P (dest)) + { + gcc_assert (!MEM_P (src)); + gcc_assert (GET_MODE (src) == V4SImode); + rv = gen_altivec_stvx_v4si_internal (dest, src); + } + else if (MEM_P (src)) + { + gcc_assert (!MEM_P (dest)); + gcc_assert (GET_MODE (dest) == V4SImode); + rv = gen_altivec_lvx_v4si_internal (dest, src); + } + else + gcc_unreachable (); + + return rv; +} + /* Expand a block move operation, and return 1 if successful. Return 0 if we should let the compiler generate normal code. @@ -2721,11 +2750,11 @@ /* Altivec first, since it will be faster than a string move when it applies, and usually not significantly larger. */ - if (TARGET_ALTIVEC && bytes >= 16 && (TARGET_EFFICIENT_UNALIGNED_VSX || align >= 128)) + if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) { move_bytes = 16; mode = V4SImode; - gen_func.mov = gen_movv4si; + gen_func.mov = gen_lvx_v4si_move; } else if (bytes >= 8 && TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT))