From patchwork Thu Nov 8 17:52:20 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 995078 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489435-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="EZwprNco"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="Mp3rY6r2"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rW8P5TBdz9sDJ for ; Fri, 9 Nov 2018 04:52:37 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=bFXEZooKMwagR07dnHw8OZ9uw7RS/oLwcYQYwmpBc6+ E8cH4CekgyZ48x0RTBUb/k0XOb+0KAUoBYXLWj4YTB275cpPV19dj0RvtFYWYmAw BB/U9lF+cf441jHJzkolF62jvEoPexhWDwTRvfHFlGwJgMoOuUBjAVBm9KZrBetw = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=TbjwZONfM+4cWS+UjlkFwTjeReQ=; b=EZwprNcoqL+llQ2CO AwKhRLz1k8iDFDWLGL8IhpgO/NBJap4w4uxv6yaKdQzFmjmNPETFZKZOyuJXnvo6 sL66rYPq7U/T/GuvOs2X87YOB2ANVkIknuyq7UugIdAcQ3gZUmdaIpV0s+PAPEOd 2kjmnHLjFMAR/fyJ4sXwuqp4CQ= Received: (qmail 61547 invoked by alias); 8 Nov 2018 17:52:28 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 61523 invoked by uid 89); 8 Nov 2018 17:52:27 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.5 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=bc, ac X-HELO: NAM03-BY2-obe.outbound.protection.outlook.com Received: from mail-by2nam03on0053.outbound.protection.outlook.com (HELO NAM03-BY2-obe.outbound.protection.outlook.com) (104.47.42.53) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 17:52:23 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=xNuQnrXVzgZ6VdhMitGWsqqO1clfEBE8Aou7s0uklSY=; b=Mp3rY6r2o1g2Ur8lJzBcBzKayTFSs+P/szKHNg7ak3WZfSdy5kdlp1tJMQWE3LLSyYIta78BzBpXYma9jVx6jAYzVNxvpnrDc7tP2ZU29oplwoaIBS3AR37eTddxhxzLla2cJK70f81bEsGkyCuE/CAOVo8zdwjpmwgg+9fGLk8= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB5717.namprd07.prod.outlook.com (20.178.0.75) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.20; Thu, 8 Nov 2018 17:52:20 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41%4]) with mapi id 15.20.1294.034; Thu, 8 Nov 2018 17:52:20 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 1/4][Aarch64] v2: Implement Aarch64 SIMD ABI Date: Thu, 8 Nov 2018 17:52:20 +0000 Message-ID: <1541699539.12016.6.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: This is a resubmission of patch 1 to support the Aarch64 SIMD ABI [1] in GCC, it does not have any functional changes from the last submit. The significant difference between the standard ARM ABI and the SIMD ABI is that in the normal ABI a callee saves only the lower 64 bits of registers V8-V15, in the SIMD ABI the callee must save all 128 bits of registers V8-V23. This patch checks for SIMD functions and saves the extra registers when needed.  It does not change the caller behavour, so with just this patch there may be values saved by both the caller and callee.  This is not efficient, but it is correct code. Patches 3 and 4 will remove the extra saves from the caller. Steve Ellcey sellcey@cavium.com 2018-11-08  Steve Ellcey   * config/aarch64/aarch64-protos.h (aarch64_use_simple_return_insn_p): New prototype. (aarch64_epilogue_uses): Ditto. * config/aarch64/aarch64.c (aarch64_attribute_table): New array. (aarch64_simd_decl_p): New function. (aarch64_reg_save_mode): New function. (aarch64_function_ok_for_sibcall): Check for simd calls. (aarch64_layout_frame): Check for simd function. (aarch64_gen_storewb_pair): Handle E_TFmode. (aarch64_push_regs): Use aarch64_reg_save_mode to get mode. (aarch64_gen_loadwb_pair): Handle E_TFmode. (aarch64_pop_regs): Use aarch64_reg_save_mode to get mode. (aarch64_gen_store_pair): Handle E_TFmode. (aarch64_gen_load_pair): Ditto. (aarch64_save_callee_saves): Handle different mode sizes. (aarch64_restore_callee_saves): Ditto. (aarch64_components_for_bb): Check for simd function. (aarch64_epilogue_uses): New function. (aarch64_process_components): Check for simd function. (aarch64_expand_prologue): Ditto. (aarch64_expand_epilogue): Ditto. (aarch64_expand_call): Ditto. (aarch64_use_simple_return_insn_p): New function. (TARGET_ATTRIBUTE_TABLE): New define. * config/aarch64/aarch64.h (EPILOGUE_USES): Redefine. (FP_SIMD_SAVED_REGNUM_P): New macro. * config/aarch64/aarch64.md (simple_return): New define_expand. (load_pair_dw_tftf): New instruction. (store_pair_dw_tftf): Ditto. (loadwb_pair_): Ditto. (storewb_pair_): Ditto. 2018-11-08  Steve Ellcey   * gcc.target/aarch64/torture/aarch64-torture.exp: New file. * gcc.target/aarch64/torture/simd-abi-1.c: New test. * gcc.target/aarch64/torture/simd-abi-2.c: Ditto. * gcc.target/aarch64/torture/simd-abi-3.c: Ditto. * gcc.target/aarch64/torture/simd-abi-4.c: Ditto. * gcc.target/aarch64/torture/simd-abi-5.c: Ditto. diff --git a/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp b/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp index e69de29..22f08ff 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp +++ b/gcc/testsuite/gcc.target/aarch64/torture/aarch64-torture.exp @@ -0,0 +1,41 @@ +# Copyright (C) 2018 Free Software Foundation, Inc. + +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with GCC; see the file COPYING3. If not see +# . + +# GCC testsuite that uses the `gcc-dg.exp' driver, looping over +# optimization options. + +# Exit immediately if this isn't a Aarch64 target. +if { ![istarget aarch64*-*-*] } then { + return +} + +# Load support procs. +load_lib gcc-dg.exp + +# If a testcase doesn't have special options, use these. +global DEFAULT_CFLAGS +if ![info exists DEFAULT_CFLAGS] then { + set DEFAULT_CFLAGS " -ansi -pedantic-errors" +} + +# Initialize `dg'. +dg-init + +# Main loop. +gcc-dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/*.\[cS\]]] "" $DEFAULT_CFLAGS + +# All done. +dg-finish diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c index e69de29..249554e 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-1.c @@ -0,0 +1,41 @@ +/* { dg-do compile } */ + +void __attribute__ ((aarch64_vector_pcs)) +f (void) +{ + /* Clobber all fp/simd regs and verify that the correct ones are saved + and restored in the prologue and epilogue of a SIMD function. */ + __asm__ __volatile__ ("" ::: "q0", "q1", "q2", "q3"); + __asm__ __volatile__ ("" ::: "q4", "q5", "q6", "q7"); + __asm__ __volatile__ ("" ::: "q8", "q9", "q10", "q11"); + __asm__ __volatile__ ("" ::: "q12", "q13", "q14", "q15"); + __asm__ __volatile__ ("" ::: "q16", "q17", "q18", "q19"); + __asm__ __volatile__ ("" ::: "q20", "q21", "q22", "q23"); + __asm__ __volatile__ ("" ::: "q24", "q25", "q26", "q27"); + __asm__ __volatile__ ("" ::: "q28", "q29", "q30", "q31"); +} + +/* { dg-final { scan-assembler {\sstp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sstp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sstp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sstp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sstp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sstp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sstp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sstp\tq22, q23} } } */ +/* { dg-final { scan-assembler {\sldp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sldp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sldp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sldp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sldp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sldp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sldp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sldp\tq22, q23} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sstp\td} } } */ +/* { dg-final { scan-assembler-not {\sldp\td} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c index e69de29..bf6e64a 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-2.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ + +void +f (void) +{ + /* Clobber all fp/simd regs and verify that the correct ones are saved + and restored in the prologue and epilogue of a SIMD function. */ + __asm__ __volatile__ ("" ::: "q0", "q1", "q2", "q3"); + __asm__ __volatile__ ("" ::: "q4", "q5", "q6", "q7"); + __asm__ __volatile__ ("" ::: "q8", "q9", "q10", "q11"); + __asm__ __volatile__ ("" ::: "q12", "q13", "q14", "q15"); + __asm__ __volatile__ ("" ::: "q16", "q17", "q18", "q19"); + __asm__ __volatile__ ("" ::: "q20", "q21", "q22", "q23"); + __asm__ __volatile__ ("" ::: "q24", "q25", "q26", "q27"); + __asm__ __volatile__ ("" ::: "q28", "q29", "q30", "q31"); +} + +/* { dg-final { scan-assembler {\sstp\td8, d9} } } */ +/* { dg-final { scan-assembler {\sstp\td10, d11} } } */ +/* { dg-final { scan-assembler {\sstp\td12, d13} } } */ +/* { dg-final { scan-assembler {\sstp\td14, d15} } } */ +/* { dg-final { scan-assembler {\sldp\td8, d9} } } */ +/* { dg-final { scan-assembler {\sldp\td10, d11} } } */ +/* { dg-final { scan-assembler {\sldp\td12, d13} } } */ +/* { dg-final { scan-assembler {\sldp\td14, d15} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[01234567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[01234567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq1[6789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq1[6789]} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c index e69de29..7d4f54f 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-3.c @@ -0,0 +1,34 @@ +/* { dg-do compile } */ + +extern void g (void); + +void __attribute__ ((aarch64_vector_pcs)) +f (void) +{ + g(); +} + +/* { dg-final { scan-assembler {\sstp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sstp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sstp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sstp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sstp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sstp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sstp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sstp\tq22, q23} } } */ +/* { dg-final { scan-assembler {\sldp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sldp\tq10, q11} } } */ +/* { dg-final { scan-assembler {\sldp\tq12, q13} } } */ +/* { dg-final { scan-assembler {\sldp\tq14, q15} } } */ +/* { dg-final { scan-assembler {\sldp\tq16, q17} } } */ +/* { dg-final { scan-assembler {\sldp\tq18, q19} } } */ +/* { dg-final { scan-assembler {\sldp\tq20, q21} } } */ +/* { dg-final { scan-assembler {\sldp\tq22, q23} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sstp\td} } } */ +/* { dg-final { scan-assembler-not {\sldp\td} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c index e69de29..e399690 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-4.c @@ -0,0 +1,34 @@ +/* dg-do run */ +/* { dg-additional-options "-std=c99" } */ + + + +/* There is nothing special about the calculations here, this is just + a test that can be compiled and run. */ + +extern void abort (void); + +__Float64x2_t __attribute__ ((noinline, aarch64_vector_pcs)) +foo(__Float64x2_t a, __Float64x2_t b, __Float64x2_t c, + __Float64x2_t d, __Float64x2_t e, __Float64x2_t f, + __Float64x2_t g, __Float64x2_t h, __Float64x2_t i) +{ + __Float64x2_t w, x, y, z; + w = a + b * c; + x = d + e * f; + y = g + h * i; + return w + x * y; +} + + +int main() +{ + __Float64x2_t a, b, c, d; + a = (__Float64x2_t) { 1.0, 2.0 }; + b = (__Float64x2_t) { 3.0, 4.0 }; + c = (__Float64x2_t) { 5.0, 6.0 }; + d = foo (a, b, c, (a+b), (b+c), (a+c), (a-b), (b-c), (a-c)) + a + b + c; + if (d[0] != 337.0 || d[1] != 554.0) + abort (); + return 0; +} diff --git a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c index e69de29..7d639a5e 100644 --- a/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c +++ b/gcc/testsuite/gcc.target/aarch64/torture/simd-abi-5.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ + +void __attribute__ ((aarch64_vector_pcs)) +f (void) +{ + /* Clobber some fp/simd regs and verify that only those are saved + and restored in the prologue and epilogue of a SIMD function. */ + __asm__ __volatile__ ("" ::: "q8", "q9", "q10", "q11"); +} + +/* { dg-final { scan-assembler {\sstp\tq8, q9} } } */ +/* { dg-final { scan-assembler {\sstp\tq10, q11} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq[034567]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq1[23456789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq1[23456789]} } } */ +/* { dg-final { scan-assembler-not {\sstp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sldp\tq2[456789]} } } */ +/* { dg-final { scan-assembler-not {\sstp\td} } } */ +/* { dg-final { scan-assembler-not {\sldp\td} } } */ +/* { dg-final { scan-assembler-not {\sstr\t} } } */ +/* { dg-final { scan-assembler-not {\sldr\t} } } */ From patchwork Thu Nov 8 17:53:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 995082 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489437-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="I7iKPti2"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="DH0rXw1n"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rW9l28Klz9s8F for ; Fri, 9 Nov 2018 04:53:47 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=cxXytXz9iJAUAX3lZx54fr8nUkfF1/+L0Joq421wBXC cI6w3a9xSe2Zz+7po9jDGh6c1b2Jga6ejNbiKV9dz6cuERhtEharzXWQVDfpktBk Y7SmRj9MZr4vVedpycwd5FkwHiHVp6IsJhPIwcNdL8MQxsn9PkNObyi6zME6Gb/s = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=lShOvjK1+n+aaGVullEwIMyQUck=; b=I7iKPti2J4FhMsivp Jbwv8tXQLS0Mh4J2x9QsS9sMSk6uSocFMNW9G+QRsziNWU4GJjisqUAvjtgaP17H kQfk3RaXnTnftJwGVbfWcsOrn2U7gL6PnqCZa+dXZkV/EsWAK/z2OgHS1yT0ZOz9 oDyB5l/I1LuFzMfh32OF6+2d7Y= Received: (qmail 64823 invoked by alias); 8 Nov 2018 17:53:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 64813 invoked by uid 89); 8 Nov 2018 17:53:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: NAM04-CO1-obe.outbound.protection.outlook.com Received: from mail-eopbgr690051.outbound.protection.outlook.com (HELO NAM04-CO1-obe.outbound.protection.outlook.com) (40.107.69.51) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 17:53:37 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=TBO9rvYwElXgpXrajbYLCKe0uzhlngILFf2gXjpdGWo=; b=DH0rXw1nDtsKy61ZtTsDiUkzhDLzu8T04GVeM3tk1Fhv13+ejRLZehKG8B5d/EQMevt2DkCwOyBL/Q5tp4RqvYcWTG8tfO8wdeJk4wJUXwhWGzdjnYUlsc/UvvxDNSlaxKkO4+tngmFmQ5sVgA+PGElje2vTVkE7rjgeToIdwg8= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB4776.namprd07.prod.outlook.com (52.135.205.28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.26; Thu, 8 Nov 2018 17:53:35 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41%4]) with mapi id 15.20.1294.034; Thu, 8 Nov 2018 17:53:35 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 2/4][Aarch64] v2: Implement Aarch64 SIMD ABI Date: Thu, 8 Nov 2018 17:53:35 +0000 Message-ID: <1541699615.12016.7.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: This is a patch 2 to support the Aarch64 SIMD ABI [1] in GCC. It defines the TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN, TARGET_SIMD_CLONE_ADJUST, and TARGET_SIMD_CLONE_USABLE macros so that GCC can generate SIMD clones on aarch64. Steve Ellcey sellcey@cavium.com 2018-11-08  Steve Ellcey   * config/aarch64/aarch64.c (cgraph.h): New include. (aarch64_simd_clone_compute_vecsize_and_simdlen): New function. (aarch64_simd_clone_adjust): Ditto. (aarch64_simd_clone_usable): Ditto. (TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN): New macro. (TARGET_SIMD_CLONE_ADJUST): Ditto. (TARGET_SIMD_CLONE_USABLE): Ditto. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c82c7b6..cccf961 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -40,6 +40,7 @@ #include "regs.h" #include "emit-rtl.h" #include "recog.h" +#include "cgraph.h" #include "diagnostic.h" #include "insn-attr.h" #include "alias.h" @@ -17834,6 +17835,131 @@ aarch64_speculation_safe_value (machine_mode mode, return result; } +/* Set CLONEI->vecsize_mangle, CLONEI->mask_mode, CLONEI->vecsize_int, + CLONEI->vecsize_float and if CLONEI->simdlen is 0, also + CLONEI->simdlen. Return 0 if SIMD clones shouldn't be emitted, + or number of vecsize_mangle variants that should be emitted. */ + +static int +aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, + struct cgraph_simd_clone *clonei, + tree base_type, + int num ATTRIBUTE_UNUSED) +{ + int ret = 0; + + if (clonei->simdlen + && (clonei->simdlen < 2 + || clonei->simdlen > 1024 + || (clonei->simdlen & (clonei->simdlen - 1)) != 0)) + { + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported simdlen %d", clonei->simdlen); + return 0; + } + + tree ret_type = TREE_TYPE (TREE_TYPE (node->decl)); + if (TREE_CODE (ret_type) != VOID_TYPE) + switch (TYPE_MODE (ret_type)) + { + case E_QImode: + case E_HImode: + case E_SImode: + case E_DImode: + case E_SFmode: + case E_DFmode: + /* case E_SCmode: */ + /* case E_DCmode: */ + break; + default: + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported return type %qT for simd\n", ret_type); + return 0; + } + + tree t; + for (t = DECL_ARGUMENTS (node->decl); t; t = DECL_CHAIN (t)) + /* FIXME: Shouldn't we allow such arguments if they are uniform? */ + switch (TYPE_MODE (TREE_TYPE (t))) + { + case E_QImode: + case E_HImode: + case E_SImode: + case E_DImode: + case E_SFmode: + case E_DFmode: + /* case E_SCmode: */ + /* case E_DCmode: */ + break; + default: + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported argument type %qT for simd\n", TREE_TYPE (t)); + return 0; + } + + if (TARGET_SIMD) + { + clonei->vecsize_mangle = 'n'; + clonei->mask_mode = VOIDmode; + clonei->vecsize_int = 128; + clonei->vecsize_float = 128; + + if (clonei->simdlen == 0) + { + if (SCALAR_INT_MODE_P (TYPE_MODE (base_type))) + clonei->simdlen = clonei->vecsize_int; + else + clonei->simdlen = clonei->vecsize_float; + clonei->simdlen /= GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)); + } + else if (clonei->simdlen > 16) + { + /* If it is possible for given SIMDLEN to pass CTYPE value in + registers (v0-v7) accept that SIMDLEN, otherwise warn and don't + emit corresponding clone. */ + int cnt = GET_MODE_BITSIZE (SCALAR_TYPE_MODE (base_type)) * clonei->simdlen; + if (SCALAR_INT_MODE_P (TYPE_MODE (base_type))) + cnt /= clonei->vecsize_int; + else + cnt /= clonei->vecsize_float; + if (cnt > 8) + { + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "unsupported simdlen %d", clonei->simdlen); + return 0; + } + } + ret = 1; + } + return ret; +} + +/* Add target attribute to SIMD clone NODE if needed. */ + +static void +aarch64_simd_clone_adjust (struct cgraph_node *node ATTRIBUTE_UNUSED) +{ +} + +/* If SIMD clone NODE can't be used in a vectorized loop + in current function, return -1, otherwise return a badness of using it + (0 if it is most desirable from vecsize_mangle point of view, 1 + slightly less desirable, etc.). */ + +static int +aarch64_simd_clone_usable (struct cgraph_node *node) +{ + switch (node->simdclone->vecsize_mangle) + { + case 'n': + if (!TARGET_SIMD) + return -1; + return 0; + default: + gcc_unreachable (); + } +} + /* Target-specific selftests. */ #if CHECKING_P @@ -18313,6 +18439,16 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_SPECULATION_SAFE_VALUE #define TARGET_SPECULATION_SAFE_VALUE aarch64_speculation_safe_value +#undef TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN +#define TARGET_SIMD_CLONE_COMPUTE_VECSIZE_AND_SIMDLEN \ + aarch64_simd_clone_compute_vecsize_and_simdlen + +#undef TARGET_SIMD_CLONE_ADJUST +#define TARGET_SIMD_CLONE_ADJUST aarch64_simd_clone_adjust + +#undef TARGET_SIMD_CLONE_USABLE +#define TARGET_SIMD_CLONE_USABLE aarch64_simd_clone_usable + #if CHECKING_P #undef TARGET_RUN_TARGET_SELFTESTS #define TARGET_RUN_TARGET_SELFTESTS selftest::aarch64_run_selftests From patchwork Thu Nov 8 17:54:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 995083 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489439-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="RjW6nmeG"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="YSWti7DS"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rWC46CrRz9s55 for ; Fri, 9 Nov 2018 04:54:56 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=XifxQoMI3+73BQCMs6VJLrkqCUQkNQum5O8nuDgYoCn HYFCrCMAbwKASXTMH6YAblS8MFQy7bLGwOea4TsIWQam4GDCjy4j1kVfTyyUmxMy uZqCgQFqqRvgGhmAKa9R+1thH+8BJfpdizqK11BTSsdQ5m0hGwS2KWpXvxJp5osY = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=zX+jSFxv/2ZIoOzx6Hq/blaMIJ0=; b=RjW6nmeGcnRgESF7d gQaRI+wtbNcXosg6njCLuRMd8vSRID9x+wUmVxA+jr3h8+qptzYp4EWccVzboG2T GY5qlarpENKQHUKoe8lhwFaZBOJ+P/Uh30WDmpDfBMdiFzuGOkHysOn7PCDX0YrK VdQr/uzxCdmN3OJRMRC1D0evVw= Received: (qmail 68875 invoked by alias); 8 Nov 2018 17:54:49 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 68861 invoked by uid 89); 8 Nov 2018 17:54:49 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: NAM01-BN3-obe.outbound.protection.outlook.com Received: from mail-eopbgr740070.outbound.protection.outlook.com (HELO NAM01-BN3-obe.outbound.protection.outlook.com) (40.107.74.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 17:54:46 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=0BKUDXmgTILxqkxZYSlqEKyjgXz+ccHsfnfTQcxptd8=; b=YSWti7DSvj7w0ishTPNeTgVcbzEhv9svGa8xCGRrq4gc7BRojEM6ztJ3UHVpj0FNCvDjLzmgiJVOIplaqOsk+tytHDE985uLGTAf701rcPbU0AQNJZxgdXRWrtgPGPkqijC61iujczmOK9xP+d4I1B/nkpx3UFf1m1ifZ/O6hfU= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB5157.namprd07.prod.outlook.com (20.176.254.210) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.32; Thu, 8 Nov 2018 17:54:44 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41%4]) with mapi id 15.20.1294.034; Thu, 8 Nov 2018 17:54:44 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 3/4][Aarch64] v2: Implement Aarch64 SIMD ABI Date: Thu, 8 Nov 2018 17:54:44 +0000 Message-ID: <1541699683.12016.8.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: This is a patch 3 to support the Aarch64 SIMD ABI [1] in GCC. It defines a new target hook targetm.remove_extra_call_preserved_regs that takes a rtx_insn and will remove registers from the register set passed in if we know that this call preserves those registers. Aarch64 SIMD functions preserve some registers that normal functions do not.  The default version of this function will do nothing. Steve Ellcey sellcey@cavium.com 2018-11-08  Steve Ellcey   * config/aarch64/aarch64.c (aarch64_simd_call_p): New function. (aarch64_remove_extra_call_preserved_regs): New function. (TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): New macro. * doc/tm.texi.in (TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS): New hook. * final.c (get_call_reg_set_usage): Call new hook. * target.def (remove_extra_call_preserved_regs): New hook. * targhooks.c (default_remove_extra_call_preserved_regs): New function. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c82c7b6..62112ac 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1470,6 +1470,50 @@ aarch64_hard_regno_mode_ok (unsigned regno, machine_mode mode) return false; } +/* Return true if the instruction is a call to a SIMD function, false + if it is not a SIMD function or if we do not know anything about + the function. */ + +static bool +aarch64_simd_call_p (rtx_insn *insn) +{ + rtx symbol; + rtx call; + tree fndecl; + + if (!insn) + return false; + call = get_call_rtx_from (insn); + if (!call) + return false; + symbol = XEXP (XEXP (call, 0), 0); + if (GET_CODE (symbol) != SYMBOL_REF) + return false; + fndecl = SYMBOL_REF_DECL (symbol); + if (!fndecl) + return false; + + return aarch64_simd_decl_p (fndecl); +} + +/* Possibly remove some registers from register set if we know they + are preserved by this call, even though they are marked as not + being callee saved in CALL_USED_REGISTERS. */ + +void +aarch64_remove_extra_call_preserved_regs (rtx_insn *insn, + HARD_REG_SET *return_set) +{ + int regno; + + if (aarch64_simd_call_p (insn)) + { + for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) + if (FP_SIMD_SAVED_REGNUM_P (regno)) + CLEAR_HARD_REG_BIT (*return_set, regno); + } +} + /* Implement TARGET_HARD_REGNO_CALL_PART_CLOBBERED. The callee only saves the lower 64 bits of a 128-bit register. Tell the compiler the callee clobbers the top 64 bits when restoring the bottom 64 bits. */ @@ -18290,6 +18334,10 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_MODES_TIEABLE_P #define TARGET_MODES_TIEABLE_P aarch64_modes_tieable_p +#undef TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS +#define TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS \ + aarch64_remove_extra_call_preserved_regs + #undef TARGET_HARD_REGNO_CALL_PART_CLOBBERED #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \ aarch64_hard_regno_call_part_clobbered diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index e8af1bf..73febe9 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -1704,6 +1704,8 @@ of @code{CALL_USED_REGISTERS}. @cindex call-saved register @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED +@hook TARGET_REMOVE_EXTRA_CALL_PRESERVED_REGS + @findex fixed_regs @findex call_used_regs @findex global_regs diff --git a/gcc/final.c b/gcc/final.c index 6e61f1e..8df869e 100644 --- a/gcc/final.c +++ b/gcc/final.c @@ -5080,7 +5080,7 @@ get_call_reg_set_usage (rtx_insn *insn, HARD_REG_SET *reg_set, return true; } } - COPY_HARD_REG_SET (*reg_set, default_set); + targetm.remove_extra_call_preserved_regs (insn, reg_set); return false; } diff --git a/gcc/target.def b/gcc/target.def index 4b166d1..25be927 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5757,6 +5757,12 @@ for targets that don't have partly call-clobbered registers.", bool, (unsigned int regno, machine_mode mode), hook_bool_uint_mode_false) +DEFHOOK +(remove_extra_call_preserved_regs, + "This hook removes some registers from the callee used register set.", + void, (rtx_insn *insn, HARD_REG_SET *used_regs), + default_remove_extra_call_preserved_regs) + /* Return the smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. */ DEFHOOK diff --git a/gcc/targhooks.c b/gcc/targhooks.c index 3d8b3b9..a9fb101 100644 --- a/gcc/targhooks.c +++ b/gcc/targhooks.c @@ -2372,4 +2372,11 @@ default_speculation_safe_value (machine_mode mode ATTRIBUTE_UNUSED, return result; } +void +default_remove_extra_call_preserved_regs (rtx_insn *insn ATTRIBUTE_UNUSED, + HARD_REG_SET *used_regs + ATTRIBUTE_UNUSED) +{ +} + #include "gt-targhooks.h" From patchwork Thu Nov 8 17:55:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Steve Ellcey X-Patchwork-Id: 995086 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-489440-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=cavium.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="R8h/N6iV"; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=CAVIUMNETWORKS.onmicrosoft.com header.i=@CAVIUMNETWORKS.onmicrosoft.com header.b="hA9k2GX6"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 42rWDN2050z9s55 for ; Fri, 9 Nov 2018 04:56:03 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; q=dns; s=default; b=TENOzmkwH36j+G+Bia3/tM9tLVamT8PXkWi1CD7KS+l UBgwmHTAMIsrldYX2mJM0HT/I42j9Qted8IJORxgH1RpCpBz6rUIzyB0g9SgmWCO y0O+RbjYTfDYoHJgdv7AvyFRpgfTTuEohjmZbXym5/dxQyGv5jSsnTIM3X1eLI94 = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type:mime-version:reply-to; s=default; bh=+vb8OdKbxOvA12TrGI0nkAC2cxs=; b=R8h/N6iVOO9w/8o+J O6Nw+JUM/mATWAy+LQ8eMUC7c/7Ha2N9+VNlg/fNECR9ynLpc4pevi9BzxdrJtVm J+lzkS5uGec07lBb4VE8HDEtxxuCvMRG3eAWr062cbdSUbuvh8EUwEvU7zRZB7SZ abFwVXTV+rQtJ4SQamqhSIMJYI= Received: (qmail 72083 invoked by alias); 8 Nov 2018 17:55:55 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 72066 invoked by uid 89); 8 Nov 2018 17:55:55 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-25.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy= X-HELO: NAM05-DM3-obe.outbound.protection.outlook.com Received: from mail-eopbgr730042.outbound.protection.outlook.com (HELO NAM05-DM3-obe.outbound.protection.outlook.com) (40.107.73.42) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 08 Nov 2018 17:55:53 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=cI29TREo7qNffEeJ/dXYzTmwP4TIbaFLp0gEQiyDTbU=; b=hA9k2GX6dCwLdHbkj6JCqra9QghFTn3SUSe/suq4+2o+GVEi8zXq3/3dziG56fJmcDz6yB0dx9NMc9A+jhZ6z3wWptZ/uMamPEybQgAgdf24tRTGk52iAl+3pbWseMIcQBNR65esdd+k/A/5njO+t0rU4tYPQPY6wc0EwDPwB1g= Received: from BYAPR07MB5031.namprd07.prod.outlook.com (52.135.238.224) by BYAPR07MB4344.namprd07.prod.outlook.com (52.135.224.11) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.26; Thu, 8 Nov 2018 17:55:50 +0000 Received: from BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41]) by BYAPR07MB5031.namprd07.prod.outlook.com ([fe80::50ef:7350:fb0f:1d41%4]) with mapi id 15.20.1294.034; Thu, 8 Nov 2018 17:55:50 +0000 From: Steve Ellcey To: gcc-patches Subject: [Patch 4/4][Aarch64] v2: Implement Aarch64 SIMD ABI Date: Thu, 8 Nov 2018 17:55:50 +0000 Message-ID: <1541699749.12016.9.camel@cavium.com> authentication-results: spf=none (sender IP is ) smtp.mailfrom=Steve.Ellcey@cavium.com; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) MIME-Version: 1.0 Reply-To: This is a patch 4 to support the Aarch64 SIMD ABI [1] in GCC. It defines a new target hook targetm.check_part_clobbered that takes a rtx_insn and checks to see if it is a call to a function that may clobber partial registers.  It returns true by default, which results in the current behaviour, but if we can determine that the function will not do any partial clobbers (like the Aarch64 SIMD functions) then it returns false. Steve Ellcey sellcey@cavium.com 2018-11-08  Steve Ellcey   * config/aarch64/aarch64.c (aarch64_check_part_clobbered): New function. (TARGET_CHECK_PART_CLOBBERED): New macro. * doc/tm.texi.in (TARGET_CHECK_PART_CLOBBERED): New hook. * lra-constraints.c (need_for_call_save_p): Use check_part_clobbered. * lra-int.h (check_part_clobbered): New field in lra_reg struct. * lra-lives.c (check_pseudos_live_through_calls): Pass in check_partial_clobber bool argument and use it. (process_bb_lives): Check basic block for functions that may do partial clobbers.  Pass this to check_pseudos_live_through_calls. * lra.c (initialize_lra_reg_info_element): Inialize  check_part_clobbered to false. * target.def (check_part_clobbered): New target hook. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c82c7b6..c2de4111 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -1480,6 +1480,17 @@ aarch64_hard_regno_call_part_clobbered (unsigned int regno, machine_mode mode) return FP_REGNUM_P (regno) && maybe_gt (GET_MODE_SIZE (mode), 8); } +/* Implement TARGET_CHECK_PART_CLOBBERED. SIMD functions never save + partial registers, so they return false. */ + +static bool +aarch64_check_part_clobbered(rtx_insn *insn) +{ + if (aarch64_simd_call_p (insn)) + return false; + return true; +} + /* Implement REGMODE_NATURAL_SIZE. */ poly_uint64 aarch64_regmode_natural_size (machine_mode mode) @@ -18294,6 +18305,9 @@ aarch64_libgcc_floating_mode_supported_p #define TARGET_HARD_REGNO_CALL_PART_CLOBBERED \ aarch64_hard_regno_call_part_clobbered +#undef TARGET_CHECK_PART_CLOBBERED +#define TARGET_CHECK_PART_CLOBBERED aarch64_check_part_clobbered + #undef TARGET_CONSTANT_ALIGNMENT #define TARGET_CONSTANT_ALIGNMENT aarch64_constant_alignment diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index e8af1bf..7dd6c54 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -1704,6 +1704,8 @@ of @code{CALL_USED_REGISTERS}. @cindex call-saved register @hook TARGET_HARD_REGNO_CALL_PART_CLOBBERED +@hook TARGET_CHECK_PART_CLOBBERED + @findex fixed_regs @findex call_used_regs @findex global_regs diff --git a/gcc/lra-constraints.c b/gcc/lra-constraints.c index ab61989..89483d3 100644 --- a/gcc/lra-constraints.c +++ b/gcc/lra-constraints.c @@ -5325,16 +5325,23 @@ inherit_reload_reg (bool def_p, int original_regno, static inline bool need_for_call_save_p (int regno) { + machine_mode pmode = PSEUDO_REGNO_MODE (regno); + int new_regno = reg_renumber[regno]; + lra_assert (regno >= FIRST_PSEUDO_REGISTER && reg_renumber[regno] >= 0); - return (usage_insns[regno].calls_num < calls_num - && (overlaps_hard_reg_set_p - ((flag_ipa_ra && - ! hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set)) - ? lra_reg_info[regno].actual_call_used_reg_set - : call_used_reg_set, - PSEUDO_REGNO_MODE (regno), reg_renumber[regno]) - || (targetm.hard_regno_call_part_clobbered - (reg_renumber[regno], PSEUDO_REGNO_MODE (regno))))); + + if (usage_insns[regno].calls_num >= calls_num) + return false; + + if (flag_ipa_ra + && !hard_reg_set_empty_p (lra_reg_info[regno].actual_call_used_reg_set)) + return (overlaps_hard_reg_set_p + (lra_reg_info[regno].actual_call_used_reg_set, pmode, new_regno) + || (lra_reg_info[regno].check_part_clobbered + && targetm.hard_regno_call_part_clobbered (new_regno, pmode))); + else + return (overlaps_hard_reg_set_p (call_used_reg_set, pmode, new_regno) + || targetm.hard_regno_call_part_clobbered (new_regno, pmode)); } /* Global registers occurring in the current EBB. */ diff --git a/gcc/lra-int.h b/gcc/lra-int.h index 5267b53..e6aacd2 100644 --- a/gcc/lra-int.h +++ b/gcc/lra-int.h @@ -117,6 +117,8 @@ struct lra_reg /* This member is set up in lra-lives.c for subsequent assignments. */ lra_copy_t copies; + /* Whether or not the register is partially clobbered. */ + bool check_part_clobbered; }; /* References to the common info about each register. */ diff --git a/gcc/lra-lives.c b/gcc/lra-lives.c index 0bf8cd0..b2dfe0e 100644 --- a/gcc/lra-lives.c +++ b/gcc/lra-lives.c @@ -597,7 +597,8 @@ lra_setup_reload_pseudo_preferenced_hard_reg (int regno, PSEUDOS_LIVE_THROUGH_CALLS and PSEUDOS_LIVE_THROUGH_SETJUMPS. */ static inline void check_pseudos_live_through_calls (int regno, - HARD_REG_SET last_call_used_reg_set) + HARD_REG_SET last_call_used_reg_set, + bool check_partial_clobber) { int hr; @@ -607,11 +608,12 @@ check_pseudos_live_through_calls (int regno, IOR_HARD_REG_SET (lra_reg_info[regno].conflict_hard_regs, last_call_used_reg_set); - for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++) - if (targetm.hard_regno_call_part_clobbered (hr, - PSEUDO_REGNO_MODE (regno))) - add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs, - PSEUDO_REGNO_MODE (regno), hr); + if (check_partial_clobber) + for (hr = 0; hr < FIRST_PSEUDO_REGISTER; hr++) + if (targetm.hard_regno_call_part_clobbered (hr, + PSEUDO_REGNO_MODE (regno))) + add_to_hard_reg_set (&lra_reg_info[regno].conflict_hard_regs, + PSEUDO_REGNO_MODE (regno), hr); lra_reg_info[regno].call_p = true; if (! sparseset_bit_p (pseudos_live_through_setjumps, regno)) return; @@ -652,6 +654,7 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) rtx_insn *next; rtx link, *link_loc; bool need_curr_point_incr; + bool partial_clobber_in_bb; HARD_REG_SET last_call_used_reg_set; reg_live_out = df_get_live_out (bb); @@ -673,6 +676,18 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) if (lra_dump_file != NULL) fprintf (lra_dump_file, " BB %d\n", bb->index); + /* Check to see if any call might do a partial clobber. */ + partial_clobber_in_bb = false; + FOR_BB_INSNS_REVERSE_SAFE (bb, curr_insn, next) + { + if (CALL_P (curr_insn) + && targetm.check_part_clobbered (curr_insn)) + { + partial_clobber_in_bb = true; + break; + } + } + /* Scan the code of this basic block, noting which pseudos and hard regs are born or die. @@ -850,7 +865,8 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) |= mark_regno_live (reg->regno, reg->biggest_mode, curr_point); check_pseudos_live_through_calls (reg->regno, - last_call_used_reg_set); + last_call_used_reg_set, + partial_clobber_in_bb); } if (reg->regno >= FIRST_PSEUDO_REGISTER) @@ -913,9 +929,14 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) { IOR_HARD_REG_SET (lra_reg_info[j].actual_call_used_reg_set, this_call_used_reg_set); + + if (targetm.check_part_clobbered (curr_insn)) + lra_reg_info[j].check_part_clobbered = true; + if (flush) - check_pseudos_live_through_calls - (j, last_call_used_reg_set); + check_pseudos_live_through_calls (j, + last_call_used_reg_set, + partial_clobber_in_bb); } COPY_HARD_REG_SET(last_call_used_reg_set, this_call_used_reg_set); } @@ -946,7 +967,8 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) |= mark_regno_live (reg->regno, reg->biggest_mode, curr_point); check_pseudos_live_through_calls (reg->regno, - last_call_used_reg_set); + last_call_used_reg_set, + partial_clobber_in_bb); } for (reg = curr_static_id->hard_regs; reg != NULL; reg = reg->next) @@ -1102,7 +1124,9 @@ process_bb_lives (basic_block bb, int &curr_point, bool dead_insn_p) if (sparseset_cardinality (pseudos_live_through_calls) == 0) break; if (sparseset_bit_p (pseudos_live_through_calls, j)) - check_pseudos_live_through_calls (j, last_call_used_reg_set); + check_pseudos_live_through_calls (j, + last_call_used_reg_set, + partial_clobber_in_bb); } for (i = 0; i < FIRST_PSEUDO_REGISTER; ++i) diff --git a/gcc/lra.c b/gcc/lra.c index 5d58d90..8831286 100644 --- a/gcc/lra.c +++ b/gcc/lra.c @@ -1344,6 +1344,7 @@ initialize_lra_reg_info_element (int i) lra_reg_info[i].val = get_new_reg_value (); lra_reg_info[i].offset = 0; lra_reg_info[i].copies = NULL; + lra_reg_info[i].check_part_clobbered = false; } /* Initialize common reg info and copies. */ diff --git a/gcc/target.def b/gcc/target.def index 4b166d1..b3c2c72 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -5757,6 +5757,15 @@ for targets that don't have partly call-clobbered registers.", bool, (unsigned int regno, machine_mode mode), hook_bool_uint_mode_false) +DEFHOOK +( + check_part_clobbered, + "This hook should return true if the function @var{insn} must obey\n\ + the hard_regno_call_part_clobbered target function. False if can ignore\n\ + it because we know the function will not partially clobber any registers.", + bool, (rtx_insn *insn), + hook_bool_rtx_insn_true) + /* Return the smallest number of different values for which it is best to use a jump-table instead of a tree of conditional branches. */ DEFHOOK