From patchwork Tue Jul 30 20:51:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Edlinger X-Patchwork-Id: 1139379 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-505864-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=hotmail.de Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="AZEd+53a"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45ypdL1PQkz9sBF for ; Wed, 31 Jul 2019 06:51:47 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=VyWi/6FbUIRAw3q2 NltoAW/iMOQpteFt6NqmSBoQPJWLsm/cmGxwpP2jqNEEKxMNY5AFz/GKfhbvUHha XMMc4JnfOE2NX2M6ySPuTR9o2jifY0GVTuZE6+1zzlEEleusPZz8cj4fgw7r0r4v kLbxBUO2opprNIrG+x5EIIcYfW8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=qFyBqh8L6feIlnyycNzxoS YI6FY=; b=AZEd+53a+eqKzY1Cz9WQHVTfPwJ59rvsHSlcOMf9S9G+eH4cYNDhSj AjroAspkARHvOFaJ6bKqpDBxSuyxrMAP6J1Uo6bUeVZl0zAHUfHGz2Ee2v3udS4A gTrN4b0e3HYxbKETYJU5ftpifGQORUJPHAMBeZQgBq29aBGS8yp8Q= Received: (qmail 18151 invoked by alias); 30 Jul 2019 20:51:39 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 17984 invoked by uid 89); 30 Jul 2019 20:51:39 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-8.1 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.1 spammy=H*Ad:U*ebotcazou, H*M:COM, hits X-HELO: EUR01-DB5-obe.outbound.protection.outlook.com Received: from mail-oln040092064048.outbound.protection.outlook.com (HELO EUR01-DB5-obe.outbound.protection.outlook.com) (40.92.64.48) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 30 Jul 2019 20:51:37 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OyQaQ8/wSfhkT81/7UNKGzhd+3y2ctEnWvZDf4S36xTUMeVl3mhwxlVDeYuQ+QtNHJBIU+sUTAQwWF/0QqO5bFIeao7I6v1ws4s+Vor/RfVlKzwH1OD1TvvTBDyP8/6foJbdquPvAuSf8xs+re1BarH1pOkVxRhrZB95HuT5j4HQNLFbWOvHkq+X+9et7gHyHsIqojfrgPRF5suhBbUJncflmMyrK3BIpVThz8xbqdAFEBzAdHS0Gb3ZrTIr4fVj3Fnq+zbNkr0K975Ey2Ri9MIEXwst5L0G+iBqSEgKDURnkjy6AAutqKgqAzaQE4L8GCXKzzF6Z53AlzAg/GhkmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=tZK5TgoCoNVsElob0UhZI2RTysWDFEfKqimOe8qB3h4=; b=HZo/TLtgI2WUwspcBP1DTe8XEP3LHRpADAsEjcfV0D5ZRNsAilNqlv9QYEmtrdbBICtjRjb9d4oKdG2XvcQuFEcXyYwm7AaXRHswxLvTR5gDvS5O3gM1x4LrMxsRGapAEnUOSA4S4ztCG4/TI2eH/ZGGhpvTWskvr7DKUIU8w6LRGxe91orLBEbVPO0iIosCwTL9UfUzxkCUsJPfC3TnPWdopENrtP/qgiNesfFGJfBVwKItSJ+9R9kDkEvtfGxLK4zw6jFwQimpyR5Ow0llQvmtVztyI05TVP3UoO7lRGLmCn/TN3Xwxmu8AryFZ4q2V0PDaA5URbr7a7oeWy8q8g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none Received: from VE1EUR01FT056.eop-EUR01.prod.protection.outlook.com (10.152.2.57) by VE1EUR01HT091.eop-EUR01.prod.protection.outlook.com (10.152.3.159) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2115.10; Tue, 30 Jul 2019 20:51:34 +0000 Received: from AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM (10.152.2.54) by VE1EUR01FT056.mail.protection.outlook.com (10.152.3.115) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2115.10 via Frontend Transport; Tue, 30 Jul 2019 20:51:34 +0000 Received: from AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM ([fe80::c488:4d1b:6ada:37cc]) by AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM ([fe80::c488:4d1b:6ada:37cc%3]) with mapi id 15.20.2136.010; Tue, 30 Jul 2019 20:51:34 +0000 From: Bernd Edlinger To: Richard Biener CC: "gcc-patches@gcc.gnu.org" , Richard Earnshaw , Ramana Radhakrishnan , Kyrill Tkachov , Eric Botcazou Subject: [PATCHv3] Fix not 8-byte aligned ldrd/strd on ARMv5 (PR 89544) Date: Tue, 30 Jul 2019 20:51:33 +0000 Message-ID: References: In-Reply-To: x-microsoft-original-message-id: <1cfebf90-6a73-d54f-deaf-270dc59b11d2@hotmail.de> MIME-Version: 1.0 Hi Richard, it is already a while ago, but I had not found time to continue with this patch until now. I think I have now a better solution, which properly addresses your comments below. On 3/25/19 9:41 AM, Richard Biener wrote: > On Fri, 22 Mar 2019, Bernd Edlinger wrote: > >> On 3/21/19 12:15 PM, Richard Biener wrote: >>> On Sun, 10 Mar 2019, Bernd Edlinger wrote: >>> Finally... >>> >>> Index: gcc/function.c >>> =================================================================== >>> --- gcc/function.c (revision 269264) >>> +++ gcc/function.c (working copy) >>> @@ -2210,6 +2210,12 @@ use_register_for_decl (const_tree decl) >>> if (DECL_MODE (decl) == BLKmode) >>> return false; >>> >>> + if (STRICT_ALIGNMENT && TREE_CODE (decl) == PARM_DECL >>> + && DECL_INCOMING_RTL (decl) && MEM_P (DECL_INCOMING_RTL (decl)) >>> + && GET_MODE_ALIGNMENT (DECL_MODE (decl)) >>> + > MEM_ALIGN (DECL_INCOMING_RTL (decl))) >>> + return false; >>> + >>> /* If -ffloat-store specified, don't put explicit float variables >>> into registers. */ >>> /* ??? This should be checked after DECL_ARTIFICIAL, but tree-ssa >>> >>> I wonder if it is necessary to look at DECL_INCOMING_RTL here >>> and why such RTL may not exist? That is, iff DECL_INCOMING_RTL >>> doesn't exist then shouldn't we return false for safety reasons? >>> You are right, it is not possbile to return different results from use_register_for_decl before vs. after incoming RTL is assigned. That hits an assertion in set_rtl. This hunk is gone now, instead I changed assign_parm_setup_reg to use movmisalign optab and/or extract_bit_field if misaligned entry_parm is to be assigned in a register. I have no test coverage for the movmisalign optab though, so I rely on your code review for that part. >>> Similarly the very same issue should exist on x86_64 which is >>> !STRICT_ALIGNMENT, it's just the ABI seems to provide the appropriate >>> alignment on the caller side. So the STRICT_ALIGNMENT check is >>> a wrong one. >>> >> >> I may be plain wrong here, but I thought that !STRICT_ALIGNMENT targets >> just use MEM_ALIGN to select the right instructions. MEM_ALIGN >> is always 32-bit align on the DImode memory. The x86_64 vector instructions >> would look at MEM_ALIGN and do the right thing, yes? > > No, they need to use the movmisalign optab and end up with UNSPECs > for example. Ah, thanks, now I see. >> It seems to be the definition of STRICT_ALIGNMENT targets that all RTL >> instructions need to have MEM_ALIGN >= GET_MODE_ALIGNMENT, so the target >> does not even have to look at MEM_ALIGN except in the mov_misalign_optab, >> right? > > Yes, I think we never losened that. Note that RTL expansion has to > fix this up for them. Note that strictly speaking SLOW_UNALIGNED_ACCESS > specifies that x86 is strict-align wrt vector modes. > Yes I agree, the code would be incorrect for x86 as well when the movmisalign_optab is not used. So I invoke the movmisalign optab if available and if not fall back to extract_bit_field. As in the assign_parm_setup_stack assign_parm_setup_reg assumes that data->promoted_mode != data->nominal_mode does not happen with misaligned stack slots. Attached is the v3 if my patch. Boot-strapped and reg-tested on x86_64-pc-linux-gnu and arm-linux-gnueabihf. Is it OK for trunk? Thanks Bernd. 2019-07-30 Bernd Edlinger PR middle-end/89544 * function.c (assign_param_data_one): Remove unused data members. (assign_parm_find_stack_rtl): Use larger alignment when possible. (assign_parm_adjust_stack_rtl): Revise STRICT_ALIGNMENT check. (assign_parm_setup_reg): Handle misaligned stack arguments. testsuite: 2019-07-30 Bernd Edlinger PR middle-end/89544 * gcc.target/arm/unaligned-argument-1.c: New test. * gcc.target/arm/unaligned-argument-2.c: New test. Index: gcc/function.c =================================================================== --- gcc/function.c (revision 273767) +++ gcc/function.c (working copy) @@ -2274,8 +2274,6 @@ struct assign_parm_data_one int partial; BOOL_BITFIELD named_arg : 1; BOOL_BITFIELD passed_pointer : 1; - BOOL_BITFIELD on_stack : 1; - BOOL_BITFIELD loaded_in_reg : 1; }; /* A subroutine of assign_parms. Initialize ALL. */ @@ -2699,8 +2697,23 @@ assign_parm_find_stack_rtl (tree parm, struct assi intentionally forcing upward padding. Otherwise we have to come up with a guess at the alignment based on OFFSET_RTX. */ poly_int64 offset; - if (data->locate.where_pad != PAD_DOWNWARD || data->entry_parm) + if (data->locate.where_pad == PAD_NONE || data->entry_parm) align = boundary; + else if (data->locate.where_pad == PAD_UPWARD) + { + align = boundary; + /* If the argument offset is actually more aligned than the nominal + stack slot boundary, take advantage of that excess alignment. + Don't make any assumptions if STACK_POINTER_OFFSET is in use. */ + if (poly_int_rtx_p (offset_rtx, &offset) + && STACK_POINTER_OFFSET == 0) + { + unsigned int offset_align = known_alignment (offset) * BITS_PER_UNIT; + if (offset_align == 0 || offset_align > STACK_BOUNDARY) + offset_align = STACK_BOUNDARY; + align = MAX (align, offset_align); + } + } else if (poly_int_rtx_p (offset_rtx, &offset)) { align = least_bit_hwi (boundary); @@ -2813,8 +2826,9 @@ assign_parm_adjust_stack_rtl (struct assign_parm_d ultimate type, don't use that slot after entry. We'll make another stack slot, if we need one. */ if (stack_parm - && ((STRICT_ALIGNMENT - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)) + && ((GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm) + && targetm.slow_unaligned_access (data->nominal_mode, + MEM_ALIGN (stack_parm))) || (data->nominal_type && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) @@ -3292,6 +3306,23 @@ assign_parm_setup_reg (struct assign_parm_data_all did_conversion = true; } + else if (MEM_P (data->entry_parm) + && GET_MODE_ALIGNMENT (promoted_nominal_mode) + > MEM_ALIGN (data->entry_parm) + && targetm.slow_unaligned_access (promoted_nominal_mode, + MEM_ALIGN (data->entry_parm))) + { + enum insn_code icode = optab_handler (movmisalign_optab, + promoted_nominal_mode); + + if (icode != CODE_FOR_nothing) + emit_insn (GEN_FCN (icode) (parmreg, validated_mem)); + else + rtl = parmreg = extract_bit_field (validated_mem, + GET_MODE_BITSIZE (promoted_nominal_mode), 0, + unsignedp, parmreg, + promoted_nominal_mode, VOIDmode, false, NULL); + } else emit_move_insn (parmreg, validated_mem); Index: gcc/testsuite/gcc.target/arm/unaligned-argument-1.c =================================================================== --- gcc/testsuite/gcc.target/arm/unaligned-argument-1.c (revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-1.c (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-marm -march=armv6 -mno-unaligned-access -mfloat-abi=soft -mabi=aapcs -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 1 } } */ +/* { dg-final { scan-assembler-times "strd" 1 } } */ +/* { dg-final { scan-assembler-times "stm" 0 } } */ Index: gcc/testsuite/gcc.target/arm/unaligned-argument-2.c =================================================================== --- gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (revision 0) +++ gcc/testsuite/gcc.target/arm/unaligned-argument-2.c (working copy) @@ -0,0 +1,17 @@ +/* { dg-do compile } */ +/* { dg-options "-marm -march=armv6 -mno-unaligned-access -mfloat-abi=soft -mabi=aapcs -O3" } */ + +struct s { + int a, b; +} __attribute__((aligned(8))); + +struct s f0; + +void f(int a, int b, int c, int d, int e, struct s f) +{ + f0 = f; +} + +/* { dg-final { scan-assembler-times "ldrd" 0 } } */ +/* { dg-final { scan-assembler-times "strd" 0 } } */ +/* { dg-final { scan-assembler-times "stm" 1 } } */