From patchwork Mon Nov 18 21:34:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kwok Cheung Yeung X-Patchwork-Id: 1196982 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-513966-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="PSWUMfbV"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47H2L86g7dz9sP4 for ; Tue, 19 Nov 2019 08:35:11 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; q=dns; s= default; b=I9GMComLZ31OcLeMqzs1EseT0q2Acb3tWPLkjO6Zv0NXoFsKbi+b1 K7LpIq7zuYUzbgeA0ndae7SyZxMR9jRTsXHk9nLPQ+jkJT5GN1G6SGSwN58MPacw /r8K+hhwz4eZY8EWDQUJ8NvX4ji1DhcwZXKwLMWKRfbVSf5sD+UEUo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; s= default; bh=9r0mMMIG4W5JIPio1Rt8CBk0y20=; b=PSWUMfbVjD5657ZGtx/Y sb8XoPm6IVXHcVqntHIU77B9MntfCHZ+WaqmtqLLJCODBuhqT2dt7u/oU50fhgXf LGHF63GlemIfGTlCJqzvR//HVEPrjkaODPRGLgLJIiWYHZO3ARHx8UJny1USLgqf tZ6KpBlIn1iC8Y0JntYVPgc= Received: (qmail 65116 invoked by alias); 18 Nov 2019 21:35:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 65105 invoked by uid 89); 18 Nov 2019 21:35:02 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-20.0 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy=Cheung, violates, Yeung, amdgcn X-HELO: esa3.mentor.iphmx.com Received: from esa3.mentor.iphmx.com (HELO esa3.mentor.iphmx.com) (68.232.137.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 18 Nov 2019 21:34:52 +0000 IronPort-SDR: pY+f465k9R7yJ0uiNCaHsOmZAQekQw8kIVnr1Sq/ZeahgIukR30IN96dKI+LoS/m71zFA4Rr45 IyCHgRjfy8i57QPbb0PbcHB0G+oKINMv7tbMoYRe6SEP7+LFQNW6eMguIj6zCySpxqA5phPf5Z c6JMFr4OAssbD9lrRZEBlf8XTS5Ud0IIbJPBAF1ZDUwKiv1X+w/81wL0a2LUinaRIQgYyb+sEM gg06NCmM2XYdKHi6i9uf/22OSwukzt68VE4DpN10bClEJ+hEgDmSWAmIYiGL9B3HbaPTYNhI8l I6c= Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 18 Nov 2019 13:34:50 -0800 IronPort-SDR: JBD0o+Z+kYcYDRbna9+VIVWakygu9ka9rqM4eMm5S1kXPQj5eoFeIvzbbfxInjsRwcOKUkgO/f MR1fTo7N+E5bOjzGxm70syJaH8Ff55ykC7LZ1dievOdd1MSZcVtAQq/VfWXfQSrxMstQ+7aI/6 E0ysOmteayAYu+Mwu+zgdHsoBoCeDfXT9fRP9hRKeXJeJtt14voKepAA/rvvtMXyVRWMNkTNzb O8mfI+L28NPQorCil9EgoNNJh8dybMQHgxx6hdlwD9iHLfS5Im+A4ffvsMrR4sTQpkfFD63T7K u6k= From: Kwok Cheung Yeung Subject: [og9] Backport AMD GCN backend improvements from mainline To: Message-ID: Date: Mon, 18 Nov 2019 21:34:36 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 Hello This patch backports to the OG9 branch the following patches that have already been committed to the GCC mainline: Support using multiple registers to hold the frame pointer [LRA] Do not use eliminable registers for spilling [amdgcn] Use first lane of v1 for zero offset [amdgcn] Reinitialize registers for every function [amdgcn] Restrict registers available to non-kernel functions [amdgcn] Update lower bounds for the number of registers in non-leaf kernels [amdgcn] Unfix registers for frame pointer [amdgcn] Fix handling of VCC_CONDITIONAL_REG Check suitability of spill register for mode I will commit this later. Thanks, Kwok 2019-11-07 Kwok Cheung Yeung gcc/ * ira.c (setup_alloc_regs): Setup no_unit_alloc_regs for frame pointer in multiple registers. (ira_setup_eliminable_regset): Setup eliminable_regset, ira_no_alloc_regs and regs_ever_live for frame pointer in multiple registers. 2019-11-10 Kwok Cheung Yeung gcc/ * lra-spills.c (assign_spill_hard_regs): Do not spill into registers in eliminable_regset. 2019-11-14 Kwok Cheung Yeung gcc/ * lra-spills.c (assign_spill_hard_regs): Check that the spill register is suitable for the mode. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG register class for VCC_LO and VCC_HI. (gcn_spill_class): Use SGPR_REGS to spill registers in VCC_CONDITIONAL_REG. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_expand_prologue): Remove initialization and prologue use of v0. (print_operand_address): Use v1 for zero vector offset. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_init_cumulative_args): Call reinit_regs. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (default_requested_args): New. (gcn_parse_amdgpu_hsa_kernel_attribute): Initialize requested args set with default_requested_args. (gcn_conditional_register_usage): Limit register usage of non-kernel functions. Reassign fixed registers if a non-standard set of args is requested. * config/gcn/gcn.h (FIXED_REGISTERS): Fix registers according to ABI. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT, MAX_NORMAL_VGPR_COUNT): New. (gcn_conditional_register_usage): Use constants in place of hard-coded values. (gcn_hsa_declare_function_name): Set lower bound for number of SGPRs/VGPRs in non-leaf kernels to MAX_NORMAL_SGPR_COUNT and MAX_NORMAL_VGPR_COUNT. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.h (FIXED_REGISTERS): Unfix frame pointer. (CALL_USED_REGISTERS): Make frame pointer callee-saved. --- gcc/config/gcn/gcn.c | 104 ++++++++++++++++++++++++++++----------------------- gcc/config/gcn/gcn.h | 8 ++-- gcc/ira.c | 33 +++++++++------- gcc/lra-spills.c | 3 ++ 4 files changed, 85 insertions(+), 63 deletions(-) From 20245bf2e0b8ac258f86e6495b3be3e09edd0181 Mon Sep 17 00:00:00 2001 From: Kwok Cheung Yeung Date: Mon, 18 Nov 2019 13:26:50 -0800 Subject: [PATCH] [og9] Backport AMD GCN backend improvements from mainline 2019-11-07 Kwok Cheung Yeung gcc/ * ira.c (setup_alloc_regs): Setup no_unit_alloc_regs for frame pointer in multiple registers. (ira_setup_eliminable_regset): Setup eliminable_regset, ira_no_alloc_regs and regs_ever_live for frame pointer in multiple registers. 2019-11-10 Kwok Cheung Yeung gcc/ * lra-spills.c (assign_spill_hard_regs): Do not spill into registers in eliminable_regset. 2019-11-14 Kwok Cheung Yeung gcc/ * lra-spills.c (assign_spill_hard_regs): Check that the spill register is suitable for the mode. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_regno_reg_class): Return VCC_CONDITIONAL_REG register class for VCC_LO and VCC_HI. (gcn_spill_class): Use SGPR_REGS to spill registers in VCC_CONDITIONAL_REG. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_expand_prologue): Remove initialization and prologue use of v0. (print_operand_address): Use v1 for zero vector offset. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (gcn_init_cumulative_args): Call reinit_regs. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (default_requested_args): New. (gcn_parse_amdgpu_hsa_kernel_attribute): Initialize requested args set with default_requested_args. (gcn_conditional_register_usage): Limit register usage of non-kernel functions. Reassign fixed registers if a non-standard set of args is requested. * config/gcn/gcn.h (FIXED_REGISTERS): Fix registers according to ABI. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.c (MAX_NORMAL_SGPR_COUNT, MAX_NORMAL_VGPR_COUNT): New. (gcn_conditional_register_usage): Use constants in place of hard-coded values. (gcn_hsa_declare_function_name): Set lower bound for number of SGPRs/VGPRs in non-leaf kernels to MAX_NORMAL_SGPR_COUNT and MAX_NORMAL_VGPR_COUNT. 2019-11-15 Kwok Cheung Yeung gcc/ * config/gcn/gcn.h (FIXED_REGISTERS): Unfix frame pointer. (CALL_USED_REGISTERS): Make frame pointer callee-saved. --- gcc/config/gcn/gcn.c | 104 ++++++++++++++++++++++++++++----------------------- gcc/config/gcn/gcn.h | 8 ++-- gcc/ira.c | 33 +++++++++------- gcc/lra-spills.c | 3 ++ 4 files changed, 85 insertions(+), 63 deletions(-) diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index 2835a3d..f556ffe 100644 --- a/gcc/config/gcn/gcn.c +++ b/gcc/config/gcn/gcn.c @@ -75,6 +75,12 @@ int gcn_isa = 3; /* Default to GCN3. */ #define LDS_SIZE 65536 +/* The number of registers usable by normal non-kernel functions. + The SGPR count includes any special extra registers such as VCC. */ + +#define MAX_NORMAL_SGPR_COUNT 64 +#define MAX_NORMAL_VGPR_COUNT 24 + /* }}} */ /* {{{ Initialization and options. */ @@ -191,6 +197,17 @@ static const struct gcn_kernel_arg_type {"work_item_id_Z", NULL, V64SImode, FIRST_VGPR_REG + 2} }; +static const long default_requested_args + = (1 << PRIVATE_SEGMENT_BUFFER_ARG) + | (1 << DISPATCH_PTR_ARG) + | (1 << QUEUE_PTR_ARG) + | (1 << KERNARG_SEGMENT_PTR_ARG) + | (1 << PRIVATE_SEGMENT_WAVE_OFFSET_ARG) + | (1 << WORKGROUP_ID_X_ARG) + | (1 << WORK_ITEM_ID_X_ARG) + | (1 << WORK_ITEM_ID_Y_ARG) + | (1 << WORK_ITEM_ID_Z_ARG); + /* Extract parameter settings from __attribute__((amdgpu_hsa_kernel ())). This function also sets the default values for some arguments. @@ -201,10 +218,7 @@ gcn_parse_amdgpu_hsa_kernel_attribute (struct gcn_kernel_args *args, tree list) { bool err = false; - args->requested = ((1 << PRIVATE_SEGMENT_BUFFER_ARG) - | (1 << QUEUE_PTR_ARG) - | (1 << KERNARG_SEGMENT_PTR_ARG) - | (1 << PRIVATE_SEGMENT_WAVE_OFFSET_ARG)); + args->requested = default_requested_args; args->nargs = 0; for (int a = 0; a < GCN_KERNEL_ARG_TYPES; a++) @@ -242,8 +256,6 @@ gcn_parse_amdgpu_hsa_kernel_attribute (struct gcn_kernel_args *args, args->requested |= (1 << a); args->order[args->nargs++] = a; } - args->requested |= (1 << WORKGROUP_ID_X_ARG); - args->requested |= (1 << WORK_ITEM_ID_Z_ARG); /* Requesting WORK_ITEM_ID_Z_ARG implies requesting WORK_ITEM_ID_X_ARG and WORK_ITEM_ID_Y_ARG. Similarly, requesting WORK_ITEM_ID_Y_ARG implies @@ -253,10 +265,6 @@ gcn_parse_amdgpu_hsa_kernel_attribute (struct gcn_kernel_args *args, if (args->requested & (1 << WORK_ITEM_ID_Y_ARG)) args->requested |= (1 << WORK_ITEM_ID_X_ARG); - /* Always enable this so that kernargs is in a predictable place for - gomp_print, etc. */ - args->requested |= (1 << DISPATCH_PTR_ARG); - int sgpr_regno = FIRST_SGPR_REG; args->nsgprs = 0; for (int a = 0; a < GCN_KERNEL_ARG_TYPES; a++) @@ -462,6 +470,9 @@ gcn_regno_reg_class (int regno) { case SCC_REG: return SCC_CONDITIONAL_REG; + case VCC_LO_REG: + case VCC_HI_REG: + return VCC_CONDITIONAL_REG; case VCCZ_REG: return VCCZ_CONDITIONAL_REG; case EXECZ_REG: @@ -629,7 +640,8 @@ gcn_can_split_p (machine_mode, rtx op) static reg_class_t gcn_spill_class (reg_class_t c, machine_mode /*mode */ ) { - if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c)) + if (reg_classes_intersect_p (ALL_CONDITIONAL_REGS, c) + || c == VCC_CONDITIONAL_REG) return SGPR_REGS; else return NO_REGS; @@ -2040,27 +2052,36 @@ gcn_secondary_reload (bool in_p, rtx x, reg_class_t rclass, static void gcn_conditional_register_usage (void) { - int i; + if (!cfun || !cfun->machine) + return; - /* FIXME: Do we need to reset fixed_regs? */ + if (cfun->machine->normal_function) + { + /* Restrict the set of SGPRs and VGPRs used by non-kernel functions. */ + for (int i = SGPR_REGNO (MAX_NORMAL_SGPR_COUNT - 2); + i <= LAST_SGPR_REG; i++) + fixed_regs[i] = 1, call_used_regs[i] = 1; -/* Limit ourselves to 1/16 the register file for maximimum sized workgroups. - There are enough SGPRs not to limit those. - TODO: Adjust this more dynamically. */ - for (i = FIRST_VGPR_REG + 64; i <= LAST_VGPR_REG; i++) - fixed_regs[i] = 1, call_used_regs[i] = 1; + for (int i = VGPR_REGNO (MAX_NORMAL_VGPR_COUNT); + i <= LAST_VGPR_REG; i++) + fixed_regs[i] = 1, call_used_regs[i] = 1; - if (!cfun || !cfun->machine || cfun->machine->normal_function) - { - /* Normal functions can't know what kernel argument registers are - live, so just fix the bottom 16 SGPRs, and bottom 3 VGPRs. */ - for (i = 0; i < 16; i++) - fixed_regs[FIRST_SGPR_REG + i] = 1; - for (i = 0; i < 3; i++) - fixed_regs[FIRST_VGPR_REG + i] = 1; return; } + /* If the set of requested args is the default set, nothing more needs to + be done. */ + if (cfun->machine->args.requested == default_requested_args) + return; + + /* Requesting a set of args different from the default violates the ABI. */ + if (!leaf_function_p ()) + warning (0, "A non-default set of initial values has been requested, " + "which violates the ABI!"); + + for (int i = SGPR_REGNO (0); i < SGPR_REGNO (14); i++) + fixed_regs[i] = 0; + /* Fix the runtime argument register containing values that may be needed later. DISPATCH_PTR_ARG and FLAT_SCRATCH_* should not be needed after the prologue so there's no need to fix them. */ @@ -2068,10 +2089,10 @@ gcn_conditional_register_usage (void) fixed_regs[cfun->machine->args.reg[PRIVATE_SEGMENT_WAVE_OFFSET_ARG]] = 1; if (cfun->machine->args.reg[PRIVATE_SEGMENT_BUFFER_ARG] >= 0) { + /* The upper 32-bits of the 64-bit descriptor are not used, so allow + the containing registers to be used for other purposes. */ fixed_regs[cfun->machine->args.reg[PRIVATE_SEGMENT_BUFFER_ARG]] = 1; fixed_regs[cfun->machine->args.reg[PRIVATE_SEGMENT_BUFFER_ARG] + 1] = 1; - fixed_regs[cfun->machine->args.reg[PRIVATE_SEGMENT_BUFFER_ARG] + 2] = 1; - fixed_regs[cfun->machine->args.reg[PRIVATE_SEGMENT_BUFFER_ARG] + 3] = 1; } if (cfun->machine->args.reg[KERNARG_SEGMENT_PTR_ARG] >= 0) { @@ -2441,6 +2462,8 @@ gcn_init_cumulative_args (CUMULATIVE_ARGS *cum /* Argument info to init */ , cfun->machine->args = cum->args; if (!caller && cfun->machine->normal_function) gcn_detect_incoming_pointer_arg (fndecl); + + reinit_regs (); } static bool @@ -2776,15 +2799,6 @@ gcn_expand_prologue () cfun->machine->args. reg[PRIVATE_SEGMENT_WAVE_OFFSET_ARG]); - if (TARGET_GCN5_PLUS) - { - /* v0 is reserved for constant zero so that "global" - memory instructions can have a nul-offset without - causing reloads. */ - emit_insn (gen_vec_duplicatev64si - (gen_rtx_REG (V64SImode, VGPR_REGNO (0)), const0_rtx)); - } - if (cfun->machine->args.requested & (1 << FLAT_SCRATCH_INIT_ARG)) { rtx fs_init_lo = @@ -2843,8 +2857,6 @@ gcn_expand_prologue () gen_int_mode (LDS_SIZE, SImode)); emit_insn (gen_prologue_use (gen_rtx_REG (SImode, M0_REG))); - if (TARGET_GCN5_PLUS) - emit_insn (gen_prologue_use (gen_rtx_REG (SImode, VGPR_REGNO (0)))); if (cfun && cfun->machine && !cfun->machine->normal_function && flag_openmp) { @@ -4876,10 +4888,10 @@ gcn_hsa_declare_function_name (FILE *file, const char *name, tree) if (!leaf_function_p ()) { /* We can't know how many registers function calls might use. */ - if (vgpr < 64) - vgpr = 64; - if (sgpr + extra_regs < 102) - sgpr = 102 - extra_regs; + if (vgpr < MAX_NORMAL_VGPR_COUNT) + vgpr = MAX_NORMAL_VGPR_COUNT; + if (sgpr + extra_regs < MAX_NORMAL_SGPR_COUNT) + sgpr = MAX_NORMAL_SGPR_COUNT - extra_regs; } /* GFX8 allocates SGPRs in blocks of 8. @@ -5303,9 +5315,9 @@ print_operand_address (FILE *file, rtx mem) /* The assembler requires a 64-bit VGPR pair here, even though the offset should be only 32-bit. */ if (vgpr_offset == NULL_RTX) - /* In this case, the vector offset is zero, so we use v0, - which is initialized by the kernel prologue to zero. */ - fprintf (file, "v[0:1]"); + /* In this case, the vector offset is zero, so we use the first + lane of v1, which is initialized to zero. */ + fprintf (file, "v[1:2]"); else if (REG_P (vgpr_offset) && VGPR_REGNO_P (REGNO (vgpr_offset))) { diff --git a/gcc/config/gcn/gcn.h b/gcc/config/gcn/gcn.h index b3b2d1a..e60b431 100644 --- a/gcc/config/gcn/gcn.h +++ b/gcc/config/gcn/gcn.h @@ -160,9 +160,9 @@ #define FIXED_REGISTERS { \ /* Scalars. */ \ - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ + 1, 1, 0, 0, 1, 1, 0, 0, 1, 1, \ /* fp sp lr. */ \ - 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, \ + 1, 1, 0, 0, 0, 0, 1, 1, 0, 0, \ /* exec_save, cc_save */ \ 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ @@ -180,7 +180,7 @@ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ /* VGRPs */ \ - 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ + 0, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ @@ -203,7 +203,7 @@ #define CALL_USED_REGISTERS { \ /* Scalars. */ \ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ - 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ + 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, \ 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, \ 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, \ diff --git a/gcc/ira.c b/gcc/ira.c index fd481d6..60e0b9b 100644 --- a/gcc/ira.c +++ b/gcc/ira.c @@ -516,7 +516,8 @@ setup_alloc_regs (bool use_hard_frame_p) #endif COPY_HARD_REG_SET (no_unit_alloc_regs, fixed_nonglobal_reg_set); if (! use_hard_frame_p) - SET_HARD_REG_BIT (no_unit_alloc_regs, HARD_FRAME_POINTER_REGNUM); + add_to_hard_reg_set (&no_unit_alloc_regs, Pmode, + HARD_FRAME_POINTER_REGNUM); setup_class_hard_regs (); } @@ -2275,6 +2276,7 @@ ira_setup_eliminable_regset (void) { int i; static const struct {const int from, to; } eliminables[] = ELIMINABLE_REGS; + int fp_reg_count = hard_regno_nregs (HARD_FRAME_POINTER_REGNUM, Pmode); /* Setup is_leaf as frame_pointer_required may use it. This function is called by sched_init before ira if scheduling is enabled. */ @@ -2303,7 +2305,8 @@ ira_setup_eliminable_regset (void) frame pointer in LRA. */ if (frame_pointer_needed) - df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true); + for (i = 0; i < fp_reg_count; i++) + df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true); COPY_HARD_REG_SET (ira_no_alloc_regs, no_unit_alloc_regs); CLEAR_HARD_REG_SET (eliminable_regset); @@ -2333,17 +2336,21 @@ ira_setup_eliminable_regset (void) } if (!HARD_FRAME_POINTER_IS_FRAME_POINTER) { - if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, HARD_FRAME_POINTER_REGNUM)) - { - SET_HARD_REG_BIT (eliminable_regset, HARD_FRAME_POINTER_REGNUM); - if (frame_pointer_needed) - SET_HARD_REG_BIT (ira_no_alloc_regs, HARD_FRAME_POINTER_REGNUM); - } - else if (frame_pointer_needed) - error ("%s cannot be used in asm here", - reg_names[HARD_FRAME_POINTER_REGNUM]); - else - df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM, true); + for (i = 0; i < fp_reg_count; i++) + if (!TEST_HARD_REG_BIT (crtl->asm_clobbers, + HARD_FRAME_POINTER_REGNUM + i)) + { + SET_HARD_REG_BIT (eliminable_regset, + HARD_FRAME_POINTER_REGNUM + i); + if (frame_pointer_needed) + SET_HARD_REG_BIT (ira_no_alloc_regs, + HARD_FRAME_POINTER_REGNUM + i); + } + else if (frame_pointer_needed) + error ("%s cannot be used in asm here", + reg_names[HARD_FRAME_POINTER_REGNUM + i]); + else + df_set_regs_ever_live (HARD_FRAME_POINTER_REGNUM + i, true); } } diff --git a/gcc/lra-spills.c b/gcc/lra-spills.c index c19b76a..417d68c 100644 --- a/gcc/lra-spills.c +++ b/gcc/lra-spills.c @@ -283,6 +283,9 @@ assign_spill_hard_regs (int *pseudo_regnos, int n) for (k = 0; k < spill_class_size; k++) { hard_regno = ira_class_hard_regs[spill_class][k]; + if (TEST_HARD_REG_BIT (eliminable_regset, hard_regno) + || !targetm.hard_regno_mode_ok (hard_regno, mode)) + continue; if (! overlaps_hard_reg_set_p (conflict_hard_regs, mode, hard_regno)) break; }