From patchwork Mon Sep 26 04:31:04 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Revital Eres X-Patchwork-Id: 116351 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 38B0DB6F75 for ; Mon, 26 Sep 2011 14:31:31 +1000 (EST) Received: (qmail 29657 invoked by alias); 26 Sep 2011 04:31:23 -0000 Received: (qmail 29646 invoked by uid 22791); 26 Sep 2011 04:31:20 -0000 X-SWARE-Spam-Status: No, hits=-2.6 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from mail-gw0-f47.google.com (HELO mail-gw0-f47.google.com) (74.125.83.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 26 Sep 2011 04:31:06 +0000 Received: by gwaa2 with SMTP id a2so5005374gwa.20 for ; Sun, 25 Sep 2011 21:31:05 -0700 (PDT) MIME-Version: 1.0 Received: by 10.100.254.3 with SMTP id b3mr5501315ani.124.1317011464762; Sun, 25 Sep 2011 21:31:04 -0700 (PDT) Received: by 10.101.58.12 with HTTP; Sun, 25 Sep 2011 21:31:04 -0700 (PDT) Date: Mon, 26 Sep 2011 07:31:04 +0300 Message-ID: Subject: [PATCH, SMS 1/2] Avoid generating redundant reg-moves From: Revital Eres To: Ayal Zaks Cc: gcc-patches@gcc.gnu.org, Patch Tracking Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello, The attached patch contains a fix to generate_reg_moves function. Currently we can generate reg-moves for stores which are later eliminated. This happens when we have mem dependency with distance 1 and as a result the number of regmoves is at least 1 based on the following calculation taken from generate_reg_moves (): if (e->distance == 1) nreg_moves4e = (SCHED_TIME (e->dest) - SCHED_TIME (e->src) + ii) / ii; This is an example of register move generated in such cases: reg_move = (insn 152 119 75 4 (set (reg:SI 231) (mem:SI (pre_modify:DI (reg:DI 215) (plus:DI (reg:DI 215) (reg:DI 171 [ ivtmp.42 ]))) [3 MEM[base: pretmp.27_65, index: ivtmp.42_9, offset: 0B]+0 S4 A32])) -1 (nil)) When not handling REG_INC instructions this was not a problem as these reg-moves were removes by dead code elimination. for example: insn 1) mem[x] = ... insn 2) .. = mem[y] When reg-move reg1 = mem [x] was generated mem[x] is not been used in insn 2 and thus reg1 could be eliminated. But with REG_INC this is different because the reg-move instruction remains and leads to bad gen. The attached tescase capture this case. Tested and bootstrap with patch 2 on ppc64-redhat-linux enabling SMS on loops with SC 1. On arm-linux-gnueabi bootstrap c on top of the set of patches that support do-loop pattern (http://gcc.gnu.org/ml/gcc-patches/2011-07/msg01807.html) which solves the bootstrap failure on ARM with SMS flags. OK for mainline? Thanks, Revital gcc/ * modulo-sched.c (generate_reg_moves): Skip instructions that do not set a register. testsuite/ * gcc.dg/sms-10.c: New file. /* { dg-do run } */ /* { dg-options "-O2 -fmodulo-sched -fmodulo-sched-allow-regmoves -fdump-rtl-sms" } */ typedef __SIZE_TYPE__ size_t; extern void *malloc (size_t); extern void free (void *); extern void abort (void); struct regstat_n_sets_and_refs_t { int sets; int refs; }; struct regstat_n_sets_and_refs_t *regstat_n_sets_and_refs; struct df_reg_info { unsigned int n_refs; }; struct df_d { struct df_reg_info **def_regs; struct df_reg_info **use_regs; }; struct df_d *df; static inline int REG_N_SETS (int regno) { return regstat_n_sets_and_refs[regno].sets; } __attribute__ ((noinline)) int max_reg_num (void) { return 100; } __attribute__ ((noinline)) void regstat_init_n_sets_and_refs (void) { unsigned int i; unsigned int max_regno = max_reg_num (); for (i = 0; i < max_regno; i++) { (regstat_n_sets_and_refs[i].sets = (df->def_regs[(i)]->n_refs)); (regstat_n_sets_and_refs[i].refs = (df->use_regs[(i)]->n_refs) + REG_N_SETS (i)); } } int a_sets[100] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99 }; int a_refs[100] = { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44, 46, 48, 50, 52, 54, 56, 58, 60, 62, 64, 66, 68, 70, 72, 74, 76, 78, 80, 82, 84, 86, 88, 90, 92, 94, 96, 98, 100, 102, 104, 106, 108, 110, 112, 114, 116, 118, 120, 122, 124, 126, 128, 130, 132, 134, 136, 138, 140, 142, 144, 146, 148, 150, 152, 154, 156, 158, 160, 162, 164, 166, 168, 170, 172, 174, 176, 178, 180, 182, 184, 186, 188, 190, 192, 194, 196, 198 }; int main () { struct df_reg_info *b[100], *c[100]; struct df_d df1; size_t s = sizeof (struct df_reg_info); struct regstat_n_sets_and_refs_t a[100]; df = &df1; regstat_n_sets_and_refs = a; int i; for (i = 0; i < 100; i++) { b[i] = (struct df_reg_info *) malloc (s); b[i]->n_refs = i; c[i] = (struct df_reg_info *) malloc (s); c[i]->n_refs = i; } df1.def_regs = b; df1.use_regs = c; regstat_init_n_sets_and_refs (); for (i = 0; i < 100; i++) if ((a[i].sets != a_sets[i]) || (a[i].refs != a_refs[i])) abort (); for (i = 0; i < 100; i++) { free (b[i]); free (c[i]); } return 0; } /* { dg-final { scan-rtl-dump-times "SMS succeeded" 1 "sms" { target powerpc*-*-* } } } */ /* { dg-final { cleanup-rtl-dump "sms" } } */ Index: modulo-sched.c =================================================================== --- modulo-sched.c (revision 179138) +++ modulo-sched.c (working copy) @@ -476,7 +476,12 @@ generate_reg_moves (partial_schedule_ptr sbitmap *uses_of_defs; rtx last_reg_move; rtx prev_reg, old_reg; - + rtx set = single_set (u->insn); + + /* Skip instructions that do not set a register. */ + if (set && !REG_P (SET_DEST (set))) + continue; + /* Compute the number of reg_moves needed for u, by looking at life ranges started at u (excluding self-loops). */ for (e = u->out; e; e = e->next_out)