From patchwork Thu Apr 28 06:57:38 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Senthil Kumar Selvaraj X-Patchwork-Id: 616028 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3qwSMT16znz9t5w for ; Thu, 28 Apr 2016 16:57:55 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=wetmWaBh; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :references:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type; q=dns; s=default; b=HSmAMhg9v/g7pU/3 eUE2z9tmNiuX6wz/TjZ0Jr/26sVn02dYv8loYHvijKUQmpzTqhuaVVK/YoAOLWdJ KuOwXIEAIAEy8B45l/uYZyhiqKqaIV6bPNsf++rB4OZ39ywdeCfCZyjatagJe27e 0PGyS/Cx8BZxCmgeHvbFFFxYsKY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :references:from:to:cc:subject:in-reply-to:date:message-id :mime-version:content-type; s=default; bh=WVAiGNJSaI2rL1MONj+Gkr 0uw7k=; b=wetmWaBhdH1bv6AfvOmwI4UEcMH0fJZI4S/CaOz77X4XEzGg4sMDGZ 4SdZ+9N89u82bTpExwQZjJaOE2vQA/CxAa8dUNNgRye7CUO5My98/KZ2it685iiV uJHxBN8RUAlweiln11JkQNeQZKT174LrYjb6EVuiagpT91CA8QBGU= Received: (qmail 120668 invoked by alias); 28 Apr 2016 06:57:41 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 120651 invoked by uid 89); 28 Apr 2016 06:57:40 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.9 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=Makefileam, top_srcdir, Makefile.am, am_cppflags X-HELO: eusmtp01.atmel.com Received: from eusmtp01.atmel.com (HELO eusmtp01.atmel.com) (212.144.249.242) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Thu, 28 Apr 2016 06:57:30 +0000 Received: from HNOCHT01.corp.atmel.com (10.161.30.161) by eusmtp01.atmel.com (10.161.101.30) with Microsoft SMTP Server (TLS) id 14.3.235.1; Thu, 28 Apr 2016 08:57:21 +0200 Received: from jaguar.atmel.com (10.161.30.18) by HNOCHT01.corp.atmel.com (10.161.30.161) with Microsoft SMTP Server (TLS) id 14.3.235.1; Thu, 28 Apr 2016 08:57:24 +0200 References: <87h9fd1vqf.fsf@atmel.com> <5706B950.3090003@redhat.com> <8737qnrq4i.fsf@atmel.com> <571E30FF.4020907@redhat.com> User-agent: mu4e 0.9.17; emacs 24.5.1 From: Senthil Kumar Selvaraj To: Bernd Schmidt CC: Senthil Kumar Selvaraj , "gcc-patches@gcc.gnu.org" , , Subject: Re: [Patch] Fix PR 60040 In-Reply-To: <571E30FF.4020907@redhat.com> Date: Thu, 28 Apr 2016 12:27:38 +0530 Message-ID: <87potafcf1.fsf@atmel.com> MIME-Version: 1.0 X-IsSubscribed: yes Bernd Schmidt writes: > On 04/15/2016 02:52 PM, Senthil Kumar Selvaraj wrote: >> >> For both testcases in the PR, reload fails to take into account that >> FP-SP elimination can no longer be performed, and tries to find reload >> regs for an rtx generated when FP-SP elimination was valid. >> >> 1. reload initializes elim table with FP->SP elimination enabled. >> 2. alter_reg for a pseudo allocates a stack slot for the pseudo, and sets >> reg_equiv_memory_loc to frame_pointer_rtx plus offset. It also sets >> something_was_spilled to true. >> 3. The main reload loop starts, and it resets something_was_spilled to false. >> 4. reload calls eliminate_regs for the pseudo and sets reg_equiv_address to >> (mem(SP + offset)). >> 5. calculate_needs_all_insns pushes a reload for SP (for the AVR target, >> SP cannot be a pointer reg - it needs to be reloaded into X Y or Z regs). >> 6. update_eliminables_and_spill calls targetm.frame_pointer_required, >> which returns true. That causes can_eliminate for FP->SP to be reset >> to zero, and FP to be added to bad_spill_regs_global. For the AVR, >> FP is Y, one of the 3 pointer regs. reload also notes that something >> has changed, and that the loop needs to run again. >> 7. reload still calls select_reload_regs, and find_regs fails to find a >> pointer reg to reload SP, which is unnecessary as FP->SP elimination >> had been disabled anyway in (6). >> >> IOW, reload fails to find pointer regs for an RTL expression that was >> created when FP->SP elimination was true, even after it turns out that >> the elimination can't be done after all. The patch tries to detect that >> - if it knows the loop is going to run again, it silences the failure. >> >> Also note that at a different point in the loop, the reload loop starts >> over if something_was_spilled (line 982-986). If set outside the reload >> loop by alter_reg, it gets reset at (3) - not sure why. I'd think a >> "continue" after update_eliminables_and_spill (line 1019-1022) would >> also work - haven't tested it though. > > That's what I was going to ask next. I think if that works for you, I > think that's an approach that would make more sense. > > However, it looks like after this call, and the other one in the same > loop, should probably be calling finish_spills. It looks like an > oversight in the existing code that this doesn't happen for the existing > continue. Please try adding it in both places. Tried that out - that works too. Bootstrap and regression tests on x86_64 went fine, and there were no regressions for the avr target as well - the extra three passes that showed up with the initial patch show up now too. I couldn't put a continue before select_reload_regs though. save_call_clobbered_regs sets up a few things that are cleaned up in delete_caller_save_insns a little later in the loop. So I only exclude select_reload_regs - finish_spills and cleanup will happen in both cases. Does this patch look ok? If yes, could you commit it for me please? I don't have commit access. Regards Senthil gcc/ChangeLog 2016-04-28 Senthil Kumar Selvaraj PR target/60040 * reload1.c (reload): Call finish_spills before restarting reload loop. Skip select_reload_regs if update_eliminables_and_spill returns true. gcc/testsuite/ChangeLog 2016-04-28 Sebastian Huber Matthijs Kooijman Senthil Kumar Selvaraj PR target/60040 * gcc.target/avr/pr60040-1.c: New. * gcc.target/avr/pr60040-2.c: New. diff --git gcc/reload1.c gcc/reload1.c index c2800f8..d6fcece 100644 --- gcc/reload1.c +++ gcc/reload1.c @@ -981,7 +981,8 @@ reload (rtx_insn *first, int global) /* If we allocated another stack slot, redo elimination bookkeeping. */ if (something_was_spilled || starting_frame_size != get_frame_size ()) { - update_eliminables_and_spill (); + if (update_eliminables_and_spill ()) + finish_spills (global); continue; } @@ -1021,10 +1022,12 @@ reload (rtx_insn *first, int global) did_spill = 1; something_changed = 1; } - - select_reload_regs (); - if (failure) - goto failed; + else + { + select_reload_regs (); + if (failure) + goto failed; + } if (insns_need_reload != 0 || did_spill) something_changed |= finish_spills (global); diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog index 65fe0d0..bbaa3a2 100644 --- gcc/testsuite/ChangeLog +++ gcc/testsuite/ChangeLog @@ -1,7 +1,3 @@ -2016-04-26 Bernd Schmidt - - * gcc.target/i386/lzcnt-1.c: Allow a different lzcntw output register. - 2016-04-25 Richard Biener PR tree-optimization/70780 diff --git gcc/testsuite/gcc.target/avr/pr60040-1.c gcc/testsuite/gcc.target/avr/pr60040-1.c new file mode 100644 index 0000000..4fe296b --- /dev/null +++ gcc/testsuite/gcc.target/avr/pr60040-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-Os" } */ + +float dhistory[10]; +float test; + +float getSlope(float history[]) { + float sumx = 0; + float sumy = 0; + float sumxy = 0; + float sumxsq = 0; + float rate = 0; + int n = 10; + + int i; + for (i=1; i< 11; i++) { + sumx = sumx + i; + sumy = sumy + history[i-1]; + sumy = sumy + history[i-1]; + sumxsq = sumxsq + (i*i); + } + + rate = sumy+sumx+sumxsq; + return rate; +} + +void loop() { + test = getSlope(dhistory); +} diff --git gcc/testsuite/gcc.target/avr/pr60040-2.c gcc/testsuite/gcc.target/avr/pr60040-2.c new file mode 100644 index 0000000..c40d49f --- /dev/null +++ gcc/testsuite/gcc.target/avr/pr60040-2.c @@ -0,0 +1,112 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +typedef unsigned char __uint8_t; +typedef short unsigned int __uint16_t; +typedef long unsigned int __uint32_t; +typedef __uint8_t uint8_t ; +typedef __uint16_t uint16_t ; +typedef __uint32_t uint32_t ; +typedef __builtin_va_list __gnuc_va_list; +typedef __gnuc_va_list va_list; +typedef enum rtems_blkdev_request_op { + RTEMS_BLKDEV_REQ_READ, +} rtems_fdisk_segment_desc; +typedef struct rtems_fdisk_driver_handlers +{ + int (*blank) (const rtems_fdisk_segment_desc* sd, + uint32_t device, + uint32_t segment, + uint32_t offset, + uint32_t size); +} rtems_fdisk_driver_handlers; +typedef struct rtems_fdisk_page_desc +{ + uint16_t flags; + uint32_t block; +} rtems_fdisk_page_desc; +typedef struct rtems_fdisk_segment_ctl +{ + rtems_fdisk_page_desc* page_descriptors; + uint32_t pages_active; +} rtems_fdisk_segment_ctl; +typedef struct rtems_fdisk_segment_ctl_queue +{ +} rtems_fdisk_segment_ctl_queue; +typedef struct rtems_fdisk_device_ctl +{ + uint32_t flags; + uint8_t* copy_buffer; +} rtems_flashdisk; + +extern void rtems_fdisk_error (const char *, ...); +extern int rtems_fdisk_seg_write(const rtems_flashdisk*, + rtems_fdisk_segment_ctl*, + uint32_t, + const rtems_fdisk_page_desc* page_desc, + uint32_t); + +void rtems_fdisk_printf (const rtems_flashdisk* fd, const char *format, ...) +{ + { + va_list args; + __builtin_va_start(args,format); + } +} +static int +rtems_fdisk_seg_blank_check (const rtems_flashdisk* fd, + rtems_fdisk_segment_ctl* sc, + uint32_t offset, + uint32_t size) +{ + uint32_t device; + uint32_t segment; + const rtems_fdisk_segment_desc* sd; + const rtems_fdisk_driver_handlers* ops; + return ops->blank (sd, device, segment, offset, size); +} +static int +rtems_fdisk_seg_write_page_desc (const rtems_flashdisk* fd, + rtems_fdisk_segment_ctl* sc, + uint32_t page, + const rtems_fdisk_page_desc* page_desc) +{ + uint32_t offset = page * sizeof (rtems_fdisk_page_desc); + if ((fd->flags & (1 << 3))) + { + int ret = rtems_fdisk_seg_blank_check (fd, sc, + offset, + sizeof (rtems_fdisk_page_desc)); + } + return rtems_fdisk_seg_write (fd, sc, offset, + page_desc, sizeof (rtems_fdisk_page_desc)); +} +void +rtems_fdisk_recycle_segment (rtems_flashdisk* fd, + rtems_fdisk_segment_ctl* ssc, + rtems_fdisk_segment_ctl* dsc, + uint32_t *pages) +{ + int ret; + uint32_t spage; + uint32_t used = 0; + uint32_t active = 0; + { + rtems_fdisk_page_desc* spd = &ssc->page_descriptors[spage]; + { + rtems_fdisk_page_desc* dpd; + uint32_t dpage; + dpd = &dsc->page_descriptors[dpage]; + *dpd = *spd; + ret = rtems_fdisk_seg_write_page_desc (fd, + dsc, + dpage, dpd); + } + } + rtems_fdisk_printf (fd, "ssc end: %d-%d: p=%ld, a=%ld, u=%ld", + pages, active, used); + { + rtems_fdisk_error ("compacting: ssc pages not 0: %d", + ssc->pages_active); + } +} diff --git gcc/testsuite/gcc.target/i386/lzcnt-1.c gcc/testsuite/gcc.target/i386/lzcnt-1.c index c7054d1..f6240d1b 100644 --- gcc/testsuite/gcc.target/i386/lzcnt-1.c +++ gcc/testsuite/gcc.target/i386/lzcnt-1.c @@ -1,6 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -mlzcnt " } */ -/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)\[ad\]\[xi\]" } } */ +/* { dg-final { scan-assembler "lzcntw\[^\\n]*(%|)ax" } } */ #include diff --git libcilkrts/ChangeLog libcilkrts/ChangeLog index 8fada8a..ed26a3a 100644 --- libcilkrts/ChangeLog +++ libcilkrts/ChangeLog @@ -1,9 +1,3 @@ -2016-04-26 Rainer Orth - - PR target/60290 - * Makefile.am (GENERAL_FLAGS): Add -funwind-tables. - * Makefile.in: Regenerate. - 2015-11-09 Igor Zamyatin PR target/66326 diff --git libcilkrts/Makefile.am libcilkrts/Makefile.am index 4f944dd..70538a2 100644 --- libcilkrts/Makefile.am +++ libcilkrts/Makefile.am @@ -43,9 +43,6 @@ GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime -I$(top_srcdir)/ # Enable Intel Cilk Plus extension GENERAL_FLAGS += -fcilkplus -# Always generate unwind tables -GENERAL_FLAGS += -funwind-tables - AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99 AM_CPPFLAGS = $(GENERAL_FLAGS) AM_LDFLAGS = $(XLDFLAGS) diff --git libcilkrts/Makefile.in libcilkrts/Makefile.in index a25d1c6..629aa6a 100644 --- libcilkrts/Makefile.in +++ libcilkrts/Makefile.in @@ -371,11 +371,9 @@ ACLOCAL_AMFLAGS = -I .. -I ../config # GENERAL_FLAGS += -D_Cilk_spawn="" -D_Cilk_sync="" -D_Cilk_for=for # Enable Intel Cilk Plus extension - -# Always generate unwind tables GENERAL_FLAGS = -I$(top_srcdir)/include -I$(top_srcdir)/runtime \ -I$(top_srcdir)/runtime/config/$(config_dir) \ - -DIN_CILK_RUNTIME=1 -fcilkplus -funwind-tables + -DIN_CILK_RUNTIME=1 -fcilkplus AM_CFLAGS = $(XCFLAGS) $(GENERAL_FLAGS) -std=c99 AM_CPPFLAGS = $(GENERAL_FLAGS) AM_LDFLAGS = $(XLDFLAGS)