From patchwork Mon Aug 8 17:11:37 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 109010 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 5CA07B6F6F for ; Tue, 9 Aug 2011 03:12:17 +1000 (EST) Received: (qmail 19549 invoked by alias); 8 Aug 2011 17:12:14 -0000 Received: (qmail 19534 invoked by uid 22791); 8 Aug 2011 17:12:12 -0000 X-SWARE-Spam-Status: No, hits=-2.0 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, TW_DC, TW_DD, TW_DQ, TW_OV, TW_PX, TW_VD, TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail-yx0-f175.google.com (HELO mail-yx0-f175.google.com) (209.85.213.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Mon, 08 Aug 2011 17:11:57 +0000 Received: by yxi19 with SMTP id 19so3072442yxi.20 for ; Mon, 08 Aug 2011 10:11:56 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.60.16 with SMTP id i16mr6026738wfa.343.1312823512631; Mon, 08 Aug 2011 10:11:52 -0700 (PDT) Received: by 10.142.118.41 with HTTP; Mon, 8 Aug 2011 10:11:37 -0700 (PDT) In-Reply-To: <201108081530.p78FUAgM029764@d06av02.portsmouth.uk.ibm.com> References: <201108081530.p78FUAgM029764@d06av02.portsmouth.uk.ibm.com> Date: Mon, 8 Aug 2011 19:11:37 +0200 Message-ID: Subject: Re: [RFC PATCH, i386]: Allow zero_extended addresses (+ problems with reload and offsetable address, "o" constraint) From: Uros Bizjak To: Ulrich Weigand Cc: gcc-patches@gcc.gnu.org, GCC Development , "H.J. Lu" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Mon, Aug 8, 2011 at 5:30 PM, Ulrich Weigand wrote: > Uros Bizjak wrote: > >> Although, it would be nice for reload to subsequently fix CSE'd >> non-offsetable memory by copying address to temporary reg (*as said in >> the documentation*), we could simply require an XMM temporary for >> TImode reloads to/from integer registers, and this fixes ICE for x32. > > Moves are special as far as reload is concerned.  If there is already > a move instruction present *before* reload, it will get fixed up > according to its constraints as any other instruction. > > However, reload will *introduce* new moves as part of its operation, > and those will *not* themselves get reloaded.  Instead, reload simply > assumes that every plain move will just succeed without requiring > any reload; if this is not true, the target *must* provide a > secondary reload for this move. > > (Note that the secondary reload could also work by reloading the > target address into a temporary; that's up to the target to > implement.) Whoa, indeed. Using attached patch that reloads memory address instead of going through XMM register, the code for the testcase improves from: test: .LFB0: .cfi_startproc pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 sall $4, %esi addl %edi, %esi movdqa (%esi), %xmm0 movdqa %xmm0, -16(%rsp) movq -16(%rsp), %rcx movq -8(%rsp), %rbx addq $1, %rcx adcq $0, %rbx movq %rcx, -16(%rsp) sall $4, %edx movq %rbx, -8(%rsp) movdqa -16(%rsp), %xmm0 movdqa %xmm0, (%esi) pxor %xmm0, %xmm0 movdqa %xmm0, (%edx,%esi) popq %rbx .cfi_def_cfa_offset 8 ret to: test: .LFB0: .cfi_startproc sall $4, %esi pushq %rbx .cfi_def_cfa_offset 16 .cfi_offset 3, -16 addl %edi, %esi pxor %xmm0, %xmm0 mov %esi, %eax movq (%rax), %rcx movq 8(%rax), %rbx addq $1, %rcx adcq $0, %rbx sall $4, %edx movq %rcx, (%rax) movq %rbx, 8(%rax) movdqa %xmm0, (%edx,%esi) popq %rbx .cfi_def_cfa_offset 8 ret H.J., can you please test attached patch? This optimization won't trigger on x86_64 anymore. Thanks, Uros. Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 177565) +++ config/i386/i386.md (working copy) @@ -2073,6 +2073,40 @@ (const_string "orig"))) (set_attr "mode" "SI,DI,DI,DI,SI,DI,DI,DI,DI,DI,TI,DI,TI,DI,DI,DI,DI,DI")]) +;; Reload patterns to support multi-word load/store +;; with non-offsetable address. +(define_expand "reload_noff_store" + [(parallel [(match_operand 0 "memory_operand" "=m") + (match_operand 1 "register_operand" "r") + (match_operand:DI 2 "register_operand" "=&r")])] + "TARGET_64BIT" +{ + rtx mem = operands[0]; + rtx addr = XEXP (mem, 0); + + emit_move_insn (operands[2], addr); + mem = replace_equiv_address_nv (mem, operands[2]); + + emit_insn (gen_rtx_SET (VOIDmode, mem, operands[1])); + DONE; +}) + +(define_expand "reload_noff_load" + [(parallel [(match_operand 0 "register_operand" "=r") + (match_operand 1 "memory_operand" "m") + (match_operand:DI 2 "register_operand" "=r")])] + "TARGET_64BIT" +{ + rtx mem = operands[1]; + rtx addr = XEXP (mem, 0); + + emit_move_insn (operands[2], addr); + mem = replace_equiv_address_nv (mem, operands[2]); + + emit_insn (gen_rtx_SET (VOIDmode, operands[0], mem)); + DONE; +}) + ;; Convert impossible stores of immediate to existing instructions. ;; First try to get scratch register and go through it. In case this ;; fails, move by 32bit parts. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 177566) +++ config/i386/i386.c (working copy) @@ -28245,18 +28245,25 @@ static reg_class_t ix86_secondary_reload (bool in_p, rtx x, reg_class_t rclass, - enum machine_mode mode, - secondary_reload_info *sri ATTRIBUTE_UNUSED) + enum machine_mode mode, secondary_reload_info *sri) { /* Double-word spills from general registers to non-offsettable memory - references (zero-extended addresses) go through XMM register. */ + references (zero-extended addresses) require special handling. */ if (TARGET_64BIT && MEM_P (x) && GET_MODE_SIZE (mode) > UNITS_PER_WORD && rclass == GENERAL_REGS && !offsettable_memref_p (x)) - return SSE_REGS; + { + sri->icode = (in_p + ? CODE_FOR_reload_noff_load + : CODE_FOR_reload_noff_store); + /* Add the cost of move to a temporary. */ + sri->extra_cost = 1; + return NO_REGS; + } + /* QImode spills from non-QI registers require intermediate register on 32bit targets. */ if (!TARGET_64BIT