From patchwork Fri Aug 5 18:51:23 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 108712 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 80D01B6F69 for ; Sat, 6 Aug 2011 04:51:41 +1000 (EST) Received: (qmail 32586 invoked by alias); 5 Aug 2011 18:51:39 -0000 Received: (qmail 32572 invoked by uid 22791); 5 Aug 2011 18:51:38 -0000 X-SWARE-Spam-Status: No, hits=-2.3 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, TW_ZJ, T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from mail-pz0-f49.google.com (HELO mail-pz0-f49.google.com) (209.85.210.49) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 05 Aug 2011 18:51:24 +0000 Received: by pzk6 with SMTP id 6so4308822pzk.8 for ; Fri, 05 Aug 2011 11:51:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.48.10 with SMTP id v10mr2365674wfv.185.1312570283531; Fri, 05 Aug 2011 11:51:23 -0700 (PDT) Received: by 10.143.34.2 with HTTP; Fri, 5 Aug 2011 11:51:23 -0700 (PDT) Date: Fri, 5 Aug 2011 20:51:23 +0200 Message-ID: Subject: [RFC PATCH, i386]: Allow zero_extended addresses (+ problems with reload and offsetable address, "o" constraint) From: Uros Bizjak To: gcc-patches@gcc.gnu.org Cc: GCC Development Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hello! Attached patch introduces generation of addr32 prefixed addresses, mainly intended to merge ZERO_EXTRACTed LEA calculations into address. After fixing various inconsistencies with "o" constraints, the patch works surprisingly well (in its current form fixes all reported problems in the PR [1]), but one problem remains w.r.t. handling of "o" constraint. Patched gcc ICEs on gcc.dg/torture/pr47744-2.c with: $ ~/gcc-build-fast/gcc/cc1 -O2 -mx32 -std=gnu99 -quiet pr47744-2.c pr47744-2.c: In function ‘matmul_i16’: pr47744-2.c:40:1: error: insn does not satisfy its constraints: (insn 116 66 67 4 (set (reg:TI 0 ax) (mem:TI (zero_extend:DI (plus:SI (reg:SI 4 si [orig:114 ivtmp.26 ] [114]) (reg:SI 5 di [orig:101 dest_y ] [101]))) [6 MEM[base: dest_y_18, index: ivtmp.26_53, offset: 0B]+0 S16 A128])) pr47744-2.c:34 60 {*movti_internal_rex64} (nil)) pr47744-2.c:40:1: internal compiler error: in reload_cse_simplify_operands, at postreload.c:403 Please submit a full bug report, ... ... due to the fact that the address is not offsetable, and plus ((zero_extend (...)) (const_int ...)) gets rejected from ix86_legitimate_address_p. However, the section "16.8.1 Simple Constraints" of the documentation claims: --quote-- * A nonoffsettable memory reference can be reloaded by copying the address into a register. So if the constraint uses the letter `o', all memory references are taken care of. --/quote-- As I read this sentence, the RTX is forced into a temporary register, and reload tries to satisfy "o" constraint with plus ((reg ...) (const_int ...)), as said at the introduction of "o" constraint a couple of pages earlier. Unfortunately, this does not seem to be the case. Is there anything wrong with my approach, or is there something wrong in reload? 2011-08-05 Uros Bizjak PR target/49781 * config/i386/i386.c (ix86_decompose_address): Allow zero-extended SImode addresses. (ix86_print_operand_address): Handle zero-extended addresses. (memory_address_length): Add length of addr32 prefix for zero-extended addresses. * config/i386/predicates.md (lea_address_operand): Reject zero-extended operands. Patch is otherwise bootstrapped and tested on x86_64-pc-linux-gnu {,-m32} without regressions. [1] http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49781 Thanks, Uros. Index: config/i386/predicates.md =================================================================== --- config/i386/predicates.md (revision 177456) +++ config/i386/predicates.md (working copy) @@ -801,6 +801,10 @@ struct ix86_address parts; int ok; + /* LEA handles zero-extend by itself. */ + if (GET_CODE (op) == ZERO_EXTEND) + return false; + ok = ix86_decompose_address (op, &parts); gcc_assert (ok); return parts.seg == SEG_DEFAULT; Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 177456) +++ config/i386/i386.c (working copy) @@ -11146,6 +11146,14 @@ ix86_decompose_address (rtx addr, struct ix86_addr int retval = 1; enum ix86_address_seg seg = SEG_DEFAULT; + /* Allow zero-extended SImode addresses, + they will be emitted with addr32 prefix. */ + if (TARGET_64BIT + && GET_CODE (addr) == ZERO_EXTEND + && GET_MODE (addr) == DImode + && GET_MODE (XEXP (addr, 0)) == SImode) + addr = XEXP (addr, 0); + if (REG_P (addr)) base = addr; else if (GET_CODE (addr) == SUBREG) @@ -14163,9 +14171,13 @@ ix86_print_operand_address (FILE *file, rtx addr) } else { - /* Print DImode registers on 64bit targets to avoid addr32 prefixes. */ - int code = TARGET_64BIT ? 'q' : 0; + int code = 0; + /* Print SImode registers for zero-extended addresses to force + addr32 prefix. Otherwise print DImode registers to avoid it. */ + if (TARGET_64BIT) + code = (GET_CODE (addr) == ZERO_EXTEND) ? 'l' : 'q'; + if (ASSEMBLER_DIALECT == ASM_ATT) { if (disp) @@ -21776,7 +21788,8 @@ assign_386_stack_local (enum machine_mode mode, en } /* Calculate the length of the memory address in the instruction - encoding. Does not include the one-byte modrm, opcode, or prefix. */ + encoding. Includes addr32 prefix, does not include the one-byte modrm, + opcode, or other prefixes. */ int memory_address_length (rtx addr) @@ -21803,8 +21816,10 @@ memory_address_length (rtx addr) base = parts.base; index = parts.index; disp = parts.disp; - len = 0; + /* Add length of addr32 prefix. */ + len = (GET_CODE (addr) == ZERO_EXTEND); + /* Rule of thumb: - esp as the base always wants an index, - ebp as the base always wants a displacement,