From patchwork Wed Apr 17 14:21:00 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 237235 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 941EA2C012F for ; Thu, 18 Apr 2013 00:21:24 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; q=dns; s=default; b=n0HK+Xi/4rGgFW3p6 lH9UaTVkmyjSfyJVHdOoNPxQKijRnfjyTiZy9/Kjh5I67/mCOEOhWK4oJvbKi1Qk 9ClWhaI5gUynBbkEJYpUWX0CMVM7YjHJdriNKgXNINLV0eb+rJ4xikbeGg239KqW Vs+wR7E2lC16jbR3v1S6244xr4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:cc:subject:message-id:references:mime-version :content-type:in-reply-to; s=default; bh=glVH4mWovpiUPWT0EAAOLl4 +15c=; b=KV2kkCrIENkI0wVFi5pcxQUq2yx8KJWfBkom2bDz1LjRJASrwdheGwE GvYa1D54XGE0HMB8WNv14nWWDlfwadAU49tWvs4E0nVWTJFYgAEOnjQlYy3lFPiL XvqZgjCG3pWHhUxr96N82TKZWhNdngKPo0CcO/JXaNvhAuBKq3aw= Received: (qmail 10246 invoked by alias); 17 Apr 2013 14:21:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 10232 invoked by uid 89); 17 Apr 2013 14:21:16 -0000 X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_UNTRUST, RCVD_IN_DNSWL_LOW, RP_MATCHES_RCVD, TW_CP, TW_DD, TW_DQ, TW_OV, TW_VD autolearn=ham version=3.3.1 Received: from nikam.ms.mff.cuni.cz (HELO nikam.ms.mff.cuni.cz) (195.113.20.16) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 17 Apr 2013 14:21:03 +0000 Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id F272C542FEF; Wed, 17 Apr 2013 16:21:00 +0200 (CEST) Date: Wed, 17 Apr 2013 16:21:00 +0200 From: Jan Hubicka To: Michael Zolotukhin Cc: Jan Hubicka , "gcc-patches@gcc.gnu.org" Subject: Re: [PATCH, x86] Use vector moves in memmove expanding Message-ID: <20130417142100.GA10525@kam.mff.cuni.cz> References: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.20 (2009-06-14) > > Bootstrap/make check/Specs2k are passing on i686 and x86_64. Thanks for returning to this! glibc has quite comprehensive testsuite for stringop. It may be useful to test it with -minline-all-stringop -mstringop-stategy=vector I tested the patch on my core notebook and my memcpy micro benchmark. Vector loop is not a win since apparenlty we do not produce any SSE code for 64bit compilation. What CPUs and bock sizes this is intended for? Also the internal loop with -march=native seems to come out as: .L7: movq (%rsi,%r8), %rax movq 8(%rsi,%r8), %rdx movq 48(%rsi,%r8), %r9 movq 56(%rsi,%r8), %r10 movdqu 16(%rsi,%r8), %xmm3 movdqu 32(%rsi,%r8), %xmm1 movq %rax, (%rdi,%r8) movq %rdx, 8(%rdi,%r8) movdqa %xmm3, 16(%rdi,%r8) movdqa %xmm1, 32(%rdi,%r8) movq %r9, 48(%rdi,%r8) movq %r10, 56(%rdi,%r8) addq $64, %r8 cmpq %r11, %r8 It is not htat much of SSE enablement since RA seems to home the vars in integer regs. Could you please look into it? > > Changelog entry: > > 2013-04-10 Michael Zolotukhin > > * config/i386/i386-opts.h (enum stringop_alg): Add vector_loop. > * config/i386/i386.c (expand_set_or_movmem_via_loop): Use > adjust_address instead of change_address to keep info about alignment. > (emit_strmov): Remove. > (emit_memmov): New function. > (expand_movmem_epilogue): Refactor to properly handle bigger sizes. > (expand_movmem_epilogue): Likewise and return updated rtx for > destination. > (expand_constant_movmem_prologue): Likewise and return updated rtx for > destination and source. > (decide_alignment): Refactor, handle vector_loop. > (ix86_expand_movmem): Likewise. > (ix86_expand_setmem): Likewise. > * config/i386/i386.opt (Enum): Add vector_loop to option stringop_alg. > * emit-rtl.c (get_mem_align_offset): Compute alignment for MEM_REF. + } else return -1; This change out to go independently. I can not review it. I will make first look over the patch shortly, but please send updated patch fixing the problem with integer regs. Honza diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index 73a59b5..edb59da 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -1565,6 +1565,18 @@ get_mem_align_offset (rtx mem, unsigned int align) expr = inner; } } + else if (TREE_CODE (expr) == MEM_REF) + { + tree base = TREE_OPERAND (expr, 0); + tree byte_offset = TREE_OPERAND (expr, 1); + if (TREE_CODE (base) != ADDR_EXPR + || TREE_CODE (byte_offset) != INTEGER_CST) + return -1; + if (!DECL_P (TREE_OPERAND (base, 0)) + || DECL_ALIGN (TREE_OPERAND (base, 0)) < align) You can use TYPE_ALIGN here? In general can't we replace all the GIMPLE handling by get_object_alignment? + return -1; + offset += tree_low_cst (byte_offset, 1);