From patchwork Thu Jul 9 14:01:24 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Simon Dardis X-Patchwork-Id: 493441 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id DE1881402B2 for ; Fri, 10 Jul 2015 00:01:39 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=ty8vhrv4; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=IFg gf4fE6zmv5ppp2baZ81rgeudvnWuUu9fjwODdde7eyhTjzhzbByO/3IzuocCy4ov fUbSYRq04km8HL+U44UzJDOpJtGIYfDZ2dQ8+U1ACy6XcSzVIPZzUqzVlyeheaAj Cw18tdk6UbgaznNUvfKPBMW4f6xGMMeU6POaI4wY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:subject:date:message-id:content-type :content-transfer-encoding:mime-version; s=default; bh=zHVAhJFQI TuB1FQgQ+06yOd3CI8=; b=ty8vhrv4utUSsjtH0IcvtIFVi8KVoO2ifJ74vg+TQ tCDvXpEIhKhQQVvh9DSXj11+83MJkdxPS9CwGqPbEwzgqV0Qj4u+ZSTm0uU14Yxr 8qjFOhRp+UMnX9Ifmnhotc+AxiRn67pIGvwhjJ2GOhUOKl+xuyR5mxInnH9uNLgX eA= Received: (qmail 84004 invoked by alias); 9 Jul 2015 14:01:31 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 83987 invoked by uid 89); 9 Jul 2015 14:01:29 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.7 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mailapp01.imgtec.com Received: from mailapp01.imgtec.com (HELO mailapp01.imgtec.com) (195.59.15.196) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 09 Jul 2015 14:01:28 +0000 Received: from KLMAIL01.kl.imgtec.org (unknown [192.168.5.35]) by Websense Email Security Gateway with ESMTPS id DDCC689C25924 for ; Thu, 9 Jul 2015 15:01:22 +0100 (IST) Received: from hhmail02.hh.imgtec.org (10.100.10.20) by KLMAIL01.kl.imgtec.org (192.168.5.35) with Microsoft SMTP Server (TLS) id 14.3.195.1; Thu, 9 Jul 2015 15:01:25 +0100 Received: from hhmail02.hh.imgtec.org ([::1]) by hhmail02.hh.imgtec.org ([::1]) with mapi id 14.03.0235.001; Thu, 9 Jul 2015 15:01:25 +0100 From: Simon Dardis To: "gcc-patches@gcc.gnu.org" Subject: [PATCH] Mips: Inline memcpy for R6 Date: Thu, 9 Jul 2015 14:01:24 +0000 Message-ID: MIME-Version: 1.0 X-IsSubscribed: yes Hello, This patch enables inline memcpy for R6 which was previously disabled and adds support for expansion when source and destination are at least half-word aligned. gcc/ * config/mips/mips.c (mips_expand_block_move): Enable inline memcpy expansion when !ISA_HAS_LWL_LWR. (mips_block_move_straight): Update the size of elements copied to account for alignment when !ISA_HAS_LWL_LWR. * config/mips/mips.h (MIPS_MIN_MOVE_MEM_ALIGN): New macro. gcc/testsuite/ * inline-memcpy-1.c: Test for inline expansion of memcpy. * inline-memcpy-2.c: Ditto. * inline-memcpy-3.c: Ditto. * inline-memcpy-4.c: Ditto. * inline-memcpy-5.c: Ditto. Thanks, Simon diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c index 6f5421a..1f7c105 100644 --- a/gcc/config/mips/mips.c +++ b/gcc/config/mips/mips.c @@ -8187,12 +8187,22 @@ mips_block_move_straight (rtx dest, rtx src, HOST_WIDE_INT length) half-word alignment, it is usually better to move in half words. For instance, lh/lh/sh/sh is usually better than lwl/lwr/swl/swr and lw/lw/sw/sw is usually better than ldl/ldr/sdl/sdr. - Otherwise move word-sized chunks. */ - if (MEM_ALIGN (src) == BITS_PER_WORD / 2 - && MEM_ALIGN (dest) == BITS_PER_WORD / 2) - bits = BITS_PER_WORD / 2; + Otherwise move word-sized chunks. + + For ISA_HAS_LWL_LWR we rely on the lwl/lwr & swl/swr load. Otherwise + picking the minimum of alignment or BITS_PER_WORD gets us the + desired size for bits. */ + + if (!ISA_HAS_LWL_LWR) + bits = MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest))); else - bits = BITS_PER_WORD; + { + if (MEM_ALIGN (src) == BITS_PER_WORD / 2 + && MEM_ALIGN (dest) == BITS_PER_WORD / 2) + bits = BITS_PER_WORD / 2; + else + bits = BITS_PER_WORD; + } mode = mode_for_size (bits, MODE_INT, 0); delta = bits / BITS_PER_UNIT; @@ -8311,8 +8321,8 @@ bool mips_expand_block_move (rtx dest, rtx src, rtx length) { if (!ISA_HAS_LWL_LWR - && (MEM_ALIGN (src) < BITS_PER_WORD - || MEM_ALIGN (dest) < BITS_PER_WORD)) + && (MEM_ALIGN (src) < MIPS_MIN_MOVE_MEM_ALIGN + || MEM_ALIGN (dest) < MIPS_MIN_MOVE_MEM_ALIGN)) return false; if (CONST_INT_P (length)) diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h index a2380e5..6578ae5 100644 --- a/gcc/config/mips/mips.h +++ b/gcc/config/mips/mips.h @@ -3041,6 +3041,9 @@ while (0) #undef PTRDIFF_TYPE #define PTRDIFF_TYPE (POINTER_SIZE == 64 ? "long int" : "int") +/* The minimum alignment of any expanded block move. */ +#define MIPS_MIN_MOVE_MEM_ALIGN 16 + /* The maximum number of bytes that can be copied by one iteration of a movmemsi loop; see mips_block_move_loop. */ #define MIPS_MAX_MOVE_BYTES_PER_LOOP_ITER \ diff --git a/gcc/testsuite/gcc.target/mips/inline-memcpy-1.c b/gcc/testsuite/gcc.target/mips/inline-memcpy-1.c new file mode 100644 index 0000000..5a254b1 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/inline-memcpy-1.c @@ -0,0 +1,16 @@ +/* { dg-options "-fno-common isa_rev>=6" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" "-Os" } { "" } } */ +/* { dg-final { scan-assembler-not "\tmemcpy" } } */ + +/* Test that memcpy is inline for target hardware + without swl, swr. */ + +#include + +char c[40] __attribute__ ((aligned(8))); + +void +f1 () +{ + memcpy (c, "1234567890QWERTYUIOPASDFGHJKLZXCVBNM", 32); +} diff --git a/gcc/testsuite/gcc.target/mips/inline-memcpy-2.c b/gcc/testsuite/gcc.target/mips/inline-memcpy-2.c new file mode 100644 index 0000000..c06be15 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/inline-memcpy-2.c @@ -0,0 +1,17 @@ +/* { dg-options "-fno-common isa_rev>=6" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */ +/* { dg-final { scan-assembler-not "\tmemcpy" } } */ +/* { dg-final { scan-assembler-times "\tsh\t" 16 } } */ + +/* Test that inline memcpy is expanded for target hardware without + swl, swr when alignment is halfword and sufficent shs are produced. */ + +#include + +char c[40] __attribute__ ((aligned(2))); + +void +f1 () +{ + memcpy (c, "1234567890QWERTYUIOPASDFGHJKLZXCVBNM", 32); +} diff --git a/gcc/testsuite/gcc.target/mips/inline-memcpy-3.c b/gcc/testsuite/gcc.target/mips/inline-memcpy-3.c new file mode 100644 index 0000000..96a0387 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/inline-memcpy-3.c @@ -0,0 +1,18 @@ +/* { dg-options "-fno-common isa_rev<=5" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" "-Os"} { "" } } */ +/* { dg-final { scan-assembler-not "\tmemcpy" } } */ +/* { dg-final { scan-assembler-times "swl" 8 } } */ +/* { dg-final { scan-assembler-times "swr" 8 } } */ + +/* Test that inline memcpy for hardware with swl, swr handles subword + alignment and produces enough swl/swrs for mips32. */ + +#include + +char c[40] __attribute__ ((aligned(2))); + +void +f1 () +{ + memcpy (c, "1234567890QWERTYUIOPASDFGHJKLZXCVBNM", 32); +} diff --git a/gcc/testsuite/gcc.target/mips/inline-memcpy-4.c b/gcc/testsuite/gcc.target/mips/inline-memcpy-4.c new file mode 100644 index 0000000..0e7a22e --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/inline-memcpy-4.c @@ -0,0 +1,18 @@ +/* { dg-options "-fno-common isa_rev<=5 -mabi=64" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" "-Os"} { "" } } */ +/* { dg-final { scan-assembler-not "\tmemcpy" } } */ +/* { dg-final { scan-assembler-times "sdl" 4 } } */ +/* { dg-final { scan-assembler-times "sdr" 4 } } */ + +/* Test that inline memcpy for hardware with sdl, sdr handles subword + alignment and produces enough sdl/sdrs on n64. */ + +#include + +char c[40] __attribute__ ((aligned(2))); + +void +f1 () +{ + memcpy (c, "1234567890QWERTYUIOPASDFGHJKLZXCVBNM", 32); +} diff --git a/gcc/testsuite/gcc.target/mips/inline-memcpy-5.c b/gcc/testsuite/gcc.target/mips/inline-memcpy-5.c new file mode 100644 index 0000000..1b9fa16 --- /dev/null +++ b/gcc/testsuite/gcc.target/mips/inline-memcpy-5.c @@ -0,0 +1,18 @@ +/* { dg-options "-fno-common isa_rev<=5 -mabi=n32" } */ +/* { dg-skip-if "code quality test" { *-*-* } { "-O0" "-Os"} { "" } } */ +/* { dg-final { scan-assembler-not "\tmemcpy" } } */ +/* { dg-final { scan-assembler-times "sdl" 4 } } */ +/* { dg-final { scan-assembler-times "sdr" 4 } } */ + +/* Test that inline memcpy for hardware with sdl, sdr handles subword + alignment and produces enough sdr/sdls on n32. */ + +#include + +char c[40] __attribute__ ((aligned(2))); + +void +f1 () +{ + memcpy (c, "1234567890QWERTYUIOPASDFGHJKLZXCVBNM", 32); +}