From patchwork Fri Jan 22 09:52:46 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
X-Patchwork-Id: 571597
Return-Path: 
 <gcc-patches-return-419787-incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 5C0DB14031D
	for <incoming@patchwork.ozlabs.org>;
	Fri, 22 Jan 2016 20:53:28 +1100 (AEDT)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org
	header.b=d+TApylJ; dkim-atps=neutral
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:date:from:mime-version:to:cc:subject:content-type;
	q=dns; s=default; b=fSCRvcen9YpcQ7mmAJKEdKLzim5jlSIO/xaoHjW1ANN
	9kHAQazGg2hfyjEAJyOXz04q9beQDaklz7giQ5xmeoZ/RwoAdFo4gtggytGY6hWT
	FuT89TW8EaWwlfC0WnNLBvsiuTv2rKdQE4Rw7elFjmUAS8SwkkWNTK3JRrYKgKGE
	=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id
	:list-unsubscribe:list-archive:list-post:list-help:sender
	:message-id:date:from:mime-version:to:cc:subject:content-type;
	s=default; bh=jGjC7AkqtK3xFYWYlWmajoJsc2Y=; b=d+TApylJlzFiRo91m
	ZHtsyTNwTOogDI83OBdcw5OivvovCeI8hH5Fqbw4Qtj5LifM/1AEIP6MqrCIgDVw
	/721/yyrjAW2EwrC6Fjb5A4UvnyJv9gcDTQet2Vyti0AMiweWUKn4HFuAR9CIhLr
	g7Z/n7Ut1hAzM4/WW26FS9xfjs=
Received: (qmail 54542 invoked by alias); 22 Jan 2016 09:53:00 -0000
Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm
Precedence: bulk
List-Id: <gcc-patches.gcc.gnu.org>
List-Unsubscribe: 
 <mailto:gcc-patches-unsubscribe-incoming=patchwork.ozlabs.org@gcc.gnu.org>
List-Archive: <http://gcc.gnu.org/ml/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-help@gcc.gnu.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 54366 invoked by uid 89); 22 Jan 2016 09:52:59 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=0.7 required=5.0 tests=BAYES_05,
	KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH,
	RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=preceding,
	wmul, wmul1c, UD:wmul-3.c
X-HELO: foss.arm.com
Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by
	sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
	Fri, 22 Jan 2016 09:52:50 +0000
Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249])	by
	usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 91ADE49;
	Fri, 22 Jan 2016 01:52:09 -0800 (PST)
Received: from [10.2.206.200] (e100706-lin.cambridge.arm.com
	[10.2.206.200])	by usa-sjc-imap-foss1.foss.arm.com (Postfix)
	with ESMTPSA id F18673F529; Fri, 22 Jan 2016 01:52:47 -0800 (PST)
Message-ID: <56A1FBEE.5020905@foss.arm.com>
Date: Fri, 22 Jan 2016 09:52:46 +0000
From: Kyrill Tkachov <kyrylo.tkachov@foss.arm.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: GCC Patches <gcc-patches@gcc.gnu.org>
CC: Ramana Radhakrishnan <ramana.radhakrishnan@arm.com>,
	Richard Earnshaw <Richard.Earnshaw@arm.com>,
	Jim Wilson <jim.wilson@linaro.org>
Subject: [PATCH][ARM][4/4] Adjust gcc.target/arm/wmul-[123].c tests

Hi all,

In this final patch I adjust the troublesome gcc.target/arm/wmul-[123].c tests
to make them more helpful.
gcc.target/arm/wmul-[12].c may now generate either sign-extending multiplies
(+accumulate) or normal 32-bit multiplies since the arguments to the multiplies
are already sign-extended by preceding loads.
So for these tests the patch adds an -mtune option where we know the sign-extending
form to be beneficial. This is, of course, reflected in the rtx costs that guide the
RTL optimisers (after the fixes in patches 2 and 3).

For wmul-3.c we now generate objectively better code.
For the loop we previously generated:
.L2:
     ldrh    r1, [lr, #2]!
     ldrh    ip, [r0, #2]!
     smulbb    ip, r1, ip
     sub    r4, r4, ip
     smulbb    r1, r1, r1
     sub    r2, r2, r1
     cmp    lr, r5
     bne    .L2

and now we generate:
.L2:
     ldrsh    r1, [ip, #2]!
     ldrsh    r4, [r0, #2]!
     mls    lr, r1, r4, lr
     mls    r2, r1, r1, r2
     cmp    ip, r5
     bne    .L2

AFAICT the new sequence is better than the old one even for -mtune=cortex-a9 since it
contains two fewer instructions.

So this test is no longer a good source of getting smulbb instructions.
The proposed change in this patch is to greatly simplify it by writing a simple enough
one-liner that we can always expect to be compiled into a single smulbb instruction.

Tested on arm-none-eabi.
Ok for trunk?

Thanks,
Kyrill

2016-01-22  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>

     * gcc.target/arm/wmul-3.c: Simplify test to generate just
     a single smulbb instruction.
     * gcc.target/amr/wmul-1.c: Add -mtune=cortex-a9 to dg-options.
     * gcc.target/amr/wmul-2.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/arm/wmul-1.c b/gcc/testsuite/gcc.target/arm/wmul-1.c
index ddddd509fe645ea98877753773e7bcf9b6787897..c340f960fa444642fe18ae3bcac93d78fe9dc851 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-1.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-1.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_dsp } */
-/* { dg-options "-O1 -fexpensive-optimizations" } */
+/* { dg-options "-O1 -fexpensive-optimizations -mtune=cortex-a9" } */
 
 int mac(const short *a, const short *b, int sqr, int *sum)
 {
diff --git a/gcc/testsuite/gcc.target/arm/wmul-2.c b/gcc/testsuite/gcc.target/arm/wmul-2.c
index 2ea55f9fbe12f74f38754cb72be791fd6e9495f4..bd2435c9113a82d2e102b545b3141cbda9ba326d 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-2.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-2.c
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_dsp } */
-/* { dg-options "-O1 -fexpensive-optimizations" } */
+/* { dg-options "-O1 -fexpensive-optimizations -mtune=cortex-a9" } */
 
 void vec_mpy(int y[], const short x[], short scaler)
 {
diff --git a/gcc/testsuite/gcc.target/arm/wmul-3.c b/gcc/testsuite/gcc.target/arm/wmul-3.c
index 144b553082e6158701639f05929987de01e7125a..87eba740142a80a1dc1979b4e79d9272a839e7b2 100644
--- a/gcc/testsuite/gcc.target/arm/wmul-3.c
+++ b/gcc/testsuite/gcc.target/arm/wmul-3.c
@@ -1,19 +1,11 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target arm_dsp } */
-/* { dg-options "-O1 -fexpensive-optimizations" } */
+/* { dg-options "-O" } */
 
-int mac(const short *a, const short *b, int sqr, int *sum)
+int
+foo (int a, int b)
 {
-  int i;
-  int dotp = *sum;
-
-  for (i = 0; i < 150; i++) {
-    dotp -= b[i] * a[i];
-    sqr -= b[i] * b[i];
-  }
-
-  *sum = dotp;
-  return sqr;
+  return (short) a * (short) b;
 }
 
-/* { dg-final { scan-assembler-times "smulbb" 2 } } */
+/* { dg-final { scan-assembler-times "smulbb" 1 } } */