From patchwork Thu Jun 10 00:13:37 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Fang, Changpeng" X-Patchwork-Id: 55133 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id AA695B7D1A for ; Thu, 10 Jun 2010 10:14:23 +1000 (EST) Received: (qmail 5162 invoked by alias); 10 Jun 2010 00:14:02 -0000 Received: (qmail 5082 invoked by uid 22791); 10 Jun 2010 00:13:50 -0000 X-SWARE-Spam-Status: No, hits=-2.5 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW X-Spam-Check-By: sourceware.org Received: from tx2ehsobe003.messaging.microsoft.com (HELO TX2EHSOBE005.bigfish.com) (65.55.88.13) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Thu, 10 Jun 2010 00:13:45 +0000 Received: from mail71-tx2-R.bigfish.com (10.9.14.240) by TX2EHSOBE005.bigfish.com (10.9.40.25) with Microsoft SMTP Server id 8.1.340.0; Thu, 10 Jun 2010 00:13:43 +0000 Received: from mail71-tx2 (localhost.localdomain [127.0.0.1]) by mail71-tx2-R.bigfish.com (Postfix) with ESMTP id 27B9A1C50167; Thu, 10 Jun 2010 00:13:43 +0000 (UTC) X-SpamScore: -11 X-BigFish: VPS-11(zz1432P4015Lzz1202hzzz32i2a8h34h61h) X-Spam-TCS-SCL: 0:0 Received: from mail71-tx2 (localhost.localdomain [127.0.0.1]) by mail71-tx2 (MessageSwitch) id 1276128822446405_18622; Thu, 10 Jun 2010 00:13:42 +0000 (UTC) Received: from TX2EHSMHS005.bigfish.com (unknown [10.9.14.247]) by mail71-tx2.bigfish.com (Postfix) with ESMTP id 689184F804E; Thu, 10 Jun 2010 00:13:42 +0000 (UTC) Received: from ausb3extmailp02.amd.com (163.181.251.22) by TX2EHSMHS005.bigfish.com (10.9.99.105) with Microsoft SMTP Server (TLS) id 14.0.482.44; Thu, 10 Jun 2010 00:13:42 +0000 Received: from ausb3twp02.amd.com ([163.181.250.38]) by ausb3extmailp02.amd.com (Switch-3.2.7/Switch-3.2.7) with SMTP id o5A0GXdn007634; Wed, 9 Jun 2010 19:16:36 -0500 X-M-MSG: Received: from sausexhtp01.amd.com (sausexhtp01.amd.com [163.181.3.165]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by ausb3twp02.amd.com (Tumbleweed MailGate 3.7.2) with ESMTP id 2F50CC86BE; Wed, 9 Jun 2010 19:13:35 -0500 (CDT) Received: from SAUSEXMBP01.amd.com ([163.181.3.198]) by sausexhtp01.amd.com ([163.181.3.165]) with mapi; Wed, 9 Jun 2010 19:13:37 -0500 From: "Fang, Changpeng" To: Zdenek Dvorak CC: "gcc-patches@gcc.gnu.org" , "sebpop@gmail.com" Date: Wed, 9 Jun 2010 19:13:37 -0500 Subject: RE: [patch] PR44185 Fix new prefetch test failures - second Message-ID: References: , <20100609071435.GA13815@kam.mff.cuni.cz> In-Reply-To: <20100609071435.GA13815@kam.mff.cuni.cz> MIME-Version: 1.0 X-Reverse-DNS: ausb3extmailp02.amd.com Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi, >> Attached is the patch to fix PR 44185: new prefetch failures. After the previous fix >> of gcc.dg/tree-ssa/prefetch-c, new failure occurs because the number of non-temporal >> stores is different in the assembler and .optimized files for different architectures. The >> reason is that the unroll_factor is different (and this is why in the original test case the unroll >> factor is limited to 1). >> >> In this patch, we don't count the exact number of non-temporal stores in the assembler and >> .optimized files. Instead, we just scan for the existence. >> >> Is it OK for the trunk? >yes, overall; but, with this change, we would not recognize if some change >breaks the optimization for just one of the loops. I guess the test needs to >be split to several testcases, one for each loop, Attached please find the patch that splits the tests in the following ways: prefetch-7.c: split loops that generate any non-temporal stores out. As a result, no non-temporal stores should be generated in any loops in prefetch-7.c now. prefetch-8.c: contains a loop that should generate one non-temporal store (before unrolling). prefetch-9.c: contains a loop that should generate one non-temporal store and one non-temporal prefetch. Is this OK? Thanks, Changpeng From 5545109a5d8da47e9f71658a282623fb9e4e8a52 Mon Sep 17 00:00:00 2001 From: Changpeng Fang Date: Wed, 9 Jun 2010 16:54:21 -0700 Subject: [PATCH] pr44185: fix prefetch test failures *gcc.dg/tree-ssa/prefetch-7.c: take the loops that will generate non-temporal stores out of the tests to form new test cases. As a result, no non-temporal store should be generated in this case. *gcc.dg/tree-ssa/prefetch-8.c: New. Test from original prefetch-7.c that generate one non-temporal store. *gcc.dg/tree-ssa/prefetch-9.c: New. Test from original prefetch-7.c that generate one non-temporal store and one one-temporal prefetch. --- gcc/testsuite/gcc.dg/tree-ssa/prefetch-7.c | 22 ++++-------------- gcc/testsuite/gcc.dg/tree-ssa/prefetch-8.c | 28 ++++++++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/prefetch-9.c | 32 ++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/prefetch-8.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/prefetch-9.c diff --git a/gcc/testsuite/gcc.dg/tree-ssa/prefetch-7.c b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-7.c index 3b9e19f..9e453a7 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/prefetch-7.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-7.c @@ -5,20 +5,12 @@ /* { dg-options "-O2 -fprefetch-loop-arrays -march=athlon -msse2 -mfpmath=sse --param simultaneous-prefetches=100 -fdump-tree-aprefetch-details -fdump-tree-optimized" } */ #define K 1000000 -int a[K], b[K]; +int a[K]; void test(int *p) { unsigned i; - /* Nontemporal store should be used for a. */ - for (i = 0; i < K; i++) - a[i] = 0; - - /* Nontemporal store should be used for a, nontemporal prefetch for b. */ - for (i = 0; i < K; i++) - a[i] = b[i]; - /* Nontemporal store should not be used here (only write and read temporal prefetches). */ for (i = 0; i < K - 10000; i++) @@ -44,18 +36,14 @@ void test(int *p) } /* { dg-final { scan-tree-dump-times "Issued prefetch" 5 "aprefetch" } } */ -/* { dg-final { scan-tree-dump-times "Issued nontemporal prefetch" 3 "aprefetch" } } */ -/* { dg-final { scan-tree-dump-times "a nontemporal store" 2 "aprefetch" } } */ +/* { dg-final { scan-tree-dump-times "Issued nontemporal prefetch" 2 "aprefetch" } } */ +/* { dg-final { scan-tree-dump-times "a nontemporal store" 0 "aprefetch" } } */ -/* { dg-final { scan-tree-dump-times "builtin_prefetch" 8 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "=\\{nt\\}" 18 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "__builtin_ia32_mfence" 2 "optimized" } } */ +/* { dg-final { scan-tree-dump-times "builtin_prefetch" 7 "optimized" } } */ /* { dg-final { scan-assembler-times "prefetchw" 5 } } */ /* { dg-final { scan-assembler-times "prefetcht" 1 } } */ -/* { dg-final { scan-assembler-times "prefetchnta" 2 } } */ -/* { dg-final { scan-assembler-times "movnti" 18 } } */ -/* { dg-final { scan-assembler-times "mfence" 2 } } */ +/* { dg-final { scan-assembler-times "prefetchnta" 1 } } */ /* { dg-final { cleanup-tree-dump "aprefetch" } } */ /* { dg-final { cleanup-tree-dump "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/prefetch-8.c b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-8.c new file mode 100644 index 0000000..a05d552 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-8.c @@ -0,0 +1,28 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-require-effective-target ilp32 } */ +/* { dg-require-effective-target sse2 } */ +/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=athlon" } } */ +/* { dg-options "-O2 -fprefetch-loop-arrays -march=athlon -msse2 -mfpmath=sse --param simultaneous-prefetches=100 -fdump-tree-aprefetch-details -fdump-tree-optimized" } */ + +#define K 1000000 +int a[K]; + +void test() +{ + unsigned i; + + /* Nontemporal store should be used for a. */ + for (i = 0; i < K; i++) + a[i] = 0; +} + +/* { dg-final { scan-tree-dump-times "a nontemporal store" 1 "aprefetch" } } */ + +/* { dg-final { scan-tree-dump "=\\{nt\\}" "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_ia32_mfence" 1 "optimized" } } */ + +/* { dg-final { scan-assembler "movnti" } } */ +/* { dg-final { scan-assembler-times "mfence" 1 } } */ + +/* { dg-final { cleanup-tree-dump "aprefetch" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/prefetch-9.c b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-9.c new file mode 100644 index 0000000..eb22a66 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/prefetch-9.c @@ -0,0 +1,32 @@ +/* { dg-do compile { target i?86-*-* x86_64-*-* } } */ +/* { dg-require-effective-target ilp32 } */ +/* { dg-require-effective-target sse2 } */ +/* { dg-skip-if "" { i?86-*-* x86_64-*-* } { "-march=*" } { "-march=athlon" } } */ +/* { dg-options "-O2 -fprefetch-loop-arrays -march=athlon -msse2 -mfpmath=sse --param simultaneous-prefetches=100 -fdump-tree-aprefetch-details -fdump-tree-optimized" } */ + +#define K 1000000 +int a[K], b[K]; + +void test() +{ + unsigned i; + + /* Nontemporal store should be used for a, nontemporal prefetch for b. */ + for (i = 0; i < K; i++) + a[i] = b[i]; + +} + +/* { dg-final { scan-tree-dump-times "Issued nontemporal prefetch" 1 "aprefetch" } } */ +/* { dg-final { scan-tree-dump-times "a nontemporal store" 1 "aprefetch" } } */ + +/* { dg-final { scan-tree-dump-times "builtin_prefetch" 1 "optimized" } } */ +/* { dg-final { scan-tree-dump "=\\{nt\\}" "optimized" } } */ +/* { dg-final { scan-tree-dump-times "__builtin_ia32_mfence" 1 "optimized" } } */ + +/* { dg-final { scan-assembler-times "prefetchnta" 1 } } */ +/* { dg-final { scan-assembler "movnti" } } */ +/* { dg-final { scan-assembler-times "mfence" 1 } } */ + +/* { dg-final { cleanup-tree-dump "aprefetch" } } */ +/* { dg-final { cleanup-tree-dump "optimized" } } */ -- 1.6.3.3