From patchwork Wed May 15 17:45:55 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 244149 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id D179D2C00A9 for ; Thu, 16 May 2013 03:46:04 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; q=dns; s=default; b=V1fqYqz5I2xFaw9bYCn8rLkMp9Zb+Bgr2ErE+Apoorx 6eSjET+Yc+3aEGZzDOzE0R2Q2p14IqgJSbbYVN15XWTiRjPny+MUXYvMa/F6u3Fs mIlemgqeilA3TdOMTHEbTeGPIN9H3oK28t5tZQiVKjBeOcUd+5M1Lb8k5KF78tZE = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:date:message-id:subject:from:to:cc:content-type; s=default; bh=SdCpq3hjMxTf4qmUSnlV30k7Sak=; b=RawXK5agKb1zoRfVO J1IE1MepgUcpsrt0EUGBCx2dUg+6pcCUfpMUS6uI18QRFlQntI8U8en017wqjczg YJ0JCpLh1SLB195mO8rIaNtUE4PMSPlI1NlorIUbpYB30DI5mgD4yGwTCGGuutHk rDmdytfAB0NsozKCv+FEOrXJZE= Received: (qmail 19141 invoked by alias); 15 May 2013 17:45:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 19132 invoked by uid 89); 15 May 2013 17:45:58 -0000 X-Spam-SWARE-Status: No, score=-2.4 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, SPF_PASS, TW_LZ, TW_ZC, TW_ZJ, UPPERCASE_50_75 autolearn=no version=3.3.1 Received: from mail-da0-f48.google.com (HELO mail-da0-f48.google.com) (209.85.210.48) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Wed, 15 May 2013 17:45:57 +0000 Received: by mail-da0-f48.google.com with SMTP id h32so1116220dak.35 for ; Wed, 15 May 2013 10:45:56 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.66.155.138 with SMTP id vw10mr15864552pab.91.1368639955943; Wed, 15 May 2013 10:45:55 -0700 (PDT) Received: by 10.70.10.34 with HTTP; Wed, 15 May 2013 10:45:55 -0700 (PDT) Date: Wed, 15 May 2013 19:45:55 +0200 Message-ID: Subject: [PATCH, i386]: Update processor_alias_table for missing PTA_PRFCHW and PTA_FXSR flags From: Uros Bizjak To: "gcc-patches@gcc.gnu.org" Cc: "Gopalasubramanian, Ganesh" X-Virus-Found: No Hello! Attached patch adds missing PTA_PRFCHW and PTA_FXSR flags to x86 processor alias table. PRFCHW CPUID flag is shared with 3dnow prefetch flag, so some additional logic is needed to avoid generating SSE prefetches for non-SSE 3dNow! targets, while still generating full set of 3dnow prefetches on 3dNow! targets. 2013-05-15 Uros Bizjak * config/i386/i386.c (iy86_option_override_internal): Update processor_alias_table for missing PTA_PRFCHW and PTA_FXSR flags. Add PTA_POPCNT to corei7 entry and remove PTA_SSE from athlon-4 entry. Do not enable SSE prefetch on non-SSE 3dNow! targets. Enable TARGET_PRFCHW for TARGET_3DNOW targets. * config/i386/i386.md (prefetch): Enable for TARGET_PRFCHW instead of TARGET_3DNOW. (*prefetch_3dnow): Enable for TARGET_PRFCHW only. Patch was bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} and was committed to mainline SVN. The patch will be backported to 4.8 branch in a couple of days. Uros. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 198933) +++ config/i386/i386.c (working copy) @@ -2892,9 +2892,10 @@ ix86_option_override_internal (bool main_args_p) {"pentium", PROCESSOR_PENTIUM, CPU_PENTIUM, 0}, {"pentium-mmx", PROCESSOR_PENTIUM, CPU_PENTIUM, PTA_MMX}, {"winchip-c6", PROCESSOR_I486, CPU_NONE, PTA_MMX}, - {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW}, - {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW}, - {"c3-2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, PTA_MMX | PTA_SSE}, + {"winchip2", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW | PTA_PRFCHW}, + {"c3", PROCESSOR_I486, CPU_NONE, PTA_MMX | PTA_3DNOW | PTA_PRFCHW}, + {"c3-2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, + PTA_MMX | PTA_SSE | PTA_FXSR}, {"i686", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, 0}, {"pentiumpro", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, 0}, {"pentium2", PROCESSOR_PENTIUMPRO, CPU_PENTIUMPRO, PTA_MMX | PTA_FXSR}, @@ -2917,8 +2918,8 @@ ix86_option_override_internal (bool main_args_p) PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 | PTA_CX16 | PTA_FXSR}, {"corei7", PROCESSOR_COREI7, CPU_COREI7, - PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 - | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_FXSR}, + PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 + | PTA_SSE4_1 | PTA_SSE4_2 | PTA_CX16 | PTA_POPCNT | PTA_FXSR}, {"corei7-avx", PROCESSOR_COREI7, CPU_COREI7, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 | PTA_SSE4_1 | PTA_SSE4_2 | PTA_AVX @@ -2940,49 +2941,49 @@ ix86_option_override_internal (bool main_args_p) PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSSE3 | PTA_CX16 | PTA_MOVBE | PTA_FXSR}, {"geode", PROCESSOR_GEODE, CPU_GEODE, - PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE}, + PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW}, {"k6", PROCESSOR_K6, CPU_K6, PTA_MMX}, - {"k6-2", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW}, - {"k6-3", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW}, + {"k6-2", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW | PTA_PRFCHW}, + {"k6-3", PROCESSOR_K6, CPU_K6, PTA_MMX | PTA_3DNOW | PTA_PRFCHW}, {"athlon", PROCESSOR_ATHLON, CPU_ATHLON, - PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE}, + PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW}, {"athlon-tbird", PROCESSOR_ATHLON, CPU_ATHLON, - PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE}, + PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW}, {"athlon-4", PROCESSOR_ATHLON, CPU_ATHLON, - PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE}, + PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_PREFETCH_SSE | PTA_PRFCHW}, {"athlon-xp", PROCESSOR_ATHLON, CPU_ATHLON, - PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE}, + PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_PRFCHW | PTA_FXSR}, {"athlon-mp", PROCESSOR_ATHLON, CPU_ATHLON, - PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE}, + PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_PRFCHW | PTA_FXSR}, {"x86-64", PROCESSOR_K8, CPU_K8, - PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF}, + PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_NO_SAHF | PTA_FXSR}, {"k8", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"k8-sse3", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"opteron", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"opteron-sse3", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"athlon64", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"athlon64-sse3", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_SSE3 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"athlon-fx", PROCESSOR_K8, CPU_K8, PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_NO_SAHF}, + | PTA_SSE2 | PTA_NO_SAHF | PTA_PRFCHW | PTA_FXSR}, {"amdfam10", PROCESSOR_AMDFAM10, CPU_AMDFAM10, - PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, + PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_SSE2 + | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_PRFCHW | PTA_FXSR}, {"barcelona", PROCESSOR_AMDFAM10, CPU_AMDFAM10, - PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE - | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM}, + PTA_64BIT | PTA_MMX | PTA_3DNOW | PTA_3DNOW_A | PTA_SSE | PTA_SSE2 + | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_PRFCHW | PTA_FXSR}, {"bdver1", PROCESSOR_BDVER1, CPU_BDVER1, PTA_64BIT | PTA_MMX | PTA_SSE | PTA_SSE2 | PTA_SSE3 | PTA_SSE4A | PTA_CX16 | PTA_ABM | PTA_SSSE3 | PTA_SSE4_1 @@ -3592,14 +3593,18 @@ ix86_option_override_internal (bool main_args_p) ix86_isa_flags |= OPTION_MASK_ISA_MMX & ~ix86_isa_flags_explicit; /* Enable SSE prefetch. */ - if (TARGET_SSE || TARGET_PRFCHW) + if (TARGET_SSE || (TARGET_PRFCHW && !TARGET_3DNOW)) x86_prefetch_sse = true; - /* Turn on popcnt instruction for -msse4.2 or -mabm. */ + /* Enable prefetch{,w} instructions for -m3dnow. */ + if (TARGET_3DNOW) + ix86_isa_flags |= OPTION_MASK_ISA_PRFCHW & ~ix86_isa_flags_explicit; + + /* Enable popcnt instruction for -msse4.2 or -mabm. */ if (TARGET_SSE4_2 || TARGET_ABM) ix86_isa_flags |= OPTION_MASK_ISA_POPCNT & ~ix86_isa_flags_explicit; - /* Turn on lzcnt instruction for -mabm. */ + /* Enable lzcnt instruction for -mabm. */ if (TARGET_ABM) ix86_isa_flags |= OPTION_MASK_ISA_LZCNT & ~ix86_isa_flags_explicit; Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 198933) +++ config/i386/i386.md (working copy) @@ -17041,21 +17041,18 @@ [(prefetch (match_operand 0 "address_operand") (match_operand:SI 1 "const_int_operand") (match_operand:SI 2 "const_int_operand"))] - "TARGET_PREFETCH_SSE || TARGET_3DNOW" + "TARGET_PREFETCH_SSE || TARGET_PRFCHW" { - int rw = INTVAL (operands[1]); + bool write = INTVAL (operands[1]) != 0; int locality = INTVAL (operands[2]); - gcc_assert (rw == 0 || rw == 1); gcc_assert (IN_RANGE (locality, 0, 3)); - if (TARGET_PRFCHW && rw) - operands[2] = GEN_INT (3); /* Use 3dNOW prefetch in case we are asking for write prefetch not supported by SSE counterpart or the SSE prefetch is not available (K6 machines). Otherwise use SSE prefetch as it allows specifying of locality. */ - else if (TARGET_3DNOW && (!TARGET_PREFETCH_SSE || rw)) + if (TARGET_PRFCHW && (write || !TARGET_PREFETCH_SSE)) operands[2] = GEN_INT (3); else operands[1] = const0_rtx; @@ -17086,7 +17083,7 @@ [(prefetch (match_operand 0 "address_operand" "p") (match_operand:SI 1 "const_int_operand" "n") (const_int 3))] - "TARGET_3DNOW || TARGET_PRFCHW" + "TARGET_PRFCHW" { if (INTVAL (operands[1]) == 0) return "prefetch\t%a0";