From patchwork Wed Apr 25 23:38:42 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 155138 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 17D54B6FF3 for ; Thu, 26 Apr 2012 09:39:03 +1000 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1336001944; h=Comment: DomainKey-Signature:Received:Received:Received:Received:Received: MIME-Version:Received:Received:In-Reply-To:References:Date: Message-ID:Subject:From:To:Cc:Content-Type:Mailing-List: Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:Sender:Delivered-To; bh=oGj4kYbidglJzRXfZoueHmF/rt0=; b=qL6RyRrWJAbtNrrrlUpW7N3IVBpSbX0ppqlyPzIrBdro5ewgnDDoLSmPjQgsPa lL0JZJm5HsPA3GHBEc/Xw2PabMbeefuAOGm9uLhhBMNxrnYtPSD9jrYE+T/SY4wk EiLH+K2vz6Zbtrd6g18sHho8gnaCd8Bev5uRYnkJQguaY= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:X-Google-DKIM-Signature:Received:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Cc:Content-Type:X-System-Of-Record:X-Gm-Message-State:X-IsSubscribed:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=J7vPJrYlnji0V1/8u6LxobEPufw0vivAjEBEKxPfCxXkZ0uXPiYoPMgnSK0mXY 8BkHTev16WsAlWDGJ6KbDSnvuwP7U6htqdRo93qWhwL4mx22gzNV4opUATupgP4D Ri04ImYNYAi3cVrV4BL805InA/Nq1wmOzylyPzy+4YuAQ=; Received: (qmail 8862 invoked by alias); 25 Apr 2012 23:38:59 -0000 Received: (qmail 8848 invoked by uid 22791); 25 Apr 2012 23:38:57 -0000 X-SWARE-Spam-Status: No, hits=-3.9 required=5.0 tests=AWL, BAYES_50, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, KHOP_RCVD_TRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, TW_AV, T_RP_MATCHES_RCVD X-Spam-Check-By: sourceware.org Received: from mail-ob0-f175.google.com (HELO mail-ob0-f175.google.com) (209.85.214.175) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 25 Apr 2012 23:38:43 +0000 Received: by obceq6 with SMTP id eq6so881049obc.20 for ; Wed, 25 Apr 2012 16:38:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=b/IBUhRQxF1bp6UnG8vV5Bnm22IouzFznrILaYX36wo=; b=IPiyIMKOrF/2rd8DkfB+yJFr80GWdJv4eI090XVSW26ounHmv+6bBd4VLgSMSSXSiU USokAIL2ldehOUGJa+AjpjqrSL5jeCRutKIyxRrCxhTIB1DMtyl7u6LsA/M04xbHhOUg yDCCv8+5QkbMWTLMuhyS4CY/mAcr2LbWtuhTxO3KSobtrUHhc9Z93vwwMTZ+8qDS2Y5k GOjQhIi6tEJLi+gU+WKBkbhXsDl26j1w6uGMxEXqtvS3LP2BJt0hvjM5kYLX+R0zLW/5 ADcbRgOl3U7/J5gxvlY27rACtz2EyeJ8rXGhBElHCCX90jEC16S8z8v8WgROKq5LwgRe VcFw== Received: by 10.60.13.37 with SMTP id e5mr6046860oec.70.1335397123238; Wed, 25 Apr 2012 16:38:43 -0700 (PDT) MIME-Version: 1.0 Received: by 10.60.13.37 with SMTP id e5mr6046832oec.70.1335397123043; Wed, 25 Apr 2012 16:38:43 -0700 (PDT) Received: by 10.182.147.104 with HTTP; Wed, 25 Apr 2012 16:38:42 -0700 (PDT) In-Reply-To: References: <20120330001021.9E30EB2086@azwildcat.mtv.corp.google.com> Date: Wed, 25 Apr 2012 16:38:42 -0700 Message-ID: Subject: Re: Support for Runtime CPU type detection via builtins (issue5754058) From: Sriraman Tallam To: "H.J. Lu" Cc: Uros Bizjak , Richard Guenther , Michael Matz , reply@codereview.appspotmail.com, gcc-patches@gcc.gnu.org, Richard Henderson , Jan Hubicka X-System-Of-Record: true X-Gm-Message-State: ALoCoQneLsrINpBKl5RRsMco59ybD5VP4Z4+dnuWJ+/sfuih+mCB1mDtkrVvMRC7Xxcj+f8cUOdVbZz3tmgbg+PICMPgQ8HJrHtYu9mdfO35C/E7fmcMXDUkCFK3RagkEk1RgXlNkfPy21efGCPdf4Wu2bShJ57IvA== X-IsSubscribed: yes Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi H.J, Could you please review this patch for AVX2 check? * config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value. (get_available_features): New argument. Check for AVX2. (__cpu_indicator_init): Modify call to get_available_features . * doc/extend.texi: Document avx2 support. * testsuite/gcc.target/i386/builtin_target.c: Check avx2. * config/i386/i386.c (fold_builtin_cpu): Add avx2. Thanks, -Sri. On Wed, Apr 25, 2012 at 2:45 PM, Sriraman Tallam wrote: > On Wed, Apr 25, 2012 at 2:28 PM, H.J. Lu wrote: >> On Wed, Apr 25, 2012 at 2:25 PM, Sriraman Tallam wrote: >>> On Tue, Apr 24, 2012 at 7:39 PM, H.J. Lu wrote: >>>> On Tue, Apr 24, 2012 at 7:06 PM, Sriraman Tallam wrote: >>>>> On Tue, Apr 24, 2012 at 5:24 PM, H.J. Lu wrote: >>>>>> On Tue, Apr 24, 2012 at 5:10 PM, Sriraman Tallam wrote: >>>>>>> Hi, >>>>>>> >>>>>>>   Thanks for all the comments. I have made all the changes as >>>>>>> mentioned and submiited the patch. Summary of changes made: >>>>>>> >>>>>>> * Add support for AVX >>>>>>> * Fix documentation in extend.texi >>>>>>> * Make it thread-safe according to H.J.'s comments. >>>>>>> >>>>>>> I have attached the patch. Boot-strapped and checked for test parity >>>>>>> with pristine build. >>>>>>> >>>>>>>       * config/i386/i386.c (build_processor_model_struct): New function. >>>>>>>        (make_var_decl): New function. >>>>>>>        (fold_builtin_cpu): New function. >>>>>>>        (ix86_fold_builtin): New function. >>>>>>>        (make_cpu_type_builtin): New function. >>>>>>>        (ix86_init_platform_type_builtins): New function. >>>>>>>        (ix86_expand_builtin): Expand new builtins by folding them. >>>>>>>        (ix86_init_builtins): Make new builtins to detect CPU type. >>>>>>>        (TARGET_FOLD_BUILTIN): New macro. >>>>>>>        (IX86_BUILTIN_CPU_INIT): New enum value. >>>>>>>        (IX86_BUILTIN_CPU_IS): New enum value. >>>>>>>        (IX86_BUILTIN_CPU_SUPPORTS): New enum value. >>>>>>>        * config/i386/i386-builtin-types.def: New function type. >>>>>>>        * testsuite/gcc.target/builtin_target.c: New testcase. >>>>>>>        * doc/extend.texi: Document builtins. >>>>>>> >>>>>>>        * libgcc/config/i386/i386-cpuinfo.c: New file. >>>>>>>        * libgcc/config/i386/t-cpuinfo: New file. >>>>>>>        * libgcc/config.host: Include t-cpuinfo. >>>>>>>        * libgcc/config/i386/libgcc-glibc.ver: Version symbol __cpu_model. >>>>>>> >>>>>>> >>>>>> >>>>>> +  /* This function needs to run just once.  */ >>>>>> +  if (__cpu_model.__cpu_vendor) >>>>>> +    return 0; >>>>>> + >>>>>> +  /* Assume cpuid insn present. Run in level 0 to get vendor id. */ >>>>>> +  if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx)) >>>>>> +    return -1; >>>>>> >>>>>> If __get_cpuid_output returns non-zero, it will be called >>>>>> repeatedly.  I think you should set __cpu_model.__cpu_vendor >>>>>> to non-zero in this case. >>>>> >>>>> Done now. >>>>> >>>>> 2012-04-24  Sriraman Tallam   >>>>> >>>>>        * libgcc/config/i386/i386-cpuinfo.c: Set __cpu_vendor always. >>>>> >>>>> >>>>> Index: libgcc/config/i386/i386-cpuinfo.c >>>>> =================================================================== >>>>> --- libgcc/config/i386/i386-cpuinfo.c   (revision 186789) >>>>> +++ libgcc/config/i386/i386-cpuinfo.c   (working copy) >>>>> @@ -256,16 +256,25 @@ __cpu_indicator_init (void) >>>>> >>>>>   /* Assume cpuid insn present. Run in level 0 to get vendor id. */ >>>>>   if (!__get_cpuid_output (0, &eax, &ebx, &ecx, &edx)) >>>>> -    return -1; >>>>> +    { >>>>> +      __cpu_model.__cpu_vendor = VENDOR_OTHER; >>>>> +      return -1; >>>>> +    } >>>>> >>>>>   vendor = ebx; >>>>>   max_level = eax; >>>>> >>>>>   if (max_level < 1) >>>>> -    return -1; >>>>> +    { >>>>> +      __cpu_model.__cpu_vendor = VENDOR_OTHER; >>>>> +      return -1; >>>>> +    } >>>>> >>>>>   if (!__get_cpuid_output (1, &eax, &ebx, &ecx, &edx)) >>>>> -    return -1; >>>>> +    { >>>>> +      __cpu_model.__cpu_vendor = VENDOR_OTHER; >>>>> +      return -1; >>>>> +    } >>>>> >>>>>   model = (eax >> 4) & 0x0f; >>>>>   family = (eax >> 8) & 0x0f; >>>>> >>>>> >>>>> Thanks, >>>> >>>> Should you also handle AVX2? >>> >>>  I cannot test it and thought will wait till I get access to a >>> processor with AVX2. >>> >> >> You can download an AVX2 emulator (SDE) from >> >> http://software.intel.com/en-us/avx/ >> >> to test AVX2 binaries. > > Ok thanks, I will prepare a patch. > > -Sri. > >> >> -- >> H.J. * config/i386/i386-cpuinfo.c (FEATURE_AVX2): New enum value. (get_available_features): New argument. Check for AVX2. (__cpu_indicator_init): Modify call to get_available_features. * doc/extend.texi: Document avx2 support. * testsuite/gcc.target/i386/builtin_target.c: Check avx2. * config/i386/i386.c (fold_builtin_cpu): Add avx2. Index: libgcc/config/i386/i386-cpuinfo.c =================================================================== --- libgcc/config/i386/i386-cpuinfo.c (revision 186848) +++ libgcc/config/i386/i386-cpuinfo.c (working copy) @@ -75,7 +75,8 @@ enum processor_features FEATURE_SSSE3, FEATURE_SSE4_1, FEATURE_SSE4_2, - FEATURE_AVX + FEATURE_AVX, + FEATURE_AVX2 }; struct __processor_model @@ -191,8 +192,11 @@ get_intel_cpu (unsigned int family, unsigned int m } } +/* ECX and EDX are output of CPUID at level one. MAX_CPUID_LEVEL is + the max possible level of CPUID insn. */ static void -get_available_features (unsigned int ecx, unsigned int edx) +get_available_features (unsigned int ecx, unsigned int edx, + int max_cpuid_level) { unsigned int features = 0; @@ -217,6 +221,15 @@ static void if (ecx & bit_AVX) features |= (1 << FEATURE_AVX); + /* Get Advanced Features at level 7 (eax = 7, ecx = 0). */ + if (max_cpuid_level >= 7) + { + unsigned int eax, ebx, ecx, edx; + __cpuid_count (7, 0, eax, ebx, ecx, edx); + if (ebx & bit_AVX2) + features |= (1 << FEATURE_AVX2); + } + __cpu_model.__cpu_features[0] = features; } @@ -296,7 +309,7 @@ __cpu_indicator_init (void) /* Get CPU type. */ get_intel_cpu (family, model, brand_id); /* Find available features. */ - get_available_features (ecx, edx); + get_available_features (ecx, edx, max_level); __cpu_model.__cpu_vendor = VENDOR_INTEL; } else if (vendor == SIG_AMD) @@ -311,7 +324,7 @@ __cpu_indicator_init (void) /* Get CPU type. */ get_amd_cpu (family, model); /* Find available features. */ - get_available_features (ecx, edx); + get_available_features (ecx, edx, max_level); __cpu_model.__cpu_vendor = VENDOR_AMD; } else Index: gcc/doc/extend.texi =================================================================== --- gcc/doc/extend.texi (revision 186848) +++ gcc/doc/extend.texi (working copy) @@ -9541,6 +9541,8 @@ SSE4.1 instructions. SSE4.2 instructions. @item avx AVX instructions. +@item avx2 +AVX2 instructions. @end table Here is an example: Index: gcc/testsuite/gcc.target/i386/builtin_target.c =================================================================== --- gcc/testsuite/gcc.target/i386/builtin_target.c (revision 186848) +++ gcc/testsuite/gcc.target/i386/builtin_target.c (working copy) @@ -29,6 +29,8 @@ fn1 () assert (__builtin_cpu_supports ("avx") >= 0); + assert (__builtin_cpu_supports ("avx2") >= 0); + /* Check CPU type. */ assert (__builtin_cpu_is ("amd") >= 0); Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 186848) +++ gcc/config/i386/i386.c (working copy) @@ -27763,6 +27763,7 @@ fold_builtin_cpu (tree fndecl, tree *args) F_SSE4_1, F_SSE4_2, F_AVX, + F_AVX2, F_MAX }; @@ -27830,7 +27831,8 @@ fold_builtin_cpu (tree fndecl, tree *args) {"ssse3", F_SSSE3}, {"sse4.1", F_SSE4_1}, {"sse4.2", F_SSE4_2}, - {"avx", F_AVX} + {"avx", F_AVX}, + {"avx2", F_AVX2} }; static tree __processor_model_type = NULL_TREE;