From patchwork Tue Apr 16 20:54:57 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sriraman Tallam X-Patchwork-Id: 237088 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "localhost", Issuer "www.qmailtoaster.com" (not verified)) by ozlabs.org (Postfix) with ESMTPS id 9ACC72C0142 for ; Wed, 17 Apr 2013 06:55:09 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; q=dns; s=default; b=roaBGARKRuPGCLWuAo DuJMhoyniPvNNgd5Kd45Wo3cRwK2tyqAsFo66x4oXaZq+j56k/xV8I5L6FInnK/r 6OzjXNo/S/ZKs+S1IH1+jnjXdWRbNE8BZ7W9ydBQRfic7sKaHnyIsPF8/qoQsUue iWjta8FI8qWjEnG/OXXUWMSfM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :mime-version:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; s=default; bh=VLCBH0fBsYsMv8zWKZ/DEROE xlg=; b=IdS28tfSf5vmxLhBMHv/24+ZL8jrlPKHDrLX8ll3kvPAH9v7QWbtlaIb JSq9qzlw6A3KojTSyOZtdFYX5ypxX3k1/qdPn7jaHcAzb3BfAg99guSKSd+z9qHV EtGU3hgspSL7FqVIqlU8M6T98AYXixjq4y3rJg33BvpuVCOVab8= Received: (qmail 16437 invoked by alias); 16 Apr 2013 20:55:02 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16422 invoked by uid 89); 16 Apr 2013 20:55:01 -0000 X-Spam-SWARE-Status: No, score=-5.9 required=5.0 tests=AWL, BAYES_00, KHOP_RCVD_TRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, RP_MATCHES_RCVD autolearn=ham version=3.3.1 Received: from mail-ob0-f169.google.com (HELO mail-ob0-f169.google.com) (209.85.214.169) by sourceware.org (qpsmtpd/0.84/v0.84-167-ge50287c) with ESMTP; Tue, 16 Apr 2013 20:54:59 +0000 Received: by mail-ob0-f169.google.com with SMTP id ta14so842444obb.28 for ; Tue, 16 Apr 2013 13:54:58 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:x-received:in-reply-to:references:date:message-id :subject:from:to:cc:content-type:x-gm-message-state; bh=bV/enHo+foHNI2CoWrH2xRJVXpsRxzxtH7S+KahfU4Q=; b=N3CQER9CBxgWLu8wGc2MEP7jsY4Md7F9drVSyyfkDzcWQRPDt5mgHGJEMEeXz857oM 0TouGmpUnD1nB/5VQKpVD5Uw7Gl8ERu9vdRHWO+fx8/+wKA12ms/vD2uy0S+KgwT03lo ZGJz+w+HDiM4KRX+c7XnYh41RuAU5RShJk/OUt45UHQ9MMi9uhjuGt5S5mMvxArfacug maG8Hdp/8TZiiTEtoPuJaoQmqnOMKhdz/oSjRxKIOqRU443bRbckedFAOOAw/hrscXjo CZ2UTrox2jmxw8Quq7SpX6c9eSwXP5a3Naeg8bGxOPWjTNZBScYolAsJ17yAYLPp/Lxi CnGA== MIME-Version: 1.0 X-Received: by 10.182.158.70 with SMTP id ws6mr1396106obb.70.1366145698079; Tue, 16 Apr 2013 13:54:58 -0700 (PDT) Received: by 10.182.245.229 with HTTP; Tue, 16 Apr 2013 13:54:57 -0700 (PDT) In-Reply-To: <20130412085850.GZ16463@tucnak.redhat.com> References: <20130412085850.GZ16463@tucnak.redhat.com> Date: Tue, 16 Apr 2013 13:54:57 -0700 Message-ID: Subject: Re: GCC does not support *mmintrin.h with function specific opts From: Sriraman Tallam To: Jakub Jelinek Cc: GCC Patches , David Li X-Gm-Message-State: ALoCoQlxXa7DiJYag2sW20yPg7vTG4+DulhELXDT/L+DsddHiRywpxlwckK41Kc5lV9+EQPALimLgvafipTFMd1tO4WaH1M5bAbg1CxahI6TM/adfRQ5m23TOZLFjCIkm1LeaM+VZBq3wRXpWYgeTEtoHbeeMjBGBIX17vyW9qZCLV5AyNk9n+vpO5VOV9LCr33ajk0ie5cL X-Virus-Found: No X-IsSubscribed: yes Hi, I have attached an updated patch that addresses all the comments raised. On Fri, Apr 12, 2013 at 1:58 AM, Jakub Jelinek wrote: > On Thu, Apr 11, 2013 at 12:05:41PM -0700, Sriraman Tallam wrote: >> I have attached a patch that fixes this. I have added an option >> "-mgenerate-builtins" that will do two things. It will define a macro >> "__ALL_ISA__" which will expose the *intrin.h functions. It will also >> expose all the target specific builtins. -mgenerate-builtins will not >> affect code generation. > > 1) this shouldn't be an option, either it can be made to work reliably, > then it should be done always, or it can't, then it shouldn't be done Ok, it is on by default now. There is a way to turn it off, with -mno-generate-builtins. > 2) have you verified that if you always generate all builtins, that the > builtins not supported by the ISA selected from the command line are > created with the right vector modes? This issue does not arise. When the target builtin is expanded, it is checked if the ISA support is there, either via function specific target opts or global target opts. If not, an error is raised. Test case added for this, please see intrinsic_4.c in patch. > 3) the *intrin.h headers in the case where the guarding macro isn't defined > should be surrounded by something like > #ifndef __FMA4__ > #pragma GCC push options > #pragma GCC target("fma4") > #endif > ... > #ifndef __FMA4__ > #pragma GCC pop options > #endif > so that everything that is in the headers is compiled with the ISA > in question I do not think this should be done because it will break the inlining ability of the header function and cause issues if the caller does not specify the required ISA. The fact that the header functions are marked extern __inline, with gnu_inline guarantees that a body will not be generated and they will be inlined. If the caller does not have the required ISA, appropriate errors will be raised. Test cases added, see intrinsics_1.c, intrinsics_2.c > 4) what happens if you use the various vector types typedefed in the > *intrin.h headers in code that doesn't support those ISAs? As TYPE_MODE > for VECTOR_TYPE is a function call, perhaps it will just be handled as > generic BLKmode vectors, which is desirable I think I checked some tests here. With -mno-sse for instance, vector types are not permitted in function arguments and return values and gcc raises a warning/error in each case. With return values, gcc always gives an error if a SSE register is required in a return value. I even fixed this message to not do it for functions marked as extern inline, with "gnu_inline" keyword as a body for them will not be generated. > 5) what happens if you use a target builtin in a function not supporting > the corresponding ISA, do you get proper error explaining what you are > doing wrong? Yes, please sse intrinsic_4.c test in patch. > 6) what happens if you use some intrinsics in a function not supporting > the corresponding ISA? Dunno if the inliner chooses not to inline it > and error out because it is always_inline, or what exactly will happen > then Same deal here. The intrinsic function will, guaranteed, to be inlined into the caller which will be a corresponding builtin call. That builtin call will trigger an error if the ISA is not supported. Thanks Sri > > For all this you certainly need testcases. > > Jakub * config/i386/i386.c (construct_container): Do not issue SSE return error for extern gnu_inline functions. (def_builtin): Do not generate builtins when -mno-generate-builtins is used. * config/i386/i386.opt (mgenerate-builtins): New target option. * config/i386/i386-c.c (ix86_target_macros_internal): Define macro __ALL_ISA__ when generate_target_builtins is true. * testsuite/gcc.target/i386/intrinsics_1.c: New test. * testsuite/gcc.target/i386/intrinsics_2.c: Ditto. * testsuite/gcc.target/i386/intrinsics_3.c: Ditto. * testsuite/gcc.target/i386/intrinsics_4.c: Ditto. * testsuite/gcc.target/i386/intrinsics_5.c: Ditto. * config/i386/lzcntintrin.h: Expose header when __ALL_ISA__ is defined. * config/i386/lwpintrin.h: Ditto. * config/i386/xopintrin.h: Ditto. * config/i386/fmaintrin.h: Ditto. * config/i386/bmiintrin.h: Ditto. * config/i386/fma4intrin.h: Ditto. * config/i386/nmmintrin.h: Ditto. * config/i386/tbmintrin.h: Ditto. * config/i386/smmintrin.h: Ditto. * config/i386/wmmintrin.h: Ditto. * config/i386/popcntintrin.h: Ditto. * config/i386/f16cintrin.h: Ditto. * config/i386/pmmintrin.h: Ditto. * config/i386/bmi2intrin.h: Ditto. * config/i386/tmmintrin.h: Ditto. * config/i386/xmmintrin.h: Ditto. * config/i386/mmintrin.h: Ditto. * config/i386/ammintrin.h: Ditto. * config/i386/emmintrin.h: Ditto. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 197691) +++ config/i386/i386.c (working copy) @@ -6370,8 +6370,13 @@ construct_container (enum machine_mode mode, enum return NULL; /* We allowed the user to turn off SSE for kernel mode. Don't crash if - some less clueful developer tries to use floating-point anyway. */ - if (needed_sseregs && !TARGET_SSE) + some less clueful developer tries to use floating-point anyway. It is + alright if this is in a extern "gnu_inline" function, as it is the + caller that matters in this case. */ + if (needed_sseregs && !TARGET_SSE + && !(DECL_EXTERNAL (current_function_decl) + && lookup_attribute ("gnu_inline", + DECL_ATTRIBUTES (current_function_decl)) != NULL)) { if (in_return) { @@ -26813,7 +26818,8 @@ def_builtin (HOST_WIDE_INT mask, const char *name, ix86_builtins_isa[(int) code].isa = mask; mask &= ~OPTION_MASK_ISA_64BIT; - if (mask == 0 + if (generate_target_builtins + || mask == 0 || (mask & ix86_isa_flags) != 0 || (lang_hooks.builtin_function == lang_hooks.builtin_function_ext_scope)) Index: config/i386/i386.opt =================================================================== --- config/i386/i386.opt (revision 197691) +++ config/i386/i386.opt (working copy) @@ -626,3 +626,7 @@ Split 32-byte AVX unaligned store mrtm Target Report Mask(ISA_RTM) Var(ix86_isa_flags) Save Support RTM built-in functions and code generation + +mgenerate-builtins +Target Report Var(generate_target_builtins) Save Init(1) +Generate all target builtins that are otherwise only generated when the approrpriate ISA is turned on. Index: config/i386/i386-c.c =================================================================== --- config/i386/i386-c.c (revision 197691) +++ config/i386/i386-c.c (working copy) @@ -54,6 +54,9 @@ ix86_target_macros_internal (HOST_WIDE_INT isa_fla int last_arch_char = ix86_arch_string[arch_len - 1]; int last_tune_char = ix86_tune_string[tune_len - 1]; + if (generate_target_builtins) + def_or_undef (parse_in, "__ALL_ISA__"); + /* Built-ins based on -march=. */ switch (arch) { Index: testsuite/gcc.target/i386/intrinsics_4.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_4.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_4.c (revision 0) @@ -0,0 +1,11 @@ +/* Test to check if a target specific builtin used in a function without the + appropriate ISA support generates an error. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-sse4.1" } */ + +#include +__m128i foo(__m128i *V) +{ + return __builtin_ia32_movntdqa (V); /* { dg-error "'__builtin_ia32_movntdqa' needs isa option -m32 -msse4.1" } */ +} Index: testsuite/gcc.target/i386/intrinsics_1.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_1.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_1.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if intrinsics and function specific target + optimizations work together. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1" } */ + +#include + +__attribute__((target("sse4.1"))) +__m128i foo(__m128i *V) +{ + return _mm_stream_load_si128(V); +} Index: testsuite/gcc.target/i386/intrinsics_2.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_2.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_2.c (revision 0) @@ -0,0 +1,19 @@ +/* Ok, to have SSE return in non-SSE functions marked as + extern, "gnu_inline". */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -msse -mno-sse4.1" } */ + +#include + +extern __inline __attribute__ ((__gnu_inline__)) +__m128i bar (__m128i *V) +{ + return _mm_stream_load_si128(V); +} + +__attribute__((target("sse4.1"))) +__m128i foo(__m128i *V) +{ + return bar (V); +} Index: testsuite/gcc.target/i386/intrinsics_3.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_3.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_3.c (revision 0) @@ -0,0 +1,11 @@ +/* Using vector types without SSE enabled should generate an error. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-sse" } */ + +typedef long long _m128i __attribute__((vector_size(16),__may_alias__)); + +int foo (_m128i V) /* { dg-warning "SSE vector argument without SSE enabled changes the ABI" } */ +{ + return 0; +} Index: testsuite/gcc.target/i386/intrinsics_5.c =================================================================== --- testsuite/gcc.target/i386/intrinsics_5.c (revision 0) +++ testsuite/gcc.target/i386/intrinsics_5.c (revision 0) @@ -0,0 +1,13 @@ +/* Test case to check if -mno-generate-builtins will break use of intrinsics + when the appropriate ISA is not specified. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -mno-generate-builtins -mno-sse4.1" } */ + +#include +__m128i foo(__m128i *V) /* { dg-error "unknown type name" } */ +{ + return V; +} + +/* { dg-excess-errors "\"SSE4.1 instruction set not enabled\"" } */ Index: config/i386/lzcntintrin.h =================================================================== --- config/i386/lzcntintrin.h (revision 197691) +++ config/i386/lzcntintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __LZCNT__ +#if !defined (__LZCNT__) && !defined (__ALL_ISA__) # error "LZCNT instruction is not enabled" #endif /* __LZCNT__ */ Index: config/i386/lwpintrin.h =================================================================== --- config/i386/lwpintrin.h (revision 197691) +++ config/i386/lwpintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _LWPINTRIN_H_INCLUDED #define _LWPINTRIN_H_INCLUDED -#ifndef __LWP__ +#if !defined (__LWP__) && !defined (__ALL_ISA__) # error "LWP instruction set not enabled" #else Index: config/i386/xopintrin.h =================================================================== --- config/i386/xopintrin.h (revision 197691) +++ config/i386/xopintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _XOPMMINTRIN_H_INCLUDED #define _XOPMMINTRIN_H_INCLUDED -#ifndef __XOP__ +#if !defined (__XOP__) && !defined (__ALL_ISA__) # error "XOP instruction set not enabled" #else Index: config/i386/fmaintrin.h =================================================================== --- config/i386/fmaintrin.h (revision 197691) +++ config/i386/fmaintrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMAINTRIN_H_INCLUDED #define _FMAINTRIN_H_INCLUDED -#ifndef __FMA__ +#if !defined (__FMA__) && !defined (__ALL_ISA__) # error "FMA instruction set not enabled" #else Index: config/i386/bmiintrin.h =================================================================== --- config/i386/bmiintrin.h (revision 197691) +++ config/i386/bmiintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __BMI__ +#if !defined (__BMI__) && !defined (__ALL_ISA__) # error "BMI instruction set not enabled" #endif /* __BMI__ */ Index: config/i386/fma4intrin.h =================================================================== --- config/i386/fma4intrin.h (revision 197691) +++ config/i386/fma4intrin.h (working copy) @@ -28,7 +28,7 @@ #ifndef _FMA4INTRIN_H_INCLUDED #define _FMA4INTRIN_H_INCLUDED -#ifndef __FMA4__ +#if !defined (__FMA4__) && !defined (__ALL_ISA__) # error "FMA4 instruction set not enabled" #else Index: config/i386/nmmintrin.h =================================================================== --- config/i386/nmmintrin.h (revision 197691) +++ config/i386/nmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _NMMINTRIN_H_INCLUDED #define _NMMINTRIN_H_INCLUDED -#ifndef __SSE4_2__ +#if !defined (__SSE4_2__) && !defined (__ALL_ISA__) # error "SSE4.2 instruction set not enabled" #else /* We just include SSE4.1 header file. */ Index: config/i386/tbmintrin.h =================================================================== --- config/i386/tbmintrin.h (revision 197691) +++ config/i386/tbmintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __TBM__ +#if !defined (__TBM__) && !defined (__ALL_ISA__) # error "TBM instruction set not enabled" #endif /* __TBM__ */ Index: config/i386/smmintrin.h =================================================================== --- config/i386/smmintrin.h (revision 197691) +++ config/i386/smmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _SMMINTRIN_H_INCLUDED #define _SMMINTRIN_H_INCLUDED -#ifndef __SSE4_1__ +#if !defined (__SSE4_1__) && !defined (__ALL_ISA__) # error "SSE4.1 instruction set not enabled" #else Index: config/i386/wmmintrin.h =================================================================== --- config/i386/wmmintrin.h (revision 197691) +++ config/i386/wmmintrin.h (working copy) @@ -30,7 +30,7 @@ /* We need definitions from the SSE2 header file. */ #include -#if !defined (__AES__) && !defined (__PCLMUL__) +#if !defined (__AES__) && !defined (__PCLMUL__) && !defined (__ALL_ISA__) # error "AES/PCLMUL instructions not enabled" #else Index: config/i386/popcntintrin.h =================================================================== --- config/i386/popcntintrin.h (revision 197691) +++ config/i386/popcntintrin.h (working copy) @@ -21,7 +21,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see . */ -#ifndef __POPCNT__ +#if !defined (__POPCNT__) && !defined (__ALL_ISA__) # error "POPCNT instruction set not enabled" #endif /* __POPCNT__ */ Index: config/i386/f16cintrin.h =================================================================== --- config/i386/f16cintrin.h (revision 197691) +++ config/i386/f16cintrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include or instead." #endif -#ifndef __F16C__ +#if !defined (__F16C__) && !defined (__ALL_ISA__) # error "F16C instruction set not enabled" #else Index: config/i386/pmmintrin.h =================================================================== --- config/i386/pmmintrin.h (revision 197691) +++ config/i386/pmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _PMMINTRIN_H_INCLUDED #define _PMMINTRIN_H_INCLUDED -#ifndef __SSE3__ +#if !defined (__SSE3__) && !defined (__ALL_ISA__) # error "SSE3 instruction set not enabled" #else Index: config/i386/bmi2intrin.h =================================================================== --- config/i386/bmi2intrin.h (revision 197691) +++ config/i386/bmi2intrin.h (working copy) @@ -25,7 +25,7 @@ # error "Never use directly; include instead." #endif -#ifndef __BMI2__ +#if !defined (__BMI2__) && !defined (__ALL_ISA__) # error "BMI2 instruction set not enabled" #endif /* __BMI2__ */ Index: config/i386/tmmintrin.h =================================================================== --- config/i386/tmmintrin.h (revision 197691) +++ config/i386/tmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _TMMINTRIN_H_INCLUDED #define _TMMINTRIN_H_INCLUDED -#ifndef __SSSE3__ +#if !defined (__SSSE3__) && !defined (__ALL_ISA__) # error "SSSE3 instruction set not enabled" #else Index: config/i386/xmmintrin.h =================================================================== --- config/i386/xmmintrin.h (revision 197691) +++ config/i386/xmmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _XMMINTRIN_H_INCLUDED #define _XMMINTRIN_H_INCLUDED -#ifndef __SSE__ +#if !defined (__SSE__) && !defined (__ALL_ISA__) # error "SSE instruction set not enabled" #else Index: config/i386/mmintrin.h =================================================================== --- config/i386/mmintrin.h (revision 197691) +++ config/i386/mmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _MMINTRIN_H_INCLUDED #define _MMINTRIN_H_INCLUDED -#ifndef __MMX__ +#if !defined (__MMX__) && !defined (__ALL_ISA__) # error "MMX instruction set not enabled" #else /* The Intel API is flexible enough that we must allow aliasing with other Index: config/i386/ammintrin.h =================================================================== --- config/i386/ammintrin.h (revision 197691) +++ config/i386/ammintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _AMMINTRIN_H_INCLUDED #define _AMMINTRIN_H_INCLUDED -#ifndef __SSE4A__ +#if !defined (__SSE4A__) && !defined (__ALL_ISA__) # error "SSE4A instruction set not enabled" #else Index: config/i386/emmintrin.h =================================================================== --- config/i386/emmintrin.h (revision 197691) +++ config/i386/emmintrin.h (working copy) @@ -27,7 +27,7 @@ #ifndef _EMMINTRIN_H_INCLUDED #define _EMMINTRIN_H_INCLUDED -#ifndef __SSE2__ +#if !defined (__SSE2__) && !defined (__ALL_ISA__) # error "SSE2 instruction set not enabled" #else