From patchwork Wed Nov 14 16:53:53 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 198963 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 4365D2C007D for ; Thu, 15 Nov 2012 03:54:32 +1100 (EST) Comment: DKIM? See http://www.dkim.org DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=gcc.gnu.org; s=default; x=1353516873; h=Comment: DomainKey-Signature:Received:Received:Received:Received: MIME-Version:Received:Received:In-Reply-To:References:Date: Message-ID:Subject:From:To:Cc:Content-Type:Mailing-List: Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:Sender:Delivered-To; bh=D98WZ/0F8d8xmCHUkUYMHOtFyYk=; b=fhiIovjkSjcIfv6zDV8OJMjej6108px84ZccHBp8L/iuqEre1WAUXqWaOfE7Pk jHRgTsWw+l5Qmp6xpHUymrrxW9R1LtiOXz8OT8x0uxZQzeA+zA1qis5qJbIt3BKe So8NDOg10zvuNfjF3k2W9Rmvu5tMOG77ALGZPjmpPOvX0= Comment: DomainKeys? See http://antispam.yahoo.com/domainkeys DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=default; d=gcc.gnu.org; h=Received:Received:X-SWARE-Spam-Status:X-Spam-Check-By:Received:Received:MIME-Version:Received:Received:In-Reply-To:References:Date:Message-ID:Subject:From:To:Cc:Content-Type:Mailing-List:Precedence:List-Id:List-Unsubscribe:List-Archive:List-Post:List-Help:Sender:Delivered-To; b=tloNYXSvULIknyGzObwH0KaMECMJOtdp9RhNGmgNOXFro1QIMD7PtNFGoKP6TJ zPZ/cyK6hC9eYxrv9eHRwR20YVSSgBb1fwTrmQcMso9eVwxwFhq0U284VrXfivjx GHQw+tFNAJN4pgpTXwzJgXcSPSyVHiOXtKx0wXYZyzmpw=; Received: (qmail 6276 invoked by alias); 14 Nov 2012 16:54:05 -0000 Received: (qmail 6266 invoked by uid 22791); 14 Nov 2012 16:54:03 -0000 X-SWARE-Spam-Status: No, hits=-4.9 required=5.0 tests=AWL, BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, KHOP_RCVD_TRUST, KHOP_THREADED, RCVD_IN_DNSWL_LOW, RCVD_IN_HOSTKARMA_YE, TW_VZ, TW_ZJ X-Spam-Check-By: sourceware.org Received: from mail-pa0-f47.google.com (HELO mail-pa0-f47.google.com) (209.85.220.47) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Wed, 14 Nov 2012 16:53:54 +0000 Received: by mail-pa0-f47.google.com with SMTP id fa11so391969pad.20 for ; Wed, 14 Nov 2012 08:53:54 -0800 (PST) MIME-Version: 1.0 Received: by 10.68.189.163 with SMTP id gj3mr65368133pbc.110.1352912034115; Wed, 14 Nov 2012 08:53:54 -0800 (PST) Received: by 10.66.246.232 with HTTP; Wed, 14 Nov 2012 08:53:53 -0800 (PST) In-Reply-To: References: Date: Wed, 14 Nov 2012 17:53:53 +0100 Message-ID: Subject: Re: [PATCH, i386]: Implement post-reload vzeroupper insertion pass From: Uros Bizjak To: gcc-patches@gcc.gnu.org Cc: Vladimir Yakovlev , "H.J. Lu" , Igor Zamyatin Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org On Sun, Nov 11, 2012 at 9:45 PM, Uros Bizjak wrote: > >> Regarding vzeroupper insertion pass - we will use gcc pass manager to >> insert a target-dependant pass directly after reload ... > > ... like attached patch. The patch inserts vzeroupper pass directly > after reload, so spills from 256bit registers are considered when > processing AVX_U128 entity. The patched gcc reruns mode-switching > pass, so an export of entry function from mode-switching is needed. 2012-11-14 Uros Bizjak Vladimir Yakovlev PR target/47440 * config/i386/i386.c (gate_insert_vzeroupper): New function. (rest_of_handle_insert_vzeroupper): Ditto. (struct rtl_opt_pass pass_insert_vzeroupper): New. (ix86_option_override): Register vzeroupper insertion pass here. (ix86_check_avx256_register): Handle SUBREGs properly. (ix86_init_machine_status): Remove optimize_mode_switching[AVX_U128] initialization. Bootstrapped and regression tested on x86_64-pc-linux-gnu {,-m32} AVX target and committed to mainline SVN. Uros. Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 193502) +++ config/i386/i386.c (working copy) @@ -2301,6 +2301,52 @@ static const char *const cpu_names[TARGET_CPU_DEFA "btver2" }; +static bool +gate_insert_vzeroupper (void) +{ + return TARGET_VZEROUPPER; +} + +static unsigned int +rest_of_handle_insert_vzeroupper (void) +{ + int i; + + /* vzeroupper instructions are inserted immediately after reload to + account for possible spills from 256bit registers. The pass + reuses mode switching infrastructure by re-running mode insertion + pass, so disable entities that have already been processed. */ + for (i = 0; i < MAX_386_ENTITIES; i++) + ix86_optimize_mode_switching[i] = 0; + + ix86_optimize_mode_switching[AVX_U128] = 1; + + /* Call optimize_mode_switching. */ + pass_mode_switching.pass.execute (); + return 0; +} + +struct rtl_opt_pass pass_insert_vzeroupper = +{ + { + RTL_PASS, + "vzeroupper", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + gate_insert_vzeroupper, /* gate */ + rest_of_handle_insert_vzeroupper, /* execute */ + NULL, /* sub */ + NULL, /* next */ + 0, /* static_pass_number */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_df_finish | TODO_verify_rtl_sharing | + 0, /* todo_flags_finish */ + } +}; + /* Return true if a red-zone is in use. */ static inline bool @@ -3705,7 +3751,16 @@ ix86_option_override_internal (bool main_args_p) static void ix86_option_override (void) { + static struct register_pass_info insert_vzeroupper_info + = { &pass_insert_vzeroupper.pass, "reload", + 1, PASS_POS_INSERT_AFTER + }; + ix86_option_override_internal (true); + + + /* This needs to be done at start up. It's convenient to do it here. */ + register_pass (&insert_vzeroupper_info); } /* Update register usage after having seen the compiler flags. */ @@ -14988,10 +15043,15 @@ output_387_binary_op (rtx insn, rtx *operands) /* Check if a 256bit AVX register is referenced inside of EXP. */ static int -ix86_check_avx256_register (rtx *exp, void *data ATTRIBUTE_UNUSED) +ix86_check_avx256_register (rtx *pexp, void *data ATTRIBUTE_UNUSED) { - if (REG_P (*exp) - && VALID_AVX256_REG_OR_OI_MODE (GET_MODE (*exp))) + rtx exp = *pexp; + + if (GET_CODE (exp) == SUBREG) + exp = SUBREG_REG (exp); + + if (REG_P (exp) + && VALID_AVX256_REG_OR_OI_MODE (GET_MODE (exp))) return 1; return 0; @@ -23449,7 +23509,6 @@ ix86_init_machine_status (void) f = ggc_alloc_cleared_machine_function (); f->use_fast_prologue_epilogue_nregs = -1; f->call_abi = ix86_abi; - f->optimize_mode_switching[AVX_U128] = TARGET_VZEROUPPER; return f; }