From patchwork Mon Jul 31 05:35:13 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 795544 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-459322-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="g+dD+rip"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xLSrH3vltz9s81 for ; Mon, 31 Jul 2017 15:37:03 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=HgN1gMsKZYmgmdqyhsnJalFjF9qja4MKXbzTPEa/Et470CdJKy j75Akoz4G2HSmQ4uiEpE3ucxcfUr+Z7n0FGKde1HmTh9TZrXQgHV9hQ3z5pbh5YI /YygbAaiiA/THQaBSvrWcMRZd3gAhZbEAMpKyUWQ99I+jOUvCgdHIKCSw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=BtQ89A3gzibJN52rl0dWfFgYMtg=; b=g+dD+rip9VHpyThhUvEl tWNZOoTnqNPkfHYDDJXuXgQCw50Ya71xDNZoGvrMqC1G9+a3ZbrKN0TAkakN+Znx Y5qSJU7QgvgXSaSgWcm/eCE/YvS1b+3/SYycO2PE/tTEdUylAgNRt9r3W6996iKi dEJTiAOO8fAdINwVwRdHR3M= Received: (qmail 125040 invoked by alias); 31 Jul 2017 05:36:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 123535 invoked by uid 89); 31 Jul 2017 05:35:19 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-26.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 spammy=Higher, attacks, discovers, adas X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 31 Jul 2017 05:35:15 +0000 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D8AECAC2DE for ; Mon, 31 Jul 2017 05:35:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D8AECAC2DE Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx09.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=law@redhat.com Received: from localhost.localdomain (ovpn-116-95.phx2.redhat.com [10.3.116.95]) by smtp.corp.redhat.com (Postfix) with ESMTP id 82A4627A3C1 for ; Mon, 31 Jul 2017 05:35:13 +0000 (UTC) To: gcc-patches From: Jeff Law Subject: [PATCH][RFA/RFC] Stack clash mitigation patch 01/08 - V3 Message-ID: <4194c6f8-4efb-3eaa-f165-367b252605c0@redhat.com> Date: Sun, 30 Jul 2017 23:35:13 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.2.1 MIME-Version: 1.0 X-IsSubscribed: yes This patch introduces the stack clash protection options Changes since V2: Adds two new params. The first controls the size of the guard area. This controls the threshold for when a function prologue requires probes. The second controls the probing interval -- ie, once probes are needed, how often do we emit them. These are really meant more for developers to experiment with than users. Regardless I did go ahead and document them./PARAM It also adds some sanity checking WRT combining stack clash protection with -fstack-check. Jeff * common.opt (-fstack-clash-protection): New option. * flag-types.h (enum stack_check_type): Note difference between -fstack-check= and -fstack-clash-protection. * params.h (set_param_value): Verify stack-clash-protection params are powers of two. * params.def (PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE): New PARAM. (PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL): Likewise. * toplev.c (process_options): Issue warnings/errors for cases not handled with -fstack-clash-protection. * doc/invoke.texi (-fstack-clash-protection): Document new option. (-fstack-check): Note additional problem with -fstack-check=generic. Note that -fstack-check is primarily for Ada and refer users to -fstack-clash-protection for stack-clash-protection. Document new params for stack clash protection. * toplev.c (process_options): Handle -fstack-clash-protection. testsuite/ * gcc.dg/stack-check-2.c: New test. * lib/target-supports.exp (check_effective_target_supports_stack_clash_protection): New function. (check_effective_target_frame_pointer_for_non_leaf): Likewise. (check_effective_target_caller_implicit_probes): Likewise. diff --git a/gcc/common.opt b/gcc/common.opt index e81165c..cfaf2bc 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2306,6 +2306,11 @@ fstack-check Common Alias(fstack-check=, specific, no) Insert stack checking code into the program. Same as -fstack-check=specific. +fstack-clash-protection +Common Report Var(flag_stack_clash_protection) +Insert code to probe each page of stack space as it is allocated to protect +from stack-clash style attacks + fstack-limit Common Var(common_deferred_options) Defer diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 3e5cee8..8095dc2 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -10128,6 +10128,20 @@ compilation without. The value for compilation with profile feedback needs to be more conservative (higher) in order to make tracer effective. +@item stack-clash-protection-guard-size +The size (in bytes) of the stack guard. Acceptable values are powers of 2 +between 4096 and 134217728 and defaults to 4096. Higher values may reduce the +number of explicit probes, but a value larger than the operating system +provided guard will leave code vulnerable to stack clash style attacks. + +@item stack-clash-protection-probe-interval +Stack clash protection involves probing stack space as it is allocated. This +param controls the maximum distance (in bytes) between probes into the stack. +Acceptable values are powers of 2 between 4096 and 65536 and defaults to +4096. Higher values may reduce the number of explicit probes, but a value +larger than the operating system provided guard will leave code vulnerable to +stack clash style attacks. + @item max-cse-path-length The maximum number of basic blocks on path that CSE considers. @@ -11333,7 +11347,8 @@ target support in the compiler but comes with the following drawbacks: @enumerate @item Modified allocation strategy for large objects: they are always -allocated dynamically if their size exceeds a fixed threshold. +allocated dynamically if their size exceeds a fixed threshold. Note this +may change the semantics of some code. @item Fixed limit on the size of the static frame of functions: when it is @@ -11348,6 +11363,25 @@ generic implementation, code performance is hampered. Note that old-style stack checking is also the fallback method for @samp{specific} if no target support has been added in the compiler. +@samp{-fstack-check=} is designed for Ada's needs to detect infinite recursion +and stack overflows. @samp{specific} is an excellent choice when compiling +Ada code. It is not generally sufficient to protect against stack-clash +attacks. To protect against those you want @samp{-fstack-clash-protection}. + +@item -fstack-clash-protection +@opindex fstack-clash-protection +Generate code to prevent stack clash style attacks. When this option is +enabled, the compiler will only allocate one page of stack space at a time +and each page is accessed immediately after allocation. Thus, it prevents +allocations from jumping over any stack guard page provided by the +operating system. + +Most targets do not fully support stack clash protection. However, on +those targets @option{-fstack-clash-protection} will protect dynamic stack +allocations. @option{-fstack-clash-protection} may also provide limited +protection for static stack allocations if the target supports +@option{-fstack-check=specific}. + @item -fstack-limit-register=@var{reg} @itemx -fstack-limit-symbol=@var{sym} @itemx -fno-stack-limit diff --git a/gcc/flag-types.h b/gcc/flag-types.h index 5faade5..8874cba 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -166,7 +166,14 @@ enum permitted_flt_eval_methods PERMITTED_FLT_EVAL_METHODS_C11 }; -/* Type of stack check. */ +/* Type of stack check. + + Stack checking is designed to detect infinite recursion for Ada + programs. Furthermore stack checking tries to ensure that scenario + that enough stack space is left to run a signal handler. + + -fstack-check= does not prevent stack-clash style attacks. For that + you want -fstack-clash-protection. */ enum stack_check_type { /* Do not check the stack. */ diff --git a/gcc/params.c b/gcc/params.c index fab0ffa..8afe4c4 100644 --- a/gcc/params.c +++ b/gcc/params.c @@ -209,6 +209,11 @@ set_param_value (const char *name, int value, error ("maximum value of parameter %qs is %u", compiler_params[i].option, compiler_params[i].max_value); + else if ((strcmp (name, "stack-clash-protection-guard-size") == 0 + || strcmp (name, "stack-clash-protection-probe-interval") == 0) + && exact_log2 (value) == -1) + error ("value of parameter %qs must be a power of 2", + compiler_params[i].option); else set_param_value_internal ((compiler_param) i, value, params, params_set, true); diff --git a/gcc/params.def b/gcc/params.def index 6b07518..2270031 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -213,6 +213,16 @@ DEFPARAM(PARAM_STACK_FRAME_GROWTH, "Maximal stack frame growth due to inlining (in percent).", 1000, 0, 0) +DEFPARAM(PARAM_STACK_CLASH_PROTECTION_GUARD_SIZE, + "stack-clash-protection-guard-size", + "Minimum size (in bytes) of the stack guard, must be a power of 2 >= 4096.", + 4096, 4096, 134217728) + +DEFPARAM(PARAM_STACK_CLASH_PROTECTION_PROBE_INTERVAL, + "stack-clash-protection-probe-interval", + "Interval (in bytes) in which to probe the stack.", + 4096, 1024, 65536) + /* The GCSE optimization will be disabled if it would require significantly more memory than this value. */ DEFPARAM(PARAM_MAX_GCSE_MEMORY, diff --git a/gcc/testsuite/gcc.dg/stack-check-2.c b/gcc/testsuite/gcc.dg/stack-check-2.c new file mode 100644 index 0000000..196c4bb --- /dev/null +++ b/gcc/testsuite/gcc.dg/stack-check-2.c @@ -0,0 +1,66 @@ +/* The goal here is to ensure that we never consider a call to a noreturn + function as a potential tail call. + + Right now GCC discovers potential tail calls by looking at the + predecessors of the exit block. A call to a non-return function + has no successors and thus can never match that first filter. + + But that could change one day and we want to catch it. The problem + is the compiler could potentially optimize a tail call to a nonreturn + function, even if the caller has a frame. That breaks the assumption + that calls probe *sp when saving the return address that some targets + depend on to elide stack probes. */ + +/* { dg-do compile } */ +/* { dg-options "-O2 -fstack-clash-protection -fdump-tree-tailc -fdump-tree-optimized" } */ +/* { dg-require-effective-target supports_stack_clash_protection } */ + +extern void foo (void) __attribute__ ((__noreturn__)); + + +void +test_direct_1 (void) +{ + foo (); +} + +void +test_direct_2 (void) +{ + return foo (); +} + +void (*indirect)(void)__attribute__ ((noreturn)); + + +void +test_indirect_1 () +{ + (*indirect)(); +} + +void +test_indirect_2 (void) +{ + return (*indirect)();; +} + + +typedef void (*pvfn)() __attribute__ ((noreturn)); + +void (*indirect_casted)(void); + +void +test_indirect_casted_1 () +{ + (*(pvfn)indirect_casted)(); +} + +void +test_indirect_casted_2 (void) +{ + return (*(pvfn)indirect_casted)(); +} +/* { dg-final { scan-tree-dump-not "tail call" "tailc" } } */ +/* { dg-final { scan-tree-dump-not "tail call" "optimized" } } */ + diff --git a/gcc/testsuite/lib/target-supports.exp b/gcc/testsuite/lib/target-supports.exp index fe5e777..25c3c80 100644 --- a/gcc/testsuite/lib/target-supports.exp +++ b/gcc/testsuite/lib/target-supports.exp @@ -8468,3 +8468,78 @@ proc check_effective_target_arm_coproc4_ok { } { return [check_cached_effective_target arm_coproc4_ok \ check_effective_target_arm_coproc4_ok_nocache] } + +# Return 1 if the target has support for stack probing designed +# to avoid stack-clash style attacks. +# +# This is used to restrict the stack-clash mitigation tests to +# just those targets that have been explicitly supported. +# +# In addition to the prologue work on those targets, each target's +# properties should be described in the functions below so that +# tests do not become a mess of unreadable target conditions. +# +proc check_effective_target_supports_stack_clash_protection { } { + if { [istarget aarch*-*-*] || [istarget x86_64-*-*] + || [istarget i?86-*-*] || [istarget s390*-*-*] + || [istarget powerpc*-*-*] || [istarget rs6000*-*-*] } { + return 1 + } + return 0 +} + +# Return 1 if the target creates a frame pointer for non-leaf functions +# Note we ignore cases where we apply tail call optimization here. +proc check_effective_target_frame_pointer_for_non_leaf { } { + if { [istarget aarch*-*-*] } { + return 1 + } + return 0 +} + +# Return 1 if the target's calling sequence or its ABI +# create implicit stack probes at or prior to function entry. +proc check_effective_target_caller_implicit_probes { } { + + # On x86/x86_64 the call instruction itself pushes the return + # address onto the stack. That is an implicit probe of *sp. + if { [istarget x86_64-*-*] || [istarget i?86-*-*] } { + return 1 + } + + # On PPC, the ABI mandates that the address of the outer + # frame be stored at *sp. Thus each allocation of stack + # space is itself an implicit probe of *sp. + if { [istarget powerpc*-*-*] || [istarget rs6000*-*-*] } { + return 1 + } + + # s390's ABI has a register save area allocated by the + # caller for use by the callee. The mere existence does + # not constitute a probe by the caller, but when the slots + # used by the callee those stores are implicit probes. + if { [istarget s390*-*-*] } { + return 1 + } + + # Not strictly true on aarch64, but we have agreed that we will + # consider any function that pushes SP more than 3kbytes into + # the guard page as broken. This essentially means that we can + # consider the aarch64 as having a caller implicit probe at + # *(sp + 1k). + if { [istarget aarch64*-*-*] } { + return 1; + } + + return 0 +} + +# Targets that potentially realign the stack pointer often cause residual +# stack allocations and make it difficult to elimination loops or residual +# allocations for dynamic stack allocations +proc check_effective_target_callee_realigns_stack { } { + if { [istarget x86_64-*-*] || [istarget i?86-*-*] } { + return 1 + } + return 0 +} diff --git a/gcc/toplev.c b/gcc/toplev.c index e6c69a4..24a00df 100644 --- a/gcc/toplev.c +++ b/gcc/toplev.c @@ -1591,6 +1591,26 @@ process_options (void) flag_associative_math = 0; } + /* -fstack-clash-protection is not currently supported on targets + where the stack grows up. */ + if (flag_stack_clash_protection && !STACK_GROWS_DOWNWARD) + { + warning_at (UNKNOWN_LOCATION, 0, + "-fstack-clash_protection is not supproted on targets " + "where the stack grows from lower to higher addresses"); + flag_stack_clash_protection = 0; + } + + /* We can not support -fstack-check= and -fstack-clash-protection at + the same time. */ + if (flag_stack_check != NO_STACK_CHECK && flag_stack_clash_protection) + { + warning_at (UNKNOWN_LOCATION, 0, + "-fstack-check= and -fstack-clash_protection are mutually " + "exclusive. Disabling -fstack-check="); + flag_stack_check = NO_STACK_CHECK; + } + /* With -fcx-limited-range, we do cheap and quick complex arithmetic. */ if (flag_cx_limited_range) flag_complex_method = 0;