From patchwork Fri Jun 24 15:51:40 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Klein X-Patchwork-Id: 101799 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) by ozlabs.org (Postfix) with SMTP id 67934B6F87 for ; Fri, 24 Jun 2011 23:55:35 +1000 (EST) Received: (qmail 17572 invoked by alias); 24 Jun 2011 13:55:34 -0000 Received: (qmail 17558 invoked by uid 22791); 24 Jun 2011 13:55:31 -0000 X-SWARE-Spam-Status: No, hits=-0.2 required=5.0 tests=AWL, BAYES_00, FREEMAIL_FROM, KAM_STOCKGEN, RCVD_IN_DNSWL_NONE, T_RP_MATCHES_RCVD, T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: sourceware.org Received: from fmmailgate03.web.de (HELO fmmailgate03.web.de) (217.72.192.234) by sourceware.org (qpsmtpd/0.43rc1) with ESMTP; Fri, 24 Jun 2011 13:55:13 +0000 Received: from smtp06.web.de ( [172.20.5.172]) by fmmailgate03.web.de (Postfix) with ESMTP id 50C7F192C3CD6 for ; Fri, 24 Jun 2011 15:54:33 +0200 (CEST) Received: from [91.0.53.209] (helo=[91.0.53.209]) by smtp06.web.de with asmtp (TLSv1:AES256-SHA:256) (WEB.DE 4.110 #2) id 1Qa6q7-0006fZ-00 for gcc-patches@gcc.gnu.org; Fri, 24 Jun 2011 15:54:31 +0200 Message-ID: <4E04B28C.2070802@web.de> Date: Fri, 24 Jun 2011 15:51:40 +0000 From: Thomas Klein User-Agent: Mozilla/5.0 (X11; U; FreeBSD amd64; de-DE; rv:1.9.1.16) Gecko/20110503 Thunderbird/3.0.11 MIME-Version: 1.0 To: gcc-patches@gcc.gnu.org Subject: Ping: C-family stack check for threads X-Sender: th.r.klein@web.de Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Hi This is a ping of (http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html). Repeating my request. I would like to have a stack check for threads with small amount of stack space per thread. (I'm using a ARM Cortex-M3 microcontroller with a stack size of a 1 KByte per Thread.) Each thread having its own limit address. The thread scheduler can then calculate the limit and store this value inside of a global variable. The compiler may generate code to check the stack for overflow at function entry. In principal this can be done this way: - push registers as usual - figure out if one or two work registers, that can be used directly without extra push - if not enough registers found push required work registers to stack - load limit address into first working register - load value of limit address (into the same register) - if stack pointer will go to extend the stack (e.g. for local variables) load this size value too (here the second work register can be used) - compare for overflow - if overflow occur "call" stack_failure function - pop work registers that are pushed before - continue function prologue as usual e.g. extend stack pointer The ARM target has an option "-mapcs-stack-check" but this is more or less not working. (implementation seems to be missing) There are also architecture independent options like "-fstack-check=generic", "-fstack-limit-symbol=current_stack_limit" or "-fstack-limit-register=r6" that can be used. The generic stack check is doing a probe at end of function prologue phase (e.g by writing 12K ahead the current stack pointer position). If this stack space is not available the probe may generates a fault. This require that the CPU is having a MPU or a MMU. For machines with small memory space an additional mechanism should be available. The option "-fstack-check" can be extend by the switches "direct" and "indirect" to emit compare code in function prologue. If switch "direct" is given the address of "-fstack-limit-symbol" represents the limit itself. If switch "indirect" is given "-fstack-limit-symbol" is a kind of global variable that needs be read before comparison. I have add an proposal to show how an integration of this behavior can be done at an ARM architecture. The generated code look like this e.g. if using "-fstack-check=indirect -fstack-limit-symbol=stack_limit_var" -> push {r0} -> ldr r0, .LSPCHK0 -> ldr r0, [r0] -> cmp sp, r0 -> bhs .LSPCHK1 -> push {lr} -> bl __thumb_stack_failure -> .align 2 -> .LSPCHK0: -> .word stack_limit_var -> .LSPCHK1: -> pop {r0} Regards Thomas Klein gcc/ChangeLog 2011-06-24 Thomas Klein * opts.c (common_handle_option): introduce additional stack checking parameters "direct" and "indirect" * flag-types.h (enum stack_check_type): Likewise * explow.c (allocate_dynamic_stack_space): - suppress stack probing if parameter "direct", "indirect" or if a stack-limit is given - do additional read of limit value if parameter "indirect" and a stack-limit symbol is given - emit a call to a stack_failure function [as an alternative to a trap call] (function probe_stack_range): if allowed to override the range probe emit generic_limit_check_stack * config/arm/arm.c (stack_check_output_function): new function to write the stack check code sequence to the assember file (inside prologue) (stack_check_work_registers): new function to find possible working registers [only used by "stack check"] (arm_expand_prologue): stack check integration for ARM and Thumb-2 (thumb1_output_function_prologue): stack check integration for Thumb-1 * config/arm/arm.md (probe_stack): do not emit code when parameters "direct" or "indirect" given, emit move code as in gcc/explow.c [function emit_stack_probe] (probe_stack_done): dummy to make sure probe_stack insns are not optimized away (generic_limit_check_stack): if stack-limit and parameter "generic" is given use the limit the same way as in function allocate_dynamic_stack_space (stack_check): ARM/Thumb-2 insn to output function stack_check_output_function (stack_failure): failure call used in function allocate_dynamic_stack_space [similar to a trap but avoid conflict with builtin_trap] ;; being inserted into the upper 16 bits of the register. (define_insn "*arm_movtas_ze" Index: gcc/opts.c =================================================================== --- gcc/opts.c (revision 175346) +++ gcc/opts.c (working copy) @@ -1629,6 +1629,12 @@ common_handle_option (struct gcc_options *opts, : STACK_CHECK_STATIC_BUILTIN ? STATIC_BUILTIN_STACK_CHECK : GENERIC_STACK_CHECK; + else if (!strcmp (arg, "indirect")) + /* This is an other stack checking method. */ + opts->x_flag_stack_check = INDIRECT_STACK_CHECK; + else if (!strcmp (arg, "direct")) + /* This is an other stack checking method. */ + opts->x_flag_stack_check = DIRECT_STACK_CHECK; else warning_at (loc, 0, "unknown stack check parameter \"%s\"", arg); break; Index: gcc/function.c =================================================================== --- gcc/function.c (revision 175346) +++ gcc/function.c (working copy) @@ -4810,7 +4810,9 @@ expand_function_start (tree subr) } /* If we are doing generic stack checking, the probe should go here. */ - if (flag_stack_check == GENERIC_STACK_CHECK) + if( flag_stack_check /*== GENERIC_STACK_CHECK + || flag_stack_check == STATIC_BUILTIN_STACK_CHECK + || flag_stack_check == FULL_BUILTIN_STACK_CHECK */) stack_check_probe_note = emit_note (NOTE_INSN_DELETED); /* Make sure there is a line number after the function entry setup code. */ Index: gcc/flag-types.h =================================================================== --- gcc/flag-types.h (revision 175346) +++ gcc/flag-types.h (working copy) @@ -153,7 +153,15 @@ enum stack_check_type /* Check the stack and entirely rely on the target configuration files, i.e. do not use the generic mechanism at all. */ - FULL_BUILTIN_STACK_CHECK + FULL_BUILTIN_STACK_CHECK, + + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is directly given e.g. by address + of a symbol */ + DIRECT_STACK_CHECK, + /* Check the stack (if possible) before allocation of local variables at + each function entry. The stack limit is given by global variable. */ + INDIRECT_STACK_CHECK }; /* Names for the different levels of -Wstrict-overflow=N. The numeric Index: gcc/explow.c =================================================================== --- gcc/explow.c (revision 175346) +++ gcc/explow.c (working copy) @@ -1356,7 +1356,12 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* If needed, check that we have the required amount of stack. Take into account what has already been checked. */ - if (STACK_CHECK_MOVING_SP) + if ( STACK_CHECK_MOVING_SP +#ifdef HAVE_generic_limit_check_stack + || crtl->limit_stack +#endif + || flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) ; else if (flag_stack_check == GENERIC_STACK_CHECK) probe_stack_range (STACK_OLD_CHECK_PROTECT + STACK_CHECK_MAX_FRAME_SIZE, @@ -1390,19 +1395,32 @@ allocate_dynamic_stack_space (rtx size, unsigned s /* Check stack bounds if necessary. */ if (crtl->limit_stack) { + rtx limit_rtx; rtx available; rtx space_available = gen_label_rtx (); + if ( GET_CODE (stack_limit_rtx) == SYMBOL_REF + && flag_stack_check == INDIRECT_STACK_CHECK) + limit_rtx = expand_unop (Pmode, mov_optab, + gen_rtx_MEM (Pmode, stack_limit_rtx), + NULL_RTX, 1); + else + limit_rtx = stack_limit_rtx; #ifdef STACK_GROWS_DOWNWARD available = expand_binop (Pmode, sub_optab, - stack_pointer_rtx, stack_limit_rtx, + stack_pointer_rtx, limit_rtx, NULL_RTX, 1, OPTAB_WIDEN); #else available = expand_binop (Pmode, sub_optab, - stack_limit_rtx, stack_pointer_rtx, + limit_rtx, stack_pointer_rtx, NULL_RTX, 1, OPTAB_WIDEN); #endif emit_cmp_and_jump_insns (available, size, GEU, NULL_RTX, Pmode, 1, space_available); +#ifdef HAVE_stack_failure + if (HAVE_stack_failure) + emit_insn (gen_stack_failure ()); + else +#endif #ifdef HAVE_trap if (HAVE_trap) emit_insn (gen_trap ()); @@ -1545,6 +1563,13 @@ probe_stack_range (HOST_WIDE_INT first, rtx size) return; } #endif +#ifdef HAVE_generic_limit_check_stack + else if (HAVE_generic_limit_check_stack) + { + rtx addr = memory_address (Pmode,stack_pointer_rtx); + emit_insn (gen_generic_limit_check_stack (addr)); + } +#endif /* Otherwise we have to generate explicit probes. If we have a constant small number of them to generate, that's the easy case. */ Index: gcc/config/arm/arm.c =================================================================== --- gcc/config/arm/arm.c (revision 175346) +++ gcc/config/arm/arm.c (working copy) @@ -14628,6 +14628,283 @@ arm_output_function_prologue (FILE *f, HOST_WIDE_I } +/* + * Write prolouge part of stack check into asm file. + * For Thumb this may look like this: + * push {rsym,ramn} + * ldr rsym, .LSPCHK0 + * ldr rsym, [rsym] + * ldr ramn, .LSPCHK0 + 4 + * add rsym, rsym, ramn + * cmp sp, rsym + * bhs .LSPCHK1 + * push {lr} + * bl __thumb_stack_failure + * .align 2 + * .LSPCHK0: + * .word symbol_addr_of(stack_limit_rtx) + * .word lenght_of(amount) + * .LSPCHK1: + * pop {rsym,ramn} + */ +void +stack_check_output_function (FILE *f, int reg0, int reg1, unsigned amount, + unsigned numregs) +{ + unsigned amount_needsreg; + bool amount_const_ok, is_non_opt_thumb2, is_thumb2_hi_reg[2]; + bool issym=false; + static unsigned spchk_labelno = 0; + char ok_lable_str[256]; + char pool_lable_str[256]; + + if (TARGET_THUMB1) + amount_const_ok = (amount < 256); + else + amount_const_ok = const_ok_for_arm (amount); + + if (GET_CODE (stack_limit_rtx) == SYMBOL_REF) /*stack_limit_rtx*/ + { + issym = true; + amount_needsreg = !amount_const_ok; + } + else + amount_needsreg = (amount > 0); + + is_non_opt_thumb2 = (TARGET_THUMB2 && !(optimize_size || optimize >= 2)); + is_thumb2_hi_reg[0] = (TARGET_THUMB2 && reg0>7); + is_thumb2_hi_reg[1] = (TARGET_THUMB2 && reg1>7); + + /*build labels for later use*/ + if ( (issym && !(is_non_opt_thumb2 || is_thumb2_hi_reg[0])) + ||(amount && !amount_const_ok + && !((issym && is_thumb2_hi_reg[1]) + || (!issym && is_thumb2_hi_reg[0]) + || is_non_opt_thumb2))) + ASM_GENERATE_INTERNAL_LABEL (pool_lable_str, "LSPCHK", spchk_labelno++); + ASM_GENERATE_INTERNAL_LABEL (ok_lable_str, "LSPCHK", spchk_labelno++); + + if (issym && amount) /*need temp regs for limit and amount*/ + { + if (numregs >= 2) + ; /*have 2 regs => no need to push*/ + else if (numregs == 1) + { + if (amount_needsreg) + { + /*have one reg but need two regs => push temp reg for amount*/ + if (TARGET_ARM) + asm_fprintf (f, "\tstr\t%r, [%r, #-4]!\n", reg1, SP_REGNUM); + else + asm_fprintf (f, "\tpush\t{%r}\n", reg1); + /*due to additional push try to correct amount*/ + if (amount >= 4) + { + if (amount_const_ok) + { + if (TARGET_THUMB1 || const_ok_for_arm(amount - 4)) + amount -= 4; + /*on Thumb2 or ARM may not corrected; shouldn't hurt*/ + } + else /*will be loaded from pool*/ + amount -= 4; + } + } + } + else if (amount_needsreg) + { + /*have no reg but need two => push temp regs for limit and amount*/ + if (TARGET_ARM) + asm_fprintf (f, "\tstmfd\t%r!, {%r,%r}\n", SP_REGNUM, reg0, reg1); + else + asm_fprintf (f, "\tpush\t{%r,%r}\n", reg0, reg1); + /*due to additional push try to correct amount*/ + if (amount >= 8) + { + if (amount_const_ok) + { + if (TARGET_THUMB1 || const_ok_for_arm(amount - 8)) + amount -= 8; + /*on Thumb2 or ARM may not corrected; shouldn't hurt*/ + } + else /*will be loaded from pool*/ + amount -= 8; + } + } + else + { + /*have no reg but need one reg => push temp reg for limit*/ + if (TARGET_ARM) + asm_fprintf (f, "\tstr\t%r, [%r, #-4]!\n", reg0, SP_REGNUM); + else + asm_fprintf (f, "\tpush\t{%r}\n", reg0); + /*due to additional push try to correct amount*/ + if (amount >= 4) + { + if (amount_const_ok) + { + if (TARGET_THUMB1 || const_ok_for_arm(amount - 4)) + amount -= 4; + /*on Thumb2 or ARM may not corrected; shouldn't hurt*/ + } + else /*will be loaded from pool*/ + amount -= 4; + } + } + } + else if ((issym || amount_needsreg) && numregs == 0) + { /*push temp reg either for limit or amount*/ + if (TARGET_ARM) + asm_fprintf (f, "\tstr\t%r, [%r, #-4]!\n", reg0, SP_REGNUM); + else + asm_fprintf (f, "\tpush\t{%r}\n", reg0); + } + + if (issym) + { + if (is_non_opt_thumb2 || is_thumb2_hi_reg[0]) + { + const char *str ; + str = (const char *) XSTR (stack_limit_rtx, 0); + asm_fprintf (f, "\tmovw\t%r, #:lower16:%s\n", reg0, str); + asm_fprintf (f, "\tmovt\t%r, #:upper16:%s\n", reg0, str); + } + else + { + asm_fprintf (f, "\tldr\t%r, ", reg0); + assemble_name (f, pool_lable_str); /* =stack_limit_rtx */ + fputs ("\n", f); + } + + if (flag_stack_check == INDIRECT_STACK_CHECK) + asm_fprintf (f, "\tldr\t%r, [%r]\n", reg0, reg0); + if (amount) + { + if (amount_const_ok) + { + if (TARGET_32BIT) + asm_fprintf (f, "\tadds\t%r, %r, #%d\n", reg0, reg0, amount); + else + asm_fprintf (f, "\tadd\t%r, %r, #%d\n", reg0, reg0, amount); + } + else + { + if (is_non_opt_thumb2 || is_thumb2_hi_reg[1]) + { + asm_fprintf (f, "\tmovw\t%r, #0x%X\n", reg1, amount&0xFFFF); + asm_fprintf (f, "\tmovt\t%r, #0x%X\n", reg1, + (amount>>16)&0xFFFF); + } + else + { + asm_fprintf (f, "\tldr\t%r, ", reg1); + assemble_name (f, pool_lable_str); /* =amount */ + if (is_thumb2_hi_reg[0]) + fputs ("\n", f); + else + fputs (" + 4\n", f); + } + asm_fprintf (f, "\tadd\t%r, %r, %r\n", reg0, reg0, reg1); + } + } + asm_fprintf (f, "\tcmp\t%r, %r\n", SP_REGNUM, reg0); + } + else if (amount) + { + if (amount_const_ok) + asm_fprintf (f, "\tmov\t%r, #%d\n", reg0, amount); + else + { + if (is_non_opt_thumb2 || is_thumb2_hi_reg[0]) + { + asm_fprintf (f, "\tmovw\t%r, #0x%X\n", reg0, amount&0xFFFF); + asm_fprintf (f, "\tmovt\t%r, #0x%X\n", reg0,(amount>>16)&0xFFFF); + } + else + { + asm_fprintf (f, "\tldr\t%r, ", reg0); + assemble_name (f, pool_lable_str); /* amount */ + fputs ("\n", f); + } + } + asm_fprintf (f, "\tadd\t%r, %r, %r\n", reg0,reg0,REGNO(stack_limit_rtx)); + asm_fprintf (f, "\tcmp\t%r, %r\n", SP_REGNUM, reg0); + } + else + asm_fprintf (f, "\tcmp\t%r, %r\n", SP_REGNUM, REGNO(stack_limit_rtx)); + asm_fprintf (f, "\tbhs\t"); + assemble_name (f, ok_lable_str); + fputs ("\n", f); + + if (TARGET_ARM) + { + asm_fprintf (f, "\tstr\t%r, [%r, #-4]!\n", LR_REGNUM, SP_REGNUM); + asm_fprintf (f, "\tbl\t__arm_stack_failure\t%@ stack check\n"); + } + else + { + asm_fprintf (f, "\tpush\t{%r}\n", LR_REGNUM); + asm_fprintf (f, "\tbl\t__thumb_stack_failure\t%@ stack check\n"); + } + + /*pool*/ + if ( (issym && !(is_non_opt_thumb2 || is_thumb2_hi_reg[0])) + ||(amount && !amount_const_ok + && !( (issym && is_thumb2_hi_reg[1]) + || (!issym && is_thumb2_hi_reg[0]) + || is_non_opt_thumb2))) + { + /*temp regs: collect values from here*/ + if (!TARGET_ARM) + ASM_OUTPUT_ALIGN (f, 2); + ASM_OUTPUT_LABEL(f,pool_lable_str); + if (issym && !(is_non_opt_thumb2 || is_thumb2_hi_reg[0])) + assemble_aligned_integer (UNITS_PER_WORD, stack_limit_rtx); + if (amount && !amount_const_ok + && !( (issym && is_thumb2_hi_reg[1]) + || (!issym && is_thumb2_hi_reg[0]) + || is_non_opt_thumb2)) + assemble_aligned_integer (UNITS_PER_WORD, GEN_INT (amount)); + } + ASM_OUTPUT_LABEL(f,ok_lable_str); + if (issym && amount) /*pop temp regs used by limit and amount*/ + { + if (numregs >= 2) + ; /*no need to pop*/ + else if (numregs == 1) + { + if (amount_needsreg) + { + if (TARGET_ARM) + asm_fprintf (f, "\tldr\t%r, [%r, #4]!\n", reg1, SP_REGNUM); + else + asm_fprintf (f, "\tpop\t{%r}\n", reg1); + } + } + else if (amount_needsreg) + { + if (TARGET_ARM) + asm_fprintf (f, "\tldmfd\t%r!, {%r,%r}\n", SP_REGNUM, reg0, reg1); + else + asm_fprintf (f, "\tpop\t{%r,%r}\n", reg0, reg1); + } + else + { + if (TARGET_ARM) + asm_fprintf (f, "\tldr\t%r, [%r, #4]!\n", reg0, SP_REGNUM); + else + asm_fprintf (f, "\tpop\t{%r}\n", reg0); + } + } + else if ((issym || amount_needsreg) && numregs == 0) + { /*pop temp reg used by limit or amount*/ + if (TARGET_ARM) + asm_fprintf (f, "\tldr\t%r, [%r, #4]!\n", reg0, SP_REGNUM); + else + asm_fprintf (f, "\tpop\t{%r}\n", reg0); + } +} + const char * arm_output_epilogue (rtx sibling) { @@ -15800,6 +16077,72 @@ thumb_set_frame_pointer (arm_stack_offsets *offset RTX_FRAME_RELATED_P (insn) = 1; } +/*search for possible work registers for stack-check operation at prologue + return the number of register that can be used without extra push/pop */ + +static int +stack_check_work_registers (rtx *workreg) +{ + int reg, i, k, n, nregs; + + if (crtl->args.info.pcs_variant <= ARM_PCS_AAPCS_LOCAL) + { + nregs = crtl->args.info.aapcs_next_ncrn; + } + else + nregs = crtl->args.info.nregs; + + + n = 0; + i = 0; + /* check if we can use one of the argument registers r0..r3 as long as they + * not holding data*/ + for (reg = 0; reg <= LAST_ARG_REGNUM && i < 2; reg++) + { + if ( !df_regs_ever_live_p (reg) + || (cfun->machine->uses_anonymous_args && crtl->args.pretend_args_size + > (LAST_ARG_REGNUM - reg) * UNITS_PER_WORD) + || (!cfun->machine->uses_anonymous_args && nregs < reg + 1) + ) + { + workreg[i++] = gen_rtx_REG (SImode, reg); + n = (reg + 1) % 4; + } + } + + /* otherwise try to use r4..r7*/ + for (reg = LAST_ARG_REGNUM + 1; reg <= LAST_LO_REGNUM && i < 2; reg++) + { + if ( df_regs_ever_live_p (reg) + && !fixed_regs[reg] + && reg != FP_REGNUM ) + { + workreg[i++] = gen_rtx_REG (SImode, reg); + } + } + + if (TARGET_32BIT) + { + /* ARM and Thumb-2 can use high regs. */ + for (reg = FIRST_HI_REGNUM; reg <= LAST_HI_REGNUM && i < 2; reg ++) + if ( df_regs_ever_live_p (reg) + && !fixed_regs[reg] + && reg != FP_REGNUM ) + { + workreg[i++] = gen_rtx_REG (SImode, reg); + } + } + + k = i; + /* if not enough found to be uses without extra push, + * collect next from r0..r4*/ + for ( ; i<2; i++) + workreg[i] = gen_rtx_REG (SImode, n++); + + return k; +} + + /* Generate the prologue instructions for entry into an ARM or Thumb-2 function. */ void @@ -16049,6 +16392,24 @@ arm_expand_prologue (void) current_function_static_stack_size = offsets->outgoing_args - offsets->saved_args; + if ( crtl->limit_stack + && !(IS_INTERRUPT (func_type)) + && ( flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) + && (offsets->outgoing_args - offsets->saved_args) > 0 + ) + { + rtx reg[2], num_temp_regs; + + amount = GEN_INT (offsets->outgoing_args - saved_regs + - offsets->saved_args); + num_temp_regs = GEN_INT (stack_check_work_registers(reg)); + insn = gen_stack_check (stack_pointer_rtx, + reg[0], reg[1], stack_limit_rtx, + amount, num_temp_regs); + insn = emit_insn (insn); + } + if (offsets->outgoing_args != offsets->saved_args + saved_regs) { /* This add can produce multiple insns for a large constant, so we @@ -21403,6 +21764,26 @@ thumb1_output_function_prologue (FILE *f, HOST_WID thumb_pushpop (f, pushable_regs, 1, &cfa_offset, real_regs_mask); } } + + if( crtl->limit_stack + && ( flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) + && (offsets->outgoing_args - offsets->saved_args) + ) + { + unsigned amount, numregs; + int reg0, reg1; + rtx reg[2]; + + amount = offsets->outgoing_args - offsets->saved_regs; + amount -= 4 * thumb1_extra_regs_pushed (offsets, true); + + numregs = stack_check_work_registers(reg); + reg0 = REGNO (reg[0]); + reg1 = REGNO (reg[1]); + + stack_check_output_function (f, reg0, reg1, amount, numregs); + } } /* Handle the case of a double word load into a low register from Index: gcc/config/arm/arm.md =================================================================== --- gcc/config/arm/arm.md (revision 175346) +++ gcc/config/arm/arm.md (working copy) @@ -105,6 +105,7 @@ UNSPEC_SYMBOL_OFFSET ; The offset of the start of the symbol from ; another symbolic address. UNSPEC_MEMORY_BARRIER ; Represent a memory barrier. + UNSPEC_PROBE_STACK ; probe stack memory reference ]) ;; UNSPEC_VOLATILE Usage: @@ -10741,6 +10742,113 @@ ;; +(define_expand "probe_stack" + [(match_operand 0 "memory_operand" "")] + "TARGET_EITHER" +{ + if ( flag_stack_check == DIRECT_STACK_CHECK + || flag_stack_check == INDIRECT_STACK_CHECK) + ; + else + { + emit_move_insn (operands[0], const0_rtx); + emit_insn (gen_probe_stack_done ()); + emit_insn (gen_blockage ()); + } + DONE; +} +) + +(define_insn "probe_stack_done" + [(unspec_volatile [(const_int 0)] UNSPEC_PROBE_STACK)] + "TARGET_EITHER" + {return \"@ probe stack done\";} + [(set_attr "type" "store1") + (set_attr "length" "0")] +) + +(define_expand "generic_limit_check_stack" + [(match_operand 0 "memory_operand" "")] + "crtl->limit_stack + && flag_stack_check != DIRECT_STACK_CHECK + && flag_stack_check != INDIRECT_STACK_CHECK" +{ + rtx label = gen_label_rtx (); + rtx addr = copy_rtx (operands[0]); + addr = gen_rtx_fmt_ee (MINUS, Pmode, addr, GEN_INT (0)); + addr = force_operand (addr, NULL_RTX); + emit_insn (gen_blockage ()); + emit_cmp_and_jump_insns (stack_limit_rtx, addr, LEU, NULL_RTX, Pmode, 1, + label); + emit_insn (gen_stack_failure ()); + emit_label (label); + emit_insn (gen_blockage ()); + DONE; +} +) + +(define_insn "stack_check" + [(set + (match_operand:SI 0 "register_operand" "=k") + (match_operand:SI 3 "general_operand" "sr") + ) + (match_operand:SI 1 "register_operand" "r") + (match_operand:SI 2 "register_operand" "r") + (match_operand:SI 4 "general_operand" "i") + (match_operand:SI 5 "general_operand" "i") + (clobber (reg:CC CC_REGNUM)) + ] + "TARGET_32BIT + && (operands[3] == stack_limit_rtx) + && (GET_CODE (operands[4]) == CONST_INT) + && (GET_CODE (operands[5]) == CONST_INT)" + "* + { + int reg0, reg1; + unsigned amount, numregs; + extern void stack_check_output_function (FILE *, int, int, unsigned, + unsigned); + + reg0 = REGNO (operands[1]); + reg1 = REGNO (operands[2]); + amount = INTVAL (operands[4]); + numregs = INTVAL (operands[5]); + + stack_check_output_function (asm_out_file, reg0, reg1, amount, numregs); + } + return \"\"; + " + [(set_attr "conds" "clob") + (set (attr "length") + (if_then_else (eq_attr "is_thumb" "yes") + (const_int 44) + (const_int 52)))] +) + +(define_insn "stack_failure" + [(trap_if (const_int 1) (const_int 0))] + "TARGET_EITHER" + "* + { + rtx ops[2]; + + ops[0] = stack_pointer_rtx; + ops[1] = gen_rtx_REG (SImode, LR_REGNUM); + if (TARGET_ARM) + { + output_asm_insn (\"str\\t%1, [%0, #-4]!\", ops); + output_asm_insn (\"bl\\t__arm_stack_failure\\t%@ trap call\", ops); + } + else + { + output_asm_insn (\"push\\t{%1}\", ops); + output_asm_insn (\"bl\\t__thumb_stack_failure\\t%@ trap call\", ops); + } + } + return \"\"; + " +) + ;; We only care about the lower 16 bits of the constant