Message ID | 87bo28z2ki.fsf@sandifor-thinkpad.stglab.manchester.uk.ibm.com |
---|---|
State | New |
Headers | show |
On Tue, Oct 29, 2013 at 12:25:33PM +0000, Richard Sandiford wrote: > >> Any updates on this one? Note that this bug is a huge blocker for > >> using AddressSanitizer on ARM platforms. > > > > Sorry for the delay, I finally found time to look at it. > > While your patch fixes the issue, I wonder if for the case where > > shadow_mem before the asan_clear_shadow call is already of the form > > (mem (reg)) it isn't better to just adjust next asan_clear_shadow > > base from the end of the cleared region rather than from the start of it, > > because the end will be already in the right pseudo, while with your > > approach it needs to be done in a different register and potentially > > increase register pressure. So here is (completely untested) alternate fix: > > Hmm, it just seems wrong to be assigning to registers returned by force_reg > and expand_binop though. Would this variation be OK? Why? If it is a pseudo, it is certainly a pseudo that isn't used for anything else, as it is the result of (base >> 3) + constant, if it isn't a pseudo, then supposedly it is better not to just keep adding the offsets to a base and thus not to use the end address from asan_clear_shadow. Your patch would force using the end address even if shadow_base is initially say some_reg + 32, I think it is better to use shadow_mem some_reg + 72 next time instead of some_other_reg + 40. Jakub
Jakub Jelinek <jakub@redhat.com> writes: > On Tue, Oct 29, 2013 at 12:25:33PM +0000, Richard Sandiford wrote: >> >> Any updates on this one? Note that this bug is a huge blocker for >> >> using AddressSanitizer on ARM platforms. >> > >> > Sorry for the delay, I finally found time to look at it. >> > While your patch fixes the issue, I wonder if for the case where >> > shadow_mem before the asan_clear_shadow call is already of the form >> > (mem (reg)) it isn't better to just adjust next asan_clear_shadow >> > base from the end of the cleared region rather than from the start of it, >> > because the end will be already in the right pseudo, while with your >> > approach it needs to be done in a different register and potentially >> > increase register pressure. So here is (completely untested) alternate fix: >> >> Hmm, it just seems wrong to be assigning to registers returned by force_reg >> and expand_binop though. Would this variation be OK? > > Why? Well, that was the traditional rule: The caller must not alter the value in the register we return, since we mark it as a "constant" register. */ rtx force_reg (enum machine_mode mode, rtx x) { but maybe it doesn't matter now, since the constant register flag has gone and since we seem to use REG_EQUAL rather than REG_EQUIV... > If it is a pseudo, it is certainly a pseudo that isn't used for > anything else, as it is the result of (base >> 3) + constant, if it isn't a > pseudo, then supposedly it is better not to just keep adding the offsets to > a base and thus not to use the end address from asan_clear_shadow. > Your patch would force using the end address even if shadow_base > is initially say some_reg + 32, I think it is better to use shadow_mem > some_reg + 72 next time instead of some_other_reg + 40. But I thought the register pressure thing was to avoid having the original base live across the loop. Doesn't that apply for reg + offset as well as plain reg? If we have to force some_reg + 32 into a temporary and then loop on it, using the new base should avoid keeping both it and some_reg live at the same time. Plus smaller offsets are often better on some targets. Thanks, Richard
On Tue, Oct 29, 2013 at 01:06:21PM +0000, Richard Sandiford wrote: > > If it is a pseudo, it is certainly a pseudo that isn't used for > > anything else, as it is the result of (base >> 3) + constant, if it isn't a > > pseudo, then supposedly it is better not to just keep adding the offsets to > > a base and thus not to use the end address from asan_clear_shadow. > > Your patch would force using the end address even if shadow_base > > is initially say some_reg + 32, I think it is better to use shadow_mem > > some_reg + 72 next time instead of some_other_reg + 40. > > But I thought the register pressure thing was to avoid having the > original base live across the loop. Doesn't that apply for reg + offset > as well as plain reg? If we have to force some_reg + 32 into a > temporary and then loop on it, using the new base should avoid keeping > both it and some_reg live at the same time. Plus smaller offsets are > often better on some targets. On the other side, if we force the address to be just register based when it previously wasn't, it makes life harder for alias analysis etc., at least in the past say on ia64 I remember the fact that it didn't allow anything but registers as valid addresses often resulted in tons of missed optimizations), so in the end I guess I have nothing against the original patch (with whatever form of the copy to reg call is desirable). Jakub
> so in the end I guess I have nothing against the original patch > (with whatever form of the copy to reg call is desirable). So ok to commit? -Y
On Wed, Oct 30, 2013 at 11:32:14AM +0400, Yury Gribov wrote: > > so in the end I guess I have nothing against the original patch > > (with whatever form of the copy to reg call is desirable). > > So ok to commit? Ok with the change suggested by Richard, I think it was: addr = copy_to_mode_reg (Pmode, XEXP (shadow_mem, 0)); Jakub
>> So ok to commit? > Ok with the change suggested by Richard, I think it was: addr = copy_to_mode_reg (Pmode, XEXP (shadow_mem, 0)); Done, r204251. Tested against x64 and ARM. -Y
Index: gcc/asan.c =================================================================== --- gcc/asan.c (revision 204124) +++ gcc/asan.c (working copy) @@ -876,9 +876,10 @@ } /* Clear shadow memory at SHADOW_MEM, LEN bytes. Can't call a library call here - though. */ + though. If the address of the next byte is in a known register, return + that register, otherwise return null. */ -static void +static rtx asan_clear_shadow (rtx shadow_mem, HOST_WIDE_INT len) { rtx insn, insns, top_label, end, addr, tmp, jump; @@ -893,12 +894,12 @@ if (insn == NULL_RTX) { emit_insn (insns); - return; + return 0; } gcc_assert ((len & 3) == 0); top_label = gen_label_rtx (); - addr = force_reg (Pmode, XEXP (shadow_mem, 0)); + addr = copy_to_mode_reg (Pmode, XEXP (shadow_mem, 0)); shadow_mem = adjust_automodify_address (shadow_mem, SImode, addr, 0); end = force_reg (Pmode, plus_constant (Pmode, addr, len)); emit_label (top_label); @@ -912,6 +913,7 @@ jump = get_last_insn (); gcc_assert (JUMP_P (jump)); add_int_reg_note (jump, REG_BR_PROB, REG_BR_PROB_BASE * 80 / 100); + return addr; } /* Insert code to protect stack vars. The prologue sequence should be emitted @@ -1048,7 +1050,15 @@ (last_offset - prev_offset) >> ASAN_SHADOW_SHIFT); prev_offset = last_offset; - asan_clear_shadow (shadow_mem, last_size >> ASAN_SHADOW_SHIFT); + rtx end_addr = asan_clear_shadow (shadow_mem, + last_size >> ASAN_SHADOW_SHIFT); + if (end_addr) + { + shadow_mem + = adjust_automodify_address (shadow_mem, VOIDmode, end_addr, + last_size >> ASAN_SHADOW_SHIFT); + prev_offset += last_size; + } last_offset = offset; last_size = 0; }