Message ID | 86e95146-d9de-f898-84b0-6ca3a8da76af@redhat.com |
---|---|
State | New |
Headers | show |
Series | [committed,PR,tree-optimization/86010] More aggressively trim partially dead mem* and str* calls | expand |
Hi Jeff, On Fri, 6 Jul 2018 at 05:44, Jeff Law <law@redhat.com> wrote: > > As noted in BZ 86010 we can be more aggressive when trimming tails of > mem* or str* calls in gimple DSE since trimming a tail doesn't affect > alignment and residuals are usually handled pretty efficiently in libc. > > Additionally, if the total number of live bytes left is smaller than a > word, then it's highly likely we'll open-code the mem* or str* routine. > So we allow more aggressive trimming in that case too. > > What's left to be able to close out 86010 is to identify when a memory > store could be merged with a subsequent memset. I'm skeptical of the > importance of that optimization, though perhaps it comes up often enough > with structure initializations to be worth doing. > > Bootstrapped and regression tested on x86_64-linux-gnu. Installing on > the trunk. > This is causing a regression on arm and i686: gcc.dg/tree-ssa/pr30375.c: pattern found 0 times FAIL: gcc.dg/tree-ssa/pr30375.c scan-tree-dump-times dse1 "MEM\\[\\(struct _s \\*\\)&signInfo \\+ [0-9]+B\\] = {}" 1 > Jeff
On 07/06/2018 06:08 AM, Christophe Lyon wrote: > Hi Jeff, > > On Fri, 6 Jul 2018 at 05:44, Jeff Law <law@redhat.com> wrote: >> >> As noted in BZ 86010 we can be more aggressive when trimming tails of >> mem* or str* calls in gimple DSE since trimming a tail doesn't affect >> alignment and residuals are usually handled pretty efficiently in libc. >> >> Additionally, if the total number of live bytes left is smaller than a >> word, then it's highly likely we'll open-code the mem* or str* routine. >> So we allow more aggressive trimming in that case too. >> >> What's left to be able to close out 86010 is to identify when a memory >> store could be merged with a subsequent memset. I'm skeptical of the >> importance of that optimization, though perhaps it comes up often enough >> with structure initializations to be worth doing. >> >> Bootstrapped and regression tested on x86_64-linux-gnu. Installing on >> the trunk. >> > > This is causing a regression on arm and i686: > gcc.dg/tree-ssa/pr30375.c: pattern found 0 times > FAIL: gcc.dg/tree-ssa/pr30375.c scan-tree-dump-times dse1 > "MEM\\[\\(struct _s \\*\\)&signInfo \\+ [0-9]+B\\] = {}" 1 > >> Jeff Thanks. I'm on it. jeff
diff --git a/gcc/tree-ssa-dse.c b/gcc/tree-ssa-dse.c index 1af50a0..ebc4a1e 100644 --- a/gcc/tree-ssa-dse.c +++ b/gcc/tree-ssa-dse.c @@ -240,11 +240,14 @@ compute_trims (ao_ref *ref, sbitmap live, int *trim_head, int *trim_tail, /* Now identify how much, if any of the tail we can chop off. */ HOST_WIDE_INT const_size; + int last_live = bitmap_last_set_bit (live); if (ref->size.is_constant (&const_size)) { int last_orig = (const_size / BITS_PER_UNIT) - 1; - int last_live = bitmap_last_set_bit (live); - *trim_tail = (last_orig - last_live) & ~0x1; + /* We can leave inconvenient amounts on the tail as + residual handling in mem* and str* functions is usually + reasonably efficient. */ + *trim_tail = last_orig - last_live; } else *trim_tail = 0; @@ -252,7 +255,12 @@ compute_trims (ao_ref *ref, sbitmap live, int *trim_head, int *trim_tail, /* Identify how much, if any of the head we can chop off. */ int first_orig = 0; int first_live = bitmap_first_set_bit (live); - *trim_head = (first_live - first_orig) & ~0x1; + *trim_head = first_live - first_orig; + + /* If more than a word remains, then make sure to keep the + starting point at least word aligned. */ + if (last_live - first_live > UNITS_PER_WORD) + *trim_head &= (UNITS_PER_WORD - 1); if ((*trim_head || *trim_tail) && dump_file && (dump_flags & TDF_DETAILS))