Message ID | 20130221140555.GE2928@two.firstfloor.org |
---|---|
State | New |
Headers | show |
On 21/02/13 14:05, Andi Kleen wrote: > An auto generated program with a 6.4mio line asm statement gave > with 4.7 and 4.8: > > xxx.c:6400017:1: internal compiler error: in account_size_time, at > ipa-inline-analysis.c:601 > > The problem is that the inliner counts the number of lines in the asm > statement and multiplies that with a weight. With the weight this > overflows 32bit signed int, and triggers an assert for negative time. > > Fix this by limiting the number of lines to 1000 for asm cost > estimation. The RTL backend also does similar multiplications for > jump shortening. I haven't tried to address this, but presumably > it's less likely to result in a failure. > > Passes test suite on x86_64-linux. > > Ok for 4.7 and 4.8? > > 2013-02-17 Andi Kleen <ak@linux.intel.com> > > * tree-inline.c (estimate_num_insns): Limit asm cost to 1000. > > diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c > index 2a1b692..7f8f2f2 100644 > --- a/gcc/tree-inline.c > +++ b/gcc/tree-inline.c > @@ -3595,7 +3595,14 @@ estimate_num_insns (gimple stmt, eni_weights *weights) > return 0; > > case GIMPLE_ASM: > - return asm_str_count (gimple_asm_string (stmt)); > + { > + int count = asm_str_count (gimple_asm_string (stmt)); > + /* 1000 means infinity. This avoids overflows later > + with very long asm statements. */ > + if (count > 1000) > + count = 1000; > + return count; > + } > > case GIMPLE_RESX: > /* This is either going to be an external function call with one > That doesn't sound enough, unless there is already code out there that respects this count. 1000 at 4 bytes per instruction is only 4k. More that small enough for the rest of the compiler to think that it could jump around such blocks cheaply. I think a limit of 1M or more might be more appropriate. R.
> That doesn't sound enough, unless there is already code out there > that respects this count. 1000 at 4 bytes per instruction is only > 4k. More that small enough for the rest of the compiler to think > that it could jump around such blocks cheaply. > > I think a limit of 1M or more might be more appropriate. I got an overflow for 6.4M, so 1M would be dangerously near that. 100k perhaps ? This was not for jump shortening, but the inliner heuristics. In the worst case we could separate the two, would be a larger patch though. -Andi
> This was not for jump shortening, but the inliner heuristics. > > In the worst case we could separate the two, would be a larger > patch though. Actually it's already separated, I don't affect the jump shortening at all. Only the inliner code adds the limit. So it would depend whether 1000 * weight is large enough for inlining. Honza? -Andi
> > This was not for jump shortening, but the inliner heuristics. > > > > In the worst case we could separate the two, would be a larger > > patch though. > > Actually it's already separated, I don't affect the jump shortening > at all. Only the inliner code adds the limit. > > So it would depend whether 1000 * weight is large enough for inlining. > Honza? I think it is safely more than enough. 1000*weight makes the function safely uninlinable. So only it cuts of precision in large function growth/large unit growth logic. The logic is skewed here: large function growth is there really for compiler visible code to prevent non-linear algorithm explosion. For large unit growth I duno - in a way those gigantic asm statements that are not going to be duplicated can be tought as not part of the unit for this putpose, too. So I am fine with the cutoff. We may need to add more overflow guards (we already have quite few for time) that makes me wonder if all this should not be done all in saturating arithmetic now when it can be done theoretically with one C++ class? Honza > > -Andi
On 21/02/13 15:59, Andi Kleen wrote: >> That doesn't sound enough, unless there is already code out there >> that respects this count. 1000 at 4 bytes per instruction is only >> 4k. More that small enough for the rest of the compiler to think >> that it could jump around such blocks cheaply. >> >> I think a limit of 1M or more might be more appropriate. > > I got an overflow for 6.4M, so 1M would be dangerously near that. > 100k perhaps ? > > This was not for jump shortening, but the inliner heuristics. > My mistake. However, the real problem here is that you're summing a potentially large number of items on the expectation that the result will be small enough to be scaled successfully by the weight function. A better patch would be to a apply a much larger limit at the time when the count of instructions is finally used, rather than picking some arbitrary number here in the hope that it won't cause overflow later on. Yes, that's a bigger patch, but it would be a more robust solution than picking an arbitrary number here. > In the worst case we could separate the two, would be a larger > patch though. > > -Andi >
> So I am fine with the cutoff. We may need to add more overflow guards (we > already have quite few for time) that makes me wonder if all this should not be > done all in saturating arithmetic now when it can be done theoretically with one > C++ class? Sounds like a good idea, although I don't know how intrusive that is. But can try. Still would like to use the 1000 for the 4.7 backport. Ok? -Andi
On Thu, Feb 21, 2013 at 07:19:10PM +0100, Andi Kleen wrote: > > So I am fine with the cutoff. We may need to add more overflow guards (we > > already have quite few for time) that makes me wonder if all this should not be > > done all in saturating arithmetic now when it can be done theoretically with one > > C++ class? > > Sounds like a good idea, although I don't know how intrusive that is. But can > try. I looked at this now and it would be very intrusive. Not only in tree-inline, but also various callers elsewhere. And there's no saturating class either, so that would need to be added too. I think I prefer the simple solution. BTW this is a regression, old gccs compiled this. -Andi
Andi Kleen <andi@firstfloor.org> writes: Ping! This problem is still open, and afaik no better solution has been proposed. It is also a regression. Is it ok to commit if I rerun the tests and they pass? > An auto generated program with a 6.4mio line asm statement gave > with 4.7 and 4.8: > > xxx.c:6400017:1: internal compiler error: in account_size_time, at > ipa-inline-analysis.c:601 > > The problem is that the inliner counts the number of lines in the asm > statement and multiplies that with a weight. With the weight this > overflows 32bit signed int, and triggers an assert for negative time. > > Fix this by limiting the number of lines to 1000 for asm cost > estimation. The RTL backend also does similar multiplications for > jump shortening. I haven't tried to address this, but presumably > it's less likely to result in a failure. > > Passes test suite on x86_64-linux. > > Ok for 4.7 and 4.8? > > 2013-02-17 Andi Kleen <ak@linux.intel.com> > > * tree-inline.c (estimate_num_insns): Limit asm cost to 1000. > > diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c > index 2a1b692..7f8f2f2 100644 > --- a/gcc/tree-inline.c > +++ b/gcc/tree-inline.c > @@ -3595,7 +3595,14 @@ estimate_num_insns (gimple stmt, eni_weights *weights) > return 0; > > case GIMPLE_ASM: > - return asm_str_count (gimple_asm_string (stmt)); > + { > + int count = asm_str_count (gimple_asm_string (stmt)); > + /* 1000 means infinity. This avoids overflows later > + with very long asm statements. */ > + if (count > 1000) > + count = 1000; > + return count; > + } > > case GIMPLE_RESX: > /* This is either going to be an external function call with one
> Andi Kleen <andi@firstfloor.org> writes: > > Ping! > > This problem is still open, and afaik no better solution has been > proposed. It is also a regression. > > Is it ok to commit if I rerun the tests and they pass? OK. Honza
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 2a1b692..7f8f2f2 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -3595,7 +3595,14 @@ estimate_num_insns (gimple stmt, eni_weights *weights) return 0; case GIMPLE_ASM: - return asm_str_count (gimple_asm_string (stmt)); + { + int count = asm_str_count (gimple_asm_string (stmt)); + /* 1000 means infinity. This avoids overflows later + with very long asm statements. */ + if (count > 1000) + count = 1000; + return count; + } case GIMPLE_RESX: /* This is either going to be an external function call with one