Patchwork Update source location for PRE inserted stmt

login
register
mail settings
Submitter Dehao Chen
Date Oct. 30, 2012, 11 p.m.
Message ID <CAO2gOZWF4=vpgWhX+duS7B5z3dN_cvy7AjkefZ18hi2QsY6SPA@mail.gmail.com>
Download mbox | patch
Permalink /patch/195647/
State New
Headers show

Comments

Dehao Chen - Oct. 30, 2012, 11 p.m.
Hi,

This patch aims to improve debugging of optimized code. It ensures
that PRE inserted statements have the same source location as the
statement at the insertion point, instead of UNKNOWN_LOCATION.

Bootstrapped on x86_64, and passed gcc regression tests and gdb
regression tests.

Is it okay for trunk?

Thanks,
Dehao

gcc/ChangeLog:
2012-10-30  Dehao Chen  <dehao@google.com>

        * tree-ssa-pre.c (insert_into_pred_update_location): New Function.
        (insert_into_preds_of_block): Update source location for inserted stmts.

gcc/testsuite/ChangeLog:
2012-10-30  Dehao Chen  <dehao@google.com>

        * gcc.dg/debug/dwarf2/pre.c: New testcase.
Steven Bosscher - Oct. 30, 2012, 11:38 p.m.
On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
> This patch aims to improve debugging of optimized code. It ensures
> that PRE inserted statements have the same source location as the
> statement at the insertion point, instead of UNKNOWN_LOCATION.

Wrong patch attached.

However, is it really better to have the location of the insertion
point than to have UNKNOWN_LOCATION? It's not where the value is
computed in the source program...

Ciao!
Steven
Xinliang David Li - Oct. 30, 2012, 11:57 p.m.
It will make the location info for the newly synthesized stmt more
deterministic, I think.

David

On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>> This patch aims to improve debugging of optimized code. It ensures
>> that PRE inserted statements have the same source location as the
>> statement at the insertion point, instead of UNKNOWN_LOCATION.
>
> Wrong patch attached.
>
> However, is it really better to have the location of the insertion
> point than to have UNKNOWN_LOCATION? It's not where the value is
> computed in the source program...
>
> Ciao!
> Steven
Richard Guenther - Oct. 31, 2012, 9:34 a.m.
On Wed, Oct 31, 2012 at 12:57 AM, Xinliang David Li <davidxl@google.com> wrote:
> It will make the location info for the newly synthesized stmt more
> deterministic, I think.

Maybe, but it will increase the jumpiness in the debugger without actually
being accurate, no?  For example if the partially redundant expression is

  i + j;

then when computed at the insertion point the values of i and j do not
necessarily reflect the computed value!  Instead we may compute the
result of i + j using completely different components / operation.

Thus I think inserted expressions should not have any debug information
at all because they do not correspond to a source line.

Richard.

> David
>
> On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>>> This patch aims to improve debugging of optimized code. It ensures
>>> that PRE inserted statements have the same source location as the
>>> statement at the insertion point, instead of UNKNOWN_LOCATION.
>>
>> Wrong patch attached.
>>
>> However, is it really better to have the location of the insertion
>> point than to have UNKNOWN_LOCATION? It's not where the value is
>> computed in the source program...
>>
>> Ciao!
>> Steven
Dehao Chen - Oct. 31, 2012, 2:41 p.m.
On Wed, Oct 31, 2012 at 2:34 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Oct 31, 2012 at 12:57 AM, Xinliang David Li <davidxl@google.com> wrote:
>> It will make the location info for the newly synthesized stmt more
>> deterministic, I think.
>
> Maybe, but it will increase the jumpiness in the debugger without actually
> being accurate, no?  For example if the partially redundant expression is

Sometimes, it is necessary to have a jump in gdb, especially when
control flow changes. For the test case I gave, people may want to
know if the branch in line 10 is taken or not. Without this patch, the
gdb simply jump from line 10 to line 15 if the branch is not taken.
But with the patch, people get the correct control flow.

One more thing, for AutoFDO to work, we want to make sure debug info
is correct at control flow boundaries. That is why we would prefer
deterministic location info, instead of randomly inherit debug info
from previous BB.

Thanks,
Dehao

>
>   i + j;
>
> then when computed at the insertion point the values of i and j do not
> necessarily reflect the computed value!  Instead we may compute the
> result of i + j using completely different components / operation.
>
> Thus I think inserted expressions should not have any debug information
> at all because they do not correspond to a source line.
>
> Richard.
>
>> David
>>
>> On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>>>> This patch aims to improve debugging of optimized code. It ensures
>>>> that PRE inserted statements have the same source location as the
>>>> statement at the insertion point, instead of UNKNOWN_LOCATION.
>>>
>>> Wrong patch attached.
>>>
>>> However, is it really better to have the location of the insertion
>>> point than to have UNKNOWN_LOCATION? It's not where the value is
>>> computed in the source program...
>>>
>>> Ciao!
>>> Steven
Xinliang David Li - Oct. 31, 2012, 7:02 p.m.
Dehao's patch will make the debugging of the following code (-g -O2)
less jumpy.   After the testing of x > 0, it should go to line 'a++'.
 Without the fix, when stepping through 'abc', the lines covered are
6, 4, 11, 13. With the fix it should be 6, 9, 11, 13 -- much better.

David




1. int x;

2. __attribute__((noinline)) int abc (int *a)
3. {
4.  int ret = 0;
5.
6.  if (x > 0)
7.    ret += *a;
8.  else
9.    a++;
10.
11.  ret += *a;
12.  return ret;
13 }


int main()
{
  int a = 0;

   x = -1;
   return abc ( &a);

}

On Wed, Oct 31, 2012 at 2:34 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Oct 31, 2012 at 12:57 AM, Xinliang David Li <davidxl@google.com> wrote:
>> It will make the location info for the newly synthesized stmt more
>> deterministic, I think.
>
> Maybe, but it will increase the jumpiness in the debugger without actually
> being accurate, no?  For example if the partially redundant expression is
>
>   i + j;
>
> then when computed at the insertion point the values of i and j do not
> necessarily reflect the computed value!  Instead we may compute the
> result of i + j using completely different components / operation.
>
> Thus I think inserted expressions should not have any debug information
> at all because they do not correspond to a source line.
>
> Richard.
>
>> David
>>
>> On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>>>> This patch aims to improve debugging of optimized code. It ensures
>>>> that PRE inserted statements have the same source location as the
>>>> statement at the insertion point, instead of UNKNOWN_LOCATION.
>>>
>>> Wrong patch attached.
>>>
>>> However, is it really better to have the location of the insertion
>>> point than to have UNKNOWN_LOCATION? It's not where the value is
>>> computed in the source program...
>>>
>>> Ciao!
>>> Steven
Richard Guenther - Nov. 4, 2012, 4:07 p.m.
On Wed, Oct 31, 2012 at 8:02 PM, Xinliang David Li <davidxl@google.com> wrote:
> Dehao's patch will make the debugging of the following code (-g -O2)
> less jumpy.   After the testing of x > 0, it should go to line 'a++'.
>  Without the fix, when stepping through 'abc', the lines covered are
> 6, 4, 11, 13. With the fix it should be 6, 9, 11, 13 -- much better.

I am not convinced.  Btw, you do not comment my example at all.
For less jumpiness no line number for inserted stmts works as well.

Richard.

> David
>
>
>
>
> 1. int x;
>
> 2. __attribute__((noinline)) int abc (int *a)
> 3. {
> 4.  int ret = 0;
> 5.
> 6.  if (x > 0)
> 7.    ret += *a;
> 8.  else
> 9.    a++;
> 10.
> 11.  ret += *a;
> 12.  return ret;
> 13 }
>
>
> int main()
> {
>   int a = 0;
>
>    x = -1;
>    return abc ( &a);
>
> }
>
> On Wed, Oct 31, 2012 at 2:34 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Oct 31, 2012 at 12:57 AM, Xinliang David Li <davidxl@google.com> wrote:
>>> It will make the location info for the newly synthesized stmt more
>>> deterministic, I think.
>>
>> Maybe, but it will increase the jumpiness in the debugger without actually
>> being accurate, no?  For example if the partially redundant expression is
>>
>>   i + j;
>>
>> then when computed at the insertion point the values of i and j do not
>> necessarily reflect the computed value!  Instead we may compute the
>> result of i + j using completely different components / operation.
>>
>> Thus I think inserted expressions should not have any debug information
>> at all because they do not correspond to a source line.
>>
>> Richard.
>>
>>> David
>>>
>>> On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>>>>> This patch aims to improve debugging of optimized code. It ensures
>>>>> that PRE inserted statements have the same source location as the
>>>>> statement at the insertion point, instead of UNKNOWN_LOCATION.
>>>>
>>>> Wrong patch attached.
>>>>
>>>> However, is it really better to have the location of the insertion
>>>> point than to have UNKNOWN_LOCATION? It's not where the value is
>>>> computed in the source program...
>>>>
>>>> Ciao!
>>>> Steven
Dehao Chen - Nov. 4, 2012, 5:42 p.m.
2. __attribute__((noinline)) int abc (int *a)
3. {
4.  int ret = 0;
5.
6.  if (x > 0)
7.    ret += *a;
8.  else
9.    a++;
10.
11.  ret += *a;
12.  return ret;
13 }

In terms of jumpiness, without the patch, the jump sequence is:
2 -> 13 -> 4 -> 11 -> 13

With the patch, the jump sequence is:
2-> 9 -> 4 -> 11 -> 13

I think the patch does not increase the jumpiness, while it
significantly improves coverage.

Thanks,
Dehao
Xinliang David Li - Nov. 4, 2012, 10:19 p.m.
On Sun, Nov 4, 2012 at 8:07 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Oct 31, 2012 at 8:02 PM, Xinliang David Li <davidxl@google.com> wrote:
>> Dehao's patch will make the debugging of the following code (-g -O2)
>> less jumpy.   After the testing of x > 0, it should go to line 'a++'.
>>  Without the fix, when stepping through 'abc', the lines covered are
>> 6, 4, 11, 13. With the fix it should be 6, 9, 11, 13 -- much better.
>
> I am not convinced.  Btw, you do not comment my example at all.

Because I was not sure what the example was trying to illustrate.
Let's make it more concrete:

Original program:

if (...)
{
    .. = i + j;
}
else
{
   // (1)
   ..
}

... = i + j;  // (2)

After PRE:

if (...)
{
   t = i + j;
   .. = t;
}
else
{
    // insertion point at 1
    t = i + j;   // New statement
   ..
}

.. = t; // (2)


What is the 'different operation/component' you are referring to in the example?


> For less jumpiness no line number for inserted stmts works as well.


Those compiler generated statements do not have source origins, but
they need to have debug location information attached so that
information collected for them can be correlated with program
constructs (such as CFG). One of the natural way is to use the source
location associated with the program point where they are inserted.

David

>
> Richard.
>
>> David
>>
>>
>>
>>
>> 1. int x;
>>
>> 2. __attribute__((noinline)) int abc (int *a)
>> 3. {
>> 4.  int ret = 0;
>> 5.
>> 6.  if (x > 0)
>> 7.    ret += *a;
>> 8.  else
>> 9.    a++;
>> 10.
>> 11.  ret += *a;
>> 12.  return ret;
>> 13 }
>>
>>
>> int main()
>> {
>>   int a = 0;
>>
>>    x = -1;
>>    return abc ( &a);
>>
>> }
>>
>> On Wed, Oct 31, 2012 at 2:34 AM, Richard Biener
>> <richard.guenther@gmail.com> wrote:
>>> On Wed, Oct 31, 2012 at 12:57 AM, Xinliang David Li <davidxl@google.com> wrote:
>>>> It will make the location info for the newly synthesized stmt more
>>>> deterministic, I think.
>>>
>>> Maybe, but it will increase the jumpiness in the debugger without actually
>>> being accurate, no?  For example if the partially redundant expression is
>>>
>>>   i + j;
>>>
>>> then when computed at the insertion point the values of i and j do not
>>> necessarily reflect the computed value!  Instead we may compute the
>>> result of i + j using completely different components / operation.
>>>
>>> Thus I think inserted expressions should not have any debug information
>>> at all because they do not correspond to a source line.
>>>
>>> Richard.
>>>
>>>> David
>>>>
>>>> On Tue, Oct 30, 2012 at 4:38 PM, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>>> On Wed, Oct 31, 2012 at 12:00 AM, Dehao Chen wrote:
>>>>>> This patch aims to improve debugging of optimized code. It ensures
>>>>>> that PRE inserted statements have the same source location as the
>>>>>> statement at the insertion point, instead of UNKNOWN_LOCATION.
>>>>>
>>>>> Wrong patch attached.
>>>>>
>>>>> However, is it really better to have the location of the insertion
>>>>> point than to have UNKNOWN_LOCATION? It's not where the value is
>>>>> computed in the source program...
>>>>>
>>>>> Ciao!
>>>>> Steven
Eric Botcazou - Nov. 5, 2012, 11:54 a.m.
> Those compiler generated statements do not have source origins, but
> they need to have debug location information attached so that
> information collected for them can be correlated with program
> constructs (such as CFG). One of the natural way is to use the source
> location associated with the program point where they are inserted.

No, there is nothing natural (and this can even be wrong).  The statements 
must have the source location corresponding to the source construct they are 
generated for, which may be totally different from that of the insertion 
point.  Yes, that can generate jumpiness in GDB, but this is far better that 
breaking the coverage info by giving the same source location to instructions 
that have different coverage status.
Dehao Chen - Nov. 5, 2012, 3:36 p.m.
> No, there is nothing natural (and this can even be wrong).  The statements
> must have the source location corresponding to the source construct they are
> generated for, which may be totally different from that of the insertion
> point.  Yes, that can generate jumpiness in GDB, but this is far better that
> breaking the coverage info by giving the same source location to instructions
> that have different coverage status.

For the unittest I provided, setting the inserted stmt with
UNKNOWN_LOCATION will:

* break the coverage
* increase jumpiness in gdb

Setting location to UNKNOWN_LOCATION is like setting it to random
because compiler may put this stmt as the entry point of a BB (as in
the unittest). Thus setting deterministic locations for inserted stmt
will improve debugability and reduce jumpiness.

Thanks,
Dehao

>
> --
> Eric Botcazou
Xinliang David Li - Nov. 5, 2012, 5 p.m.
For the example I listed, the new statement is generated for source
construct at program point (2). However unlike simple code motion, (2)
is not going away after PRE. How would setting the location of the new
statement at the insertion point break coverage? Besides, the new
statement won't create 'false' coverage for the insertion point either
as the there are existing statements at that point where the new stmt
inherits the location from.

David

On Mon, Nov 5, 2012 at 3:54 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> Those compiler generated statements do not have source origins, but
>> they need to have debug location information attached so that
>> information collected for them can be correlated with program
>> constructs (such as CFG). One of the natural way is to use the source
>> location associated with the program point where they are inserted.
>
> No, there is nothing natural (and this can even be wrong).  The statements
> must have the source location corresponding to the source construct they are
> generated for, which may be totally different from that of the insertion
> point.  Yes, that can generate jumpiness in GDB, but this is far better that
> breaking the coverage info by giving the same source location to instructions
> that have different coverage status.
>
> --
> Eric Botcazou
Diego Novillo - Nov. 15, 2012, 1:11 p.m.
On 2012-11-05 06:54 , Eric Botcazou wrote:
>> Those compiler generated statements do not have source origins, but
>> they need to have debug location information attached so that
>> information collected for them can be correlated with program
>> constructs (such as CFG). One of the natural way is to use the source
>> location associated with the program point where they are inserted.
>
> No, there is nothing natural (and this can even be wrong).  The statements
> must have the source location corresponding to the source construct they are
> generated for, which may be totally different from that of the insertion
> point.  Yes, that can generate jumpiness in GDB, but this is far better that
> breaking the coverage info by giving the same source location to instructions
> that have different coverage status.

But UNKNOWN_LOCATION is effectively wrong as well.  If other 
optimizations move the statements above the inserted instruction, then 
the new instruction ends up inheriting whatever location happens to be 
in the previous statement.

Fixing the location when the statement is inserted will reduce this 
randomness.  I'm not sure I understand why you think this will break 
coverage.  The examples given in this thread were helped by this 
location fixing.

Do you have any counterexample?  I may be missing something here.


Thanks.  Diego.
Eric Botcazou - Nov. 15, 2012, 4:46 p.m.
> But UNKNOWN_LOCATION is effectively wrong as well.  If other
> optimizations move the statements above the inserted instruction, then
> the new instruction ends up inheriting whatever location happens to be
> in the previous statement.

That's correct, UNKNOWN_LOCATION isn't a panacea either, although it can help 
in some cases.  The only safe thing to do is to put the exact source location 
of the statement, i.e. the location of the source construct for which the 
statement has been generated.

> Fixing the location when the statement is inserted will reduce this
> randomness.  I'm not sure I understand why you think this will break
> coverage.  The examples given in this thread were helped by this
> location fixing.

What do you call randomness exactly?  GDB jumpiness?  I'm a little wary of 
fixing GDB jumpiness by distorting the debug info.  Ian has clearly outlined 
the correct approach to addressing it.

Note that I didn't specifically reply to Dehao's patch here, which might be 
acceptable in the end, only to David's seemingly general statement about the 
"natural way" of putting a location on these statements.
Dehao Chen - Nov. 15, 2012, 5:05 p.m.
On Thu, Nov 15, 2012 at 8:46 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>
> > But UNKNOWN_LOCATION is effectively wrong as well.  If other
> > optimizations move the statements above the inserted instruction, then
> > the new instruction ends up inheriting whatever location happens to be
> > in the previous statement.
>
> That's correct, UNKNOWN_LOCATION isn't a panacea either, although it can help
> in some cases.  The only safe thing to do is to put the exact source location
> of the statement, i.e. the location of the source construct for which the
> statement has been generated.
>
> > Fixing the location when the statement is inserted will reduce this
> > randomness.  I'm not sure I understand why you think this will break
> > coverage.  The examples given in this thread were helped by this
> > location fixing.
>
> What do you call randomness exactly?  GDB jumpiness?  I'm a little wary of
> fixing GDB jumpiness by distorting the debug info.  Ian has clearly outlined
> the correct approach to addressing it.

The randomness here means that if we set UNKNOWN_LOCATION to insn, it
can get source location anywhere. Even with some small code layout
changes, the location for that insn could change. I would hope that in
the future, we add an assertion when emitting instruction to enforce
that INSN_LOCATION is never UNKNOWN_LOCATION. So when generate new
instructions/stmts, if we can surely know where it is coming from, it
is fine. Otherwise, instead of using UNKNOWN_LOCATION, we will set its
location to where it is inserted. How does that sound?

Thanks,
Dehao

>
>
> Note that I didn't specifically reply to Dehao's patch here, which might be
> acceptable in the end, only to David's seemingly general statement about the
> "natural way" of putting a location on these statements.
>
> --
> Eric Botcazou
Eric Botcazou - Nov. 15, 2012, 5:23 p.m.
> The randomness here means that if we set UNKNOWN_LOCATION to insn, it
> can get source location anywhere. Even with some small code layout
> changes, the location for that insn could change. I would hope that in
> the future, we add an assertion when emitting instruction to enforce
> that INSN_LOCATION is never UNKNOWN_LOCATION. So when generate new
> instructions/stmts, if we can surely know where it is coming from, it
> is fine. Otherwise, instead of using UNKNOWN_LOCATION, we will set its
> location to where it is inserted. How does that sound?

Still the same problem: you cannot make that a general rule, since you don't 
know the coverage status of the instruction just above the insertion point.
If a later optimization moves the new statements around, you may well end up 
with wrong coverage info.
Xinliang David Li - Nov. 15, 2012, 6:30 p.m.
I probably made too general statement in this topic. However for the
PRE case, I believe the choice of not using UNKNOWN location is still
better.

thanks,

David

On Thu, Nov 15, 2012 at 9:23 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> The randomness here means that if we set UNKNOWN_LOCATION to insn, it
>> can get source location anywhere. Even with some small code layout
>> changes, the location for that insn could change. I would hope that in
>> the future, we add an assertion when emitting instruction to enforce
>> that INSN_LOCATION is never UNKNOWN_LOCATION. So when generate new
>> instructions/stmts, if we can surely know where it is coming from, it
>> is fine. Otherwise, instead of using UNKNOWN_LOCATION, we will set its
>> location to where it is inserted. How does that sound?
>
> Still the same problem: you cannot make that a general rule, since you don't
> know the coverage status of the instruction just above the insertion point.
> If a later optimization moves the new statements around, you may well end up
> with wrong coverage info.
>
> --
> Eric Botcazou
Dehao Chen - Nov. 15, 2012, 6:32 p.m.
Yeah, at least for the unittest I provided, the coverage info will be
wrong without the patch.

Thanks,
Dehao

On Thu, Nov 15, 2012 at 10:30 AM, Xinliang David Li <davidxl@google.com> wrote:
> I probably made too general statement in this topic. However for the
> PRE case, I believe the choice of not using UNKNOWN location is still
> better.
>
> thanks,
>
> David
>
> On Thu, Nov 15, 2012 at 9:23 AM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>>> The randomness here means that if we set UNKNOWN_LOCATION to insn, it
>>> can get source location anywhere. Even with some small code layout
>>> changes, the location for that insn could change. I would hope that in
>>> the future, we add an assertion when emitting instruction to enforce
>>> that INSN_LOCATION is never UNKNOWN_LOCATION. So when generate new
>>> instructions/stmts, if we can surely know where it is coming from, it
>>> is fine. Otherwise, instead of using UNKNOWN_LOCATION, we will set its
>>> location to where it is inserted. How does that sound?
>>
>> Still the same problem: you cannot make that a general rule, since you don't
>> know the coverage status of the instruction just above the insertion point.
>> If a later optimization moves the new statements around, you may well end up
>> with wrong coverage info.
>>
>> --
>> Eric Botcazou
Richard Guenther - Nov. 25, 2012, 12:40 p.m.
On Thu, Nov 15, 2012 at 5:46 PM, Eric Botcazou <ebotcazou@adacore.com> wrote:
>> But UNKNOWN_LOCATION is effectively wrong as well.  If other
>> optimizations move the statements above the inserted instruction, then
>> the new instruction ends up inheriting whatever location happens to be
>> in the previous statement.
>
> That's correct, UNKNOWN_LOCATION isn't a panacea either, although it can help
> in some cases.  The only safe thing to do is to put the exact source location
> of the statement, i.e. the location of the source construct for which the
> statement has been generated.

There may not be a source location for a generated statement, the computed
value might not have been computed in the source at any point even.  Consider
re-association of an expression and then a re-associated part becoming
partially redundant.

 if ()
   tem = a + b;
 tem2 = a + c  + b;

after re-assoc + PRE:

 if ()
   tem = a + b;
else
   tem' = a + b;  // which sloc here?
tem'' = PHI <tem, tem'>  //  or here?  on the args?
tem2 = tem'' + c;

so what source location would you use for the inserted expression?  If
UNKNOWN is not
correct here then I think we need an explicit NOWHERE?

>> Fixing the location when the statement is inserted will reduce this
>> randomness.  I'm not sure I understand why you think this will break
>> coverage.  The examples given in this thread were helped by this
>> location fixing.
>
> What do you call randomness exactly?  GDB jumpiness?  I'm a little wary of
> fixing GDB jumpiness by distorting the debug info.  Ian has clearly outlined
> the correct approach to addressing it.
>
> Note that I didn't specifically reply to Dehao's patch here, which might be
> acceptable in the end, only to David's seemingly general statement about the
> "natural way" of putting a location on these statements.

Richard.

> --
> Eric Botcazou

Patch

Index: testsuite/g++.dg/debug/dwarf2/block.C
===================================================================
--- testsuite/g++.dg/debug/dwarf2/block.C	(revision 0)
+++ testsuite/g++.dg/debug/dwarf2/block.C	(revision 0)
@@ -0,0 +1,29 @@ 
+// Compiler should not generate too many lexical blocks for this function.
+// { dg-do compile { target { i?86-*-* x86_64-*-* } } }
+// { dg-options "-O0 -fno-exceptions -g -dA" }
+
+union UElement {
+    void* pointer;
+    int integer;
+};
+struct UColToken {
+  unsigned source;
+  unsigned char **rulesToParseHdl;
+};
+
+int uhash_hashTokens(const union UElement k)
+{
+  int hash = 0;
+  struct UColToken *key = (struct UColToken *)k.pointer;
+  if (key != 0) {
+    int len = (key->source & 0xFF000000)>>24;
+    int inc = ((len - 32) / 32) + 1;
+    const unsigned char *p = (key->source & 0x00FFFFFF)
+			     + *(key->rulesToParseHdl);
+    const unsigned char *limit = p + len;
+    hash = *p + *limit;
+  }
+  return hash;
+}
+
+// { dg-final { scan-assembler-not "LBB10" } }
Index: tree-eh.c
===================================================================
--- tree-eh.c	(revision 192809)
+++ tree-eh.c	(working copy)
@@ -739,6 +739,7 @@  do_return_redirection (struct goto_queue_node *q,
     gimple_seq_add_seq (&q->repl_stmt, mod);
 
   x = gimple_build_goto (finlab);
+  gimple_set_location (x, q->location);
   gimple_seq_add_stmt (&q->repl_stmt, x);
 }
 
@@ -758,6 +759,7 @@  do_goto_redirection (struct goto_queue_node *q, tr
     gimple_seq_add_seq (&q->repl_stmt, mod);
 
   x = gimple_build_goto (finlab);
+  gimple_set_location (x, q->location);
   gimple_seq_add_stmt (&q->repl_stmt, x);
 }
 
@@ -857,6 +859,7 @@  frob_into_branch_around (gimple tp, eh_region regi
       if (!over)
 	over = create_artificial_label (loc);
       x = gimple_build_goto (over);
+      gimple_set_location (x, loc);
       gimple_seq_add_stmt (&cleanup, x);
     }
   gimple_seq_add_seq (&eh_seq, cleanup);
@@ -1085,6 +1088,7 @@  lower_try_finally_nofallthru (struct leh_state *st
 	  emit_post_landing_pad (&eh_seq, tf->region);
 
 	  x = gimple_build_goto (lab);
+	  gimple_set_location (x, gimple_location (tf->try_finally_expr));
 	  gimple_seq_add_stmt (&eh_seq, x);
 	}
     }
@@ -1223,6 +1227,7 @@  lower_try_finally_copy (struct leh_state *state, s
 
       tmp = lower_try_finally_fallthru_label (tf);
       x = gimple_build_goto (tmp);
+      gimple_set_location (x, tf_loc);
       gimple_seq_add_stmt (&new_stmt, x);
     }
 
@@ -1395,6 +1400,7 @@  lower_try_finally_switch (struct leh_state *state,
 
       tmp = lower_try_finally_fallthru_label (tf);
       x = gimple_build_goto (tmp);
+      gimple_set_location (x, tf_loc);
       gimple_seq_add_stmt (&switch_body, x);
     }
 
@@ -1423,6 +1429,7 @@  lower_try_finally_switch (struct leh_state *state,
       gimple_seq_add_stmt (&eh_seq, x);
 
       x = gimple_build_goto (finally_label);
+      gimple_set_location (x, tf_loc);
       gimple_seq_add_stmt (&eh_seq, x);
 
       tmp = build_int_cst (integer_type_node, eh_index);
Index: expr.c
===================================================================
--- expr.c	(revision 192809)
+++ expr.c	(working copy)
@@ -5030,7 +5030,7 @@  store_expr (tree exp, rtx target, int call_param_p
 {
   rtx temp;
   rtx alt_rtl = NULL_RTX;
-  location_t loc = EXPR_LOCATION (exp);
+  location_t loc = curr_insn_location ();
 
   if (VOID_TYPE_P (TREE_TYPE (exp)))
     {
@@ -7869,31 +7869,7 @@  expand_expr_real (tree exp, rtx target, enum machi
       return ret ? ret : const0_rtx;
     }
 
-  /* If this is an expression of some kind and it has an associated line
-     number, then emit the line number before expanding the expression.
-
-     We need to save and restore the file and line information so that
-     errors discovered during expansion are emitted with the right
-     information.  It would be better of the diagnostic routines
-     used the file/line information embedded in the tree nodes rather
-     than globals.  */
-  if (cfun && EXPR_HAS_LOCATION (exp))
-    {
-      location_t saved_location = input_location;
-      location_t saved_curr_loc = curr_insn_location ();
-      input_location = EXPR_LOCATION (exp);
-      set_curr_insn_location (input_location);
-
-      ret = expand_expr_real_1 (exp, target, tmode, modifier, alt_rtl);
-
-      input_location = saved_location;
-      set_curr_insn_location (saved_curr_loc);
-    }
-  else
-    {
-      ret = expand_expr_real_1 (exp, target, tmode, modifier, alt_rtl);
-    }
-
+  ret = expand_expr_real_1 (exp, target, tmode, modifier, alt_rtl);
   return ret;
 }
 
@@ -9250,8 +9226,15 @@  expand_expr_real_1 (tree exp, rtx target, enum mac
 	g = SSA_NAME_DEF_STMT (exp);
       if (g)
 	{
-	  rtx r = expand_expr_real (gimple_assign_rhs_to_tree (g), target,
-				    tmode, modifier, NULL);
+	  rtx r;
+	  location_t saved_loc = input_location;
+
+	  input_location = gimple_location (g);
+	  set_curr_insn_location (input_location);
+	  r = expand_expr_real (gimple_assign_rhs_to_tree (g), target,
+				tmode, modifier, NULL);
+	  input_location = saved_loc;
+	  set_curr_insn_location (saved_loc);
 	  if (REG_P (r) && !REG_EXPR (r))
 	    set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (exp), r);
 	  return r;
@@ -9481,7 +9464,7 @@  expand_expr_real_1 (tree exp, rtx target, enum mac
 	       with non-BLKmode values.  */
 	    gcc_assert (GET_MODE (ret) != BLKmode);
 
-	    val = build_decl (EXPR_LOCATION (exp),
+	    val = build_decl (curr_insn_location (),
 			      VAR_DECL, NULL, TREE_TYPE (exp));
 	    DECL_ARTIFICIAL (val) = 1;
 	    DECL_IGNORED_P (val) = 1;