diff mbox

Postpone __LINE__ evaluation to the end of #line directives

Message ID 1385548162-1883-1-git-send-email-mtewoodbury@gmail.com
State New
Headers show

Commit Message

Max Woodbury Nov. 27, 2013, 10:29 a.m. UTC
From: Max TenEyck Woodbury <max+git@mtew.isa-geek.net>

Copyright 2013 assigned to the Free Software Foundation.
---
 gcc/testsuite/gcc.dg/cpp/line4.c | 19 +++++++++++++++----
 libcpp/directives.c              |  9 ++++++++-
 libcpp/internal.h                |  1 +
 libcpp/macro.c                   |  3 +++
 4 files changed, 27 insertions(+), 5 deletions(-)

Comments

Marek Polacek Nov. 27, 2013, 10:46 a.m. UTC | #1
On Wed, Nov 27, 2013 at 05:29:22AM -0500, mtewoodbury@gmail.com wrote:
> From: Max TenEyck Woodbury <max+git@mtew.isa-geek.net>

This patch is badly missing a description.  You also want to mention
the PR number, if this fixes a bug.  I guess this is to fix PR58687.
 
> Copyright 2013 assigned to the Free Software Foundation.
> ---
>  gcc/testsuite/gcc.dg/cpp/line4.c | 19 +++++++++++++++----
>  libcpp/directives.c              |  9 ++++++++-
>  libcpp/internal.h                |  1 +
>  libcpp/macro.c                   |  3 +++
>  4 files changed, 27 insertions(+), 5 deletions(-)

You didn't attach a ChangeLog entry.

The patch has numerous formatting issues, see below.

> diff --git a/gcc/testsuite/gcc.dg/cpp/line4.c b/gcc/testsuite/gcc.dg/cpp/line4.c
> index 84dbf96..0120a2b 100644
> --- a/gcc/testsuite/gcc.dg/cpp/line4.c
> +++ b/gcc/testsuite/gcc.dg/cpp/line4.c
> @@ -13,7 +13,18 @@ enum { i = __LINE__ };
>  enum { j = __LINE__ };
>  
>  #line 16  /* N.B. the _next_ line is line 16.  */
> -
> -char array1[i        == 44 ? 1 : -1];
> -char array2[j        == 90 ? 1 : -1];
> -char array3[__LINE__ == 19 ? 1 : -1];
> +                                                     /* __LINE__ should be 16 */
> +char array1[i        == 44 ? 1 : -1];                                   /* 17 */
> +char array2[j        == 90 ? 1 : -1];                                   /* 18 */
> +char array3[__LINE__ == 19 ? 1 : -1];                                   /* 19 */
> +                                                                        /* 20 */
> +# line __LINE__ /* N.B. the __LINE__ sequence should _not_ change here. */

Two spaces after '.'.

> +# line __LINE__ /* N.B. nor shoud a multi-line comment change the fact

"should"

> diff --git a/libcpp/directives.c b/libcpp/directives.c
> index 65b2034..adb04a5 100644
> --- a/libcpp/directives.c
> +++ b/libcpp/directives.c
> @@ -900,7 +900,9 @@ do_line (cpp_reader *pfile)
>    bool wrapped;
>  
>    /* #line commands expand macros.  */
> +  ++pfile->state.in_directive;          /* Request special __LINE__ handling. */

Don't put the comments next to the statements, instead put them above
the statements.

>    token = cpp_get_token (pfile);
> +  --pfile->state.in_directive;                              /* Cancle request */

"Cancel"

> -  if (CPP_PEDANTIC (pfile) && (new_lineno == 0 || new_lineno > cap || wrapped))
> +  if (CPP_PEDANTIC (pfile) && (new_lineno == 0 ||
> +      (new_lineno > cap && new_lineno != CUR__LINE__) ||
> +      wrapped))

|| operator shouldn't end the line, put it on the start of the next
line.  See existing code for how it should look like.

>      cpp_error (pfile, CPP_DL_PEDWARN, "line number out of range");
>    else if (wrapped)
>      cpp_error (pfile, CPP_DL_WARNING, "line number out of range");
> @@ -936,6 +940,9 @@ do_line (cpp_reader *pfile)
>      }
>  
>    skip_rest_of_line (pfile);
> +  if ( new_lineno == CUR__LINE__ )                  /* Postponed evaluation ? */
> +    new_lineno = linemap_get_expansion_line (pfile->line_table,
> +                                             pfile->line_table->highest_line);

No space after '(' and before ')'.  No space before '?'.

> +#define CUR__LINE__ -1U

The trailing underscores look weird to me.  Thanks,

	Marek
Joseph Myers Nov. 27, 2013, 4:21 p.m. UTC | #2
On Wed, 27 Nov 2013, mtewoodbury@gmail.com wrote:

> Copyright 2013 assigned to the Free Software Foundation.

FWIW I don't see this in copyright.list yet.  If you sent the paperwork 
(whether paper mail or scans) to the FSF over a week ago and haven't had 
it acknowledged, please chase up assign@gnu.org, and keep chasing them up 
weekly until it's acknowledged and a maintainer confirms it's now listed 
in copyright.list.

> diff --git a/gcc/testsuite/gcc.dg/cpp/line4.c b/gcc/testsuite/gcc.dg/cpp/line4.c

I think it's best to leave this test as-is and add a new test line9.c 
specifically for this case of "#line __LINE__".  Changing existing tests 
unnecessarily complicates establishing whether a newly seen failure is a 
regression or not.

I think it would be worth including tests where the __LINE__ expansion 
(which as you note should be the line number of the line *after* the 
directive) is involved in concatenation or stringized, e.g.

#define xstr(x) #x
#define str(x) xstr(x)
#line 1 str(__LINE__)

which should set __FILE__ to be the stringized number of the line after 
the directive.  (I hope these cases will just work given your patch, so 
this is just a matter of adding more to the testcase.)
Max Woodbury Nov. 27, 2013, 6:18 p.m. UTC | #3
On 11/27/2013 05:46 AM, Marek Polacek wrote:> On Wed, Nov 27, 2013 at 
05:29:22AM -0500, mtewoodbury@gmail.com wrote:
 >> From: Max TenEyck Woodbury <max+git@mtew.isa-geek.net>
 >
 > This patch is badly missing a description.  You also want to mention
 > the PR number, if this fixes a bug.  I guess this is to fix PR58687.
 >

I am new to GNU patching.  There is some description on Bugzilla.  At
least one other group I have worked with disliked bug number references
in patches.

Please point me to the formatting requirements for change log entries...

On 11/27/2013 11:21 AM, Joseph S. Myers wrote:
> On Wed, 27 Nov 2013, mtewoodbury@gmail.com wrote:
>
>> Copyright 2013 assigned to the Free Software Foundation.
>
> FWIW I don't see this in copyright.list yet.  If you sent the paperwork
> (whether paper mail or scans) to the FSF over a week ago and haven't had
> it acknowledged, please chase up assign@gnu.org, and keep chasing them up
> weekly until it's acknowledged and a maintainer confirms it's now listed
> in copyright.list.

I am a new at GNU patching.  I had hoped that a simple assignment in
the document would be sufficient.  I really do not want to turn
Everything I do over to FSF...

>
>> diff --git a/gcc/testsuite/gcc.dg/cpp/line4.c b/gcc/testsuite/gcc.dg/cpp/line4.c
>
> I think it's best to leave this test as-is and add a new test line9.c
> specifically for this case of "#line __LINE__".  Changing existing tests
> unnecessarily complicates establishing whether a newly seen failure is a
> regression or not.

HMM?  Not sure I understand that line of reasoning.  The test is about
setting the line number.  This just adds the __LINE__ special case.

>
> I think it would be worth including tests where the __LINE__ expansion
> (which as you note should be the line number of the line *after* the
> directive) is involved in concatenation or stringized, e.g.
>
> #define xstr(x) #x
> #define str(x) xstr(x)
> #line 1 str(__LINE__)
>
> which should set __FILE__ to be the stringized number of the line after
> the directive.  (I hope these cases will just work given your patch, so
> this is just a matter of adding more to the testcase.)
>

This patch would not do that.  It only fixes the use of __LINE__ when
used to reset the line number.  There are at least two ways this could
be implemented and keep with the standard:

One would be to set the new line value as soon as it is seen in the
'# line' directive and let the normal __LINE__ tracking take care of
the end-of-line increments.  That would require a larger change than
this patch and would have implications when it comes to error handling.

The other is to postpone macro expansion until the end of the directive
has been seen.  That would require two passes over the token sequence;
one to build the sequence and a second to evaluate it.  Doing that is
'the right thing' in my opinion but it should be applied to all the
other directives that permit <pp-token> substitution as well.  I tried
to do it that way initially, but that requires a deeper understanding
of the current implementation than I was able to build quickly.  I
would like to see a 'theory of operation' document that contained such
details, as some commercial software shops require, but that does not
seem to be how it is done here.

So I punted...
Joseph Myers Nov. 27, 2013, 7:03 p.m. UTC | #4
On Wed, 27 Nov 2013, Max Woodbury wrote:

> On 11/27/2013 05:46 AM, Marek Polacek wrote:> On Wed, Nov 27, 2013 at
> 05:29:22AM -0500, mtewoodbury@gmail.com wrote:
> >> From: Max TenEyck Woodbury <max+git@mtew.isa-geek.net>
> >
> > This patch is badly missing a description.  You also want to mention
> > the PR number, if this fixes a bug.  I guess this is to fix PR58687.
> >
> 
> I am new to GNU patching.  There is some description on Bugzilla.  At
> least one other group I have worked with disliked bug number references
> in patches.

It's best for the gcc-patches submission to be self-contained, with the 
explanation of the issue, why it's a bug, the approach taken to fix it and 
the rationale for that approach.

> Please point me to the formatting requirements for change log entries...

Looking at existing examples in the ChangeLog files is the simplest way of 
getting the hang of them, but

http://gcc.gnu.org/codingconventions.html#ChangeLogs

points to the GNU Coding Standards which give the detailed specification 
at:

http://www.gnu.org/prep/standards/html_node/Change-Logs.html#Change-Logs

> On 11/27/2013 11:21 AM, Joseph S. Myers wrote:
> > On Wed, 27 Nov 2013, mtewoodbury@gmail.com wrote:
> > 
> > > Copyright 2013 assigned to the Free Software Foundation.
> > 
> > FWIW I don't see this in copyright.list yet.  If you sent the paperwork
> > (whether paper mail or scans) to the FSF over a week ago and haven't had
> > it acknowledged, please chase up assign@gnu.org, and keep chasing them up
> > weekly until it's acknowledged and a maintainer confirms it's now listed
> > in copyright.list.
> 
> I am a new at GNU patching.  I had hoped that a simple assignment in
> the document would be sufficient.  I really do not want to turn
> Everything I do over to FSF...

You can assign copyright in just an individual patch, or all past and 
future changes to an individual GNU project, but in either case there's 
still paperwork involved (which can be done electronically in some cases).

http://git.savannah.gnu.org/cgit/gnulib.git/tree/doc/Copyright/request-assign.changes

(for assigning changes already done)

http://git.savannah.gnu.org/cgit/gnulib.git/plain/doc/Copyright/request-assign.future

(for assigning past and future changes)

> > > diff --git a/gcc/testsuite/gcc.dg/cpp/line4.c
> > > b/gcc/testsuite/gcc.dg/cpp/line4.c
> > 
> > I think it's best to leave this test as-is and add a new test line9.c
> > specifically for this case of "#line __LINE__".  Changing existing tests
> > unnecessarily complicates establishing whether a newly seen failure is a
> > regression or not.
> 
> HMM?  Not sure I understand that line of reasoning.  The test is about
> setting the line number.  This just adds the __LINE__ special case.

There are eight different lineN.c tests for different aspects of setting 
the line number.  This is a new aspect not previously considered.

It's best that a given test, such as gcc.dg/cpp/line4.c, always stays 
testing the same things, and that new things are tested in new tests.  If 
someone sees line4.c failing and searches for information about that test, 
it's least confusing if any past information about that test and its 
failures is still relevant to the test as it is today - if the test has 
only changed, since it was added, as actually required to reflect 
deliberate changes to what should be expected, and not simply to test new 
things not previously covered by the testsuite at all.

> The other is to postpone macro expansion until the end of the directive
> has been seen.  That would require two passes over the token sequence;
> one to build the sequence and a second to evaluate it.  Doing that is
> 'the right thing' in my opinion but it should be applied to all the
> other directives that permit <pp-token> substitution as well.  I tried
> to do it that way initially, but that requires a deeper understanding
> of the current implementation than I was able to build quickly.  I
> would like to see a 'theory of operation' document that contained such
> details, as some commercial software shops require, but that does not
> seem to be how it is done here.

There's cppinternals.texi, though it may be out of date.
Joseph Myers Nov. 27, 2013, 9:10 p.m. UTC | #5
On further consideration, I'm not convinced there's a bug here at all; I 
don't think it's sufficiently defined in the standard what the current 
token is for the purposes of line numbering when __LINE__ gets expanded to 
be able to say that one or the other value in a directive is wrong.  As 
far as I can tell after checking the reflector email archives, the only 
time WG14 considered anything vaguely relevant was C90 DR#173, which 
didn't receive a response.  
<https://groups.google.com/forum/#!original/comp.std.c/ulBqfulqO_0/P8AzVfP8U5oJ> 
is a statement by the standard editor that some cases were deliberately 
left ambiguous.  So that suggests the bug should be closed as INVALID (but 
if you think this case should be fully defined, you could always file a DR 
with WG14).
Max Woodbury Nov. 28, 2013, 3:51 a.m. UTC | #6
On 11/27/2013 04:10 PM, Joseph S. Myers wrote:
> On further consideration, I'm not convinced there's a bug here at all; I
> don't think it's sufficiently defined in the standard what the current
> token is for the purposes of line numbering when __LINE__ gets expanded to
> be able to say that one or the other value in a directive is wrong.  As
> far as I can tell after checking the reflector email archives, the only
> time WG14 considered anything vaguely relevant was C90 DR#173, which
> didn't receive a response.
> <https://groups.google.com/forum/#!original/comp.std.c/ulBqfulqO_0/P8AzVfP8U5oJ>
> is a statement by the standard editor that some cases were deliberately
> left ambiguous.  So that suggests the bug should be closed as INVALID (but
> if you think this case should be fully defined, you could always file a DR
> with WG14).
>
PLEASE think about this a bit more.

There should be a way to change the __FILE__ value without changing the
line number sequencing.  Whatever that mechanism is, it should NOT
introduce maintenance problems that involve counting lines of code.

A little Googeling quickly turns up examples that make it clear that:

     #line __LINE__ "new__FILE__value"

is that expected mechanism,

A little additional thought should make it clear that multi-line
comments should NOT screw up resetting the line number,  They are
supposed to be removed in a translation phase before the phase where
directives are identified.

There is also the way the standard defines the two fixed forms for
'#line' first and then allows for <pp-token> substitution for cases
that do not match the prescribed forms as a separate form.  All three
forms require that the <end-of-line> be seen before the directive is
processed which means any substitution for __LINE__ should only take
place AFTER the <end-of-line> that ends the directive has been seen.

In other words, if you processed the text in multiple phases the way
the standard requires, you would not substitute the value for the
__LINE__ token until after the end of the directive has been seen.
Thus the problem only arises because this implementation folds the
translation phases into a single pass over the text and takes an
improper short-cut as it does so.  The standard explicitly warns
against this kind of mistake.

It looks to me like WG14 tried quite hard to make the expected form
work the way most people expect it to.

So I have STRONG objections to calling the bug report 'INVALID'!
Joseph Myers Nov. 28, 2013, 4:34 p.m. UTC | #7
On Wed, 27 Nov 2013, Max Woodbury wrote:

> There should be a way to change the __FILE__ value without changing the
> line number sequencing.  Whatever that mechanism is, it should NOT
> introduce maintenance problems that involve counting lines of code.

I think that #line is mainly intended for use by code generators that 
generate C code, rather than directly by people writing C programs.  Such 
a code generator can easily manage counting lines of code.

> A little Googeling quickly turns up examples that make it clear that:
> 
>     #line __LINE__ "new__FILE__value"
> 
> is that expected mechanism,

You'll find any number of examples online based on misconceptions about 
the C languages, possibly together with what one particular implementation 
does.  Any recommendation to do things based on an area where the editor 
of the standard has said the ambiguity in the standard is deliberate is 
clearly a bad recommendation.  Recommendations on use of C should be based 
on areas where the standard is clear and implementations agree.

> In other words, if you processed the text in multiple phases the way
> the standard requires, you would not substitute the value for the
> __LINE__ token until after the end of the directive has been seen.
> Thus the problem only arises because this implementation folds the
> translation phases into a single pass over the text and takes an
> improper short-cut as it does so.  The standard explicitly warns
> against this kind of mistake.

The standard itself mixes up the phases.  Recall that the definition of 
line number is "one greater than the number of new-line characters read or 
introduced in translation phase 1 (5.1.1.2) while processing the source 
file to the current token" (where "current token" is never defined).  If 
the phases were completely separate, by your reasoning every newline has 
been processed in phase 1 before any of phases 2, 3 or 4 do anything, and 
so all line numbers relate to the end of the file.  There is absolutely 
nothing to say that the newline at the end of the #line directive has been 
read "while processing the source file to the current token" (if __LINE__ 
in the #line directive is the current token) but that the newline after it 
hasn't been read; if anything, the phases imply that all newlines have 
been read.

This case is just as ambiguous as the case of a multi-line macro call, 
where __LINE__ gets expanded somewhere in the macro arguments, and the 
line number can be that of the macro name, or of the closing parenthesis 
of the call, or somewhere in between, and the standard does not make a 
conformance distinction between those choices.

So, I don't think we should make complicated changes to implement one 
particular choice in an area of deliberate ambiguity without direction 
from WG14 to eliminate the ambiguity in the standard.  Instead, we can let 
the choices be whatever is most natural in the implementation.  If you 
believe the standard is defective in not defining certain things, I advise 
filing a DR (or, when next open for revisions, proposing a paper at a 
meeting to change the definition as you think appropriate).
diff mbox

Patch

diff --git a/gcc/testsuite/gcc.dg/cpp/line4.c b/gcc/testsuite/gcc.dg/cpp/line4.c
index 84dbf96..0120a2b 100644
--- a/gcc/testsuite/gcc.dg/cpp/line4.c
+++ b/gcc/testsuite/gcc.dg/cpp/line4.c
@@ -13,7 +13,18 @@  enum { i = __LINE__ };
 enum { j = __LINE__ };
 
 #line 16  /* N.B. the _next_ line is line 16.  */
-
-char array1[i        == 44 ? 1 : -1];
-char array2[j        == 90 ? 1 : -1];
-char array3[__LINE__ == 19 ? 1 : -1];
+                                                     /* __LINE__ should be 16 */
+char array1[i        == 44 ? 1 : -1];                                   /* 17 */
+char array2[j        == 90 ? 1 : -1];                                   /* 18 */
+char array3[__LINE__ == 19 ? 1 : -1];                                   /* 19 */
+                                                                        /* 20 */
+# line __LINE__ /* N.B. the __LINE__ sequence should _not_ change here. */
+                                                                        /* 22 */
+char array4[__LINE__ == 23 ? 1: -1];                                    /* 23 */
+char array5[__LINE__ == 23 ? -1: 1];                                    /* 24 */
+                                                                        /* 25 */
+# line __LINE__ /* N.B. nor shoud a multi-line comment change the fact
+                   that the __LINE__ sequence should _not_ change here. */
+                                                                        /* 28 */
+char array6[__LINE__ == 29 ? 1: -1];                                    /* 29 */
+char array7[__LINE__ == 27 ? -1: 1];                                    /* 30 */
diff --git a/libcpp/directives.c b/libcpp/directives.c
index 65b2034..adb04a5 100644
--- a/libcpp/directives.c
+++ b/libcpp/directives.c
@@ -900,7 +900,9 @@  do_line (cpp_reader *pfile)
   bool wrapped;
 
   /* #line commands expand macros.  */
+  ++pfile->state.in_directive;          /* Request special __LINE__ handling. */
   token = cpp_get_token (pfile);
+  --pfile->state.in_directive;                              /* Cancle request */
   if (token->type != CPP_NUMBER
       || strtolinenum (token->val.str.text, token->val.str.len,
 		       &new_lineno, &wrapped))
@@ -914,7 +916,9 @@  do_line (cpp_reader *pfile)
       return;
     }
 
-  if (CPP_PEDANTIC (pfile) && (new_lineno == 0 || new_lineno > cap || wrapped))
+  if (CPP_PEDANTIC (pfile) && (new_lineno == 0 ||
+      (new_lineno > cap && new_lineno != CUR__LINE__) ||
+      wrapped))
     cpp_error (pfile, CPP_DL_PEDWARN, "line number out of range");
   else if (wrapped)
     cpp_error (pfile, CPP_DL_WARNING, "line number out of range");
@@ -936,6 +940,9 @@  do_line (cpp_reader *pfile)
     }
 
   skip_rest_of_line (pfile);
+  if ( new_lineno == CUR__LINE__ )                  /* Postponed evaluation ? */
+    new_lineno = linemap_get_expansion_line (pfile->line_table,
+                                             pfile->line_table->highest_line);
   _cpp_do_file_change (pfile, LC_RENAME_VERBATIM, new_file, new_lineno,
 		       map_sysp);
 }
diff --git a/libcpp/internal.h b/libcpp/internal.h
index 5321458..268de86 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -604,6 +604,7 @@  cpp_in_primary_file (cpp_reader *pfile)
 {
   return pfile->line_table->depth == 1;
 }
+#define CUR__LINE__ -1U
 
 /* In macro.c */
 extern void _cpp_free_definition (cpp_hashnode *);
diff --git a/libcpp/macro.c b/libcpp/macro.c
index e359d15..47e41b6 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -309,6 +309,9 @@  _cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node)
       /* If __LINE__ is embedded in a macro, it must expand to the
 	 line of the macro's invocation, not its definition.
 	 Otherwise things like assert() will not work properly.  */
+      if ( pfile->state.in_directive > 1 )             /* In #line directive? */
+        number = CUR__LINE__;                  /* yes, postpone the lookup... */
+      else
       number = linemap_get_expansion_line (pfile->line_table,
 					   CPP_OPTION (pfile, traditional)
 					   ? pfile->line_table->highest_line