preprocessor: Fix tracking of system header state [PR60014, PR60723]

Message ID 5dce970b21e788deaa3d08f21995d8cb3cdb3752.1665263871.git.lhyatt@gmail.com
State Accepted, archived
Headers
Series preprocessor: Fix tracking of system header state [PR60014, PR60723] |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Lewis Hyatt Oct. 8, 2022, 9:18 p.m. UTC
  The token_streamer class (which implements gcc mode -E and
-save-temps/-no-integrated-cpp) needs to keep track whether the last tokens
output were in a system header, so that it can generate line marker
annotations as necessary for a downstream consumer to reconstruct the
state. The logic for tracking it, which was added by r5-1863 to resolve
PR60723, has some edge case issues as revealed by the three new test
cases. The first, coming from the original PR60014, was incidentally fixed by
r9-1926 for unrelated reasons. The other two were still failing on master
prior to this commit. Such code paths were not realizable prior to r13-1544,
which made it possible for the token streamer to see CPP_PRAGMA tokens in more
contexts.

The two main issues being corrected here are:

1) print.prev_was_system_token needs to indicate whether the previous token
output was in a system location. However, it was not being set on every token,
only on those that triggered the main code path; specifically it was not
triggered on a CPP_PRAGMA token. Testcase 2 covers this case.

2) The token_streamer uses a variable "line_marker_emitted" to remember
whether a line marker has been emitted while processing a given token, so that
it wouldn't be done more than once in case multiple conditions requiring a
line marker are true. There was no reason for this to be a member variable
that retains its value from token to token, since it is just needed for
tracking the state locally while processing a single given token. The fact
that it could retain its value for a subsequent token is rather difficult to
observe, but testcase 3 demonstrates incorrect behavior resulting from
that. Moving this to a local variable also simplifies understanding the
control flow going forward.

gcc/c-family/ChangeLog:

	PR preprocessor/60014
	PR preprocessor/60723
	* c-ppoutput.cc (class token_streamer): Remove member
	line_marker_emitted to...
	(token_streamer::stream): ...a local variable here. Set
	print.prev_was_system_token on all code paths.

gcc/testsuite/ChangeLog:

	PR preprocessor/60014
	PR preprocessor/60723
	* gcc.dg/cpp/pr60014-1.c: New test.
	* gcc.dg/cpp/pr60014-1.h: New test.
	* gcc.dg/cpp/pr60014-2.c: New test.
	* gcc.dg/cpp/pr60014-2.h: New test.
	* gcc.dg/cpp/pr60014-3.c: New test.
	* gcc.dg/cpp/pr60014-3.h: New test.
---

Notes:
    Hello-
    
    I tried to describe it all in the commit message, please see also
    https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60014#c8 for more
    details. bootstrap+regtest all languages looks good on x86-64 Linux. Please
    let me know if it looks OK? Thanks!
    
    -Lewis

 gcc/c-family/c-ppoutput.cc           | 17 ++++++++++-------
 gcc/testsuite/gcc.dg/cpp/pr60014-1.c |  9 +++++++++
 gcc/testsuite/gcc.dg/cpp/pr60014-1.h |  5 +++++
 gcc/testsuite/gcc.dg/cpp/pr60014-2.c |  5 +++++
 gcc/testsuite/gcc.dg/cpp/pr60014-2.h |  5 +++++
 gcc/testsuite/gcc.dg/cpp/pr60014-3.c | 16 ++++++++++++++++
 gcc/testsuite/gcc.dg/cpp/pr60014-3.h |  2 ++
 7 files changed, 52 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-1.c
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-1.h
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-2.c
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-2.h
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-3.c
 create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-3.h
  

Comments

Jeff Law Oct. 12, 2022, 5:09 p.m. UTC | #1
On 10/8/22 15:18, Lewis Hyatt via Gcc-patches wrote:
> The token_streamer class (which implements gcc mode -E and
> -save-temps/-no-integrated-cpp) needs to keep track whether the last tokens
> output were in a system header, so that it can generate line marker
> annotations as necessary for a downstream consumer to reconstruct the
> state. The logic for tracking it, which was added by r5-1863 to resolve
> PR60723, has some edge case issues as revealed by the three new test
> cases. The first, coming from the original PR60014, was incidentally fixed by
> r9-1926 for unrelated reasons. The other two were still failing on master
> prior to this commit. Such code paths were not realizable prior to r13-1544,
> which made it possible for the token streamer to see CPP_PRAGMA tokens in more
> contexts.
>
> The two main issues being corrected here are:
>
> 1) print.prev_was_system_token needs to indicate whether the previous token
> output was in a system location. However, it was not being set on every token,
> only on those that triggered the main code path; specifically it was not
> triggered on a CPP_PRAGMA token. Testcase 2 covers this case.
>
> 2) The token_streamer uses a variable "line_marker_emitted" to remember
> whether a line marker has been emitted while processing a given token, so that
> it wouldn't be done more than once in case multiple conditions requiring a
> line marker are true. There was no reason for this to be a member variable
> that retains its value from token to token, since it is just needed for
> tracking the state locally while processing a single given token. The fact
> that it could retain its value for a subsequent token is rather difficult to
> observe, but testcase 3 demonstrates incorrect behavior resulting from
> that. Moving this to a local variable also simplifies understanding the
> control flow going forward.
>
> gcc/c-family/ChangeLog:
>
> 	PR preprocessor/60014
> 	PR preprocessor/60723
> 	* c-ppoutput.cc (class token_streamer): Remove member
> 	line_marker_emitted to...
> 	(token_streamer::stream): ...a local variable here. Set
> 	print.prev_was_system_token on all code paths.
>
> gcc/testsuite/ChangeLog:
>
> 	PR preprocessor/60014
> 	PR preprocessor/60723
> 	* gcc.dg/cpp/pr60014-1.c: New test.
> 	* gcc.dg/cpp/pr60014-1.h: New test.
> 	* gcc.dg/cpp/pr60014-2.c: New test.
> 	* gcc.dg/cpp/pr60014-2.h: New test.
> 	* gcc.dg/cpp/pr60014-3.c: New test.
> 	* gcc.dg/cpp/pr60014-3.h: New test.

OK

jeff
  

Patch

diff --git a/gcc/c-family/c-ppoutput.cc b/gcc/c-family/c-ppoutput.cc
index 98081ccfbb0..a99d9e9c5ca 100644
--- a/gcc/c-family/c-ppoutput.cc
+++ b/gcc/c-family/c-ppoutput.cc
@@ -184,15 +184,13 @@  class token_streamer
   bool avoid_paste;
   bool do_line_adjustments;
   bool in_pragma;
-  bool line_marker_emitted;
 
  public:
   token_streamer (cpp_reader *pfile)
     :avoid_paste (false),
     do_line_adjustments (cpp_get_options (pfile)->lang != CLK_ASM
 			 && !flag_no_line_commands),
-    in_pragma (false),
-    line_marker_emitted (false)
+    in_pragma (false)
     {
       gcc_assert (!print.streamer);
       print.streamer = this;
@@ -227,7 +225,14 @@  token_streamer::stream (cpp_reader *pfile, const cpp_token *token,
   if (token->type == CPP_EOF)
     return;
 
+  /* Keep track when we move into and out of system locations.  */
+  const bool is_system_token = in_system_header_at (loc);
+  const bool system_state_changed
+    = (is_system_token != print.prev_was_system_token);
+  print.prev_was_system_token = is_system_token;
+
   /* Subtle logic to output a space if and only if necessary.  */
+  bool line_marker_emitted = false;
   if (avoid_paste)
     {
       unsigned src_line = LOCATION_LINE (loc);
@@ -301,19 +306,17 @@  token_streamer::stream (cpp_reader *pfile, const cpp_token *token,
       if (do_line_adjustments
 	  && !in_pragma
 	  && !line_marker_emitted
-	  && print.prev_was_system_token != !!in_system_header_at (loc)
+	  && system_state_changed
 	  && !is_location_from_builtin_token (loc))
 	/* The system-ness of this token is different from the one of
 	   the previous token.  Let's emit a line change to mark the
 	   new system-ness before we emit the token.  */
 	{
-	  do_line_change (pfile, token, loc, false);
-	  print.prev_was_system_token = !!in_system_header_at (loc);
+	  line_marker_emitted = do_line_change (pfile, token, loc, false);
 	}
       if (!in_pragma || should_output_pragmas ())
 	{
 	  cpp_output_token (token, print.outf);
-	  line_marker_emitted = false;
 	  print.printed = true;
 	}
     }
diff --git a/gcc/testsuite/gcc.dg/cpp/pr60014-1.c b/gcc/testsuite/gcc.dg/cpp/pr60014-1.c
new file mode 100644
index 00000000000..de52b30c161
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr60014-1.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-save-temps -Wint-conversion" } */
+#include "pr60014-1.h"
+int main ()
+{
+    X(a, 
+      b);
+    char *should_warn = 1; /* { dg-warning {-Wint-conversion} } */
+}
diff --git a/gcc/testsuite/gcc.dg/cpp/pr60014-1.h b/gcc/testsuite/gcc.dg/cpp/pr60014-1.h
new file mode 100644
index 00000000000..50c159c44ee
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr60014-1.h
@@ -0,0 +1,5 @@ 
+#pragma GCC system_header
+
+/* N.B. the semicolon in the macro definition is important, since it produces a
+   second token from this system header on the same line as the __LINE__ token.  */
+#define X(a, b) __LINE__;
diff --git a/gcc/testsuite/gcc.dg/cpp/pr60014-2.c b/gcc/testsuite/gcc.dg/cpp/pr60014-2.c
new file mode 100644
index 00000000000..115c9858ec7
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr60014-2.c
@@ -0,0 +1,5 @@ 
+/* { dg-do compile } */
+/* { dg-options "-save-temps -Wint-conversion" } */
+#include "pr60014-2.h"
+X
+char *should_warn = 1; /* { dg-warning {-Wint-conversion} } */
diff --git a/gcc/testsuite/gcc.dg/cpp/pr60014-2.h b/gcc/testsuite/gcc.dg/cpp/pr60014-2.h
new file mode 100644
index 00000000000..455f1ed2e5b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr60014-2.h
@@ -0,0 +1,5 @@ 
+#pragma GCC system_header
+
+/* N.B. the semicolon in the macro definition is important, since it produces a
+   second token from this system header on the same line as the _Pragma.  */
+#define X _Pragma("GCC diagnostic push");
diff --git a/gcc/testsuite/gcc.dg/cpp/pr60014-3.c b/gcc/testsuite/gcc.dg/cpp/pr60014-3.c
new file mode 100644
index 00000000000..c4306035f05
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr60014-3.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-save-temps -Wint-conversion" } */
+#include "pr60014-3.h"
+
+/* The line continuation on the next line is what triggers the problem here,
+   because it synchronizes the output line between the input source and the
+   preprocessed output (whereas without the line continuation, the
+   preprocessed output would be off by one line from having output a #pragma
+   on a line by itself). Therefore, the token streamer doesn't have a reason
+   to generate a line marker purely based on the line number. That gives it
+   the chance to consider whether instead it needs to generate a line marker
+   based on a change of the "in-system-header" state, allowing us to test that
+   it comes to the right conclusion, which it did not, prior to this commit to
+   resolve PR60014.  */
+P(GCC diagnostic) \
+const char *should_warn = 1; /* { dg-warning {-Wint-conversion} } */
diff --git a/gcc/testsuite/gcc.dg/cpp/pr60014-3.h b/gcc/testsuite/gcc.dg/cpp/pr60014-3.h
new file mode 100644
index 00000000000..aedf038d95f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/cpp/pr60014-3.h
@@ -0,0 +1,2 @@ 
+#pragma GCC system_header
+#define P(x) _Pragma(#x)