preprocessor: Fix tracking of system header state [PR60014, PR60723]
Checks
Commit Message
The token_streamer class (which implements gcc mode -E and
-save-temps/-no-integrated-cpp) needs to keep track whether the last tokens
output were in a system header, so that it can generate line marker
annotations as necessary for a downstream consumer to reconstruct the
state. The logic for tracking it, which was added by r5-1863 to resolve
PR60723, has some edge case issues as revealed by the three new test
cases. The first, coming from the original PR60014, was incidentally fixed by
r9-1926 for unrelated reasons. The other two were still failing on master
prior to this commit. Such code paths were not realizable prior to r13-1544,
which made it possible for the token streamer to see CPP_PRAGMA tokens in more
contexts.
The two main issues being corrected here are:
1) print.prev_was_system_token needs to indicate whether the previous token
output was in a system location. However, it was not being set on every token,
only on those that triggered the main code path; specifically it was not
triggered on a CPP_PRAGMA token. Testcase 2 covers this case.
2) The token_streamer uses a variable "line_marker_emitted" to remember
whether a line marker has been emitted while processing a given token, so that
it wouldn't be done more than once in case multiple conditions requiring a
line marker are true. There was no reason for this to be a member variable
that retains its value from token to token, since it is just needed for
tracking the state locally while processing a single given token. The fact
that it could retain its value for a subsequent token is rather difficult to
observe, but testcase 3 demonstrates incorrect behavior resulting from
that. Moving this to a local variable also simplifies understanding the
control flow going forward.
gcc/c-family/ChangeLog:
PR preprocessor/60014
PR preprocessor/60723
* c-ppoutput.cc (class token_streamer): Remove member
line_marker_emitted to...
(token_streamer::stream): ...a local variable here. Set
print.prev_was_system_token on all code paths.
gcc/testsuite/ChangeLog:
PR preprocessor/60014
PR preprocessor/60723
* gcc.dg/cpp/pr60014-1.c: New test.
* gcc.dg/cpp/pr60014-1.h: New test.
* gcc.dg/cpp/pr60014-2.c: New test.
* gcc.dg/cpp/pr60014-2.h: New test.
* gcc.dg/cpp/pr60014-3.c: New test.
* gcc.dg/cpp/pr60014-3.h: New test.
---
Notes:
Hello-
I tried to describe it all in the commit message, please see also
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60014#c8 for more
details. bootstrap+regtest all languages looks good on x86-64 Linux. Please
let me know if it looks OK? Thanks!
-Lewis
gcc/c-family/c-ppoutput.cc | 17 ++++++++++-------
gcc/testsuite/gcc.dg/cpp/pr60014-1.c | 9 +++++++++
gcc/testsuite/gcc.dg/cpp/pr60014-1.h | 5 +++++
gcc/testsuite/gcc.dg/cpp/pr60014-2.c | 5 +++++
gcc/testsuite/gcc.dg/cpp/pr60014-2.h | 5 +++++
gcc/testsuite/gcc.dg/cpp/pr60014-3.c | 16 ++++++++++++++++
gcc/testsuite/gcc.dg/cpp/pr60014-3.h | 2 ++
7 files changed, 52 insertions(+), 7 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-1.c
create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-1.h
create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-2.c
create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-2.h
create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-3.c
create mode 100644 gcc/testsuite/gcc.dg/cpp/pr60014-3.h
Comments
On 10/8/22 15:18, Lewis Hyatt via Gcc-patches wrote:
> The token_streamer class (which implements gcc mode -E and
> -save-temps/-no-integrated-cpp) needs to keep track whether the last tokens
> output were in a system header, so that it can generate line marker
> annotations as necessary for a downstream consumer to reconstruct the
> state. The logic for tracking it, which was added by r5-1863 to resolve
> PR60723, has some edge case issues as revealed by the three new test
> cases. The first, coming from the original PR60014, was incidentally fixed by
> r9-1926 for unrelated reasons. The other two were still failing on master
> prior to this commit. Such code paths were not realizable prior to r13-1544,
> which made it possible for the token streamer to see CPP_PRAGMA tokens in more
> contexts.
>
> The two main issues being corrected here are:
>
> 1) print.prev_was_system_token needs to indicate whether the previous token
> output was in a system location. However, it was not being set on every token,
> only on those that triggered the main code path; specifically it was not
> triggered on a CPP_PRAGMA token. Testcase 2 covers this case.
>
> 2) The token_streamer uses a variable "line_marker_emitted" to remember
> whether a line marker has been emitted while processing a given token, so that
> it wouldn't be done more than once in case multiple conditions requiring a
> line marker are true. There was no reason for this to be a member variable
> that retains its value from token to token, since it is just needed for
> tracking the state locally while processing a single given token. The fact
> that it could retain its value for a subsequent token is rather difficult to
> observe, but testcase 3 demonstrates incorrect behavior resulting from
> that. Moving this to a local variable also simplifies understanding the
> control flow going forward.
>
> gcc/c-family/ChangeLog:
>
> PR preprocessor/60014
> PR preprocessor/60723
> * c-ppoutput.cc (class token_streamer): Remove member
> line_marker_emitted to...
> (token_streamer::stream): ...a local variable here. Set
> print.prev_was_system_token on all code paths.
>
> gcc/testsuite/ChangeLog:
>
> PR preprocessor/60014
> PR preprocessor/60723
> * gcc.dg/cpp/pr60014-1.c: New test.
> * gcc.dg/cpp/pr60014-1.h: New test.
> * gcc.dg/cpp/pr60014-2.c: New test.
> * gcc.dg/cpp/pr60014-2.h: New test.
> * gcc.dg/cpp/pr60014-3.c: New test.
> * gcc.dg/cpp/pr60014-3.h: New test.
OK
jeff
@@ -184,15 +184,13 @@ class token_streamer
bool avoid_paste;
bool do_line_adjustments;
bool in_pragma;
- bool line_marker_emitted;
public:
token_streamer (cpp_reader *pfile)
:avoid_paste (false),
do_line_adjustments (cpp_get_options (pfile)->lang != CLK_ASM
&& !flag_no_line_commands),
- in_pragma (false),
- line_marker_emitted (false)
+ in_pragma (false)
{
gcc_assert (!print.streamer);
print.streamer = this;
@@ -227,7 +225,14 @@ token_streamer::stream (cpp_reader *pfile, const cpp_token *token,
if (token->type == CPP_EOF)
return;
+ /* Keep track when we move into and out of system locations. */
+ const bool is_system_token = in_system_header_at (loc);
+ const bool system_state_changed
+ = (is_system_token != print.prev_was_system_token);
+ print.prev_was_system_token = is_system_token;
+
/* Subtle logic to output a space if and only if necessary. */
+ bool line_marker_emitted = false;
if (avoid_paste)
{
unsigned src_line = LOCATION_LINE (loc);
@@ -301,19 +306,17 @@ token_streamer::stream (cpp_reader *pfile, const cpp_token *token,
if (do_line_adjustments
&& !in_pragma
&& !line_marker_emitted
- && print.prev_was_system_token != !!in_system_header_at (loc)
+ && system_state_changed
&& !is_location_from_builtin_token (loc))
/* The system-ness of this token is different from the one of
the previous token. Let's emit a line change to mark the
new system-ness before we emit the token. */
{
- do_line_change (pfile, token, loc, false);
- print.prev_was_system_token = !!in_system_header_at (loc);
+ line_marker_emitted = do_line_change (pfile, token, loc, false);
}
if (!in_pragma || should_output_pragmas ())
{
cpp_output_token (token, print.outf);
- line_marker_emitted = false;
print.printed = true;
}
}
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-save-temps -Wint-conversion" } */
+#include "pr60014-1.h"
+int main ()
+{
+ X(a,
+ b);
+ char *should_warn = 1; /* { dg-warning {-Wint-conversion} } */
+}
new file mode 100644
@@ -0,0 +1,5 @@
+#pragma GCC system_header
+
+/* N.B. the semicolon in the macro definition is important, since it produces a
+ second token from this system header on the same line as the __LINE__ token. */
+#define X(a, b) __LINE__;
new file mode 100644
@@ -0,0 +1,5 @@
+/* { dg-do compile } */
+/* { dg-options "-save-temps -Wint-conversion" } */
+#include "pr60014-2.h"
+X
+char *should_warn = 1; /* { dg-warning {-Wint-conversion} } */
new file mode 100644
@@ -0,0 +1,5 @@
+#pragma GCC system_header
+
+/* N.B. the semicolon in the macro definition is important, since it produces a
+ second token from this system header on the same line as the _Pragma. */
+#define X _Pragma("GCC diagnostic push");
new file mode 100644
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-save-temps -Wint-conversion" } */
+#include "pr60014-3.h"
+
+/* The line continuation on the next line is what triggers the problem here,
+ because it synchronizes the output line between the input source and the
+ preprocessed output (whereas without the line continuation, the
+ preprocessed output would be off by one line from having output a #pragma
+ on a line by itself). Therefore, the token streamer doesn't have a reason
+ to generate a line marker purely based on the line number. That gives it
+ the chance to consider whether instead it needs to generate a line marker
+ based on a change of the "in-system-header" state, allowing us to test that
+ it comes to the right conclusion, which it did not, prior to this commit to
+ resolve PR60014. */
+P(GCC diagnostic) \
+const char *should_warn = 1; /* { dg-warning {-Wint-conversion} } */
new file mode 100644
@@ -0,0 +1,2 @@
+#pragma GCC system_header
+#define P(x) _Pragma(#x)