[v3,0/4] diagnostics: libcpp: Overhaul locations for _Pragma tokens

Message ID 20230721230851.1981434-1-lhyatt@gmail.com
Headers
Series diagnostics: libcpp: Overhaul locations for _Pragma tokens |

Message

Lewis Hyatt July 21, 2023, 11:08 p.m. UTC
  Hello-

This is an update to the v2 patch series last sent in January:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html

While I did not receive any feedback on the v2 patches yet, they did need some
rebasing on top of other recent commits to input.cc, so I thought it would be
helpful to send them again now. The patches have not otherwise changed from
v2, and the above-linked message explains how all the patches fit in with the
original v1 series sent last November.

Dave, I would appreciate it very much if you could please let me know what you
think of this approach? I feel like the diagnostics we currently
output for _Pragmas are worth improving. As a reminder, say for this example:

=====
 #define S "GCC diagnostic ignored \"oops"
 _Pragma(S)
=====

We currently output:

=====
file.cpp:2:24: warning: missing terminating " character
    2 | _Pragma(S)
      |                        ^
=====

While after these patches, we would output:

======
<generated>:1:24: warning: missing terminating " character
    1 | GCC diagnostic ignored "oops
      |                        ^
file.cpp:2:1: note: in <_Pragma directive>
    2 | _Pragma(S)
      | ^~~~~~~
======

Thanks!

-Lewis
  

Comments

David Malcolm July 28, 2023, 10:22 p.m. UTC | #1
On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> Hello-
> 
> This is an update to the v2 patch series last sent in January:
> https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html
> 
> While I did not receive any feedback on the v2 patches yet, they did
> need some
> rebasing on top of other recent commits to input.cc, so I thought it
> would be
> helpful to send them again now. The patches have not otherwise
> changed from
> v2, and the above-linked message explains how all the patches fit in
> with the
> original v1 series sent last November.
> 
> Dave, I would appreciate it very much if you could please let me know
> what you
> think of this approach? I feel like the diagnostics we currently
> output for _Pragmas are worth improving. As a reminder, say for this
> example:
> 
> =====
>  #define S "GCC diagnostic ignored \"oops"
>  _Pragma(S)
> =====
> 
> We currently output:
> 
> =====
> file.cpp:2:24: warning: missing terminating " character
>     2 | _Pragma(S)
>       |                        ^
> =====
> 
> While after these patches, we would output:
> 
> ======
> <generated>:1:24: warning: missing terminating " character
>     1 | GCC diagnostic ignored "oops
>       |                        ^
> file.cpp:2:1: note: in <_Pragma directive>
>     2 | _Pragma(S)
>       | ^~~~~~~
> ======
> 
> Thanks!

Hi Lewis; sorry for not responding to the v2 patches.

I've started looking at the v3 patches in detail, but I have some high-
level questions about memory usage:

Am I right in thinking that the effect of this patch is that for every
_Pragma in the source we will create a new line_map_ordinary, and a new
buffer for the stringified content of that _Pragma, and that these
allocations will persist for the rest of the compilation?  (plus a
little extra allocation within the "location_t" space from 0 to
0x7fffffff).

It sounds like this will probably be a rounding error that won't be
noticable in profiling, but did you attempt any such measurement of the
memory usage before/after this patch on some real-world projects?

Thanks
Dave
  
Lewis Hyatt July 29, 2023, 2:27 p.m. UTC | #2
On Fri, Jul 28, 2023 at 6:22 PM David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > Hello-
> >
> > This is an update to the v2 patch series last sent in January:
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html
> >
> > While I did not receive any feedback on the v2 patches yet, they did
> > need some
> > rebasing on top of other recent commits to input.cc, so I thought it
> > would be
> > helpful to send them again now. The patches have not otherwise
> > changed from
> > v2, and the above-linked message explains how all the patches fit in
> > with the
> > original v1 series sent last November.
> >
> > Dave, I would appreciate it very much if you could please let me know
> > what you
> > think of this approach? I feel like the diagnostics we currently
> > output for _Pragmas are worth improving. As a reminder, say for this
> > example:
> >
> > =====
> >  #define S "GCC diagnostic ignored \"oops"
> >  _Pragma(S)
> > =====
> >
> > We currently output:
> >
> > =====
> > file.cpp:2:24: warning: missing terminating " character
> >     2 | _Pragma(S)
> >       |                        ^
> > =====
> >
> > While after these patches, we would output:
> >
> > ======
> > <generated>:1:24: warning: missing terminating " character
> >     1 | GCC diagnostic ignored "oops
> >       |                        ^
> > file.cpp:2:1: note: in <_Pragma directive>
> >     2 | _Pragma(S)
> >       | ^~~~~~~
> > ======
> >
> > Thanks!
>
> Hi Lewis; sorry for not responding to the v2 patches.
>
> I've started looking at the v3 patches in detail, but I have some high-
> level questions about memory usage:
>
> Am I right in thinking that the effect of this patch is that for every
> _Pragma in the source we will create a new line_map_ordinary, and a new
> buffer for the stringified content of that _Pragma, and that these
> allocations will persist for the rest of the compilation?  (plus a
> little extra allocation within the "location_t" space from 0 to
> 0x7fffffff).
>
> It sounds like this will probably be a rounding error that won't be
> noticable in profiling, but did you attempt any such measurement of the
> memory usage before/after this patch on some real-world projects?
>
> Thanks
> Dave
>

Thanks for looking at the patches, I appreciate it whenever you have
time to get to them.

This is a fair point about the memory usage, basically it means that
each instance of a _Pragma has comparable memory footprint to a macro
definition. (In addition to the overheads you mentioned, it also
creates a macro map to generate a virtual location for the tokens, so
that it's able to output the "in expansion of _Pragma" note. That part
can be disabled with -ftrack-macro-expansion=0 at least.)

I had the sense that _Pragma isn't used often enough for that to be a
problem, but agreed it is worth checking. (I really hope this memory
usage isn't an issue since there are also numerous PRs complaining
about 32-bit limitations in location tracking, that make it tempting
to explore 64-bit line maps or some other option someday too.)

I tried one thing now, wxWidgets uses a lot of diagnostic pragmas
wrapped up inside macros that use _Pragma. (See
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578). The testsuite
contains a file allheaders.cpp which includes the whole library, so I
tried compiling this into a pch, which I believe measures the entire
memory footprint including the ordinary and macro line maps and the
_Pragma strings. The resulting PCH sizes were:

279000173 bytes before the changes
279491345 bytes after the changes

So 0.1% bigger. Happy to check other projects too, do you have any
standard gotos? Maybe firefox or something I take it.

I see your other response on patch #1, I am thinking about that and
will reply later. Thanks again!

-Lewis
  
David Malcolm July 29, 2023, 4:03 p.m. UTC | #3
On Sat, 2023-07-29 at 10:27 -0400, Lewis Hyatt wrote:
> On Fri, Jul 28, 2023 at 6:22 PM David Malcolm <dmalcolm@redhat.com>
> wrote:
> > 
> > On Fri, 2023-07-21 at 19:08 -0400, Lewis Hyatt wrote:
> > > Hello-
> > > 
> > > This is an update to the v2 patch series last sent in January:
> > > https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609473.html
> > > 
> > > While I did not receive any feedback on the v2 patches yet, they
> > > did
> > > need some
> > > rebasing on top of other recent commits to input.cc, so I thought
> > > it
> > > would be
> > > helpful to send them again now. The patches have not otherwise
> > > changed from
> > > v2, and the above-linked message explains how all the patches fit
> > > in
> > > with the
> > > original v1 series sent last November.
> > > 
> > > Dave, I would appreciate it very much if you could please let me
> > > know
> > > what you
> > > think of this approach? I feel like the diagnostics we currently
> > > output for _Pragmas are worth improving. As a reminder, say for
> > > this
> > > example:
> > > 
> > > =====
> > >  #define S "GCC diagnostic ignored \"oops"
> > >  _Pragma(S)
> > > =====
> > > 
> > > We currently output:
> > > 
> > > =====
> > > file.cpp:2:24: warning: missing terminating " character
> > >     2 | _Pragma(S)
> > >       |                        ^
> > > =====
> > > 
> > > While after these patches, we would output:
> > > 
> > > ======
> > > <generated>:1:24: warning: missing terminating " character
> > >     1 | GCC diagnostic ignored "oops
> > >       |                        ^
> > > file.cpp:2:1: note: in <_Pragma directive>
> > >     2 | _Pragma(S)
> > >       | ^~~~~~~
> > > ======
> > > 
> > > Thanks!
> > 
> > Hi Lewis; sorry for not responding to the v2 patches.
> > 
> > I've started looking at the v3 patches in detail, but I have some
> > high-
> > level questions about memory usage:
> > 
> > Am I right in thinking that the effect of this patch is that for
> > every
> > _Pragma in the source we will create a new line_map_ordinary, and a
> > new
> > buffer for the stringified content of that _Pragma, and that these
> > allocations will persist for the rest of the compilation?  (plus a
> > little extra allocation within the "location_t" space from 0 to
> > 0x7fffffff).
> > 
> > It sounds like this will probably be a rounding error that won't be
> > noticable in profiling, but did you attempt any such measurement of
> > the
> > memory usage before/after this patch on some real-world projects?
> > 
> > Thanks
> > Dave
> > 
> 
> Thanks for looking at the patches, I appreciate it whenever you have
> time to get to them.
> 
> This is a fair point about the memory usage, basically it means that
> each instance of a _Pragma has comparable memory footprint to a macro
> definition. (In addition to the overheads you mentioned, it also
> creates a macro map to generate a virtual location for the tokens, so
> that it's able to output the "in expansion of _Pragma" note. That
> part
> can be disabled with -ftrack-macro-expansion=0 at least.)
> 
> I had the sense that _Pragma isn't used often enough for that to be a
> problem, but agreed it is worth checking. (I really hope this memory
> usage isn't an issue since there are also numerous PRs complaining
> about 32-bit limitations in location tracking, that make it tempting
> to explore 64-bit line maps or some other option someday too.)
> 
> I tried one thing now, wxWidgets uses a lot of diagnostic pragmas
> wrapped up inside macros that use _Pragma. (See
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55578). The testsuite
> contains a file allheaders.cpp which includes the whole library, so I
> tried compiling this into a pch, which I believe measures the entire
> memory footprint including the ordinary and macro line maps and the
> _Pragma strings. The resulting PCH sizes were:
> 
> 279000173 bytes before the changes
> 279491345 bytes after the changes
> 
> So 0.1% bigger. Happy to check other projects too, do you have any
> standard gotos? Maybe firefox or something I take it.

Thanks for doing that test; I think that slight increase on a heavy
user of _Pragma is acceptable.
> 
> I see your other response on patch #1, I am thinking about that and
> will reply later. Thanks again!

Thanks.  Hope that my patch #1 response makes sense and that I'm not
missing something about the way this works.

Dave