[RESEND,0/1] RFC: P1689R5 support

Message ID 20221004151200.1275636-1-ben.boeckel@kitware.com
Headers
Series RFC: P1689R5 support |

Message

Ben Boeckel Oct. 4, 2022, 3:11 p.m. UTC
  This patch adds initial support for ISO C++'s [P1689R5][], a format for
describing C++ module requirements and provisions based on the source
code. This is required because compiling C++ with modules is not
embarrassingly parallel and need to be ordered to ensure that `import
some_module;` can be satisfied in time by making sure that the TU with
`export import some_module;` is compiled first.

[P1689R5]: https://isocpp.org/files/papers/P1689R5.html

I'd like feedback on the approach taken here with respect to the
user-visible flags. I'll also note that header units are not supported
at this time because the current `-E` behavior with respect to `import
<some_header>;` is to search for an appropriate `.gcm` file which is not
something such a "scan" can support. A new mode will likely need to be
created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
where headers are looked up "normally" and processed only as much as
scanning requires.

Testing is currently happening in CMake's CI using a prior revision of
this patch (the differences are basically the changelog, some style, and
`trtbd` instead of `p1689r5` as the format name).

For testing within GCC, I'll work on the following:

- scanning non-module source
- scanning module-importing source (`import X;`)
- scanning module-exporting source (`export module X;`)
- scanning module implementation unit (`module X;`)
- flag combinations?

Are there existing tools for handling JSON output for testing purposes?
Basically, something that I can add to the test suite that doesn't care
about whitespace, but checks the structure (with sensible replacements
for absolute paths where relevant)?

For the record, Clang has patches with similar flags and behavior by
Chuanqi Xu here:

    https://reviews.llvm.org/D134269

with the same flags (though using my old `trtbd` spelling for the
format name).

Thanks,

--Ben

Ben Boeckel (1):
  p1689r5: initial support

 gcc/ChangeLog           |   9 ++
 gcc/c-family/ChangeLog  |   6 +
 gcc/c-family/c-opts.cc  |  40 ++++++-
 gcc/c-family/c.opt      |  12 ++
 gcc/cp/ChangeLog        |   5 +
 gcc/cp/module.cc        |   3 +-
 gcc/doc/invoke.texi     |  15 +++
 gcc/fortran/ChangeLog   |   5 +
 gcc/fortran/cpp.cc      |   4 +-
 gcc/genmatch.cc         |   2 +-
 gcc/input.cc            |   4 +-
 libcpp/ChangeLog        |  11 ++
 libcpp/include/cpplib.h |  12 +-
 libcpp/include/mkdeps.h |  17 ++-
 libcpp/init.cc          |  14 ++-
 libcpp/mkdeps.cc        | 235 ++++++++++++++++++++++++++++++++++++++--
 16 files changed, 368 insertions(+), 26 deletions(-)


base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
  

Comments

Jason Merrill Oct. 10, 2022, 8:21 p.m. UTC | #1
On 10/4/22 11:11, Ben Boeckel wrote:
> This patch adds initial support for ISO C++'s [P1689R5][], a format for
> describing C++ module requirements and provisions based on the source
> code. This is required because compiling C++ with modules is not
> embarrassingly parallel and need to be ordered to ensure that `import
> some_module;` can be satisfied in time by making sure that the TU with
> `export import some_module;` is compiled first.
> 
> [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> 
> I'd like feedback on the approach taken here with respect to the
> user-visible flags. I'll also note that header units are not supported
> at this time because the current `-E` behavior with respect to `import
> <some_header>;` is to search for an appropriate `.gcm` file which is not
> something such a "scan" can support. A new mode will likely need to be
> created (e.g., replacing `-E` with `-fc++-module-scanning` or something)
> where headers are looked up "normally" and processed only as much as
> scanning requires.
> 
> Testing is currently happening in CMake's CI using a prior revision of
> this patch (the differences are basically the changelog, some style, and
> `trtbd` instead of `p1689r5` as the format name).
> 
> For testing within GCC, I'll work on the following:
> 
> - scanning non-module source
> - scanning module-importing source (`import X;`)
> - scanning module-exporting source (`export module X;`)
> - scanning module implementation unit (`module X;`)
> - flag combinations?
> 
> Are there existing tools for handling JSON output for testing purposes?

David Malcolm would probably know best about JSON wrangling.

> Basically, something that I can add to the test suite that doesn't care
> about whitespace, but checks the structure (with sensible replacements
> for absolute paths where relevant)?

Various tests in g++.dg/debug/dwarf2 handle that sort of thing with regexps.

> For the record, Clang has patches with similar flags and behavior by
> Chuanqi Xu here:
> 
>      https://reviews.llvm.org/D134269
> 
> with the same flags (though using my old `trtbd` spelling for the
> format name).
> 
> Thanks,
> 
> --Ben
> 
> Ben Boeckel (1):
>    p1689r5: initial support
> 
>   gcc/ChangeLog           |   9 ++
>   gcc/c-family/ChangeLog  |   6 +
>   gcc/c-family/c-opts.cc  |  40 ++++++-
>   gcc/c-family/c.opt      |  12 ++
>   gcc/cp/ChangeLog        |   5 +
>   gcc/cp/module.cc        |   3 +-
>   gcc/doc/invoke.texi     |  15 +++
>   gcc/fortran/ChangeLog   |   5 +
>   gcc/fortran/cpp.cc      |   4 +-
>   gcc/genmatch.cc         |   2 +-
>   gcc/input.cc            |   4 +-
>   libcpp/ChangeLog        |  11 ++
>   libcpp/include/cpplib.h |  12 +-
>   libcpp/include/mkdeps.h |  17 ++-
>   libcpp/init.cc          |  14 ++-
>   libcpp/mkdeps.cc        | 235 ++++++++++++++++++++++++++++++++++++++--
>   16 files changed, 368 insertions(+), 26 deletions(-)
> 
> 
> base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
  
David Malcolm Oct. 13, 2022, 5:08 p.m. UTC | #2
On Mon, 2022-10-10 at 16:21 -0400, Jason Merrill wrote:
> On 10/4/22 11:11, Ben Boeckel wrote:
> > This patch adds initial support for ISO C++'s [P1689R5][], a format
> > for
> > describing C++ module requirements and provisions based on the
> > source
> > code. This is required because compiling C++ with modules is not
> > embarrassingly parallel and need to be ordered to ensure that
> > `import
> > some_module;` can be satisfied in time by making sure that the TU
> > with
> > `export import some_module;` is compiled first.
> > 
> > [P1689R5]: https://isocpp.org/files/papers/P1689R5.html
> > 
> > I'd like feedback on the approach taken here with respect to the
> > user-visible flags. I'll also note that header units are not
> > supported
> > at this time because the current `-E` behavior with respect to
> > `import
> > <some_header>;` is to search for an appropriate `.gcm` file which
> > is not
> > something such a "scan" can support. A new mode will likely need to
> > be
> > created (e.g., replacing `-E` with `-fc++-module-scanning` or
> > something)
> > where headers are looked up "normally" and processed only as much
> > as
> > scanning requires.
> > 
> > Testing is currently happening in CMake's CI using a prior revision
> > of
> > this patch (the differences are basically the changelog, some
> > style, and
> > `trtbd` instead of `p1689r5` as the format name).
> > 
> > For testing within GCC, I'll work on the following:
> > 
> > - scanning non-module source
> > - scanning module-importing source (`import X;`)
> > - scanning module-exporting source (`export module X;`)
> > - scanning module implementation unit (`module X;`)
> > - flag combinations?
> > 
> > Are there existing tools for handling JSON output for testing
> > purposes?
> 
> David Malcolm would probably know best about JSON wrangling.

Unfortunately our JSON output doesn't make any guarantees about the
ordering of keys within an object, so the precise textual output
changes from run to run.  I've coped with that in my test cases by
limiting myself to simple regexes of fragments of the JSON output.

Martin Liska [CCed] went much further in
4e275dccfc2467b3fe39012a3dd2a80bac257dd0 by adding a run-gcov-pytest
DejaGnu directive, allowing for test cases for gcov to be written in
Python, which can thus test much more interesting assertions about the
generated JSON.

Dave

> 
> > Basically, something that I can add to the test suite that doesn't
> > care
> > about whitespace, but checks the structure (with sensible
> > replacements
> > for absolute paths where relevant)?
> 
> Various tests in g++.dg/debug/dwarf2 handle that sort of thing with
> regexps.
> 
> > For the record, Clang has patches with similar flags and behavior
> > by
> > Chuanqi Xu here:
> > 
> >      https://reviews.llvm.org/D134269
> > 
> > with the same flags (though using my old `trtbd` spelling for the
> > format name).
> > 
> > Thanks,
> > 
> > --Ben
> > 
> > Ben Boeckel (1):
> >    p1689r5: initial support
> > 
> >   gcc/ChangeLog           |   9 ++
> >   gcc/c-family/ChangeLog  |   6 +
> >   gcc/c-family/c-opts.cc  |  40 ++++++-
> >   gcc/c-family/c.opt      |  12 ++
> >   gcc/cp/ChangeLog        |   5 +
> >   gcc/cp/module.cc        |   3 +-
> >   gcc/doc/invoke.texi     |  15 +++
> >   gcc/fortran/ChangeLog   |   5 +
> >   gcc/fortran/cpp.cc      |   4 +-
> >   gcc/genmatch.cc         |   2 +-
> >   gcc/input.cc            |   4 +-
> >   libcpp/ChangeLog        |  11 ++
> >   libcpp/include/cpplib.h |  12 +-
> >   libcpp/include/mkdeps.h |  17 ++-
> >   libcpp/init.cc          |  14 ++-
> >   libcpp/mkdeps.cc        | 235
> > ++++++++++++++++++++++++++++++++++++++--
> >   16 files changed, 368 insertions(+), 26 deletions(-)
> > 
> > 
> > base-commit: d812e8cb2a920fd75768e16ca8ded59ad93c172f
>
  
Ben Boeckel Oct. 18, 2022, 12:22 p.m. UTC | #3
On Thu, Oct 13, 2022 at 13:08:46 -0400, David Malcolm wrote:
> On Mon, 2022-10-10 at 16:21 -0400, Jason Merrill wrote:
> > David Malcolm would probably know best about JSON wrangling.
> 
> Unfortunately our JSON output doesn't make any guarantees about the
> ordering of keys within an object, so the precise textual output
> changes from run to run.  I've coped with that in my test cases by
> limiting myself to simple regexes of fragments of the JSON output.
> 
> Martin Liska [CCed] went much further in
> 4e275dccfc2467b3fe39012a3dd2a80bac257dd0 by adding a run-gcov-pytest
> DejaGnu directive, allowing for test cases for gcov to be written in
> Python, which can thus test much more interesting assertions about the
> generated JSON.

Ok, if Python is acceptable, I'll use its stdlib to do "fancy" things.
Part of this is because I want to assert that unnecessary fields don't
exist and that sounds…unlikely to be possible in any maintainable way
(assuming it is possible) with regexen. `jq` could help immensely, but
that is probably a bridge too far :) .

Thanks,

--Ben
  
Martin Liška Oct. 19, 2022, 7:21 a.m. UTC | #4
On 10/18/22 14:22, Ben Boeckel wrote:
> On Thu, Oct 13, 2022 at 13:08:46 -0400, David Malcolm wrote:
>> On Mon, 2022-10-10 at 16:21 -0400, Jason Merrill wrote:
>>> David Malcolm would probably know best about JSON wrangling.
>>
>> Unfortunately our JSON output doesn't make any guarantees about the
>> ordering of keys within an object, so the precise textual output
>> changes from run to run.  I've coped with that in my test cases by
>> limiting myself to simple regexes of fragments of the JSON output.
>>
>> Martin Liska [CCed] went much further in
>> 4e275dccfc2467b3fe39012a3dd2a80bac257dd0 by adding a run-gcov-pytest
>> DejaGnu directive, allowing for test cases for gcov to be written in
>> Python, which can thus test much more interesting assertions about the
>> generated JSON.
> 
> Ok, if Python is acceptable, I'll use its stdlib to do "fancy" things.
> Part of this is because I want to assert that unnecessary fields don't
> exist and that sounds…unlikely to be possible in any maintainable way
> (assuming it is possible) with regexen. `jq` could help immensely, but
> that is probably a bridge too far :) .

Yes, please use Python if you have a more complicated output verification.

Examples I introduced:
./gcc/testsuite/g++.dg/gcov/test-pr98273.py
./gcc/testsuite/g++.dg/gcov/test-gcov-17.py

Martin

> 
> Thanks,
> 
> --Ben