c, analyzer: support named constants in analyzer [PR106302]
Checks
Commit Message
The analyzer's file-descriptor state machine tracks the access mode of
opened files, so that it can emit -Wanalyzer-fd-access-mode-mismatch.
To do this, its symbolic execution needs to "know" the values of the
constants "O_RDONLY", "O_WRONLY", and "O_ACCMODE". Currently
analyzer/sm-fd.cc simply uses these values directly from the build-time
header files, but these are the values on the host, not those from the
target, which could be different (PR analyzer/106302).
In an earlier discussion of this issue:
https://gcc.gnu.org/pipermail/gcc/2022-June/238954.html
we talked about adding a target hook for this.
However, I've also been experimenting with extending the fd state
machine to track sockets (PR analyzer/106140). For this, it's useful to
"know" the values of the constants "SOCK_STREAM" and "SOCK_DGRAM".
Unfortunately, these seem to have many arbitrary differences from target
to target.
For example: Linux/glibc general has SOCK_STREAM == 1, SOCK_DGRAM == 2,
as does AIX, but annoyingly, e.g. Linux on MIPS has them the other way
around.
It seems to me that as the analyzer grows more ambitious modeling of the
behavior of APIs (perhaps via plugins) it's more likely that the
analyzer will need to know the values of named constants, which might
not even exist on the host.
For example, at LPC it was suggested to me that -fanalyzer could check
rules about memory management inside the Linux kernel (probably via a
plugin), but doing so involves a bunch of GFP_* flags (see PR 107472).
So rather than trying to capture all this knowledge in a target hook,
this patch attempts to get at named constant values from the user's
source code.
The patch adds an interface for frontends to call into the analyzer as
the translation unit finishes. The analyzer can then call back into the
frontend to ask about the values of the named constants it cares about
whilst the frontend's data structures are still around.
The patch implements this for the C frontend, which looks up the names
by looking for named CONST_DECLs (which handles enum values). Failing
that, it attempts to look up the values of macros but only the simplest
cases are supported (a non-traditional macro with a single CPP_NUMBER
token). It does this by building a buffer containing the macro
definition and rerunning a lexer on it.
The analyzer gracefully handles the cases where named values aren't
found (such as anything more complicated than described above).
The patch ports the analyzer to use this mechanism for "O_RDONLY",
"O_WRONLY", and "O_ACCMODE". I have successfully tested my socket patch
to also use this for "SOCK_STREAM" and "SOCK_DGRAM", so the technique
seems to work.
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Are the C frontend parts OK for trunk?
Thanks
Dave
gcc/ChangeLog:
PR analyzer/106302
* Makefile.in (ANALYZER_OBJS): Add analyzer/analyzer-language.o.
(GTFILES): Add analyzer/analyzer-language.cc.
gcc/analyzer/ChangeLog:
PR analyzer/106302
* analyzer-language.cc: New file.
* analyzer-language.h: New file.
* analyzer.h (get_stashed_constant_by_name): New decl.
(log_stashed_constants): New decl.
* engine.cc (impl_run_checkers): Call log_stashed_constants.
* region-model-impl-calls.cc
(region_model::impl_call_analyzer_dump_named_constant): New.
* region-model.cc (region_model::on_stmt_pre): Handle
__analyzer_dump_named_constant.
* region-model.h
(region_model::impl_call_analyzer_dump_named_constant): New decl.
* sm-fd.cc (fd_state_machine::m_O_ACCMODE): New.
(fd_state_machine::m_O_RDONLY): New.
(fd_state_machine::m_O_WRONLY): New.
(fd_state_machine::fd_state_machine): Initialize the new fields.
(fd_state_machine::get_access_mode_from_flag): Use the new fields,
rather than using the host values.
gcc/c/ChangeLog:
PR analyzer/106302
* c-parser.cc: Include "analyzer/analyzer-language.h" and "toplev.h".
(class ana::c_translation_unit): New.
(c_parser_translation_unit): Call ana::on_finish_translation_unit.
gcc/ChangeLog:
* doc/analyzer.texi: Document __analyzer_dump_named_constant.
gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/analyzer-decls.h
(__analyzer_dump_named_constant): New decl.
* gcc.dg/analyzer/fd-4.c (void): Likewise.
(O_ACCMODE): Define.
* gcc.dg/analyzer/fd-access-mode-enum.c: New test, based on .
* gcc.dg/analyzer/fd-5.c: ...this. Rename to...
* gcc.dg/analyzer/fd-access-mode-macros.c: ...this.
(O_ACCMODE): Define.
* gcc.dg/analyzer/fd-access-mode-target-headers.c: New test, also
based on fd-5.c.
(test_sm_fd_constants): New.
* gcc.dg/analyzer/fd-dup-1.c (O_ACCMODE): Define.
* gcc.dg/analyzer/named-constants-via-enum.c: New test.
* gcc.dg/analyzer/named-constants-via-macros-2.c: New test.
* gcc.dg/analyzer/named-constants-via-macros.c: New test.
Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
gcc/Makefile.in | 2 +
gcc/analyzer/analyzer-language.cc | 110 ++++++++++++++++++
gcc/analyzer/analyzer-language.h | 48 ++++++++
gcc/analyzer/analyzer.h | 3 +
gcc/analyzer/engine.cc | 1 +
gcc/analyzer/region-model-impl-calls.cc | 28 +++++
gcc/analyzer/region-model.cc | 4 +
gcc/analyzer/region-model.h | 2 +
gcc/analyzer/sm-fd.cc | 30 +++--
gcc/c/c-parser.cc | 91 +++++++++++++++
gcc/doc/analyzer.texi | 17 +++
.../gcc.dg/analyzer/analyzer-decls.h | 3 +
gcc/testsuite/gcc.dg/analyzer/fd-4.c | 1 +
.../gcc.dg/analyzer/fd-access-mode-enum.c | 60 ++++++++++
.../{fd-5.c => fd-access-mode-macros.c} | 1 +
.../analyzer/fd-access-mode-target-headers.c | 56 +++++++++
gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c | 1 +
.../analyzer/named-constants-via-enum.c | 20 ++++
.../analyzer/named-constants-via-macros-2.c | 15 +++
.../analyzer/named-constants-via-macros.c | 19 +++
20 files changed, 502 insertions(+), 10 deletions(-)
create mode 100644 gcc/analyzer/analyzer-language.cc
create mode 100644 gcc/analyzer/analyzer-language.h
create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
rename gcc/testsuite/gcc.dg/analyzer/{fd-5.c => fd-access-mode-macros.c} (98%)
create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros-2.c
create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
Comments
I'd like to "ping" the C frontend parts of this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604739.html
FWIW I've just posted the socket patch that I referred to, here:
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605281.html
which depends on this patch.
Thanks
Dave
On Mon, 2022-10-31 at 15:07 -0400, David Malcolm wrote:
> The analyzer's file-descriptor state machine tracks the access mode
> of
> opened files, so that it can emit -Wanalyzer-fd-access-mode-mismatch.
>
> To do this, its symbolic execution needs to "know" the values of the
> constants "O_RDONLY", "O_WRONLY", and "O_ACCMODE". Currently
> analyzer/sm-fd.cc simply uses these values directly from the build-
> time
> header files, but these are the values on the host, not those from
> the
> target, which could be different (PR analyzer/106302).
>
> In an earlier discussion of this issue:
> https://gcc.gnu.org/pipermail/gcc/2022-June/238954.html
> we talked about adding a target hook for this.
>
> However, I've also been experimenting with extending the fd state
> machine to track sockets (PR analyzer/106140). For this, it's useful
> to
> "know" the values of the constants "SOCK_STREAM" and "SOCK_DGRAM".
> Unfortunately, these seem to have many arbitrary differences from
> target
> to target.
>
> For example: Linux/glibc general has SOCK_STREAM == 1, SOCK_DGRAM ==
> 2,
> as does AIX, but annoyingly, e.g. Linux on MIPS has them the other
> way
> around.
>
> It seems to me that as the analyzer grows more ambitious modeling of
> the
> behavior of APIs (perhaps via plugins) it's more likely that the
> analyzer will need to know the values of named constants, which might
> not even exist on the host.
>
> For example, at LPC it was suggested to me that -fanalyzer could
> check
> rules about memory management inside the Linux kernel (probably via a
> plugin), but doing so involves a bunch of GFP_* flags (see PR
> 107472).
>
> So rather than trying to capture all this knowledge in a target hook,
> this patch attempts to get at named constant values from the user's
> source code.
>
> The patch adds an interface for frontends to call into the analyzer
> as
> the translation unit finishes. The analyzer can then call back into
> the
> frontend to ask about the values of the named constants it cares
> about
> whilst the frontend's data structures are still around.
>
> The patch implements this for the C frontend, which looks up the
> names
> by looking for named CONST_DECLs (which handles enum values).
> Failing
> that, it attempts to look up the values of macros but only the
> simplest
> cases are supported (a non-traditional macro with a single CPP_NUMBER
> token). It does this by building a buffer containing the macro
> definition and rerunning a lexer on it.
>
> The analyzer gracefully handles the cases where named values aren't
> found (such as anything more complicated than described above).
>
> The patch ports the analyzer to use this mechanism for "O_RDONLY",
> "O_WRONLY", and "O_ACCMODE". I have successfully tested my socket
> patch
> to also use this for "SOCK_STREAM" and "SOCK_DGRAM", so the technique
> seems to work.
>
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
>
> Are the C frontend parts OK for trunk?
>
> Thanks
> Dave
>
> gcc/ChangeLog:
> PR analyzer/106302
> * Makefile.in (ANALYZER_OBJS): Add analyzer/analyzer-
> language.o.
> (GTFILES): Add analyzer/analyzer-language.cc.
>
> gcc/analyzer/ChangeLog:
> PR analyzer/106302
> * analyzer-language.cc: New file.
> * analyzer-language.h: New file.
> * analyzer.h (get_stashed_constant_by_name): New decl.
> (log_stashed_constants): New decl.
> * engine.cc (impl_run_checkers): Call log_stashed_constants.
> * region-model-impl-calls.cc
> (region_model::impl_call_analyzer_dump_named_constant): New.
> * region-model.cc (region_model::on_stmt_pre): Handle
> __analyzer_dump_named_constant.
> * region-model.h
> (region_model::impl_call_analyzer_dump_named_constant): New
> decl.
> * sm-fd.cc (fd_state_machine::m_O_ACCMODE): New.
> (fd_state_machine::m_O_RDONLY): New.
> (fd_state_machine::m_O_WRONLY): New.
> (fd_state_machine::fd_state_machine): Initialize the new
> fields.
> (fd_state_machine::get_access_mode_from_flag): Use the new
> fields,
> rather than using the host values.
>
> gcc/c/ChangeLog:
> PR analyzer/106302
> * c-parser.cc: Include "analyzer/analyzer-language.h" and
> "toplev.h".
> (class ana::c_translation_unit): New.
> (c_parser_translation_unit): Call
> ana::on_finish_translation_unit.
>
> gcc/ChangeLog:
> * doc/analyzer.texi: Document __analyzer_dump_named_constant.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/analyzer/analyzer-decls.h
> (__analyzer_dump_named_constant): New decl.
> * gcc.dg/analyzer/fd-4.c (void): Likewise.
> (O_ACCMODE): Define.
> * gcc.dg/analyzer/fd-access-mode-enum.c: New test, based on .
> * gcc.dg/analyzer/fd-5.c: ...this. Rename to...
> * gcc.dg/analyzer/fd-access-mode-macros.c: ...this.
> (O_ACCMODE): Define.
> * gcc.dg/analyzer/fd-access-mode-target-headers.c: New test,
> also
> based on fd-5.c.
> (test_sm_fd_constants): New.
> * gcc.dg/analyzer/fd-dup-1.c (O_ACCMODE): Define.
> * gcc.dg/analyzer/named-constants-via-enum.c: New test.
> * gcc.dg/analyzer/named-constants-via-macros-2.c: New test.
> * gcc.dg/analyzer/named-constants-via-macros.c: New test.
>
> Signed-off-by: David Malcolm <dmalcolm@redhat.com>
> ---
> gcc/Makefile.in | 2 +
> gcc/analyzer/analyzer-language.cc | 110
> ++++++++++++++++++
> gcc/analyzer/analyzer-language.h | 48 ++++++++
> gcc/analyzer/analyzer.h | 3 +
> gcc/analyzer/engine.cc | 1 +
> gcc/analyzer/region-model-impl-calls.cc | 28 +++++
> gcc/analyzer/region-model.cc | 4 +
> gcc/analyzer/region-model.h | 2 +
> gcc/analyzer/sm-fd.cc | 30 +++--
> gcc/c/c-parser.cc | 91 +++++++++++++++
> gcc/doc/analyzer.texi | 17 +++
> .../gcc.dg/analyzer/analyzer-decls.h | 3 +
> gcc/testsuite/gcc.dg/analyzer/fd-4.c | 1 +
> .../gcc.dg/analyzer/fd-access-mode-enum.c | 60 ++++++++++
> .../{fd-5.c => fd-access-mode-macros.c} | 1 +
> .../analyzer/fd-access-mode-target-headers.c | 56 +++++++++
> gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c | 1 +
> .../analyzer/named-constants-via-enum.c | 20 ++++
> .../analyzer/named-constants-via-macros-2.c | 15 +++
> .../analyzer/named-constants-via-macros.c | 19 +++
> 20 files changed, 502 insertions(+), 10 deletions(-)
> create mode 100644 gcc/analyzer/analyzer-language.cc
> create mode 100644 gcc/analyzer/analyzer-language.h
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-
> enum.c
> rename gcc/testsuite/gcc.dg/analyzer/{fd-5.c => fd-access-mode-
> macros.c} (98%)
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/fd-access-mode-
> target-headers.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-enum.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-macros-2.c
> create mode 100644 gcc/testsuite/gcc.dg/analyzer/named-constants-
> via-macros.c
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index f672e6ea549..4539934cb03 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1249,6 +1249,7 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-
> cppbuiltin.o c-family/c-dump.o \
> ANALYZER_OBJS = \
> analyzer/analysis-plan.o \
> analyzer/analyzer.o \
> + analyzer/analyzer-language.o \
> analyzer/analyzer-logging.o \
> analyzer/analyzer-pass.o \
> analyzer/analyzer-selftests.o \
> @@ -2741,6 +2742,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h
> $(srcdir)/coretypes.h \
> $(srcdir)/internal-fn.h \
> $(srcdir)/calls.cc \
> $(srcdir)/omp-general.h \
> + $(srcdir)/analyzer/analyzer-language.cc \
> @all_gtfiles@
>
> # Compute the list of GT header files from the corresponding C
> sources,
> diff --git a/gcc/analyzer/analyzer-language.cc
> b/gcc/analyzer/analyzer-language.cc
> new file mode 100644
> index 00000000000..ba4352b729a
> --- /dev/null
> +++ b/gcc/analyzer/analyzer-language.cc
> @@ -0,0 +1,110 @@
> +/* Interface between analyzer and frontends.
> + Copyright (C) 2022 Free Software Foundation, Inc.
> + Contributed by David Malcolm <dmalcolm@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#include "config.h"
> +#define INCLUDE_MEMORY
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tree.h"
> +#include "stringpool.h"
> +#include "analyzer/analyzer.h"
> +#include "analyzer/analyzer-language.h"
> +#include "analyzer/analyzer-logging.h"
> +
> +/* Map from identifier to INTEGER_CST. */
> +static GTY (()) hash_map <tree, tree> *analyzer_stashed_constants;
> +
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Call into TU to try to find a value for NAME.
> + If found, stash its value within analyzer_stashed_constants. */
> +
> +static void
> +maybe_stash_named_constant (const translation_unit &tu, const char
> *name)
> +{
> + if (!analyzer_stashed_constants)
> + analyzer_stashed_constants = hash_map<tree, tree>::create_ggc
> ();
> +
> + tree id = get_identifier (name);
> + if (tree t = tu.lookup_constant_by_id (id))
> + {
> + gcc_assert (TREE_CODE (t) == INTEGER_CST);
> + analyzer_stashed_constants->put (id, t);
> + }
> +}
> +
> +/* Hook for frontend to call into analyzer when TU finishes.
> + This exists so that the analyzer can stash named constant values
> from
> + header files (e.g. macros and enums) for later use when modeling
> the
> + behaviors of APIs.
> +
> + By doing it this way, the analyzer can use the precise values for
> those
> + constants from the user's headers, rather than attempting to
> model them
> + as properties of the target. */
> +
> +void
> +on_finish_translation_unit (const translation_unit &tu)
> +{
> + /* Bail if the analyzer isn't enabled. */
> + if (!flag_analyzer)
> + return;
> +
> + /* Stash named constants for use by sm-fd.cc */
> + maybe_stash_named_constant (tu, "O_ACCMODE");
> + maybe_stash_named_constant (tu, "O_RDONLY");
> + maybe_stash_named_constant (tu, "O_WRONLY");
> +}
> +
> +/* Lookup NAME in the named constants stashed when the frontend TU
> finished.
> + Return either an INTEGER_CST, or NULL_TREE. */
> +
> +tree
> +get_stashed_constant_by_name (const char *name)
> +{
> + if (!analyzer_stashed_constants)
> + return NULL_TREE;
> + tree id = get_identifier (name);
> + if (tree *slot = analyzer_stashed_constants->get (id))
> + {
> + gcc_assert (TREE_CODE (*slot) == INTEGER_CST);
> + return *slot;
> + }
> + return NULL_TREE;
> +}
> +
> +/* Log all stashed named constants to LOGGER. */
> +
> +void
> +log_stashed_constants (logger *logger)
> +{
> + gcc_assert (logger);
> + LOG_SCOPE (logger);
> + if (analyzer_stashed_constants)
> + for (auto iter : *analyzer_stashed_constants)
> + logger->log ("%qE: %qE", iter.first, iter.second);
> +}
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> +#include "gt-analyzer-language.h"
> diff --git a/gcc/analyzer/analyzer-language.h
> b/gcc/analyzer/analyzer-language.h
> new file mode 100644
> index 00000000000..33c4dd60623
> --- /dev/null
> +++ b/gcc/analyzer/analyzer-language.h
> @@ -0,0 +1,48 @@
> +/* Interface between analyzer and frontends.
> + Copyright (C) 2022 Free Software Foundation, Inc.
> + Contributed by David Malcolm <dmalcolm@redhat.com>.
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it
> +under the terms of the GNU General Public License as published by
> +the Free Software Foundation; either version 3, or (at your option)
> +any later version.
> +
> +GCC is distributed in the hope that it will be useful, but
> +WITHOUT ANY WARRANTY; without even the implied warranty of
> +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
> +General Public License for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3. If not see
> +<http://www.gnu.org/licenses/>. */
> +
> +#ifndef GCC_ANALYZER_LANGUAGE_H
> +#define GCC_ANALYZER_LANGUAGE_H
> +
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Abstract base class for representing a specific TU
> + to the analyzer. */
> +
> +class translation_unit
> +{
> + public:
> + /* Attempt to look up an value for identifier ID (e.g. in the
> headers that
> + have been seen). If it is defined and an integer (e.g. either
> as a
> + macro or enum), return the INTEGER_CST value, otherwise return
> NULL. */
> + virtual tree lookup_constant_by_id (tree id) const = 0;
> +};
> +
> +/* Analyzer hook for frontends to call at the end of the TU. */
> +
> +void on_finish_translation_unit (const translation_unit &tu);
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> +#endif /* GCC_ANALYZER_LANGUAGE_H */
> diff --git a/gcc/analyzer/analyzer.h b/gcc/analyzer/analyzer.h
> index a2d79e4a59f..e0c2ef77405 100644
> --- a/gcc/analyzer/analyzer.h
> +++ b/gcc/analyzer/analyzer.h
> @@ -312,6 +312,9 @@ public:
> virtual bool terminate_path_p () const = 0;
> };
>
> +extern tree get_stashed_constant_by_name (const char *name);
> +extern void log_stashed_constants (logger *logger);
> +
> } // namespace ana
>
> extern bool is_special_named_call_p (const gcall *call, const char
> *funcname,
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index 52978dd0d37..40ee14c1eae 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -6001,6 +6001,7 @@ impl_run_checkers (logger *logger)
> logger->log ("BITS_BIG_ENDIAN: %i", BITS_BIG_ENDIAN ? 1 : 0);
> logger->log ("BYTES_BIG_ENDIAN: %i", BYTES_BIG_ENDIAN ? 1 :
> 0);
> logger->log ("WORDS_BIG_ENDIAN: %i", WORDS_BIG_ENDIAN ? 1 :
> 0);
> + log_stashed_constants (logger);
> }
>
> /* If using LTO, ensure that the cgraph nodes have function
> bodies. */
> diff --git a/gcc/analyzer/region-model-impl-calls.cc
> b/gcc/analyzer/region-model-impl-calls.cc
> index 52c4205cbeb..3e06f9eacd2 100644
> --- a/gcc/analyzer/region-model-impl-calls.cc
> +++ b/gcc/analyzer/region-model-impl-calls.cc
> @@ -350,6 +350,34 @@ region_model::impl_call_analyzer_dump_escaped
> (const gcall *call)
> pp_formatted_text (&pp));
> }
>
> +/* Handle a call to "__analyzer_dump_named_constant".
> +
> + Look up the given name, and emit a warning describing the
> + state of the corresponding stashed value.
> +
> + This is for use when debugging, and for DejaGnu tests. */
> +
> +void
> +region_model::
> +impl_call_analyzer_dump_named_constant (const gcall *call,
> + region_model_context *ctxt)
> +{
> + call_details cd (call, this, ctxt);
> + const char *name = cd.get_arg_string_literal (0);
> + if (!name)
> + {
> + error_at (call->location, "cannot determine name");
> + return;
> + }
> + tree value = get_stashed_constant_by_name (name);
> + if (value)
> + warning_at (call->location, 0, "named constant %qs has value
> %qE",
> + name, value);
> + else
> + warning_at (call->location, 0, "named constant %qs has unknown
> value",
> + name);
> +}
> +
> /* Handle a call to "__analyzer_eval" by evaluating the input
> and dumping as a dummy warning, so that test cases can use
> dg-warning to validate the result (and so unexpected warnings
> will
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-
> model.cc
> index 7c44fc9e253..d33be9c9905 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -1224,6 +1224,10 @@ region_model::on_stmt_pre (const gimple *stmt,
> impl_call_analyzer_dump_capacity (call, ctxt);
> else if (is_special_named_call_p (call,
> "__analyzer_dump_escaped", 0))
> impl_call_analyzer_dump_escaped (call);
> + else if (is_special_named_call_p (call,
> +
> "__analyzer_dump_named_constant",
> + 1))
> + impl_call_analyzer_dump_named_constant (call, ctxt);
> else if (is_special_named_call_p (call,
> "__analyzer_dump_path", 0))
> {
> /* Handle the builtin "__analyzer_dump_path" by queuing a
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-
> model.h
> index 19e8043daa4..16429703cef 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -343,6 +343,8 @@ class region_model
> void impl_call_analyzer_dump_capacity (const gcall *call,
> region_model_context *ctxt);
> void impl_call_analyzer_dump_escaped (const gcall *call);
> + void impl_call_analyzer_dump_named_constant (const gcall *call,
> + region_model_context
> *ctxt);
> void impl_call_analyzer_eval (const gcall *call,
> region_model_context *ctxt);
> void impl_call_analyzer_get_unknown_ptr (const call_details &cd);
> diff --git a/gcc/analyzer/sm-fd.cc b/gcc/analyzer/sm-fd.cc
> index ae846cd6ec8..f3a5582c4b8 100644
> --- a/gcc/analyzer/sm-fd.cc
> +++ b/gcc/analyzer/sm-fd.cc
> @@ -157,6 +157,11 @@ public:
> /* State for a file descriptor that we do not want to track
> anymore . */
> state_t m_stop;
>
> + /* Stashed constant values from the frontend. These could be
> NULL. */
> + tree m_O_ACCMODE;
> + tree m_O_RDONLY;
> + tree m_O_WRONLY;
> +
> private:
> void on_open (sm_context *sm_ctxt, const supernode *node, const
> gimple *stmt,
> const gcall *call) const;
> @@ -684,7 +689,10 @@ fd_state_machine::fd_state_machine (logger
> *logger)
> m_valid_write_only (add_state ("fd-valid-write-only")),
> m_invalid (add_state ("fd-invalid")),
> m_closed (add_state ("fd-closed")),
> - m_stop (add_state ("fd-stop"))
> + m_stop (add_state ("fd-stop")),
> + m_O_ACCMODE (get_stashed_constant_by_name ("O_ACCMODE")),
> + m_O_RDONLY (get_stashed_constant_by_name ("O_RDONLY")),
> + m_O_WRONLY (get_stashed_constant_by_name ("O_WRONLY"))
> {
> }
>
> @@ -707,16 +715,18 @@ fd_state_machine::is_valid_fd_p (state_t s)
> const
> enum access_mode
> fd_state_machine::get_access_mode_from_flag (int flag) const
> {
> - /* FIXME: this code assumes the access modes on the host and
> - target are the same, which in practice might not be the case.
> */
> -
> - if ((flag & O_ACCMODE) == O_RDONLY)
> - {
> - return READ_ONLY;
> - }
> - else if ((flag & O_ACCMODE) == O_WRONLY)
> + if (m_O_ACCMODE && TREE_CODE (m_O_ACCMODE) == INTEGER_CST)
> {
> - return WRITE_ONLY;
> + const unsigned HOST_WIDE_INT mask_val = TREE_INT_CST_LOW
> (m_O_ACCMODE);
> + const unsigned HOST_WIDE_INT masked_flag = flag & mask_val;
> +
> + if (m_O_RDONLY && TREE_CODE (m_O_RDONLY) == INTEGER_CST)
> + if (masked_flag == TREE_INT_CST_LOW (m_O_RDONLY))
> + return READ_ONLY;
> +
> + if (m_O_WRONLY && TREE_CODE (m_O_WRONLY) == INTEGER_CST)
> + if (masked_flag == TREE_INT_CST_LOW (m_O_WRONLY))
> + return WRITE_ONLY;
> }
> return READ_WRITE;
> }
> diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
> index ca533c9c667..9e67afffa13 100644
> --- a/gcc/c/c-parser.cc
> +++ b/gcc/c/c-parser.cc
> @@ -72,6 +72,8 @@ along with GCC; see the file COPYING3. If not see
> #include "memmodel.h"
> #include "c-family/known-headers.h"
> #include "bitmap.h"
> +#include "analyzer/analyzer-language.h"
> +#include "toplev.h"
>
> /* We need to walk over decls with incomplete struct/union/enum
> types
> after parsing the whole translation unit.
> @@ -1662,6 +1664,87 @@ static bool
> c_parser_objc_diagnose_bad_element_prefix
> (c_parser *, struct c_declspecs *);
> static location_t c_parser_parse_rtl_body (c_parser *, char *);
>
> +#if ENABLE_ANALYZER
> +
> +namespace ana {
> +
> +/* Concrete implementation of ana::translation_unit for the C
> frontend. */
> +
> +class c_translation_unit : public translation_unit
> +{
> +public:
> + /* Implementation of translation_unit::lookup_constant_by_id for
> use by the
> + analyzer to look up named constants in the user's source code.
> */
> + tree lookup_constant_by_id (tree id) const final override
> + {
> + /* Consider decls. */
> + if (tree decl = lookup_name (id))
> + if (TREE_CODE (decl) == CONST_DECL)
> + if (tree value = DECL_INITIAL (decl))
> + if (TREE_CODE (value) == INTEGER_CST)
> + return value;
> +
> + /* Consider macros. */
> + cpp_hashnode *hashnode = C_CPP_HASHNODE (id);
> + if (cpp_macro_p (hashnode))
> + if (tree value = consider_macro (hashnode->value.macro))
> + return value;
> +
> + return NULL_TREE;
> + }
> +
> +private:
> + /* Attempt to get an INTEGER_CST from MACRO.
> + Only handle the simplest cases: where MACRO's definition is a
> single
> + token containing a number, by lexing the number again.
> + This will handle e.g.
> + #define NAME 42
> + and other bases but not negative numbers, parentheses or e.g.
> + #define NAME 1 << 7
> + as doing so would require a parser. */
> + tree consider_macro (cpp_macro *macro) const
> + {
> + if (macro->paramc > 0)
> + return NULL_TREE;
> + if (macro->kind == cmk_traditional)
> + return NULL_TREE;
> + if (macro->count != 1)
> + return NULL_TREE;
> + const cpp_token &tok = macro->exp.tokens[0];
> + if (tok.type != CPP_NUMBER)
> + return NULL_TREE;
> +
> + cpp_reader *old_parse_in = parse_in;
> + parse_in = cpp_create_reader (c_dialect_cxx () ? CLK_GNUCXX:
> CLK_GNUC89,
> + ident_hash, line_table);
> +
> + pretty_printer pp;
> + pp_string (&pp, (const char *)tok.val.str.text);
> + pp_newline (&pp);
> + cpp_push_buffer (parse_in,
> + (const unsigned char *)pp_formatted_text (&pp),
> + strlen (pp_formatted_text (&pp)),
> + 0);
> +
> + tree value;
> + location_t loc;
> + unsigned char cpp_flags;
> + c_lex_with_flags (&value, &loc, &cpp_flags, 0);
> +
> + cpp_destroy (parse_in);
> + parse_in = old_parse_in;
> +
> + if (value && TREE_CODE (value) == INTEGER_CST)
> + return value;
> +
> + return NULL_TREE;
> + }
> +};
> +
> +} // namespace ana
> +
> +#endif /* #if ENABLE_ANALYZER */
> +
> /* Parse a translation unit (C90 6.7, C99 6.9, C11 6.9).
>
> translation-unit:
> @@ -1722,6 +1805,14 @@ c_parser_translation_unit (c_parser *parser)
> "#pragma omp begin assumes", "#pragma omp end
> assumes");
> current_omp_begin_assumes = 0;
> }
> +
> +#if ENABLE_ANALYZER
> + if (flag_analyzer)
> + {
> + ana::c_translation_unit tu;
> + ana::on_finish_translation_unit (tu);
> + }
> +#endif
> }
>
> /* Parse an external declaration (C90 6.7, C99 6.9, C11 6.9).
> diff --git a/gcc/doc/analyzer.texi b/gcc/doc/analyzer.texi
> index ec49f951435..d61b55cec5a 100644
> --- a/gcc/doc/analyzer.texi
> +++ b/gcc/doc/analyzer.texi
> @@ -524,6 +524,23 @@ With a non-zero argument
>
> it will also dump all of the states within the ``processed'' nodes.
>
> +The builtin @code{__analyzer_dump_named_constant} will emit a
> warning
> +during analysis describing what is known about the value of a given
> +named constant, for parts of the analyzer that interact with target
> +headers.
> +
> +For example:
> +
> +@smallexample
> +__analyzer_dump_named_constant ("O_RDONLY");
> +@end smallexample
> +
> +might emit the warning:
> +
> +@smallexample
> +warning: named constant 'O_RDONLY' has value '1'
> +@end smallexample
> +
> @smallexample
> __analyzer_dump_region_model ();
> @end smallexample
> diff --git a/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> b/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> index 4478d740b58..d9a32ed9370 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> +++ b/gcc/testsuite/gcc.dg/analyzer/analyzer-decls.h
> @@ -31,6 +31,9 @@ extern void __analyzer_dump_escaped (void);
> will also dump all of the states within those nodes. */
> extern void __analyzer_dump_exploded_nodes (int);
>
> +/* Emit a warning describing what is known about the value of NAME.
> */
> +extern void __analyzer_dump_named_constant (const char *name);
> +
> /* Emit a placeholder "note" diagnostic with a path to this call
> site,
> if the analyzer finds a feasible path to it. */
> extern void __analyzer_dump_path (void);
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> index 842a26b4364..994bad84342 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-4.c
> @@ -8,6 +8,7 @@ void close(int fd);
> int write (int fd, void *buf, int nbytes);
> int read (int fd, void *buf, int nbytes);
>
> +#define O_ACCMODE 0xf
> #define O_RDONLY 0
> #define O_WRONLY 1
> #define O_RDWR 2
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> new file mode 100644
> index 00000000000..5226569c437
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-enum.c
> @@ -0,0 +1,60 @@
> +int open(const char *, int mode);
> +void close(int fd);
> +int write (int fd, void *buf, int nbytes);
> +int read (int fd, void *buf, int nbytes);
> +
> +/* Example of these flags as an enum, and with
> + non-standard values for them. */
> +
> +enum {
> + O_RDONLY = 0x10,
> + O_WRONLY = 0x20,
> + O_RDWR = 0x40,
> +
> + O_ACCMODE = 0xf0
> +};
> +
> +void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void
> +test_1 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + close(fd);
> + f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
> + /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd';
> 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
> +}
> +
> +void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message
> "argument 1 of 'g' must be a readable file descriptor, due to
> '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
> +
> +void
> +test_2 (const char *path)
> +{
> + int fd = open (path, O_WRONLY);
> + if (fd != -1)
> + {
> + g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'"
> } */
> + }
> + close (fd);
> +}
> +
> +void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message
> "argument 1 of 'h' must be a writable file descriptor, due to
> '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
> +void
> +test_3 (const char *path)
> +{
> + int fd = open (path, O_RDONLY);
> + if (fd != -1)
> + {
> + h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'"
> } */
> + }
> + close(fd);
> +}
> +
> +void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'ff' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void test_4 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor
> 'fd'" } */
> + close(fd);
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-5.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> similarity index 98%
> rename from gcc/testsuite/gcc.dg/analyzer/fd-5.c
> rename to gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> index c18b2adcbe5..f9a6931a5db 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-5.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
> @@ -6,6 +6,7 @@ int read (int fd, void *buf, int nbytes);
> #define O_RDONLY 0
> #define O_WRONLY 1
> #define O_RDWR 2
> +#define O_ACCMODE 0x3
>
> void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
>
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-
> headers.c b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-
> headers.c
> new file mode 100644
> index 00000000000..b76eb667d50
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-access-mode-target-headers.c
> @@ -0,0 +1,56 @@
> +#include <sys/stat.h>
> +#include <fcntl.h>
> +#include <unistd.h>
> +#include "analyzer-decls.h"
> +
> +void f (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'f' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void
> +test_1 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + close(fd);
> + f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
> + /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd';
> 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
> +}
> +
> +void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message
> "argument 1 of 'g' must be a readable file descriptor, due to
> '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
> +
> +void
> +test_2 (const char *path)
> +{
> + int fd = open (path, O_WRONLY);
> + if (fd != -1)
> + {
> + g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'"
> } */
> + }
> + close (fd);
> +}
> +
> +void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message
> "argument 1 of 'h' must be a writable file descriptor, due to
> '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
> +void
> +test_3 (const char *path)
> +{
> + int fd = open (path, O_RDONLY);
> + if (fd != -1)
> + {
> + h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'"
> } */
> + }
> + close(fd);
> +}
> +
> +void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message
> "argument 1 of 'ff' must be an open file descriptor, due to
> '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
> +
> +void test_4 (const char *path)
> +{
> + int fd = open (path, O_RDWR);
> + ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor
> 'fd'" } */
> + close(fd);
> +}
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '\[0-9\]+'" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '\[0-9\]+'" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '\[0-9\]+'" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> b/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> index b4f43e7f0ef..bb58e9d9d5e 100644
> --- a/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> +++ b/gcc/testsuite/gcc.dg/analyzer/fd-dup-1.c
> @@ -8,6 +8,7 @@ int read (int fd, void *buf, int nbytes);
> #define O_RDONLY 0
> #define O_WRONLY 1
> #define O_RDWR 2
> +#define O_ACCMODE 3
>
> void test_1 (const char *path)
> {
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> new file mode 100644
> index 00000000000..e6b77b8dd18
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-enum.c
> @@ -0,0 +1,20 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine. */
> +enum {
> + O_ACCMODE = 42,
> + O_RDONLY = 0x1,
> + O_WRONLY = 010
> +};
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '42'" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '1'" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '8'" } */
> +}
> +
> +void test_unknown (void)
> +{
> + __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named
> constant 'UNKNOWN' has unknown value" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros-2.c b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros-2.c
> new file mode 100644
> index 00000000000..9c019e7c5ef
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros-2.c
> @@ -0,0 +1,15 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine, as macros
> + that can't be handled. */
> +
> +#define O_ACCMODE (
> +#define O_RDONLY "foo"
> +#define O_WRONLY int
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has unknown value" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has unknown value" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has unknown value" } */
> +}
> diff --git a/gcc/testsuite/gcc.dg/analyzer/named-constants-via-
> macros.c b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
> new file mode 100644
> index 00000000000..2022f98e5b6
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/analyzer/named-constants-via-macros.c
> @@ -0,0 +1,19 @@
> +#include "analyzer-decls.h"
> +
> +/* Various constants used by the fd state machine. */
> +
> +#define O_ACCMODE 42
> +#define O_RDONLY 0x1
> +#define O_WRONLY 010
> +
> +void test_sm_fd_constants (void)
> +{
> + __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning
> "named constant 'O_ACCMODE' has value '42'" } */
> + __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning
> "named constant 'O_RDONLY' has value '1'" } */
> + __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning
> "named constant 'O_WRONLY' has value '8'" } */
> +}
> +
> +void test_unknown (void)
> +{
> + __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named
> constant 'UNKNOWN' has unknown value" } */
> +}
@@ -1249,6 +1249,7 @@ C_COMMON_OBJS = c-family/c-common.o c-family/c-cppbuiltin.o c-family/c-dump.o \
ANALYZER_OBJS = \
analyzer/analysis-plan.o \
analyzer/analyzer.o \
+ analyzer/analyzer-language.o \
analyzer/analyzer-logging.o \
analyzer/analyzer-pass.o \
analyzer/analyzer-selftests.o \
@@ -2741,6 +2742,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
$(srcdir)/internal-fn.h \
$(srcdir)/calls.cc \
$(srcdir)/omp-general.h \
+ $(srcdir)/analyzer/analyzer-language.cc \
@all_gtfiles@
# Compute the list of GT header files from the corresponding C sources,
new file mode 100644
@@ -0,0 +1,110 @@
+/* Interface between analyzer and frontends.
+ Copyright (C) 2022 Free Software Foundation, Inc.
+ Contributed by David Malcolm <dmalcolm@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#include "config.h"
+#define INCLUDE_MEMORY
+#include "system.h"
+#include "coretypes.h"
+#include "tree.h"
+#include "stringpool.h"
+#include "analyzer/analyzer.h"
+#include "analyzer/analyzer-language.h"
+#include "analyzer/analyzer-logging.h"
+
+/* Map from identifier to INTEGER_CST. */
+static GTY (()) hash_map <tree, tree> *analyzer_stashed_constants;
+
+#if ENABLE_ANALYZER
+
+namespace ana {
+
+/* Call into TU to try to find a value for NAME.
+ If found, stash its value within analyzer_stashed_constants. */
+
+static void
+maybe_stash_named_constant (const translation_unit &tu, const char *name)
+{
+ if (!analyzer_stashed_constants)
+ analyzer_stashed_constants = hash_map<tree, tree>::create_ggc ();
+
+ tree id = get_identifier (name);
+ if (tree t = tu.lookup_constant_by_id (id))
+ {
+ gcc_assert (TREE_CODE (t) == INTEGER_CST);
+ analyzer_stashed_constants->put (id, t);
+ }
+}
+
+/* Hook for frontend to call into analyzer when TU finishes.
+ This exists so that the analyzer can stash named constant values from
+ header files (e.g. macros and enums) for later use when modeling the
+ behaviors of APIs.
+
+ By doing it this way, the analyzer can use the precise values for those
+ constants from the user's headers, rather than attempting to model them
+ as properties of the target. */
+
+void
+on_finish_translation_unit (const translation_unit &tu)
+{
+ /* Bail if the analyzer isn't enabled. */
+ if (!flag_analyzer)
+ return;
+
+ /* Stash named constants for use by sm-fd.cc */
+ maybe_stash_named_constant (tu, "O_ACCMODE");
+ maybe_stash_named_constant (tu, "O_RDONLY");
+ maybe_stash_named_constant (tu, "O_WRONLY");
+}
+
+/* Lookup NAME in the named constants stashed when the frontend TU finished.
+ Return either an INTEGER_CST, or NULL_TREE. */
+
+tree
+get_stashed_constant_by_name (const char *name)
+{
+ if (!analyzer_stashed_constants)
+ return NULL_TREE;
+ tree id = get_identifier (name);
+ if (tree *slot = analyzer_stashed_constants->get (id))
+ {
+ gcc_assert (TREE_CODE (*slot) == INTEGER_CST);
+ return *slot;
+ }
+ return NULL_TREE;
+}
+
+/* Log all stashed named constants to LOGGER. */
+
+void
+log_stashed_constants (logger *logger)
+{
+ gcc_assert (logger);
+ LOG_SCOPE (logger);
+ if (analyzer_stashed_constants)
+ for (auto iter : *analyzer_stashed_constants)
+ logger->log ("%qE: %qE", iter.first, iter.second);
+}
+
+} // namespace ana
+
+#endif /* #if ENABLE_ANALYZER */
+
+#include "gt-analyzer-language.h"
new file mode 100644
@@ -0,0 +1,48 @@
+/* Interface between analyzer and frontends.
+ Copyright (C) 2022 Free Software Foundation, Inc.
+ Contributed by David Malcolm <dmalcolm@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3. If not see
+<http://www.gnu.org/licenses/>. */
+
+#ifndef GCC_ANALYZER_LANGUAGE_H
+#define GCC_ANALYZER_LANGUAGE_H
+
+#if ENABLE_ANALYZER
+
+namespace ana {
+
+/* Abstract base class for representing a specific TU
+ to the analyzer. */
+
+class translation_unit
+{
+ public:
+ /* Attempt to look up an value for identifier ID (e.g. in the headers that
+ have been seen). If it is defined and an integer (e.g. either as a
+ macro or enum), return the INTEGER_CST value, otherwise return NULL. */
+ virtual tree lookup_constant_by_id (tree id) const = 0;
+};
+
+/* Analyzer hook for frontends to call at the end of the TU. */
+
+void on_finish_translation_unit (const translation_unit &tu);
+
+} // namespace ana
+
+#endif /* #if ENABLE_ANALYZER */
+
+#endif /* GCC_ANALYZER_LANGUAGE_H */
@@ -312,6 +312,9 @@ public:
virtual bool terminate_path_p () const = 0;
};
+extern tree get_stashed_constant_by_name (const char *name);
+extern void log_stashed_constants (logger *logger);
+
} // namespace ana
extern bool is_special_named_call_p (const gcall *call, const char *funcname,
@@ -6001,6 +6001,7 @@ impl_run_checkers (logger *logger)
logger->log ("BITS_BIG_ENDIAN: %i", BITS_BIG_ENDIAN ? 1 : 0);
logger->log ("BYTES_BIG_ENDIAN: %i", BYTES_BIG_ENDIAN ? 1 : 0);
logger->log ("WORDS_BIG_ENDIAN: %i", WORDS_BIG_ENDIAN ? 1 : 0);
+ log_stashed_constants (logger);
}
/* If using LTO, ensure that the cgraph nodes have function bodies. */
@@ -350,6 +350,34 @@ region_model::impl_call_analyzer_dump_escaped (const gcall *call)
pp_formatted_text (&pp));
}
+/* Handle a call to "__analyzer_dump_named_constant".
+
+ Look up the given name, and emit a warning describing the
+ state of the corresponding stashed value.
+
+ This is for use when debugging, and for DejaGnu tests. */
+
+void
+region_model::
+impl_call_analyzer_dump_named_constant (const gcall *call,
+ region_model_context *ctxt)
+{
+ call_details cd (call, this, ctxt);
+ const char *name = cd.get_arg_string_literal (0);
+ if (!name)
+ {
+ error_at (call->location, "cannot determine name");
+ return;
+ }
+ tree value = get_stashed_constant_by_name (name);
+ if (value)
+ warning_at (call->location, 0, "named constant %qs has value %qE",
+ name, value);
+ else
+ warning_at (call->location, 0, "named constant %qs has unknown value",
+ name);
+}
+
/* Handle a call to "__analyzer_eval" by evaluating the input
and dumping as a dummy warning, so that test cases can use
dg-warning to validate the result (and so unexpected warnings will
@@ -1224,6 +1224,10 @@ region_model::on_stmt_pre (const gimple *stmt,
impl_call_analyzer_dump_capacity (call, ctxt);
else if (is_special_named_call_p (call, "__analyzer_dump_escaped", 0))
impl_call_analyzer_dump_escaped (call);
+ else if (is_special_named_call_p (call,
+ "__analyzer_dump_named_constant",
+ 1))
+ impl_call_analyzer_dump_named_constant (call, ctxt);
else if (is_special_named_call_p (call, "__analyzer_dump_path", 0))
{
/* Handle the builtin "__analyzer_dump_path" by queuing a
@@ -343,6 +343,8 @@ class region_model
void impl_call_analyzer_dump_capacity (const gcall *call,
region_model_context *ctxt);
void impl_call_analyzer_dump_escaped (const gcall *call);
+ void impl_call_analyzer_dump_named_constant (const gcall *call,
+ region_model_context *ctxt);
void impl_call_analyzer_eval (const gcall *call,
region_model_context *ctxt);
void impl_call_analyzer_get_unknown_ptr (const call_details &cd);
@@ -157,6 +157,11 @@ public:
/* State for a file descriptor that we do not want to track anymore . */
state_t m_stop;
+ /* Stashed constant values from the frontend. These could be NULL. */
+ tree m_O_ACCMODE;
+ tree m_O_RDONLY;
+ tree m_O_WRONLY;
+
private:
void on_open (sm_context *sm_ctxt, const supernode *node, const gimple *stmt,
const gcall *call) const;
@@ -684,7 +689,10 @@ fd_state_machine::fd_state_machine (logger *logger)
m_valid_write_only (add_state ("fd-valid-write-only")),
m_invalid (add_state ("fd-invalid")),
m_closed (add_state ("fd-closed")),
- m_stop (add_state ("fd-stop"))
+ m_stop (add_state ("fd-stop")),
+ m_O_ACCMODE (get_stashed_constant_by_name ("O_ACCMODE")),
+ m_O_RDONLY (get_stashed_constant_by_name ("O_RDONLY")),
+ m_O_WRONLY (get_stashed_constant_by_name ("O_WRONLY"))
{
}
@@ -707,16 +715,18 @@ fd_state_machine::is_valid_fd_p (state_t s) const
enum access_mode
fd_state_machine::get_access_mode_from_flag (int flag) const
{
- /* FIXME: this code assumes the access modes on the host and
- target are the same, which in practice might not be the case. */
-
- if ((flag & O_ACCMODE) == O_RDONLY)
- {
- return READ_ONLY;
- }
- else if ((flag & O_ACCMODE) == O_WRONLY)
+ if (m_O_ACCMODE && TREE_CODE (m_O_ACCMODE) == INTEGER_CST)
{
- return WRITE_ONLY;
+ const unsigned HOST_WIDE_INT mask_val = TREE_INT_CST_LOW (m_O_ACCMODE);
+ const unsigned HOST_WIDE_INT masked_flag = flag & mask_val;
+
+ if (m_O_RDONLY && TREE_CODE (m_O_RDONLY) == INTEGER_CST)
+ if (masked_flag == TREE_INT_CST_LOW (m_O_RDONLY))
+ return READ_ONLY;
+
+ if (m_O_WRONLY && TREE_CODE (m_O_WRONLY) == INTEGER_CST)
+ if (masked_flag == TREE_INT_CST_LOW (m_O_WRONLY))
+ return WRITE_ONLY;
}
return READ_WRITE;
}
@@ -72,6 +72,8 @@ along with GCC; see the file COPYING3. If not see
#include "memmodel.h"
#include "c-family/known-headers.h"
#include "bitmap.h"
+#include "analyzer/analyzer-language.h"
+#include "toplev.h"
/* We need to walk over decls with incomplete struct/union/enum types
after parsing the whole translation unit.
@@ -1662,6 +1664,87 @@ static bool c_parser_objc_diagnose_bad_element_prefix
(c_parser *, struct c_declspecs *);
static location_t c_parser_parse_rtl_body (c_parser *, char *);
+#if ENABLE_ANALYZER
+
+namespace ana {
+
+/* Concrete implementation of ana::translation_unit for the C frontend. */
+
+class c_translation_unit : public translation_unit
+{
+public:
+ /* Implementation of translation_unit::lookup_constant_by_id for use by the
+ analyzer to look up named constants in the user's source code. */
+ tree lookup_constant_by_id (tree id) const final override
+ {
+ /* Consider decls. */
+ if (tree decl = lookup_name (id))
+ if (TREE_CODE (decl) == CONST_DECL)
+ if (tree value = DECL_INITIAL (decl))
+ if (TREE_CODE (value) == INTEGER_CST)
+ return value;
+
+ /* Consider macros. */
+ cpp_hashnode *hashnode = C_CPP_HASHNODE (id);
+ if (cpp_macro_p (hashnode))
+ if (tree value = consider_macro (hashnode->value.macro))
+ return value;
+
+ return NULL_TREE;
+ }
+
+private:
+ /* Attempt to get an INTEGER_CST from MACRO.
+ Only handle the simplest cases: where MACRO's definition is a single
+ token containing a number, by lexing the number again.
+ This will handle e.g.
+ #define NAME 42
+ and other bases but not negative numbers, parentheses or e.g.
+ #define NAME 1 << 7
+ as doing so would require a parser. */
+ tree consider_macro (cpp_macro *macro) const
+ {
+ if (macro->paramc > 0)
+ return NULL_TREE;
+ if (macro->kind == cmk_traditional)
+ return NULL_TREE;
+ if (macro->count != 1)
+ return NULL_TREE;
+ const cpp_token &tok = macro->exp.tokens[0];
+ if (tok.type != CPP_NUMBER)
+ return NULL_TREE;
+
+ cpp_reader *old_parse_in = parse_in;
+ parse_in = cpp_create_reader (c_dialect_cxx () ? CLK_GNUCXX: CLK_GNUC89,
+ ident_hash, line_table);
+
+ pretty_printer pp;
+ pp_string (&pp, (const char *)tok.val.str.text);
+ pp_newline (&pp);
+ cpp_push_buffer (parse_in,
+ (const unsigned char *)pp_formatted_text (&pp),
+ strlen (pp_formatted_text (&pp)),
+ 0);
+
+ tree value;
+ location_t loc;
+ unsigned char cpp_flags;
+ c_lex_with_flags (&value, &loc, &cpp_flags, 0);
+
+ cpp_destroy (parse_in);
+ parse_in = old_parse_in;
+
+ if (value && TREE_CODE (value) == INTEGER_CST)
+ return value;
+
+ return NULL_TREE;
+ }
+};
+
+} // namespace ana
+
+#endif /* #if ENABLE_ANALYZER */
+
/* Parse a translation unit (C90 6.7, C99 6.9, C11 6.9).
translation-unit:
@@ -1722,6 +1805,14 @@ c_parser_translation_unit (c_parser *parser)
"#pragma omp begin assumes", "#pragma omp end assumes");
current_omp_begin_assumes = 0;
}
+
+#if ENABLE_ANALYZER
+ if (flag_analyzer)
+ {
+ ana::c_translation_unit tu;
+ ana::on_finish_translation_unit (tu);
+ }
+#endif
}
/* Parse an external declaration (C90 6.7, C99 6.9, C11 6.9).
@@ -524,6 +524,23 @@ With a non-zero argument
it will also dump all of the states within the ``processed'' nodes.
+The builtin @code{__analyzer_dump_named_constant} will emit a warning
+during analysis describing what is known about the value of a given
+named constant, for parts of the analyzer that interact with target
+headers.
+
+For example:
+
+@smallexample
+__analyzer_dump_named_constant ("O_RDONLY");
+@end smallexample
+
+might emit the warning:
+
+@smallexample
+warning: named constant 'O_RDONLY' has value '1'
+@end smallexample
+
@smallexample
__analyzer_dump_region_model ();
@end smallexample
@@ -31,6 +31,9 @@ extern void __analyzer_dump_escaped (void);
will also dump all of the states within those nodes. */
extern void __analyzer_dump_exploded_nodes (int);
+/* Emit a warning describing what is known about the value of NAME. */
+extern void __analyzer_dump_named_constant (const char *name);
+
/* Emit a placeholder "note" diagnostic with a path to this call site,
if the analyzer finds a feasible path to it. */
extern void __analyzer_dump_path (void);
@@ -8,6 +8,7 @@ void close(int fd);
int write (int fd, void *buf, int nbytes);
int read (int fd, void *buf, int nbytes);
+#define O_ACCMODE 0xf
#define O_RDONLY 0
#define O_WRONLY 1
#define O_RDWR 2
new file mode 100644
@@ -0,0 +1,60 @@
+int open(const char *, int mode);
+void close(int fd);
+int write (int fd, void *buf, int nbytes);
+int read (int fd, void *buf, int nbytes);
+
+/* Example of these flags as an enum, and with
+ non-standard values for them. */
+
+enum {
+ O_RDONLY = 0x10,
+ O_WRONLY = 0x20,
+ O_RDWR = 0x40,
+
+ O_ACCMODE = 0xf0
+};
+
+void f (int fd) __attribute__((fd_arg(1))); /* { dg-message "argument 1 of 'f' must be an open file descriptor, due to '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
+
+void
+test_1 (const char *path)
+{
+ int fd = open (path, O_RDWR);
+ close(fd);
+ f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
+ /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd'; 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
+}
+
+void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message "argument 1 of 'g' must be a readable file descriptor, due to '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
+
+void
+test_2 (const char *path)
+{
+ int fd = open (path, O_WRONLY);
+ if (fd != -1)
+ {
+ g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'" } */
+ }
+ close (fd);
+}
+
+void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message "argument 1 of 'h' must be a writable file descriptor, due to '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
+void
+test_3 (const char *path)
+{
+ int fd = open (path, O_RDONLY);
+ if (fd != -1)
+ {
+ h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'" } */
+ }
+ close(fd);
+}
+
+void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message "argument 1 of 'ff' must be an open file descriptor, due to '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
+
+void test_4 (const char *path)
+{
+ int fd = open (path, O_RDWR);
+ ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor 'fd'" } */
+ close(fd);
+}
similarity index 98%
rename from gcc/testsuite/gcc.dg/analyzer/fd-5.c
rename to gcc/testsuite/gcc.dg/analyzer/fd-access-mode-macros.c
@@ -6,6 +6,7 @@ int read (int fd, void *buf, int nbytes);
#define O_RDONLY 0
#define O_WRONLY 1
#define O_RDWR 2
+#define O_ACCMODE 0x3
void f (int fd) __attribute__((fd_arg(1))); /* { dg-message "argument 1 of 'f' must be an open file descriptor, due to '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
new file mode 100644
@@ -0,0 +1,56 @@
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <unistd.h>
+#include "analyzer-decls.h"
+
+void f (int fd) __attribute__((fd_arg(1))); /* { dg-message "argument 1 of 'f' must be an open file descriptor, due to '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
+
+void
+test_1 (const char *path)
+{
+ int fd = open (path, O_RDWR);
+ close(fd);
+ f(fd); /* { dg-warning "'f' on closed file descriptor 'fd'" } */
+ /* { dg-message "\\(3\\) 'f' on closed file descriptor 'fd'; 'close' was at \\(2\\)" "" { target *-*-* } .-1 } */
+}
+
+void g (int fd) __attribute__((fd_arg_read(1))); /* { dg-message "argument 1 of 'g' must be a readable file descriptor, due to '__attribute__\\(\\(fd_arg_read\\(1\\)\\)\\)'" } */
+
+void
+test_2 (const char *path)
+{
+ int fd = open (path, O_WRONLY);
+ if (fd != -1)
+ {
+ g (fd); /* { dg-warning "'g' on write-only file descriptor 'fd'" } */
+ }
+ close (fd);
+}
+
+void h (int fd) __attribute__((fd_arg_write(1))); /* { dg-message "argument 1 of 'h' must be a writable file descriptor, due to '__attribute__\\(\\(fd_arg_write\\(1\\)\\)\\)'" } */
+void
+test_3 (const char *path)
+{
+ int fd = open (path, O_RDONLY);
+ if (fd != -1)
+ {
+ h (fd); /* { dg-warning "'h' on read-only file descriptor 'fd'" } */
+ }
+ close(fd);
+}
+
+void ff (int fd) __attribute__((fd_arg(1))); /* { dg-message "argument 1 of 'ff' must be an open file descriptor, due to '__attribute__\\(\\(fd_arg\\(1\\)\\)\\)'" } */
+
+void test_4 (const char *path)
+{
+ int fd = open (path, O_RDWR);
+ ff (fd); /* { dg-warning "'ff' on possibly invalid file descriptor 'fd'" } */
+ close(fd);
+}
+
+void test_sm_fd_constants (void)
+{
+ __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning "named constant 'O_ACCMODE' has value '\[0-9\]+'" } */
+ __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning "named constant 'O_RDONLY' has value '\[0-9\]+'" } */
+ __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning "named constant 'O_WRONLY' has value '\[0-9\]+'" } */
+}
@@ -8,6 +8,7 @@ int read (int fd, void *buf, int nbytes);
#define O_RDONLY 0
#define O_WRONLY 1
#define O_RDWR 2
+#define O_ACCMODE 3
void test_1 (const char *path)
{
new file mode 100644
@@ -0,0 +1,20 @@
+#include "analyzer-decls.h"
+
+/* Various constants used by the fd state machine. */
+enum {
+ O_ACCMODE = 42,
+ O_RDONLY = 0x1,
+ O_WRONLY = 010
+};
+
+void test_sm_fd_constants (void)
+{
+ __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning "named constant 'O_ACCMODE' has value '42'" } */
+ __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning "named constant 'O_RDONLY' has value '1'" } */
+ __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning "named constant 'O_WRONLY' has value '8'" } */
+}
+
+void test_unknown (void)
+{
+ __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named constant 'UNKNOWN' has unknown value" } */
+}
new file mode 100644
@@ -0,0 +1,15 @@
+#include "analyzer-decls.h"
+
+/* Various constants used by the fd state machine, as macros
+ that can't be handled. */
+
+#define O_ACCMODE (
+#define O_RDONLY "foo"
+#define O_WRONLY int
+
+void test_sm_fd_constants (void)
+{
+ __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning "named constant 'O_ACCMODE' has unknown value" } */
+ __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning "named constant 'O_RDONLY' has unknown value" } */
+ __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning "named constant 'O_WRONLY' has unknown value" } */
+}
new file mode 100644
@@ -0,0 +1,19 @@
+#include "analyzer-decls.h"
+
+/* Various constants used by the fd state machine. */
+
+#define O_ACCMODE 42
+#define O_RDONLY 0x1
+#define O_WRONLY 010
+
+void test_sm_fd_constants (void)
+{
+ __analyzer_dump_named_constant ("O_ACCMODE"); /* { dg-warning "named constant 'O_ACCMODE' has value '42'" } */
+ __analyzer_dump_named_constant ("O_RDONLY"); /* { dg-warning "named constant 'O_RDONLY' has value '1'" } */
+ __analyzer_dump_named_constant ("O_WRONLY"); /* { dg-warning "named constant 'O_WRONLY' has value '8'" } */
+}
+
+void test_unknown (void)
+{
+ __analyzer_dump_named_constant ("UNKNOWN"); /* { dg-warning "named constant 'UNKNOWN' has unknown value" } */
+}