mid-end: Use integral time intervals in timevar.cc
Checks
Commit Message
>
> I think this is undesriable. With fused you mean we use FMA?
> I think you could use -ffp-contract=off for the TU instead.
>
> Note you can't use __attribute__((noinline)) literally since the
> host compiler might not support this.
>
> Richard.
>
Trying to make the timevar store integral time intervals.
Hope this is acceptable -- I had originally planned to use
`-ffp-contract` as agreed until I saw the email mentioning the old x86
bug in the same area which was not to do with floating point contraction
of operations (PR 99903) and figured it would be better to try and solve
both at the same time while making things in general a bit more robust.
Comments
On Thu, 2023-08-03 at 14:38 +0100, Matthew Malcomson via Gcc-patches
wrote:
> >
> > I think this is undesriable. With fused you mean we use FMA?
> > I think you could use -ffp-contract=off for the TU instead.
> >
> > Note you can't use __attribute__((noinline)) literally since the
> > host compiler might not support this.
> >
> > Richard.
> >
>
>
> Trying to make the timevar store integral time intervals.
> Hope this is acceptable -- I had originally planned to use
> `-ffp-contract` as agreed until I saw the email mentioning the old
> x86
> bug in the same area which was not to do with floating point
> contraction
> of operations (PR 99903) and figured it would be better to try and
> solve
> both at the same time while making things in general a bit more
> robust.
> _____________________________________________________________________
> ___
>
>
> On some AArch64 bootstrapped builds, we were getting a flaky test
> because the floating point operations in `get_time` were being fused
> with the floating point operations in `timevar_accumulate`.
>
> This meant that the rounding behaviour of our multiplication with
> `ticks_to_msec` was different when used in `timer::start` and when
> performed in `timer::stop`. These extra inaccuracies led to the
> testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.
>
> ------------------------------
> Avoiding the inlining which was agreed to be undesirable. Three
> alternative approaches:
> 1) Use `-ffp-contract=on` to avoid this particular optimisation.
> 2) Adjusting the code so that the "tolerance" is always of the order
> of
> a "tick".
> 3) Recording times and elapsed differences in integral values.
> - Could be in terms of a standard measurement (e.g. nanoseconds or
> microseconds).
> - Could be in terms of whatever integral value ("ticks" /
> secondsµseconds / "clock ticks") is returned from the
> syscall
> chosen at configure time.
>
> While `-ffp-contract=on` removes the problem that I bumped into,
> there
> has been a similar bug on x86 that was to do with a different
> floating
> point problem that also happens after `get_time` and
> `timevar_accumulate` both being inlined into the same function.
> Hence
> it seems worth choosing a different approach.
>
> Of the two other solutions, recording measurements in integral values
> seems the most robust against slightly "off" measurements being
> presented to the user -- even though it could avoid the ICE that
> creates
> a flaky test.
>
> I considered storing time in whatever units our syscall returns and
> normalising them at the time we print out rather than normalising
> them
> to nanoseconds at the point we record our "current time". The logic
> being that normalisation could have some rounding affect (e.g. if
> TICKS_PER_SECOND is 3) that would be taken into account in
> calculations.
>
> I decided against it in order to give the values recorded in
> `timevar_time_def` some interpretive value so it's easier to read the
> code. Compared to the small rounding that would represent a tiny
> amount
> of time and AIUI can not trigger the same kind of ICE's as we are
> attempting to fix, said interpretive value seems more valuable.
>
> Recording time in microseconds seemed reasonable since all obvious
> values for ticks and `getrusage` are at microsecond granularity or
> less
> precise. That said, since TICKS_PER_SECOND and CLOCKS_PER_SEC are
> both
> variables given to use by the host system I was not sure of that
> enough
> to make this decision.
>
> ------------------------------
> timer::all_zero is ignoring rows which are inconsequential to the
> user
> and would be printed out as all zeros. Since upon printing rows we
> convert to the same double value and print out the same precision as
> before, we return true/false based on the same amount of time as
> before.
>
> timer::print_row casts to a floating point measurement in units of
> seconds as was printed out before.
>
> timer::validate_phases -- I'm printing out nanoseconds here rather
> than
> floating point seconds since this is an error message for when things
> have "gone wrong" printing out the actual nanoseconds that have been
> recorded seems like the best approach.
> N.b. since we now print out nanoseconds instead of floating point
> value
> the padding requirements are different. Originally we were padding
> to
> 24 characters and printing 18 decimal places. This looked odd with
> the
> now visually smaller values getting printed. I judged 13 characters
> (corresponding to 2 hours) to be a reasonable point at which our
> alignment could start to degrade and this provides a more compact
> output
> for the majority of cases (checked by triggering the error case via
> GDB).
>
> ------------------------------
> N.b. I use a literal 1000000000 for "NANOSEC_PER_SEC". I believe
> this
> would fit in an integer on all hosts that GCC supports, but am not
> certain there are not strange integer sizes we support hence am
> pointing
> it out for special attention during review.
>
> ------------------------------
> No expected change in generated code.
> Bootstrapped and regtested on AArch64 with no regressions.
> Manually checked that flaky test is no longer flaky on the machine it
> was seen before.
>
> gcc/ChangeLog:
>
> PR middle-end/110316
> PR middle-end/9903
> * timevar.cc (NANOSEC_PER_SEC, TICKS_TO_NANOSEC,
> CLOCKS_TO_NANOSEC, nanosec_to_floating_sec, percent_of): New.
> (TICKS_TO_MSEC, CLOCKS_TO_MSEC): Remove these macros.
> (timer::validate_phases): Use integral arithmetic to check
> validity.
> (timer::print_row, timer::print): Convert from integral
> nanoseconds to floating point seconds before printing.
> (timer::all_zero): Change limit to nanosec count instead of
> fractional count of seconds.
> * timevar.h (struct timevar_time_def): Update all
> measurements
> to use uint64_t nanoseconds rather than seconds stored in a
> double.
>
>
>
> ############### Attachment also inlined for ease of reply
> ###############
>
>
> diff --git a/gcc/timevar.h b/gcc/timevar.h
> index
> ad46573160957c6461190204c5181ec9e904e204..882954a527e11a90873f25a45e9
> 073149709dd05 100644
> --- a/gcc/timevar.h
> +++ b/gcc/timevar.h
> @@ -44,7 +44,7 @@
> */
>
> /* This structure stores the various varieties of time that can be
> - measured. Times are stored in seconds. The time may be an
> + measured. Times are stored in nanoseconds. The time may be an
> absolute time or a time difference; in the former case, the time
> base is undefined, except that the difference between two times
> produces a valid time difference. */
> @@ -52,14 +52,13 @@
> struct timevar_time_def
> {
> /* User time in this process. */
> - double user;
> + uint64_t user;
>
> - /* System time (if applicable for this host platform) in this
> - process. */
> - double sys;
> + /* System time (if applicable for this host platform) in this
> process. */
> + uint64_t sys;
>
> /* Wall clock time. */
> - double wall;
> + uint64_t wall;
>
> /* Garbage collector memory. */
> size_t ggc_mem;
Hi Matthew. I recently touched the timevar code (in r14-2881-
g75d623946d4b6e) to add support for serializing the timevar data in
JSON form as part of the SARIF output (PR analyzer/109361).
Looking at your patch, it looks like the baseline for the patch seems
to predate r14-2881-g75d623946d4b6e.
I don't have a strong opinion on the implementation choices in your
patch, but please can you rebase to beyond my recent change and make
sure that the SARIF serialization still works with your patch.
Specifically, please try compiling with
-ftime-report -fdiagnostics-format=sarif-file
and have a look at the generated .sarif file, e.g. via
python -m json.tool foo.c.sarif
which will pretty-print the JSON to stdout.
Currently I'm writing out the values as floating-point seconds, and
AFAIK my analyzer integration testsuite [1] is the only consumer of
this data.
[...snip...]
Thanks
Dave
[1]
https://github.com/davidmalcolm/gcc-analyzer-integration-tests/issues/5
On 8/3/23 15:09, David Malcolm wrote:
>
> Hi Matthew. I recently touched the timevar code (in r14-2881-
> g75d623946d4b6e) to add support for serializing the timevar data in
> JSON form as part of the SARIF output (PR analyzer/109361).
>
> Looking at your patch, it looks like the baseline for the patch seems
> to predate r14-2881-g75d623946d4b6e.
>
> I don't have a strong opinion on the implementation choices in your
> patch, but please can you rebase to beyond my recent change and make
> sure that the SARIF serialization still works with your patch.
>
> Specifically, please try compiling with
> -ftime-report -fdiagnostics-format=sarif-file
> and have a look at the generated .sarif file, e.g. via
> python -m json.tool foo.c.sarif
> which will pretty-print the JSON to stdout.
>
> Currently I'm writing out the values as floating-point seconds, and
> AFAIK my analyzer integration testsuite [1] is the only consumer of
> this data.
Hi David,
Thanks for the heads-up. Will update the patch.
I read your last paragraph as suggesting that you'd be open to changing
the format. Is that correct?
I would initially assume that writing out the time as floating-point
seconds would still be most convenient for your use since it looks to be
like something to be presented to a person.
However, since I don't know much about the intended uses of SARIF in
general I figured I should double-check -- does that choice to remain
printing out floating-point seconds seem best to you?
>
> [...snip...]
>
> Thanks
> Dave
> [1]
> https://github.com/davidmalcolm/gcc-analyzer-integration-tests/issues/5
>
On Thu, 2023-08-03 at 15:54 +0100, Matthew Malcomson wrote:
> On 8/3/23 15:09, David Malcolm wrote:
> >
> > Hi Matthew. I recently touched the timevar code (in r14-2881-
> > g75d623946d4b6e) to add support for serializing the timevar data in
> > JSON form as part of the SARIF output (PR analyzer/109361).
> >
> > Looking at your patch, it looks like the baseline for the patch
> > seems
> > to predate r14-2881-g75d623946d4b6e.
> >
> > I don't have a strong opinion on the implementation choices in your
> > patch, but please can you rebase to beyond my recent change and
> > make
> > sure that the SARIF serialization still works with your patch.
> >
> > Specifically, please try compiling with
> > -ftime-report -fdiagnostics-format=sarif-file
> > and have a look at the generated .sarif file, e.g. via
> > python -m json.tool foo.c.sarif
> > which will pretty-print the JSON to stdout.
> >
> > Currently I'm writing out the values as floating-point seconds, and
> > AFAIK my analyzer integration testsuite [1] is the only consumer of
> > this data.
>
> Hi David,
>
> Thanks for the heads-up. Will update the patch.
>
> I read your last paragraph as suggesting that you'd be open to
> changing
> the format. Is that correct?
I suppose, but I'd prefer to keep the existing format.
>
> I would initially assume that writing out the time as floating-point
> seconds would still be most convenient for your use since it looks to
> be
> like something to be presented to a person.
Yes. I may be biased in that with -fanalyzer the times tend to be
measured in seconds rather than fractions of seconds, alas.
>
> However, since I don't know much about the intended uses of SARIF in
> general I figured I should double-check -- does that choice to remain
> printing out floating-point seconds seem best to you?
I'd prefer to keep the JSON output as floating-point seconds, if that's
not too much of a pain.
Dave
>
> >
> > [...snip...]
> >
> > Thanks
> > Dave
> > [1]
> > https://github.com/davidmalcolm/gcc-analyzer-integration-tests/issues/5
> >
>
On Thu, 3 Aug 2023, Matthew Malcomson wrote:
> >
> > I think this is undesriable. With fused you mean we use FMA?
> > I think you could use -ffp-contract=off for the TU instead.
> >
> > Note you can't use __attribute__((noinline)) literally since the
> > host compiler might not support this.
> >
> > Richard.
> >
>
>
> Trying to make the timevar store integral time intervals.
> Hope this is acceptable -- I had originally planned to use
> `-ffp-contract` as agreed until I saw the email mentioning the old x86
> bug in the same area which was not to do with floating point contraction
> of operations (PR 99903) and figured it would be better to try and solve
> both at the same time while making things in general a bit more robust.
> ________________________________________________________________________
>
>
> On some AArch64 bootstrapped builds, we were getting a flaky test
> because the floating point operations in `get_time` were being fused
> with the floating point operations in `timevar_accumulate`.
>
> This meant that the rounding behaviour of our multiplication with
> `ticks_to_msec` was different when used in `timer::start` and when
> performed in `timer::stop`. These extra inaccuracies led to the
> testcase `g++.dg/ext/timevar1.C` being flaky on some hardware.
>
> ------------------------------
> Avoiding the inlining which was agreed to be undesirable. Three
> alternative approaches:
> 1) Use `-ffp-contract=on` to avoid this particular optimisation.
> 2) Adjusting the code so that the "tolerance" is always of the order of
> a "tick".
> 3) Recording times and elapsed differences in integral values.
> - Could be in terms of a standard measurement (e.g. nanoseconds or
> microseconds).
> - Could be in terms of whatever integral value ("ticks" /
> secondsµseconds / "clock ticks") is returned from the syscall
> chosen at configure time.
>
> While `-ffp-contract=on` removes the problem that I bumped into, there
> has been a similar bug on x86 that was to do with a different floating
> point problem that also happens after `get_time` and
> `timevar_accumulate` both being inlined into the same function. Hence
> it seems worth choosing a different approach.
>
> Of the two other solutions, recording measurements in integral values
> seems the most robust against slightly "off" measurements being
> presented to the user -- even though it could avoid the ICE that creates
> a flaky test.
>
> I considered storing time in whatever units our syscall returns and
> normalising them at the time we print out rather than normalising them
> to nanoseconds at the point we record our "current time". The logic
> being that normalisation could have some rounding affect (e.g. if
> TICKS_PER_SECOND is 3) that would be taken into account in calculations.
>
> I decided against it in order to give the values recorded in
> `timevar_time_def` some interpretive value so it's easier to read the
> code. Compared to the small rounding that would represent a tiny amount
> of time and AIUI can not trigger the same kind of ICE's as we are
> attempting to fix, said interpretive value seems more valuable.
>
> Recording time in microseconds seemed reasonable since all obvious
> values for ticks and `getrusage` are at microsecond granularity or less
> precise. That said, since TICKS_PER_SECOND and CLOCKS_PER_SEC are both
> variables given to use by the host system I was not sure of that enough
> to make this decision.
>
> ------------------------------
> timer::all_zero is ignoring rows which are inconsequential to the user
> and would be printed out as all zeros. Since upon printing rows we
> convert to the same double value and print out the same precision as
> before, we return true/false based on the same amount of time as before.
>
> timer::print_row casts to a floating point measurement in units of
> seconds as was printed out before.
>
> timer::validate_phases -- I'm printing out nanoseconds here rather than
> floating point seconds since this is an error message for when things
> have "gone wrong" printing out the actual nanoseconds that have been
> recorded seems like the best approach.
> N.b. since we now print out nanoseconds instead of floating point value
> the padding requirements are different. Originally we were padding to
> 24 characters and printing 18 decimal places. This looked odd with the
> now visually smaller values getting printed. I judged 13 characters
> (corresponding to 2 hours) to be a reasonable point at which our
> alignment could start to degrade and this provides a more compact output
> for the majority of cases (checked by triggering the error case via
> GDB).
>
> ------------------------------
> N.b. I use a literal 1000000000 for "NANOSEC_PER_SEC". I believe this
> would fit in an integer on all hosts that GCC supports, but am not
> certain there are not strange integer sizes we support hence am pointing
> it out for special attention during review.
>
> ------------------------------
> No expected change in generated code.
> Bootstrapped and regtested on AArch64 with no regressions.
> Manually checked that flaky test is no longer flaky on the machine it
> was seen before.
The patch looks OK to me if it passes bootstrap / regtest and the
output of -ftime-report doesn't change (too much).
Thanks,
Richard.
> gcc/ChangeLog:
>
> PR middle-end/110316
> PR middle-end/9903
> * timevar.cc (NANOSEC_PER_SEC, TICKS_TO_NANOSEC,
> CLOCKS_TO_NANOSEC, nanosec_to_floating_sec, percent_of): New.
> (TICKS_TO_MSEC, CLOCKS_TO_MSEC): Remove these macros.
> (timer::validate_phases): Use integral arithmetic to check
> validity.
> (timer::print_row, timer::print): Convert from integral
> nanoseconds to floating point seconds before printing.
> (timer::all_zero): Change limit to nanosec count instead of
> fractional count of seconds.
> * timevar.h (struct timevar_time_def): Update all measurements
> to use uint64_t nanoseconds rather than seconds stored in a
> double.
>
>
>
> ############### Attachment also inlined for ease of reply ###############
>
>
> diff --git a/gcc/timevar.h b/gcc/timevar.h
> index ad46573160957c6461190204c5181ec9e904e204..882954a527e11a90873f25a45e9073149709dd05 100644
> --- a/gcc/timevar.h
> +++ b/gcc/timevar.h
> @@ -44,7 +44,7 @@
> */
>
> /* This structure stores the various varieties of time that can be
> - measured. Times are stored in seconds. The time may be an
> + measured. Times are stored in nanoseconds. The time may be an
> absolute time or a time difference; in the former case, the time
> base is undefined, except that the difference between two times
> produces a valid time difference. */
> @@ -52,14 +52,13 @@
> struct timevar_time_def
> {
> /* User time in this process. */
> - double user;
> + uint64_t user;
>
> - /* System time (if applicable for this host platform) in this
> - process. */
> - double sys;
> + /* System time (if applicable for this host platform) in this process. */
> + uint64_t sys;
>
> /* Wall clock time. */
> - double wall;
> + uint64_t wall;
>
> /* Garbage collector memory. */
> size_t ggc_mem;
> diff --git a/gcc/timevar.cc b/gcc/timevar.cc
> index d695297aae7f6b2a6de01a37fe86c2a232338df0..c6094b1c336c30e5a404ff8fd5c79795edbd59f7 100644
> --- a/gcc/timevar.cc
> +++ b/gcc/timevar.cc
> @@ -42,7 +42,7 @@ struct tms
> # define RUSAGE_SELF 0
> #endif
>
> -/* Calculation of scale factor to convert ticks to microseconds.
> +/* Calculation of scale factor to convert ticks to seconds.
> We mustn't use CLOCKS_PER_SEC except with clock(). */
> #if HAVE_SYSCONF && defined _SC_CLK_TCK
> # define TICKS_PER_SECOND sysconf (_SC_CLK_TCK) /* POSIX 1003.1-1996 */
> @@ -91,14 +91,15 @@ struct tms
> the underlying constants, and that can be very slow, so we have to
> precompute them. Whose wonderful idea was it to make all those
> _constants_ variable at run time, anyway? */
> +#define NANOSEC_PER_SEC 1000000000
> #ifdef USE_TIMES
> -static double ticks_to_msec;
> -#define TICKS_TO_MSEC (1 / (double)TICKS_PER_SECOND)
> +static uint64_t ticks_to_nanosec;
> +#define TICKS_TO_NANOSEC (NANOSEC_PER_SEC / TICKS_PER_SECOND)
> #endif
>
> #ifdef USE_CLOCK
> -static double clocks_to_msec;
> -#define CLOCKS_TO_MSEC (1 / (double)CLOCKS_PER_SEC)
> +static uint64_t clocks_to_nanosec;
> +#define CLOCKS_TO_NANOSEC (NANOSEC_PER_SEC / CLOCKS_PER_SEC)
> #endif
>
> /* Non-NULL if timevars should be used. In GCC, this happens with
> @@ -222,18 +223,20 @@ get_time (struct timevar_time_def *now)
> {
> #ifdef USE_TIMES
> struct tms tms;
> - now->wall = times (&tms) * ticks_to_msec;
> - now->user = tms.tms_utime * ticks_to_msec;
> - now->sys = tms.tms_stime * ticks_to_msec;
> + now->wall = times (&tms) * ticks_to_nanosec;
> + now->user = tms.tms_utime * ticks_to_nanosec;
> + now->sys = tms.tms_stime * ticks_to_nanosec;
> #endif
> #ifdef USE_GETRUSAGE
> struct rusage rusage;
> getrusage (RUSAGE_SELF, &rusage);
> - now->user = rusage.ru_utime.tv_sec + rusage.ru_utime.tv_usec * 1e-6;
> - now->sys = rusage.ru_stime.tv_sec + rusage.ru_stime.tv_usec * 1e-6;
> + now->user = rusage.ru_utime.tv_sec * NANOSEC_PER_SEC
> + + rusage.ru_utime.tv_usec * 1000;
> + now->sys = rusage.ru_stime.tv_sec * NANOSEC_PER_SEC
> + + rusage.ru_stime.tv_usec * 1000;
> #endif
> #ifdef USE_CLOCK
> - now->user = clock () * clocks_to_msec;
> + now->user = clock () * clocks_to_nanosec;
> #endif
> }
> }
> @@ -271,10 +274,10 @@ timer::timer () :
> /* Initialize configuration-specific state.
> Ideally this would be one-time initialization. */
> #ifdef USE_TIMES
> - ticks_to_msec = TICKS_TO_MSEC;
> + ticks_to_nanosec = TICKS_TO_NANOSEC;
> #endif
> #ifdef USE_CLOCK
> - clocks_to_msec = CLOCKS_TO_MSEC;
> + clocks_to_nanosec = CLOCKS_TO_NANOSEC;
> #endif
> }
>
> @@ -583,12 +586,11 @@ timer::validate_phases (FILE *fp) const
> {
> unsigned int /* timevar_id_t */ id;
> const timevar_time_def *total = &m_timevars[TV_TOTAL].elapsed;
> - double phase_user = 0.0;
> - double phase_sys = 0.0;
> - double phase_wall = 0.0;
> + uint64_t phase_user = 0;
> + uint64_t phase_sys = 0;
> + uint64_t phase_wall = 0;
> size_t phase_ggc_mem = 0;
> static char phase_prefix[] = "phase ";
> - const double tolerance = 1.000001; /* One part in a million. */
>
> for (id = 0; id < (unsigned int) TIMEVAR_LAST; ++id)
> {
> @@ -607,26 +609,32 @@ timer::validate_phases (FILE *fp) const
> }
> }
>
> - if (phase_user > total->user * tolerance
> - || phase_sys > total->sys * tolerance
> - || phase_wall > total->wall * tolerance
> - || phase_ggc_mem > total->ggc_mem * tolerance)
> + if (phase_user > total->user
> + || phase_sys > total->sys
> + || phase_wall > total->wall
> + || phase_ggc_mem > total->ggc_mem)
> {
>
> fprintf (fp, "Timing error: total of phase timers exceeds total time.\n");
> if (phase_user > total->user)
> - fprintf (fp, "user %24.18e > %24.18e\n", phase_user, total->user);
> + fprintf (fp, "user %13" PRIu64 " > %13" PRIu64 "\n",
> + phase_user, total->user);
> if (phase_sys > total->sys)
> - fprintf (fp, "sys %24.18e > %24.18e\n", phase_sys, total->sys);
> + fprintf (fp, "sys %13" PRIu64 " > %13" PRIu64 "\n",
> + phase_sys, total->sys);
> if (phase_wall > total->wall)
> - fprintf (fp, "wall %24.18e > %24.18e\n", phase_wall, total->wall);
> + fprintf (fp, "wall %13" PRIu64 " > %13" PRIu64 "\n",
> + phase_wall, total->wall);
> if (phase_ggc_mem > total->ggc_mem)
> - fprintf (fp, "ggc_mem %24lu > %24lu\n", (unsigned long)phase_ggc_mem,
> + fprintf (fp, "ggc_mem %13lu > %13lu\n", (unsigned long)phase_ggc_mem,
> (unsigned long)total->ggc_mem);
> gcc_unreachable ();
> }
> }
>
> +#define nanosec_to_floating_sec(NANO) ((double)(NANO) * 1e-9)
> +#define percent_of(TOTAL, SUBTOTAL) \
> + ((TOTAL) == 0 ? 0 : ((double)SUBTOTAL / TOTAL) * 100)
> /* Helper function for timer::print. */
> void
> @@ -640,22 +648,22 @@ timer::print_row (FILE *fp,
> #ifdef HAVE_USER_TIME
> /* Print user-mode time for this process. */
> fprintf (fp, "%7.2f (%3.0f%%)",
> - elapsed.user,
> - (total->user == 0 ? 0 : elapsed.user / total->user) * 100);
> + nanosec_to_floating_sec (elapsed.user),
> + percent_of (total->user, elapsed.user));
> #endif /* HAVE_USER_TIME */
>
> #ifdef HAVE_SYS_TIME
> /* Print system-mode time for this process. */
> fprintf (fp, "%7.2f (%3.0f%%)",
> - elapsed.sys,
> - (total->sys == 0 ? 0 : elapsed.sys / total->sys) * 100);
> + nanosec_to_floating_sec (elapsed.sys),
> + percent_of (total->sys, elapsed.sys));
> #endif /* HAVE_SYS_TIME */
>
> #ifdef HAVE_WALL_TIME
> /* Print wall clock time elapsed. */
> fprintf (fp, "%7.2f (%3.0f%%)",
> - elapsed.wall,
> - (total->wall == 0 ? 0 : elapsed.wall / total->wall) * 100);
> + nanosec_to_floating_sec (elapsed.wall),
> + percent_of (total->wall, elapsed.wall));
> #endif /* HAVE_WALL_TIME */
>
> /* Print the amount of ggc memory allocated. */
> @@ -673,7 +681,8 @@ timer::print_row (FILE *fp,
> bool
> timer::all_zero (const timevar_time_def &elapsed)
> {
> - const double tiny = 5e-3;
> + /* 5000000 nanosec == 5e-3 seconds. */
> + uint64_t tiny = 5000000;
> return (elapsed.user < tiny
> && elapsed.sys < tiny
> && elapsed.wall < tiny
> @@ -766,13 +775,13 @@ timer::print (FILE *fp)
> /* Print total time. */
> fprintf (fp, " %-35s:", "TOTAL");
> #ifdef HAVE_USER_TIME
> - fprintf (fp, "%7.2f ", total->user);
> + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user));
> #endif
> #ifdef HAVE_SYS_TIME
> - fprintf (fp, "%8.2f ", total->sys);
> + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys));
> #endif
> #ifdef HAVE_WALL_TIME
> - fprintf (fp, "%8.2f ", total->wall);
> + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall));
> #endif
> fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem));
>
> @@ -790,6 +799,8 @@ timer::print (FILE *fp)
>
> validate_phases (fp);
> }
> +#undef nanosec_to_floating_sec
> +#undef percent_of
>
> /* Get the name of the topmost item. For use by jit for validating
> inputs to gcc_jit_timer_pop. */
>
>
>
>
@@ -44,7 +44,7 @@
*/
/* This structure stores the various varieties of time that can be
- measured. Times are stored in seconds. The time may be an
+ measured. Times are stored in nanoseconds. The time may be an
absolute time or a time difference; in the former case, the time
base is undefined, except that the difference between two times
produces a valid time difference. */
@@ -52,14 +52,13 @@
struct timevar_time_def
{
/* User time in this process. */
- double user;
+ uint64_t user;
- /* System time (if applicable for this host platform) in this
- process. */
- double sys;
+ /* System time (if applicable for this host platform) in this process. */
+ uint64_t sys;
/* Wall clock time. */
- double wall;
+ uint64_t wall;
/* Garbage collector memory. */
size_t ggc_mem;
@@ -42,7 +42,7 @@ struct tms
# define RUSAGE_SELF 0
#endif
-/* Calculation of scale factor to convert ticks to microseconds.
+/* Calculation of scale factor to convert ticks to seconds.
We mustn't use CLOCKS_PER_SEC except with clock(). */
#if HAVE_SYSCONF && defined _SC_CLK_TCK
# define TICKS_PER_SECOND sysconf (_SC_CLK_TCK) /* POSIX 1003.1-1996 */
@@ -91,14 +91,15 @@ struct tms
the underlying constants, and that can be very slow, so we have to
precompute them. Whose wonderful idea was it to make all those
_constants_ variable at run time, anyway? */
+#define NANOSEC_PER_SEC 1000000000
#ifdef USE_TIMES
-static double ticks_to_msec;
-#define TICKS_TO_MSEC (1 / (double)TICKS_PER_SECOND)
+static uint64_t ticks_to_nanosec;
+#define TICKS_TO_NANOSEC (NANOSEC_PER_SEC / TICKS_PER_SECOND)
#endif
#ifdef USE_CLOCK
-static double clocks_to_msec;
-#define CLOCKS_TO_MSEC (1 / (double)CLOCKS_PER_SEC)
+static uint64_t clocks_to_nanosec;
+#define CLOCKS_TO_NANOSEC (NANOSEC_PER_SEC / CLOCKS_PER_SEC)
#endif
/* Non-NULL if timevars should be used. In GCC, this happens with
@@ -222,18 +223,20 @@ get_time (struct timevar_time_def *now)
{
#ifdef USE_TIMES
struct tms tms;
- now->wall = times (&tms) * ticks_to_msec;
- now->user = tms.tms_utime * ticks_to_msec;
- now->sys = tms.tms_stime * ticks_to_msec;
+ now->wall = times (&tms) * ticks_to_nanosec;
+ now->user = tms.tms_utime * ticks_to_nanosec;
+ now->sys = tms.tms_stime * ticks_to_nanosec;
#endif
#ifdef USE_GETRUSAGE
struct rusage rusage;
getrusage (RUSAGE_SELF, &rusage);
- now->user = rusage.ru_utime.tv_sec + rusage.ru_utime.tv_usec * 1e-6;
- now->sys = rusage.ru_stime.tv_sec + rusage.ru_stime.tv_usec * 1e-6;
+ now->user = rusage.ru_utime.tv_sec * NANOSEC_PER_SEC
+ + rusage.ru_utime.tv_usec * 1000;
+ now->sys = rusage.ru_stime.tv_sec * NANOSEC_PER_SEC
+ + rusage.ru_stime.tv_usec * 1000;
#endif
#ifdef USE_CLOCK
- now->user = clock () * clocks_to_msec;
+ now->user = clock () * clocks_to_nanosec;
#endif
}
}
@@ -271,10 +274,10 @@ timer::timer () :
/* Initialize configuration-specific state.
Ideally this would be one-time initialization. */
#ifdef USE_TIMES
- ticks_to_msec = TICKS_TO_MSEC;
+ ticks_to_nanosec = TICKS_TO_NANOSEC;
#endif
#ifdef USE_CLOCK
- clocks_to_msec = CLOCKS_TO_MSEC;
+ clocks_to_nanosec = CLOCKS_TO_NANOSEC;
#endif
}
@@ -583,12 +586,11 @@ timer::validate_phases (FILE *fp) const
{
unsigned int /* timevar_id_t */ id;
const timevar_time_def *total = &m_timevars[TV_TOTAL].elapsed;
- double phase_user = 0.0;
- double phase_sys = 0.0;
- double phase_wall = 0.0;
+ uint64_t phase_user = 0;
+ uint64_t phase_sys = 0;
+ uint64_t phase_wall = 0;
size_t phase_ggc_mem = 0;
static char phase_prefix[] = "phase ";
- const double tolerance = 1.000001; /* One part in a million. */
for (id = 0; id < (unsigned int) TIMEVAR_LAST; ++id)
{
@@ -607,26 +609,32 @@ timer::validate_phases (FILE *fp) const
}
}
- if (phase_user > total->user * tolerance
- || phase_sys > total->sys * tolerance
- || phase_wall > total->wall * tolerance
- || phase_ggc_mem > total->ggc_mem * tolerance)
+ if (phase_user > total->user
+ || phase_sys > total->sys
+ || phase_wall > total->wall
+ || phase_ggc_mem > total->ggc_mem)
{
fprintf (fp, "Timing error: total of phase timers exceeds total time.\n");
if (phase_user > total->user)
- fprintf (fp, "user %24.18e > %24.18e\n", phase_user, total->user);
+ fprintf (fp, "user %13" PRIu64 " > %13" PRIu64 "\n",
+ phase_user, total->user);
if (phase_sys > total->sys)
- fprintf (fp, "sys %24.18e > %24.18e\n", phase_sys, total->sys);
+ fprintf (fp, "sys %13" PRIu64 " > %13" PRIu64 "\n",
+ phase_sys, total->sys);
if (phase_wall > total->wall)
- fprintf (fp, "wall %24.18e > %24.18e\n", phase_wall, total->wall);
+ fprintf (fp, "wall %13" PRIu64 " > %13" PRIu64 "\n",
+ phase_wall, total->wall);
if (phase_ggc_mem > total->ggc_mem)
- fprintf (fp, "ggc_mem %24lu > %24lu\n", (unsigned long)phase_ggc_mem,
+ fprintf (fp, "ggc_mem %13lu > %13lu\n", (unsigned long)phase_ggc_mem,
(unsigned long)total->ggc_mem);
gcc_unreachable ();
}
}
+#define nanosec_to_floating_sec(NANO) ((double)(NANO) * 1e-9)
+#define percent_of(TOTAL, SUBTOTAL) \
+ ((TOTAL) == 0 ? 0 : ((double)SUBTOTAL / TOTAL) * 100)
/* Helper function for timer::print. */
void
@@ -640,22 +648,22 @@ timer::print_row (FILE *fp,
#ifdef HAVE_USER_TIME
/* Print user-mode time for this process. */
fprintf (fp, "%7.2f (%3.0f%%)",
- elapsed.user,
- (total->user == 0 ? 0 : elapsed.user / total->user) * 100);
+ nanosec_to_floating_sec (elapsed.user),
+ percent_of (total->user, elapsed.user));
#endif /* HAVE_USER_TIME */
#ifdef HAVE_SYS_TIME
/* Print system-mode time for this process. */
fprintf (fp, "%7.2f (%3.0f%%)",
- elapsed.sys,
- (total->sys == 0 ? 0 : elapsed.sys / total->sys) * 100);
+ nanosec_to_floating_sec (elapsed.sys),
+ percent_of (total->sys, elapsed.sys));
#endif /* HAVE_SYS_TIME */
#ifdef HAVE_WALL_TIME
/* Print wall clock time elapsed. */
fprintf (fp, "%7.2f (%3.0f%%)",
- elapsed.wall,
- (total->wall == 0 ? 0 : elapsed.wall / total->wall) * 100);
+ nanosec_to_floating_sec (elapsed.wall),
+ percent_of (total->wall, elapsed.wall));
#endif /* HAVE_WALL_TIME */
/* Print the amount of ggc memory allocated. */
@@ -673,7 +681,8 @@ timer::print_row (FILE *fp,
bool
timer::all_zero (const timevar_time_def &elapsed)
{
- const double tiny = 5e-3;
+ /* 5000000 nanosec == 5e-3 seconds. */
+ uint64_t tiny = 5000000;
return (elapsed.user < tiny
&& elapsed.sys < tiny
&& elapsed.wall < tiny
@@ -766,13 +775,13 @@ timer::print (FILE *fp)
/* Print total time. */
fprintf (fp, " %-35s:", "TOTAL");
#ifdef HAVE_USER_TIME
- fprintf (fp, "%7.2f ", total->user);
+ fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user));
#endif
#ifdef HAVE_SYS_TIME
- fprintf (fp, "%8.2f ", total->sys);
+ fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys));
#endif
#ifdef HAVE_WALL_TIME
- fprintf (fp, "%8.2f ", total->wall);
+ fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall));
#endif
fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem));
@@ -790,6 +799,8 @@ timer::print (FILE *fp)
validate_phases (fp);
}
+#undef nanosec_to_floating_sec
+#undef percent_of
/* Get the name of the topmost item. For use by jit for validating
inputs to gcc_jit_timer_pop. */