[0/3,v3] genmatch: Speed up recompilation after changes to match.pd

Message ID 20230803142131.250087-1-andrzej.turko@gmail.com
Headers
Series genmatch: Speed up recompilation after changes to match.pd |

Message

Andrzej Turko Aug. 3, 2023, 2:21 p.m. UTC
  The following reduces the number of object files that need to be rebuilt
after match.pd has been modified. Right now a change to match.pd which
adds/removes a line almost always forces recompilation of all files that
genmatch generates from it. This is because of unnecessary changes to
the generated .cc files:

1. Function names and ordering change as does the way the functions are
        distributed across multiple source files.
2. Code locations from match.pd are quoted directly (including line
        numbers) by logging fprintf calls.

This patch addresses the those issues without changing the behaviour
of the generated code. The first one is solved by making sure that minor
changes to match.pd do not influence the order in which functions are
generated. The second one by using a lookup table with line numbers.

Now a change to a single function will trigger a rebuild of 4 object
files (one with the function  and the one with the lookup table both for
gimple and generic) instead all of them (20 by default).
For reference, this decreased the rebuild time with 48 threads from 3.5
minutes to 1.5 minutes on my machine.

V2:
        * Placed the change in Makefile.in in the correct commit.
        * Used a separate logging function to reduce size of the
        executable.

V3:
	* Fix a bug from 'genmatch: Log line numbers indirectly',
	which was introduced in V2.
       

As for Richard Biener's remarks on executable size (cc1plus):

1. The first version of the change decreased (sic!) the executable size
	by approximately 120 kB (.text and .data sections grew by
	correspondingly 14 and 2 kB, but .debug_info section shrank by
	roughly 170 kB).
2. In the current version (V3) the binary size increases by 36 kB (.text
	grows by 3 kB and .rodata by 14 kB, the rest of the increase can
	be mostly attributed to debug sections).

One can choose between those variants just by taking the third commit
either from this or the first version of the patch series.


Possible optimization:

Currently, the lookup table for line numbers contains duplicate values.
If I remove them, the table would shrink by 40-50% reducing the increase
in .data sections. Is it worth pursuing? And if so, would it be better
if I integrate this into this patch series or implement it separately?
Also, can I assume that genmatch is producing source code using a single
input file per invocation? Currently, this is the case.


Note for reviewers: I do not have write access.


Andrzej Turko (3):
  Support get_or_insert in ordered_hash_map
  genmatch: Reduce variability of generated code
  genmatch: Log line numbers indirectly

 gcc/Makefile.in               |  4 +-
 gcc/genmatch.cc               | 92 +++++++++++++++++++++++++++++------
 gcc/ordered-hash-map-tests.cc | 19 ++++++--
 gcc/ordered-hash-map.h        | 26 ++++++++++
 4 files changed, 119 insertions(+), 22 deletions(-)