[4/4] contrib: Add dg-out-generator.pl

Message ID 20221210094303.2180127-5-arsen@aarsen.me
State Accepted
Headers
Series c++: Small tweaks to contracts |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Arsen Arsenović Dec. 10, 2022, 9:43 a.m. UTC
  This script is a helper used to generate dg-output lines from an existing
program output conveniently.  It takes care of escaping Tcl and ARE stuff.

contrib/ChangeLog:

	* dg-out-generator.pl: New file.
---
 contrib/dg-out-generator.pl | 67 +++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)
 create mode 100755 contrib/dg-out-generator.pl
  

Comments

Jason Merrill Dec. 15, 2022, 4:30 p.m. UTC | #1
On 12/10/22 04:43, Arsen Arsenović wrote:
> This script is a helper used to generate dg-output lines from an existing
> program output conveniently.  It takes care of escaping Tcl and ARE stuff.
> contrib/ChangeLog:
> 
> 	* dg-out-generator.pl: New file.
> ---
>   contrib/dg-out-generator.pl | 67 +++++++++++++++++++++++++++++++++++++
>   1 file changed, 67 insertions(+)
>   create mode 100755 contrib/dg-out-generator.pl
> 
> diff --git a/contrib/dg-out-generator.pl b/contrib/dg-out-generator.pl
> new file mode 100755
> index 00000000000..38aed2aa38d
> --- /dev/null
> +++ b/contrib/dg-out-generator.pl
> @@ -0,0 +1,67 @@
> +#!/usr/bin/env perl
> +#
> +# Copyright (C) 2022 GCC Contributors.
> +# Contributed by Arsen Arsenović.
> +#
> +# This script is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published by
> +# the Free Software Foundation; either version 3, or (at your option)
> +# any later version.
> +
> +# This script reads program output on STDIN, and out of it produces a block of
> +# dg-output lines that can be yanked at the end of a file.  It will escape
> +# special ARE and Tcl constructs automatically.
> +#
> +# Each argument passed on the standard input is treated as a string to be
> +# replaced by ``.*'' in the final result.  This is intended to mask out build
> +# paths, filenames, etc.
> +#
> +# Usage example:
> +
> +# $ g++-13 -fcontracts -o test \
> +#  'g++.dg/contracts/contracts-access1.C' && \
> +#   ./test |& dg-out-generator.pl 'g++.dg/contracts/contracts-access1.C'
> +# // { dg-output "contract violation in function Base::b at .*:11: pub > 0(\n|\r\n|\r)*" }
> +# // { dg-output "\\\[level:default, role:default, continuation mode:never\\\](\n|\r\n|\r)*" }
> +# // { dg-output "terminate called without an active exception(\n|\r\n|\r)*" }
> +# You can now freely dump the above into your testcase.
> +
> +use strict;
> +use warnings;
> +use POSIX 'floor';
> +
> +my $escapees = '(' . join ('|', map { quotemeta } @ARGV) . ')';
> +
> +sub gboundary($)
> +{
> +  my $str = shift;
> +  my $sz = 10.0;
> +  for (;;)
> +    {
> +      my $bnd = join '', (map chr 64 + rand 27, 1 .. floor $sz);
> +      return $bnd unless index ($str, $bnd) >= 0;
> +      $sz += 0.1;
> +    }
> +}
> +
> +while (<STDIN>)
> +  {
> +    # Escape our escapees.
> +    my $boundary = gboundary $_;
> +    s/$escapees/$boundary/;
> +
> +    # Quote stuff special in Tcl ARE.
> +    s/([[\]*+?{}()\\])/\\$1/g;
> +
> +    # Then, special stuff in TCL itself.
> +    s/([\][\\])/\\$1/g;
> +
> +    # Newlines should be more tolerant.
> +    s/\n$/(\\n|\\r\\n|\\r)*/;
> +
> +    # Then split out the boundary, replacing it with .*.
> +    s/$boundary/.*/;
> +
> +    # Then, let's print it in a dg-output block.
> +    print "// { dg-output \"$_\" }\n";

I wonder if you want to wrap the pattern in {} instead of "" so you 
don't need the "special stuff in TCL itself" quoting?

Jason
  
Arsen Arsenović Dec. 15, 2022, 5:30 p.m. UTC | #2
Hi Jason,

Jason Merrill <jason@redhat.com> writes:

> I wonder if you want to wrap the pattern in {} instead of "" so you don't need
> the "special stuff in TCL itself" quoting?

{}s lack generality, for instance, try: puts {unbalanced \}}.  I could
try to write a revision that complies with the minimal escaping style
when I take the opportunity to address your other comment.

(also, it just occurred to me that I forgot to escape dollar signs)

Thanks, have a great day.
  

Patch

diff --git a/contrib/dg-out-generator.pl b/contrib/dg-out-generator.pl
new file mode 100755
index 00000000000..38aed2aa38d
--- /dev/null
+++ b/contrib/dg-out-generator.pl
@@ -0,0 +1,67 @@ 
+#!/usr/bin/env perl
+#
+# Copyright (C) 2022 GCC Contributors.
+# Contributed by Arsen Arsenović.
+#
+# This script is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+
+# This script reads program output on STDIN, and out of it produces a block of
+# dg-output lines that can be yanked at the end of a file.  It will escape
+# special ARE and Tcl constructs automatically.
+#
+# Each argument passed on the standard input is treated as a string to be
+# replaced by ``.*'' in the final result.  This is intended to mask out build
+# paths, filenames, etc.
+#
+# Usage example:
+
+# $ g++-13 -fcontracts -o test \
+#  'g++.dg/contracts/contracts-access1.C' && \
+#   ./test |& dg-out-generator.pl 'g++.dg/contracts/contracts-access1.C'
+# // { dg-output "contract violation in function Base::b at .*:11: pub > 0(\n|\r\n|\r)*" }
+# // { dg-output "\\\[level:default, role:default, continuation mode:never\\\](\n|\r\n|\r)*" }
+# // { dg-output "terminate called without an active exception(\n|\r\n|\r)*" }
+# You can now freely dump the above into your testcase.
+
+use strict;
+use warnings;
+use POSIX 'floor';
+
+my $escapees = '(' . join ('|', map { quotemeta } @ARGV) . ')';
+
+sub gboundary($)
+{
+  my $str = shift;
+  my $sz = 10.0;
+  for (;;)
+    {
+      my $bnd = join '', (map chr 64 + rand 27, 1 .. floor $sz);
+      return $bnd unless index ($str, $bnd) >= 0;
+      $sz += 0.1;
+    }
+}
+
+while (<STDIN>)
+  {
+    # Escape our escapees.
+    my $boundary = gboundary $_;
+    s/$escapees/$boundary/;
+
+    # Quote stuff special in Tcl ARE.
+    s/([[\]*+?{}()\\])/\\$1/g;
+
+    # Then, special stuff in TCL itself.
+    s/([\][\\])/\\$1/g;
+
+    # Newlines should be more tolerant.
+    s/\n$/(\\n|\\r\\n|\\r)*/;
+
+    # Then split out the boundary, replacing it with .*.
+    s/$boundary/.*/;
+
+    # Then, let's print it in a dg-output block.
+    print "// { dg-output \"$_\" }\n";
+  }