From patchwork Thu Nov 3 13:43:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 14878 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp539432wru; Thu, 3 Nov 2022 06:44:05 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6bEyagHeXpNnlY7Vz975QNiDqr5SQ7k4EdpBDRtlH15LIp0x/g+G+qnwdrtk/KvultaD9l X-Received: by 2002:a17:907:c60f:b0:7ae:15bf:957e with SMTP id ud15-20020a170907c60f00b007ae15bf957emr3909949ejc.621.1667483045420; Thu, 03 Nov 2022 06:44:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667483045; cv=none; d=google.com; s=arc-20160816; b=VdbyhNbbc2rGGavNyNzdGH8pA2bVqgKHYnLWO1OE+nCTODV7+W9i1bRLdeOfnX9fUg AIa6HaJAgqQ/0RLWJwS8b32aSa4v3rXTPDzNVwLWj4LQ6mRAQOvcsqfGU8haVOWFXxc9 nnyNhiqc6VO+AFrVVXb9Pwnu5tZcNcwIKfBxq7aocP5ZQk8usL8Br0Iz8hW45dtts94x XqzhqwfFPsdmpY9qC4MzgvtvALi+7YdPXynNGLl3vYoRYsGVUkIPFvi/hUQ1Fk15A80J 03IjPCY74ORXl2SnVAqrWP0vATc+77fvp1kz30ktK6AAz8gxIuVNCS1TmIoldAVRQPJt S5lQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence:subject :to:content-language:user-agent:mime-version:date:message-id :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=CHwx5C+AFbL4xK4gB8iZtXG2z2DCKjcIvmhG/tAr+P4=; b=EtWpIM7gVsTgxajiLfgNKkyj8f1lHKczzxwDFbz3kytn6AYZ1ZVP+TQ95f3vkOJHUH tnJpCaQ2gnYmdW9Al4hHMcBGprRaw+xc3LlRQGIKZJOSIIibd3f9MD/L2EPhDVDPWeYO uVBQ68WRX+9gu5+ziN7pzw7AkDpzihjcjgLfonogJIcHmPs4xh1BhRl4p92bB2N901Cc maJ67YLP3WCKUTZD1ObMVy/jc3hmY8bjhE1VUg3utQdlCBYkdW12pZG0W1gyMc0Jt+on H/ROtQhNMamllDXmVs1QvtdgMcg+VHUChyTTCUz8aWpqGJaUH5sbXHAb3KXrEtO6jLlX yVZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TDlR2MKz; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id hd10-20020a170907968a00b00777be437681si1365646ejc.984.2022.11.03.06.44.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Nov 2022 06:44:05 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TDlR2MKz; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 18EF73858C2F for ; Thu, 3 Nov 2022 13:44:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 18EF73858C2F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667483044; bh=CHwx5C+AFbL4xK4gB8iZtXG2z2DCKjcIvmhG/tAr+P4=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=TDlR2MKzFhlowQ+lMGvO8pZJeYWUCWh1mCFkRQavsAhu3NleHFIn2EwqLNxV5gyPp PazKXC7HuwoUrlRBXeDmDWcChZD5/J6ALhnp+KDxmQ6rA5sYps2duWBGjb19glLspL 8FYv4jYFutIt+szDRAGGRxkjpIgrJkukyAxr1rB4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 634283858D39 for ; Thu, 3 Nov 2022 13:43:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 634283858D39 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2308C1FB; Thu, 3 Nov 2022 06:43:24 -0700 (PDT) Received: from [10.57.7.219] (unknown [10.57.7.219]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D7C033F534; Thu, 3 Nov 2022 06:43:16 -0700 (PDT) Message-ID: <785436fa-0ef9-e424-030d-f7b2bdf9c935@arm.com> Date: Thu, 3 Nov 2022 13:43:06 +0000 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.4.0 Content-Language: en-US To: "gcc-patches@gcc.gnu.org" Subject: [PATCH] ifcvt: Support bitfield lowering of multiple-exit loops X-Spam-Status: No, score=-16.9 required=5.0 tests=BAYES_00, BODY_8BITS, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford , Richard Biener Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748482702322464626?= X-GMAIL-MSGID: =?utf-8?q?1748482702322464626?= Hi, With Tamar's patch (https://gcc.gnu.org/pipermail/gcc-patches/2022-November/604880.html) enabling the vectorization of early-breaks, I'd like to allow bitfield lowering in such loops, which requires the relaxation of allowing multiple exits when doing so.  In order to avoid a similar issue to PR107275, I hoisted the code that rejects loops with certain types of gimple_stmts from 'if_convertible_loop_p_1' to 'get_loop_body_in_if_conv_order', to avoid trying to lower bitfields in loops we are not going to vectorize anyway.  This also ensures 'ifcvt_local_dce' doesn't accidentally remove statements it shouldn't as it will never come across them.  I made sure to add a comment to make clear that there is a direct connection between the two and if we were to enable vectorization of any other gimple statement we should make sure both handle it. Bootstrapped and regression tested on aarch64-none-linux-gnu and x86_64-pc-linux-gnu gcc/ChangeLog:         * tree-if-conv.cc (if_convertible_loop_p_1): Move statement check loop from here ...         (get_loop_body_if_conv_order): ... to here.         (if_convertible_loop_p): Remove single_exit check.         (tree_if_conversion): Move single_exit check to if-conversion part. gcc/testsuite/ChangeLog:         * gcc.dg/vect/vect-bitfield-read-1-not.c: New test.         * gcc.dg/vect/vect-bitfield-read-2-not.c: New test.         * gcc.dg/vect/vect-bitfield-read-8.c: New test.         * gcc.dg/vect/vect-bitfield-read-9.c: New test. diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1-not.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1-not.c new file mode 100644 index 0000000000000000000000000000000000000000..0d91067ebb27b1db2b2352975c43bce8b4171e3f --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-1-not.c @@ -0,0 +1,60 @@ +/* { dg-require-effective-target vect_shift } */ +/* { dg-require-effective-target vect_long_long } */ +/* { dg-additional-options { "-fdump-tree-ifcvt-all" } } */ + +#include +#include "tree-vect.h" + +extern void abort(void); + +struct s { + char a : 4; +}; + +#define N 32 +#define ELT0 {0} +#define ELT1 {1} +#define ELT2 {2} +#define ELT3 {3} +#define RES 56 +struct s A[N] + = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3}; + +int __attribute__ ((noipa)) +f(struct s *ptr, unsigned n) { + int res = 0; + for (int i = 0; i < n; ++i) + { + switch (ptr[i].a) + { + case 0: + res += ptr[i].a + 1; + break; + case 1: + case 2: + case 3: + res += ptr[i].a; + break; + default: + return 0; + } + } + return res; +} + +int main (void) +{ + check_vect (); + + if (f(&A[0], N) != RES) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-not "Bitfield OK to lower." "ifcvt" } } */ + + diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2-not.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2-not.c new file mode 100644 index 0000000000000000000000000000000000000000..4ac7b3fc0dfd1c9d0b5e94a2ba6a745545577ec1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-2-not.c @@ -0,0 +1,49 @@ +/* { dg-require-effective-target vect_shift } */ +/* { dg-require-effective-target vect_long_long } */ +/* { dg-additional-options { "-fdump-tree-ifcvt-all" } } */ + +#include +#include "tree-vect.h" + +extern void abort(void); + +struct s { + char a : 4; +}; + +#define N 32 +#define ELT0 {0} +#define ELT1 {1} +#define ELT2 {2} +#define ELT3 {3} +#define RES 48 +struct s A[N] + = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3}; + +int __attribute__ ((noipa)) +f(struct s *ptr, unsigned n) { + int res = 0; + for (int i = 0; i < n; ++i) + { + asm volatile ("" ::: "memory"); + res += ptr[i].a; + } + return res; +} + +int main (void) +{ + check_vect (); + + if (f(&A[0], N) != RES) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-not "Bitfield OK to lower." "ifcvt" } } */ + + diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-8.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-8.c new file mode 100644 index 0000000000000000000000000000000000000000..52cfd33d937ae90f3fe9556716c90e098b768ac8 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-8.c @@ -0,0 +1,49 @@ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_shift } */ +/* { dg-additional-options { "-fdump-tree-ifcvt-all" } } */ + +#include +#include "tree-vect.h" + +extern void abort(void); + +struct s { int i : 31; }; + +#define ELT0 {0} +#define ELT1 {1} +#define ELT2 {2} +#define ELT3 {3} +#define ELT4 {4} +#define N 32 +#define RES 25 +struct s A[N] + = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT4, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3}; + +int __attribute__ ((noipa)) +f(struct s *ptr, unsigned n) { + int res = 0; + for (int i = 0; i < n; ++i) + { + if (ptr[i].i == 4) + return res; + res += ptr[i].i; + } + + return res; +} + +int main (void) +{ + check_vect (); + + if (f(&A[0], N) != RES) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump "Bitfield OK to lower." "ifcvt" } } */ + diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-9.c b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-9.c new file mode 100644 index 0000000000000000000000000000000000000000..ab814698131a5905def181eeed85d8a3c62b924b --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-9.c @@ -0,0 +1,51 @@ +/* { dg-require-effective-target vect_shift } */ +/* { dg-require-effective-target vect_long_long } */ +/* { dg-additional-options { "-fdump-tree-ifcvt-all" } } */ + +#include +#include "tree-vect.h" + +extern void abort(void); + +struct s { + unsigned i : 31; + char a : 4; +}; + +#define N 32 +#define ELT0 {0x7FFFFFFFUL, 0} +#define ELT1 {0x7FFFFFFFUL, 1} +#define ELT2 {0x7FFFFFFFUL, 2} +#define ELT3 {0x7FFFFFFFUL, 3} +#define ELT4 {0x7FFFFFFFUL, 4} +#define RES 9 +struct s A[N] + = { ELT0, ELT4, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3, + ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3}; + +int __attribute__ ((noipa)) +f(struct s *ptr, unsigned n) { + int res = 0; + for (int i = 0; i < n; ++i) + { + if (ptr[i].a) + return 9; + res += ptr[i].a; + } + return res; +} + +int main (void) +{ + check_vect (); + + if (f(&A[0], N) != RES) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump "Bitfield OK to lower." "ifcvt" } } */ + diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc index a83b013d2ad3d9066b1c2bc62282ec054483598f..639cd91f61517b4cadd50cca253eec3ccdf79b8d 100644 --- a/gcc/tree-if-conv.cc +++ b/gcc/tree-if-conv.cc @@ -1257,6 +1257,31 @@ get_loop_body_in_if_conv_order (const class loop *loop) } free (blocks_in_bfs_order); BITMAP_FREE (visited); + + /* Go through loop and reject if-conversion or lowering of bitfields if we + encounter statements we do not believe the vectorizer will be able to + handle. If adding a new type of statement here, make sure + 'ifcvt_local_dce' is also able to handle it propertly. */ + for (index = 0; index < loop->num_nodes; index++) + { + basic_block bb = blocks[index]; + gimple_stmt_iterator gsi; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + switch (gimple_code (gsi_stmt (gsi))) + { + case GIMPLE_LABEL: + case GIMPLE_ASSIGN: + case GIMPLE_CALL: + case GIMPLE_DEBUG: + case GIMPLE_COND: + gimple_set_uid (gsi_stmt (gsi), 0); + break; + default: + free (blocks); + return NULL; + } + } return blocks; } @@ -1429,26 +1454,6 @@ if_convertible_loop_p_1 (class loop *loop, vec *refs) exit_bb = bb; } - for (i = 0; i < loop->num_nodes; i++) - { - basic_block bb = ifc_bbs[i]; - gimple_stmt_iterator gsi; - - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) - switch (gimple_code (gsi_stmt (gsi))) - { - case GIMPLE_LABEL: - case GIMPLE_ASSIGN: - case GIMPLE_CALL: - case GIMPLE_DEBUG: - case GIMPLE_COND: - gimple_set_uid (gsi_stmt (gsi), 0); - break; - default: - return false; - } - } - data_reference_p dr; innermost_DR_map @@ -1560,14 +1565,6 @@ if_convertible_loop_p (class loop *loop, vec *refs) return false; } - /* More than one loop exit is too much to handle. */ - if (!single_exit (loop)) - { - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "multiple exits\n"); - return false; - } - /* If one of the loop header's edge is an exit edge then do not apply if-conversion. */ FOR_EACH_EDGE (e, ei, loop->header->succs) @@ -3510,9 +3507,6 @@ tree_if_conversion (class loop *loop, vec *preds) aggressive_if_conv = true; } - if (!single_exit (loop)) - goto cleanup; - /* If there are more than two BBs in the loop then there is at least one if to convert. */ if (loop->num_nodes > 2 @@ -3532,15 +3526,25 @@ tree_if_conversion (class loop *loop, vec *preds) if (loop->num_nodes > 2) { - need_to_ifcvt = true; + /* More than one loop exit is too much to handle. */ + if (!single_exit (loop)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "Can not ifcvt due to multiple exits\n"); + } + else + { + need_to_ifcvt = true; - if (!if_convertible_loop_p (loop, &refs) || !dbg_cnt (if_conversion_tree)) - goto cleanup; + if (!if_convertible_loop_p (loop, &refs) + || !dbg_cnt (if_conversion_tree)) + goto cleanup; - if ((need_to_predicate || any_complicated_phi) - && ((!flag_tree_loop_vectorize && !loop->force_vectorize) - || loop->dont_vectorize)) - goto cleanup; + if ((need_to_predicate || any_complicated_phi) + && ((!flag_tree_loop_vectorize && !loop->force_vectorize) + || loop->dont_vectorize)) + goto cleanup; + } } if ((flag_tree_loop_vectorize || loop->force_vectorize) @@ -3631,7 +3635,8 @@ tree_if_conversion (class loop *loop, vec *preds) PHIs, those are to be kept in sync with the non-if-converted copy. ??? We'll still keep dead stores though. */ exit_bbs = BITMAP_ALLOC (NULL); - bitmap_set_bit (exit_bbs, single_exit (loop)->dest->index); + for (edge exit : get_loop_exit_edges (loop)) + bitmap_set_bit (exit_bbs, exit->dest->index); bitmap_set_bit (exit_bbs, loop->latch->index); std::pair *name_pair;