Message ID | cover.1686573640.git.linkw@linux.ibm.com |
---|---|
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp252593vqr; Mon, 12 Jun 2023 19:09:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6fX0ZyaqtArQYGINEZubw4qYZwJjxeGs1FsZqLbm6GpoDb/wN616tDwmcgacUFhVK+PQ4k X-Received: by 2002:a17:907:a0d:b0:978:6fbf:869c with SMTP id bb13-20020a1709070a0d00b009786fbf869cmr11393493ejc.16.1686622184431; Mon, 12 Jun 2023 19:09:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686622184; cv=none; d=google.com; s=arc-20160816; b=M9HN2lYmdzeC6qbhbJBa/sfEs5WUaiowtSE/RCsmFOWmxOf+sUvuOteX+4JFgrwBxX o7EOhi3o57DhUj6pqbK2iMVxEGtEex9vKZenElpus9W97pitvEq0kWCTqwe5VOqeFTOa n3V2L2u6HlcLzdKuE/AwgtzZ4fiIeANAjrK5fglEHtXrPNs9xEsa6KKMG5l8yfEFQyBq hTTrmjCgSU5hDjSoQ3hNoOedHGT5NJkFHOC3gfIoqD0tkxbxBF3ZHMT7fnBJlK7xOQYY 1woZBgRk0TY0t3aX3D7uxaRrbphEGw7FTDR4hV74ebG5w7aq/LRYtVU+hDBmyU52XnX9 vSag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :content-transfer-encoding:message-id:date:subject:cc:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=keMRaTklrUXZZ7N7PDQTPbhklNbH7OG0DuYBAvO3G2A=; b=Qu8toUsEO6gTF2pIjx8nEd92/MB2t+ryGFlKrOoR7Biy9bneMa6n7rO4ZRr3XbWVVN 7pDVpieXaixrbjxzPh6wlCYV/0Krgy1H1OCDJZeHwEo8wwSYZ1QwbF0ye0ZjjIu4ORPh 8CV2LMn6hwRJQeJGhKgEIgkKGPtwbRgZG3PIvqFAlTC4rtE9jKGv1+Yfz9DFZ3wUDRwR bBQ9QFWB2gtjkwMcl42Wn5VLCGlvALHcbBT9G+rO7TzNysO2d1nG+Ce5dNAyGGKwP6Pt eQDuMTPCAtt1ndIvEw95BfEeSguHZHjAlrQtbvEiw/c37o2HuWTe63pEZ39sVRA3ycbA UHFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=kyHRw5Dk; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id q23-20020a170906941700b00978875dcc9asi6086322ejx.131.2023.06.12.19.09.44 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Jun 2023 19:09:44 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=kyHRw5Dk; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 78957385661D for <ouuuleilei@gmail.com>; Tue, 13 Jun 2023 02:08:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 78957385661D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1686622105; bh=keMRaTklrUXZZ7N7PDQTPbhklNbH7OG0DuYBAvO3G2A=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=kyHRw5DkgUGbxSYaZzpr9lYBdzDQoBAEg1u5G+RiD+JNKPb6ZawnaNJsIDHGv3OVj XeajCNJK6DGUF4rdB+Yb9FmiOmgSCjaL/0hQ3r055eZsPqo9LAFhyNrKsyZ/D3+7Tq GNjx36miOyy0VjAcIPyeJmNsTgAmLYikQ77zdBg4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 26FB138582A4 for <gcc-patches@gcc.gnu.org>; Tue, 13 Jun 2023 02:07:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 26FB138582A4 Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 35D27MMO012155; Tue, 13 Jun 2023 02:07:30 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3era6n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:07:29 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 35D27Sir012654; Tue, 13 Jun 2023 02:07:28 GMT Received: from ppma04ams.nl.ibm.com (63.31.33a9.ip4.static.sl-reverse.com [169.51.49.99]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3r6f3era0b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:07:27 +0000 Received: from pps.filterd (ppma04ams.nl.ibm.com [127.0.0.1]) by ppma04ams.nl.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 35CMjL4H009472; Tue, 13 Jun 2023 02:03:44 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma04ams.nl.ibm.com (PPS) with ESMTPS id 3r4gt51upq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 13 Jun 2023 02:03:44 +0000 Received: from smtpav04.fra02v.mail.ibm.com (smtpav04.fra02v.mail.ibm.com [10.20.54.103]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 35D23guc61931916 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 13 Jun 2023 02:03:42 GMT Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 192FC20040; Tue, 13 Jun 2023 02:03:42 +0000 (GMT) Received: from smtpav04.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 220F920043; Tue, 13 Jun 2023 02:03:41 +0000 (GMT) Received: from trout.aus.stglabs.ibm.com (unknown [9.40.194.100]) by smtpav04.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 13 Jun 2023 02:03:40 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, richard.sandiford@arm.com, segher@kernel.crashing.org, bergner@linux.ibm.com Subject: [PATCH 0/9] vect: Move costing next to the transform for vect load Date: Mon, 12 Jun 2023 21:03:21 -0500 Message-Id: <cover.1686573640.git.linkw@linux.ibm.com> X-Mailer: git-send-email 2.31.1 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: QFn4w2O1cggRrfB9HLbGFagtUyJNqEhh X-Proofpoint-ORIG-GUID: 3OjwRguwwjvR6yCyDVvnXo2issTxP_X8 Content-Transfer-Encoding: 8bit X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.573,FMLib:17.11.176.26 definitions=2023-06-12_18,2023-06-12_02,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 mlxscore=0 lowpriorityscore=0 spamscore=0 malwarescore=0 adultscore=0 suspectscore=0 mlxlogscore=893 clxscore=1015 impostorscore=0 bulkscore=0 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2306130016 X-Spam-Status: No, score=-6.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: Kewen Lin via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: Kewen Lin <linkw@linux.ibm.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768551543507882470?= X-GMAIL-MSGID: =?utf-8?q?1768551543507882470?= |
Series |
vect: Move costing next to the transform for vect load
|
|
Message
Kewen.Lin
June 13, 2023, 2:03 a.m. UTC
This patch series follows Richi's suggestion at the link [1], which suggest structuring vectorizable_load to make costing next to the transform, in order to make it easier to keep costing and the transform in sync. For now, it's a known issue that what we cost can be inconsistent with what we transform, as the case in PR82255 and some other associated test cases in the patches of this series show. Basically this patch series makes costing not call function vect_model_load_cost any more. To make the review and bisection easy, I organized the changes according to the memory access types of vector load. For each memory access type, firstly it follows the handlings in the function vect_model_load_costto avoid any missing, then refines further by referring to the transform code, I also checked them with some typical test cases to verify. Hope the subjects of patches are clear enough. The whole series can be bootstrapped and regtested incrementally on: - x86_64-redhat-linux - aarch64-linux-gnu - powerpc64-linux-gnu P7, P8 and P9 - powerpc64le-linux-gnu P8, P9 and P10 By considering the current vector test buckets are mainly tested without cost model, I also verified the whole patch series was neutral for SPEC2017 int/fp on Power9 at O2, O3 and Ofast separately. [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html Kewen Lin (9): vect: Move vect_model_load_cost next to the transform in vectorizable_load vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && gs_info.decl vect: Adjust vectorizable_load costing on VMAT_INVARIANT vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS .../vect/costmodel/ppc/costmodel-pr82255.c | 31 + .../costmodel/ppc/costmodel-vect-reversed.c | 22 + gcc/testsuite/gcc.target/i386/pr70021.c | 2 +- gcc/tree-vect-stmts.cc | 651 ++++++++++-------- 4 files changed, 432 insertions(+), 274 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c
Comments
Hi, I'd like to gentle ping this patch series: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621460.html BR, Kewen on 2023/6/13 10:03, Kewen Lin via Gcc-patches wrote: > This patch series follows Richi's suggestion at the link [1], > which suggest structuring vectorizable_load to make costing > next to the transform, in order to make it easier to keep > costing and the transform in sync. For now, it's a known > issue that what we cost can be inconsistent with what we > transform, as the case in PR82255 and some other associated > test cases in the patches of this series show. > > Basically this patch series makes costing not call function > vect_model_load_cost any more. To make the review and > bisection easy, I organized the changes according to the > memory access types of vector load. For each memory access > type, firstly it follows the handlings in the function > vect_model_load_costto avoid any missing, then refines > further by referring to the transform code, I also checked > them with some typical test cases to verify. Hope the > subjects of patches are clear enough. > > The whole series can be bootstrapped and regtested > incrementally on: > - x86_64-redhat-linux > - aarch64-linux-gnu > - powerpc64-linux-gnu P7, P8 and P9 > - powerpc64le-linux-gnu P8, P9 and P10 > > By considering the current vector test buckets are mainly > tested without cost model, I also verified the whole patch > series was neutral for SPEC2017 int/fp on Power9 at O2, > O3 and Ofast separately. > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html > > Kewen Lin (9): > vect: Move vect_model_load_cost next to the transform in vectorizable_load > vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && gs_info.decl > vect: Adjust vectorizable_load costing on VMAT_INVARIANT > vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP > vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER > vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES > vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE > vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE > vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS > > .../vect/costmodel/ppc/costmodel-pr82255.c | 31 + > .../costmodel/ppc/costmodel-vect-reversed.c | 22 + > gcc/testsuite/gcc.target/i386/pr70021.c | 2 +- > gcc/tree-vect-stmts.cc | 651 ++++++++++-------- > 4 files changed, 432 insertions(+), 274 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c > create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c >
On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <linkw@linux.ibm.com> wrote: > > This patch series follows Richi's suggestion at the link [1], > which suggest structuring vectorizable_load to make costing > next to the transform, in order to make it easier to keep > costing and the transform in sync. For now, it's a known > issue that what we cost can be inconsistent with what we > transform, as the case in PR82255 and some other associated > test cases in the patches of this series show. > > Basically this patch series makes costing not call function > vect_model_load_cost any more. To make the review and > bisection easy, I organized the changes according to the > memory access types of vector load. For each memory access > type, firstly it follows the handlings in the function > vect_model_load_costto avoid any missing, then refines > further by referring to the transform code, I also checked > them with some typical test cases to verify. Hope the > subjects of patches are clear enough. > > The whole series can be bootstrapped and regtested > incrementally on: > - x86_64-redhat-linux > - aarch64-linux-gnu > - powerpc64-linux-gnu P7, P8 and P9 > - powerpc64le-linux-gnu P8, P9 and P10 > > By considering the current vector test buckets are mainly > tested without cost model, I also verified the whole patch > series was neutral for SPEC2017 int/fp on Power9 at O2, > O3 and Ofast separately. I went through the series now and I like it overall (well, I suggested the change). Looking at the changes I think we want some followup to reduce the mess in the final loop nest. We already have some VMAT_* cases handled separately, maybe we can split out some more cases. Maybe we should bite the bullet and duplicate that loop nest for the different VMAT_* cases. Maybe we can merge some of the if (!costing_p) checks by clever re-ordering. So what this series doesn't improve is overall readability of the code (indent and our 80 char line limit). The change also makes it more difficult(?) to separate analysis and transform though in the end I hope that analysis will actually "code generate" to a (SLP) data structure so the target will have a chance to see the actual flow of insns. That said, I'd like to hear from Richard whether he thinks this is a step in the right direction. Are you willing to followup with doing the same re-structuring to vectorizable_store? OK from my side with the few comments addressed. The patch likely needs refresh after the RVV changes in this area? Thanks, Richard. > [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-January/563624.html > > Kewen Lin (9): > vect: Move vect_model_load_cost next to the transform in vectorizable_load > vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER && gs_info.decl > vect: Adjust vectorizable_load costing on VMAT_INVARIANT > vect: Adjust vectorizable_load costing on VMAT_ELEMENTWISE and VMAT_STRIDED_SLP > vect: Adjust vectorizable_load costing on VMAT_GATHER_SCATTER > vect: Adjust vectorizable_load costing on VMAT_LOAD_STORE_LANES > vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_REVERSE > vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS_PERMUTE > vect: Adjust vectorizable_load costing on VMAT_CONTIGUOUS > > .../vect/costmodel/ppc/costmodel-pr82255.c | 31 + > .../costmodel/ppc/costmodel-vect-reversed.c | 22 + > gcc/testsuite/gcc.target/i386/pr70021.c | 2 +- > gcc/tree-vect-stmts.cc | 651 ++++++++++-------- > 4 files changed, 432 insertions(+), 274 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-pr82255.c > create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-reversed.c > > -- > 2.31.1 >
Richard Biener <richard.guenther@gmail.com> writes: > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <linkw@linux.ibm.com> wrote: >> >> This patch series follows Richi's suggestion at the link [1], >> which suggest structuring vectorizable_load to make costing >> next to the transform, in order to make it easier to keep >> costing and the transform in sync. FTR, I was keeping quiet given that this was following an agreed plan :) Thanks for organising the series this way. It made it easier to review. >> For now, it's a known >> issue that what we cost can be inconsistent with what we >> transform, as the case in PR82255 and some other associated >> test cases in the patches of this series show. >> >> Basically this patch series makes costing not call function >> vect_model_load_cost any more. To make the review and >> bisection easy, I organized the changes according to the >> memory access types of vector load. For each memory access >> type, firstly it follows the handlings in the function >> vect_model_load_costto avoid any missing, then refines >> further by referring to the transform code, I also checked >> them with some typical test cases to verify. Hope the >> subjects of patches are clear enough. >> >> The whole series can be bootstrapped and regtested >> incrementally on: >> - x86_64-redhat-linux >> - aarch64-linux-gnu >> - powerpc64-linux-gnu P7, P8 and P9 >> - powerpc64le-linux-gnu P8, P9 and P10 >> >> By considering the current vector test buckets are mainly >> tested without cost model, I also verified the whole patch >> series was neutral for SPEC2017 int/fp on Power9 at O2, >> O3 and Ofast separately. > > I went through the series now and I like it overall (well, I suggested > the change). > Looking at the changes I think we want some followup to reduce the > mess in the final loop nest. We already have some VMAT_* cases handled > separately, maybe we can split out some more cases. Maybe we should > bite the bullet and duplicate that loop nest for the different VMAT_* cases. > Maybe we can merge some of the if (!costing_p) checks by clever > re-ordering. So what > this series doesn't improve is overall readability of the code (indent and our > 80 char line limit). > > The change also makes it more difficult(?) to separate analysis and transform > though in the end I hope that analysis will actually "code generate" to a (SLP) > data structure so the target will have a chance to see the actual flow of insns. > > That said, I'd like to hear from Richard whether he thinks this is a step > in the right direction. Yeah, agree that it's probably better on balance. It's going to need a bit of discipline to make sure that we don't accidentally change the IR during the analysis phase, but I guess that already exists to a lesser extent with the “before !vec_stmt”/“after !vec_stmt” split. Thanks, Richard
Hi Richi, Thanks for your review comments on this and some others! on 2023/6/30 19:37, Richard Biener wrote: > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <linkw@linux.ibm.com> wrote: >> >> This patch series follows Richi's suggestion at the link [1], >> which suggest structuring vectorizable_load to make costing >> next to the transform, in order to make it easier to keep >> costing and the transform in sync. For now, it's a known >> issue that what we cost can be inconsistent with what we >> transform, as the case in PR82255 and some other associated >> test cases in the patches of this series show. >> >> Basically this patch series makes costing not call function >> vect_model_load_cost any more. To make the review and >> bisection easy, I organized the changes according to the >> memory access types of vector load. For each memory access >> type, firstly it follows the handlings in the function >> vect_model_load_costto avoid any missing, then refines >> further by referring to the transform code, I also checked >> them with some typical test cases to verify. Hope the >> subjects of patches are clear enough. >> >> The whole series can be bootstrapped and regtested >> incrementally on: >> - x86_64-redhat-linux >> - aarch64-linux-gnu >> - powerpc64-linux-gnu P7, P8 and P9 >> - powerpc64le-linux-gnu P8, P9 and P10 >> >> By considering the current vector test buckets are mainly >> tested without cost model, I also verified the whole patch >> series was neutral for SPEC2017 int/fp on Power9 at O2, >> O3 and Ofast separately. > > I went through the series now and I like it overall (well, I suggested > the change). > Looking at the changes I think we want some followup to reduce the > mess in the final loop nest. We already have some VMAT_* cases handled > separately, maybe we can split out some more cases. Maybe we should At first glance, the simple parts look to be the handlings for VMAT_LOAD_STORE_LANES, and VMAT_GATHER_SCATTER (with ifn and emulated). It seems a bit straightforward if it's fine to duplicate the nested loop, but may need to care about removing some useless code. > bite the bullet and duplicate that loop nest for the different VMAT_* cases. > Maybe we can merge some of the if (!costing_p) checks by clever > re-ordering. I've tried a bit to merge them if possible, like the place to check VMAT_CONTIGUOUS, VMAT_CONTIGUOUS_REVERSE and VMAT_CONTIGUOUS_PERMUTE. But will keep in mind for the following updates. > So what > this series doesn't improve is overall readability of the code (indent and our > 80 char line limit). Sorry about that. > > The change also makes it more difficult(?) to separate analysis and transform > though in the end I hope that analysis will actually "code generate" to a (SLP) > data structure so the target will have a chance to see the actual flow of insns. > > That said, I'd like to hear from Richard whether he thinks this is a step > in the right direction. > > Are you willing to followup with doing the same re-structuring to > vectorizable_store? Yes, vectorizable_store was also pointed out in your original suggestion [1], I planned to update this once this series meets your expectations and gets landed. > > OK from my side with the few comments addressed. The patch likely needs refresh > after the RVV changes in this area? Thanks! Yes, I've updated 2/9 and 3/9 according to your comments, and updated 5/9 and 9/9 as they had some conflicts when rebasing. Re-testing is ongoing, do the updated versions look good to you? Is this series ok for trunk if all the test runs go well again as before? BR, Kewen
On Mon, Jul 3, 2023 at 5:39 AM Kewen.Lin <linkw@linux.ibm.com> wrote: > > Hi Richi, > > Thanks for your review comments on this and some others! > > on 2023/6/30 19:37, Richard Biener wrote: > > On Tue, Jun 13, 2023 at 4:07 AM Kewen Lin <linkw@linux.ibm.com> wrote: > >> > >> This patch series follows Richi's suggestion at the link [1], > >> which suggest structuring vectorizable_load to make costing > >> next to the transform, in order to make it easier to keep > >> costing and the transform in sync. For now, it's a known > >> issue that what we cost can be inconsistent with what we > >> transform, as the case in PR82255 and some other associated > >> test cases in the patches of this series show. > >> > >> Basically this patch series makes costing not call function > >> vect_model_load_cost any more. To make the review and > >> bisection easy, I organized the changes according to the > >> memory access types of vector load. For each memory access > >> type, firstly it follows the handlings in the function > >> vect_model_load_costto avoid any missing, then refines > >> further by referring to the transform code, I also checked > >> them with some typical test cases to verify. Hope the > >> subjects of patches are clear enough. > >> > >> The whole series can be bootstrapped and regtested > >> incrementally on: > >> - x86_64-redhat-linux > >> - aarch64-linux-gnu > >> - powerpc64-linux-gnu P7, P8 and P9 > >> - powerpc64le-linux-gnu P8, P9 and P10 > >> > >> By considering the current vector test buckets are mainly > >> tested without cost model, I also verified the whole patch > >> series was neutral for SPEC2017 int/fp on Power9 at O2, > >> O3 and Ofast separately. > > > > I went through the series now and I like it overall (well, I suggested > > the change). > > Looking at the changes I think we want some followup to reduce the > > mess in the final loop nest. We already have some VMAT_* cases handled > > separately, maybe we can split out some more cases. Maybe we should > > At first glance, the simple parts look to be the handlings for > VMAT_LOAD_STORE_LANES, and VMAT_GATHER_SCATTER (with ifn and emulated). > It seems a bit straightforward if it's fine to duplicate the nested loop, > but may need to care about removing some useless code. > > > bite the bullet and duplicate that loop nest for the different VMAT_* cases. > > Maybe we can merge some of the if (!costing_p) checks by clever > > re-ordering. > > I've tried a bit to merge them if possible, like the place to check > VMAT_CONTIGUOUS, VMAT_CONTIGUOUS_REVERSE and VMAT_CONTIGUOUS_PERMUTE. > But will keep in mind for the following updates. > > > So what > > this series doesn't improve is overall readability of the code (indent and our > > 80 char line limit). > > Sorry about that. > > > > > The change also makes it more difficult(?) to separate analysis and transform > > though in the end I hope that analysis will actually "code generate" to a (SLP) > > data structure so the target will have a chance to see the actual flow of insns. > > > > That said, I'd like to hear from Richard whether he thinks this is a step > > in the right direction. > > > > Are you willing to followup with doing the same re-structuring to > > vectorizable_store? > > Yes, vectorizable_store was also pointed out in your original suggestion [1], > I planned to update this once this series meets your expectations and gets landed. > > > > > OK from my side with the few comments addressed. The patch likely needs refresh > > after the RVV changes in this area? > > Thanks! Yes, I've updated 2/9 and 3/9 according to your comments, and updated > 5/9 and 9/9 as they had some conflicts when rebasing. Re-testing is ongoing, > do the updated versions look good to you? Is this series ok for trunk if all the > test runs go well again as before? Yes. Thanks, Richard. > BR, > Kewen