From patchwork Wed Oct 11 09:05:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: HAO CHEN GUI X-Patchwork-Id: 151234 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp397617vqb; Wed, 11 Oct 2023 02:06:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF12DrlN1kA15eQMBp0R/Xl/VJ/1G9XeHb7may8IZP5lM4gTNji+PqRa95jUYTeNqyWrFY6 X-Received: by 2002:a05:6402:371a:b0:53d:af00:1682 with SMTP id ek26-20020a056402371a00b0053daf001682mr2123802edb.40.1697015212043; Wed, 11 Oct 2023 02:06:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697015212; cv=none; d=google.com; s=arc-20160816; b=q/QNwvcXlbMScZk5uY9X7VnO5yaSyAcdCb79WUPQiPD9mecr07kSq2nnp7JZuqQrvc 2RB+tVTIZTf+pjoAqIY3aLw2aPfHxSpPzrZRDYWuwhp6gZwtm55v3DPRq6JbD3Aj7ga2 G8fYeCxuW8d0x3pY4cJZ5fUL5WUqgX3YUEtQZcNPljyDiSnszSY1QgR0CimScfo8fN/p VqdKddYSNilyPVXVf/U2e39A/keQEoWThtTdK0tc5zM8Rp0gVE1GI4ibn12p3hFFYKHm JGnhlnPufb+cvrSUe6LOAUG7VOr02n/4sGEt2T/N7UzJSz12G6FS9hYMYpACisUWKzXr FXVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :subject:from:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature:dmarc-filter:delivered-to; bh=Lav0o57FcvRqptuPppHV5yZ6umONjpOzJ/1pUhkzeQc=; fh=Xl0YFiP+Ia8Aqs1F+VBUBMflnPDDzNov2ohILvfBp+Q=; b=FSWWus9mFWAbEBz3SeuBwo+tnbRYqJMk3gkVAenazSWclndXyuIqtb1DXZ/E0DFQig AjHByW2HtngMp62TRvTLMHvl2LEZ6i03MXenbK/cd9Py1NaxM4lNHpfOJuyGqLnKGRiu z5jsX10gusJUEIiH+izDjojqKKPR/a6AE0ZKdmN1CPkMq5CG4tpmJHuoGkR0hR8sWNLk c3k3qvxzy1nATKImRAU5g0jWfy16LEth+vV56PQndDr0YYFk6xYOu6x89+cjLY3uSScC s8JlXPPhHc1j+0oxWh3i0WTpVBne4aWqxkpGzRXzON80IKtoCxiPhn4+EX1lYF5qzMtb 8p4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=fuSKj4x9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id w8-20020aa7cb48000000b00536325f842dsi6404274edt.152.2023.10.11.02.06.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Oct 2023 02:06:52 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=fuSKj4x9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=NONE dis=NONE) header.from=ibm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 98C3D385AE4E for ; Wed, 11 Oct 2023 09:06:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id A440A3858C5E for ; Wed, 11 Oct 2023 09:06:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A440A3858C5E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.ibm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.ibm.com Received: from pps.filterd (m0353722.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39B8nlVr020724; Wed, 11 Oct 2023 09:06:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=message-id : date : mime-version : to : cc : from : subject : content-type : content-transfer-encoding; s=pp1; bh=Lav0o57FcvRqptuPppHV5yZ6umONjpOzJ/1pUhkzeQc=; b=fuSKj4x9wF8FgvRpbdiTBWzCnPJa5nOt/bOp6hTJDp2nC8Y2QAfihf+iqbY5vWxTdqUl pJwCamIONYRCIESzx1G7RCoGqv4QyJegjIUPPW2uTBimVeM0fMEeo3bHcUlTeySkdWYz sYE3cMEzLCo8Hc2AEEucCtR4NyczUFNwFGGTN3j8L56lc/xNZM0fP1pGC42FEffI9jev A+molk42wKKnK3bZ0R7B37Q9Fq00OLGgGlObunvfieggPhcC6fFLzScycSfbXjmC5mlg w0T/wUphWDvb2tbXhvEsgBjD2zJkQqFaVLHEvKD+TVj6qvSz0ny4nsSX8vz9EXC9qbYG dw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tnrhngm87-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 11 Oct 2023 09:06:08 +0000 Received: from m0353722.ppops.net (m0353722.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 39B8owF5024440; Wed, 11 Oct 2023 09:06:08 GMT Received: from ppma13.dal12v.mail.ibm.com (dd.9e.1632.ip4.static.sl-reverse.com [50.22.158.221]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3tnrhngm4c-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 11 Oct 2023 09:06:08 +0000 Received: from pps.filterd (ppma13.dal12v.mail.ibm.com [127.0.0.1]) by ppma13.dal12v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 39B6q61a001150; Wed, 11 Oct 2023 09:06:06 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma13.dal12v.mail.ibm.com (PPS) with ESMTPS id 3tkkvjxgxe-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Wed, 11 Oct 2023 09:06:05 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 39B963wx45482334 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Wed, 11 Oct 2023 09:06:03 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 08AEB20043; Wed, 11 Oct 2023 09:06:03 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3D51A2004B; Wed, 11 Oct 2023 09:06:01 +0000 (GMT) Received: from [9.200.103.64] (unknown [9.200.103.64]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Wed, 11 Oct 2023 09:06:00 +0000 (GMT) Message-ID: <88cb6668-66dc-26ba-461c-64dd097b8eba@linux.ibm.com> Date: Wed, 11 Oct 2023 17:05:59 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Content-Language: en-US To: gcc-patches Cc: Segher Boessenkool , David , "Kewen.Lin" , Peter Bergner , richard.sandiford@arm.com From: HAO CHEN GUI Subject: [PATCH-1v2, expand] Enable vector mode for compare_by_pieces [PR111449] X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: dweO9mzCH4Oa82_FQLpWnJLUf7RGldQx X-Proofpoint-GUID: j5hrOT6uC9NXX1wcb8044R94-Uzn8lj1 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-11_06,2023-10-10_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 clxscore=1015 adultscore=0 suspectscore=0 impostorscore=0 bulkscore=0 phishscore=0 spamscore=0 mlxlogscore=999 lowpriorityscore=0 mlxscore=0 malwarescore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2310110079 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779449422571916270 X-GMAIL-MSGID: 1779449422571916270 Hi, Vector mode instructions are efficient on some targets (e.g. ppc64). This patch enables vector mode for compare_by_pieces. The non-member function widest_fixed_size_mode_for_size takes by_pieces_operation as the second argument and decide whether vector mode is enabled or not by the type of operations. Currently only set and compare enabled vector mode and do the optab checking correspondingly. The test case is in the second patch which is rs6000 specific. Compared to last version, the main change is to enable vector mode for compare_by_pieces in smallest_fixed_size_mode_for_size which is used for overlapping compare. Bootstrapped and tested on x86 and powerpc64-linux BE and LE with no regressions. Thanks Gui Haochen ChangeLog Expand: Enable vector mode for pieces compares Vector mode compare instructions are efficient for equality compare on rs6000. This patch refactors the codes of pieces operation to enable vector mode for compare. gcc/ PR target/111449 * expr.cc (widest_fixed_size_mode_for_size): Enable vector mode for compare. Replace the second argument with the type of pieces operation. Add optab checks for vector mode used in compare. (by_pieces_ninsns): Pass the type of pieces operation to widest_fixed_size_mode_for_size. (class op_by_pieces_d): Define virtual function widest_fixed_size_mode_for_size and optab_checking. (op_by_pieces_d::op_by_pieces_d): Call outer function widest_fixed_size_mode_for_size. (op_by_pieces_d::get_usable_mode): Call class function widest_fixed_size_mode_for_size. (op_by_pieces_d::smallest_fixed_size_mode_for_size): Call optab_checking for different types of operations. (op_by_pieces_d::run): Call class function widest_fixed_size_mode_for_size. (class move_by_pieces_d): Declare function widest_fixed_size_mode_for_size. (move_by_pieces_d::widest_fixed_size_mode_for_size): Implement. (class store_by_pieces_d): Declare function widest_fixed_size_mode_for_size and optab_checking. (store_by_pieces_d::optab_checking): Implement. (store_by_pieces_d::widest_fixed_size_mode_for_size): Implement. (can_store_by_pieces): Pass the type of pieces operation to widest_fixed_size_mode_for_size. (class compare_by_pieces_d): Declare function widest_fixed_size_mode_for_size and optab_checking. (compare_by_pieces_d::compare_by_pieces_d): Set m_qi_vector_mode to true to enable vector mode. (compare_by_pieces_d::widest_fixed_size_mode_for_size): Implement. (compare_by_pieces_d::optab_checking): Implement. patch.diff diff --git a/gcc/expr.cc b/gcc/expr.cc index 9a37bff1fdd..e83c0a378ed 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -992,8 +992,9 @@ alignment_for_piecewise_move (unsigned int max_pieces, unsigned int align) that is narrower than SIZE bytes. */ static fixed_size_mode -widest_fixed_size_mode_for_size (unsigned int size, bool qi_vector) +widest_fixed_size_mode_for_size (unsigned int size, by_pieces_operation op) { + bool qi_vector = ((op == COMPARE_BY_PIECES) || op == SET_BY_PIECES); fixed_size_mode result = NARROWEST_INT_MODE; gcc_checking_assert (size > 1); @@ -1009,8 +1010,13 @@ widest_fixed_size_mode_for_size (unsigned int size, bool qi_vector) { if (GET_MODE_SIZE (candidate) >= size) break; - if (optab_handler (vec_duplicate_optab, candidate) - != CODE_FOR_nothing) + if ((op == SET_BY_PIECES + && optab_handler (vec_duplicate_optab, candidate) + != CODE_FOR_nothing) + || (op == COMPARE_BY_PIECES + && optab_handler (mov_optab, mode) + != CODE_FOR_nothing + && can_compare_p (EQ, mode, ccp_jump))) result = candidate; } @@ -1061,8 +1067,7 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned int align, { /* NB: Round up L and ALIGN to the widest integer mode for MAX_SIZE. */ - mode = widest_fixed_size_mode_for_size (max_size, - op == SET_BY_PIECES); + mode = widest_fixed_size_mode_for_size (max_size, op); if (optab_handler (mov_optab, mode) != CODE_FOR_nothing) { unsigned HOST_WIDE_INT up = ROUND_UP (l, GET_MODE_SIZE (mode)); @@ -1076,8 +1081,7 @@ by_pieces_ninsns (unsigned HOST_WIDE_INT l, unsigned int align, while (max_size > 1 && l > 0) { - mode = widest_fixed_size_mode_for_size (max_size, - op == SET_BY_PIECES); + mode = widest_fixed_size_mode_for_size (max_size, op); enum insn_code icode; unsigned int modesize = GET_MODE_SIZE (mode); @@ -1327,6 +1331,12 @@ class op_by_pieces_d virtual void finish_mode (machine_mode) { } + virtual fixed_size_mode widest_fixed_size_mode_for_size (unsigned int size) + = 0; + virtual bool optab_checking (fixed_size_mode) + { + return false; + } public: op_by_pieces_d (unsigned int, rtx, bool, rtx, bool, by_pieces_constfn, @@ -1375,8 +1385,7 @@ op_by_pieces_d::op_by_pieces_d (unsigned int max_pieces, rtx to, { /* Find the mode of the largest comparison. */ fixed_size_mode mode - = widest_fixed_size_mode_for_size (m_max_size, - m_qi_vector_mode); + = ::widest_fixed_size_mode_for_size (m_max_size, COMPARE_BY_PIECES); m_from.decide_autoinc (mode, m_reverse, len); m_to.decide_autoinc (mode, m_reverse, len); @@ -1401,7 +1410,7 @@ op_by_pieces_d::get_usable_mode (fixed_size_mode mode, unsigned int len) if (len >= size && prepare_mode (mode, m_align)) break; /* widest_fixed_size_mode_for_size checks SIZE > 1. */ - mode = widest_fixed_size_mode_for_size (size, m_qi_vector_mode); + mode = widest_fixed_size_mode_for_size (size); } while (1); return mode; @@ -1427,8 +1436,7 @@ op_by_pieces_d::smallest_fixed_size_mode_for_size (unsigned int size) break; if (GET_MODE_SIZE (candidate) >= size - && (optab_handler (vec_duplicate_optab, candidate) - != CODE_FOR_nothing)) + && optab_checking (candidate)) return candidate; } } @@ -1451,7 +1459,7 @@ op_by_pieces_d::run () /* widest_fixed_size_mode_for_size checks M_MAX_SIZE > 1. */ fixed_size_mode mode - = widest_fixed_size_mode_for_size (m_max_size, m_qi_vector_mode); + = widest_fixed_size_mode_for_size (m_max_size); mode = get_usable_mode (mode, length); by_pieces_prev to_prev = { nullptr, mode }; @@ -1516,8 +1524,7 @@ op_by_pieces_d::run () else { /* widest_fixed_size_mode_for_size checks SIZE > 1. */ - mode = widest_fixed_size_mode_for_size (size, - m_qi_vector_mode); + mode = widest_fixed_size_mode_for_size (size); mode = get_usable_mode (mode, length); } } @@ -1538,6 +1545,8 @@ class move_by_pieces_d : public op_by_pieces_d insn_gen_fn m_gen_fun; void generate (rtx, rtx, machine_mode) final override; bool prepare_mode (machine_mode, unsigned int) final override; + fixed_size_mode widest_fixed_size_mode_for_size (unsigned int) + final override; public: move_by_pieces_d (rtx to, rtx from, unsigned HOST_WIDE_INT len, @@ -1626,14 +1635,24 @@ move_by_pieces (rtx to, rtx from, unsigned HOST_WIDE_INT len, return to; } +fixed_size_mode +move_by_pieces_d::widest_fixed_size_mode_for_size (unsigned int size) +{ + return ::widest_fixed_size_mode_for_size (size, MOVE_BY_PIECES); +} + /* Derived class from op_by_pieces_d, providing support for block move operations. */ class store_by_pieces_d : public op_by_pieces_d { insn_gen_fn m_gen_fun; + void generate (rtx, rtx, machine_mode) final override; bool prepare_mode (machine_mode, unsigned int) final override; + fixed_size_mode widest_fixed_size_mode_for_size (unsigned int) + final override; + bool optab_checking (fixed_size_mode) final override; public: store_by_pieces_d (rtx to, by_pieces_constfn cfn, void *cfn_data, @@ -1670,6 +1689,18 @@ store_by_pieces_d::generate (rtx op0, rtx op1, machine_mode) emit_insn (m_gen_fun (op0, op1)); } +bool +store_by_pieces_d::optab_checking (fixed_size_mode mode) +{ + /* optab checking for memset. */ + if (m_qi_vector_mode + && optab_handler (vec_duplicate_optab, mode) != CODE_FOR_nothing) + return true; + else + return false; +} + + /* Perform the final adjustment at the end of a string to obtain the correct return value for the block operation. Return value is based on RETMODE argument. */ @@ -1686,6 +1717,13 @@ store_by_pieces_d::finish_retmode (memop_ret retmode) return m_to.adjust (QImode, m_offset); } +fixed_size_mode +store_by_pieces_d::widest_fixed_size_mode_for_size (unsigned int size) +{ + return ::widest_fixed_size_mode_for_size (size, + m_qi_vector_mode ? SET_BY_PIECES : STORE_BY_PIECES); +} + /* Determine whether the LEN bytes generated by CONSTFUN can be stored to memory using several move instructions. CONSTFUNDATA is a pointer which will be passed as argument in every CONSTFUN call. @@ -1730,7 +1768,8 @@ can_store_by_pieces (unsigned HOST_WIDE_INT len, while (max_size > 1 && l > 0) { fixed_size_mode mode - = widest_fixed_size_mode_for_size (max_size, memsetp); + = widest_fixed_size_mode_for_size (max_size, + memsetp ? SET_BY_PIECES : STORE_BY_PIECES); icode = optab_handler (mov_optab, mode); if (icode != CODE_FOR_nothing @@ -1832,12 +1871,16 @@ class compare_by_pieces_d : public op_by_pieces_d void generate (rtx, rtx, machine_mode) final override; bool prepare_mode (machine_mode, unsigned int) final override; void finish_mode (machine_mode) final override; + fixed_size_mode widest_fixed_size_mode_for_size (unsigned int) + final override; + bool optab_checking (fixed_size_mode) final override; + public: compare_by_pieces_d (rtx op0, rtx op1, by_pieces_constfn op1_cfn, void *op1_cfn_data, HOST_WIDE_INT len, int align, rtx_code_label *fail_label) : op_by_pieces_d (COMPARE_MAX_PIECES, op0, true, op1, true, op1_cfn, - op1_cfn_data, len, align, false) + op1_cfn_data, len, align, false, true) { m_fail_label = fail_label; } @@ -1943,6 +1986,23 @@ compare_by_pieces (rtx arg0, rtx arg1, unsigned HOST_WIDE_INT len, return target; } + +fixed_size_mode +compare_by_pieces_d::widest_fixed_size_mode_for_size (unsigned int size) +{ + return ::widest_fixed_size_mode_for_size (size, COMPARE_BY_PIECES); +} + +bool +compare_by_pieces_d::optab_checking (fixed_size_mode mode) +{ + if (optab_handler (mov_optab, mode) != CODE_FOR_nothing + && can_compare_p (EQ, mode, ccp_jump)) + return true; + else + return false; +} + /* Emit code to move a block Y to a block X. This may be done with string-move instructions, with multiple scalar move instructions,