From patchwork Thu Nov 17 06:15:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiufu Guo X-Patchwork-Id: 21456 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp238039wrr; Wed, 16 Nov 2022 22:16:43 -0800 (PST) X-Google-Smtp-Source: AA0mqf5G6CwvFDn+e7N66lDgh2uofs1BQfqvr95FB+6F86QUNBs1RIqdfos8LHU/uWyyJP4Kvre1 X-Received: by 2002:aa7:c6c8:0:b0:461:be33:ad25 with SMTP id b8-20020aa7c6c8000000b00461be33ad25mr904442eds.138.1668665803262; Wed, 16 Nov 2022 22:16:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668665803; cv=none; d=google.com; s=arc-20160816; b=ug/EH70aM85wbotM2JY31lTVQkt4kV+wsRb/f8nh2xekXa/Atski5Dbquxmmcq/Xou wn3wlxIgX48OcizAAT9PUOQnIppq7V5LSaObdkNy6mbmQ6Tp0HELwb+4q+GyzVQqb1Je Sy/EJEGbomM+wEOlK7ei2TDX7Jd61Ws5PB99AZWMAz68TEAG3MRf9ACm5tloP1qQaQxx XOImzCCcXAOaKDIMQrouRmKLg+Sekf5StkhMQ4wgg0NG8W50M6+r7/LuHqF75CqOOJlF HjwrtM//+4xHt/bemPkZLSxMOm/czkfRC41L36JC/zl6iHrJ+rAzK7RpMAUOqlRak5AN Lozg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=Qw+6/WqmDQO9FSsLY0qYHb+qhjymg49Qutw2FpYFqRY=; b=JunnqckUYtQawXNJIN593YqTFVQToRPFsA5hhp4mRp03tqW1HfUh2dR4rsKEhU8OrL tGTxNNvhxVrceH8AADFRwxixeYxCSWeIz29QfYMijOiA1F8VuMRsVLlrkXWPZsEegtG7 7IJ15YuKsfmAr+MFd5xoHIDVLjP4nMD3yiLe2iUwuz20Z2XUBVVMva9/FrUanK+ga895 bhWkZLXJGlkYkGD3lCDjedEYMw35Tf524KeN9qvkI33mPsmC88wdsGYSwjYfCMf6g1FR /3KUhwj7sjUbvE+/3F7AbM1ZPJFRnE1C4y5R4b+LHnhD4FFzpQfPFBAPCO0PhPakgFBS 7cEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=JAQVQG4R; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id sd6-20020a1709076e0600b007adb6459e64si16637129ejc.862.2022.11.16.22.16.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Nov 2022 22:16:43 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=JAQVQG4R; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 008B038493F5 for ; Thu, 17 Nov 2022 06:16:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 008B038493F5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668665802; bh=Qw+6/WqmDQO9FSsLY0qYHb+qhjymg49Qutw2FpYFqRY=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=JAQVQG4RK2rhJIH6rf2PGaj+AOYgkfe6dkB9tx+vQ8RqxcC54EDtUPCd+/ugdTf0U ps8hOtAHpl54ZhAjO6zwRQ8HqfSL/MP8j1ixj/k6x0w6FLJ4sXMKKvqEP7Mv1UMFri TizzimLKJrXru7fCIu64UVkB2QbWV5O+LNPBWSfU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by sourceware.org (Postfix) with ESMTPS id 1E1683853573; Thu, 17 Nov 2022 06:15:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 1E1683853573 Received: from pps.filterd (m0098417.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 2AH5L6dB025648; Thu, 17 Nov 2022 06:15:56 GMT Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kweqw15gg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 17 Nov 2022 06:15:56 +0000 Received: from m0098417.ppops.net (m0098417.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 2AH5Mx8F032265; Thu, 17 Nov 2022 06:15:55 GMT Received: from ppma06fra.de.ibm.com (48.49.7a9f.ip4.static.sl-reverse.com [159.122.73.72]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3kweqw15g0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 17 Nov 2022 06:15:55 +0000 Received: from pps.filterd (ppma06fra.de.ibm.com [127.0.0.1]) by ppma06fra.de.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 2AH66tml004398; Thu, 17 Nov 2022 06:15:53 GMT Received: from b06avi18626390.portsmouth.uk.ibm.com (b06avi18626390.portsmouth.uk.ibm.com [9.149.26.192]) by ppma06fra.de.ibm.com with ESMTP id 3kt2rj5acc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 17 Nov 2022 06:15:53 +0000 Received: from d06av23.portsmouth.uk.ibm.com (d06av23.portsmouth.uk.ibm.com [9.149.105.59]) by b06avi18626390.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 2AH69oUi49414538 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 17 Nov 2022 06:09:50 GMT Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 61551A4040; Thu, 17 Nov 2022 06:15:51 +0000 (GMT) Received: from d06av23.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 64328A4055; Thu, 17 Nov 2022 06:15:50 +0000 (GMT) Received: from pike.rch.stglabs.ibm.com (unknown [9.5.12.127]) by d06av23.portsmouth.uk.ibm.com (Postfix) with ESMTP; Thu, 17 Nov 2022 06:15:50 +0000 (GMT) To: gcc-patches@gcc.gnu.org Cc: segher@kernel.crashing.org, dje.gcc@gmail.com, linkw@gcc.gnu.org, guojiufu@linux.ibm.com, rguenther@suse.de, jeffreyalaw@gmail.com Subject: [PATCH V2] Use subscalar mode to move struct block for parameter Date: Thu, 17 Nov 2022 14:15:49 +0800 Message-Id: <20221117061549.178481-1-guojiufu@linux.ibm.com> X-Mailer: git-send-email 2.17.1 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: JNLSWVzjFgYdhlkORu_gaCCLFX3kJxg7 X-Proofpoint-GUID: H8dEFiaqLxZUqna_IBaS2aM86_PfuvvI X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.219,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-11-17_02,2022-11-16_01,2022-06-22_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 clxscore=1011 mlxlogscore=999 priorityscore=1501 phishscore=0 impostorscore=0 adultscore=0 malwarescore=0 lowpriorityscore=0 mlxscore=0 spamscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2210170000 definitions=main-2211170045 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jiufu Guo via Gcc-patches From: Jiufu Guo Reply-To: Jiufu Guo Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749722913220472438?= X-GMAIL-MSGID: =?utf-8?q?1749722913220472438?= Hi, As mentioned in the previous version patch: https://gcc.gnu.org/pipermail/gcc-patches/2022-October/604646.html The suboptimal code is generated for "assigning from parameter" or "assigning to return value". This patch enhances the assignment from parameters like the below cases: /////case1.c typedef struct SA {double a[3];long l; } A; A ret_arg (A a) {return a;} void st_arg (A a, A *p) {*p = a;} ////case2.c typedef struct SA {double a[3];} A; A ret_arg (A a) {return a;} void st_arg (A a, A *p) {*p = a;} For this patch, bootstrap and regtest pass on ppc64{,le} and x86_64. * Besides asking for help reviewing this patch, I would like to consult comments about enhancing for "assigning to returns". On some targets(ppc64), for below case: ////case3.c typedef struct SA {double a[3]; long l; } A; A ret_arg_pt (A *a) {return *a;} The optimized GIMPLE code looks like: = *a_2(D); return ; Here, (aka. RESULT_DECL) is MEM, and "aggregate_value_p" returns true for . * While for below case, the generated code is still suboptimal. ////case4.c typedef struct SA {double a[3];} A; A ret_arg_pt (A *a) {return *a;} The optimized GIMPLE code looks like: D.3951 = *a_2(D); return D.3951; The "return/assign" stmts are using D.3951(VAR_DECL) instead "(RESULT_DECL)". The mode of D.3951/ is BLK. The RTL of D.3951 is MEM, and RTL of is PARALLEL. For PARALLEL, aggregate_value_p returns false. In function expand_assignment, there is code: if (TREE_CODE (to) == RESULT_DECL && (REG_P (to_rtx) || GET_CODE (to_rtx) == PARALLEL)) This code can handle "", but can not handle "D.3951". I'm thinking of one way to handle this issue is to update the GIMPLE sequence as: " = *a_2(D); return ;" Or, collecting VARs which are used by return stmts; and for assignments to those VARs, using sub scalar mode for the block move. Thanks for any comments and suggestions! BR, Jeff (Jiufu) --- gcc/expr.cc | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/gcc/expr.cc b/gcc/expr.cc index d9407432ea5..420f9cf3662 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -6045,6 +6045,46 @@ expand_assignment (tree to, tree from, bool nontemporal) return; } + if (TREE_CODE (from) == PARM_DECL && DECL_INCOMING_RTL (from) + && TYPE_MODE (TREE_TYPE (from)) == BLKmode + && (GET_CODE (DECL_INCOMING_RTL (from)) == PARALLEL + || REG_P (DECL_INCOMING_RTL (from)))) + { + rtx parm = DECL_INCOMING_RTL (from); + + push_temp_slots (); + machine_mode mode; + mode = GET_CODE (parm) == PARALLEL + ? GET_MODE (XEXP (XVECEXP (parm, 0, 0), 0)) + : word_mode; + int mode_size = GET_MODE_SIZE (mode).to_constant (); + int size = INTVAL (expr_size (from)); + + /* If/How the parameter using submode, it dependes on the size and + position of the parameter. Here using heurisitic number. */ + int hurstc_num = 8; + if (size < mode_size || (size % mode_size) != 0 + || size > (mode_size * hurstc_num)) + result = store_expr (from, to_rtx, 0, nontemporal, false); + else + { + rtx from_rtx + = expand_expr (from, NULL_RTX, GET_MODE (to_rtx), EXPAND_NORMAL); + for (int i = 0; i < size / mode_size; i++) + { + rtx temp = gen_reg_rtx (mode); + rtx src = adjust_address (from_rtx, mode, mode_size * i); + rtx dest = adjust_address (to_rtx, mode, mode_size * i); + emit_move_insn (temp, src); + emit_move_insn (dest, temp); + } + result = to_rtx; + } + preserve_temp_slots (result); + pop_temp_slots (); + return; + } + /* Compute FROM and store the value in the rtx we got. */ push_temp_slots ();