From patchwork Wed Feb 15 09:18:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 57436 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp88112wrn; Wed, 15 Feb 2023 01:19:29 -0800 (PST) X-Google-Smtp-Source: AK7set/D1cgQqvPPi1CYeXI4LqpbGklG4bz+ZvcPK/cZQlL67sEKpdIjGa0GzPyGHcNR782L2zoI X-Received: by 2002:aa7:c6c1:0:b0:4ab:2503:4039 with SMTP id b1-20020aa7c6c1000000b004ab25034039mr1303151eds.17.1676452769087; Wed, 15 Feb 2023 01:19:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676452769; cv=none; d=google.com; s=arc-20160816; b=YW3VDYylWjSA0IILFqNCyJKLV3JSFP5zbNY8QA2rCnjAwLMnvqBCi1jnp687hUwf0M ifVAS2G00ZMPP/XTt3DB7W1pM+WXPjofg+/kQMQ/A62q4TJFOsAf4EsOXrPZP/8/reAS uy7mvbHhwXVj35y0rKGMXydkUgHrpTL9x1Pv9BMWroGsHYVB774AZtfbyoGa5ReUSiEW EJ1xq9GiGFr8QB4j1J/DLlH/sqSGOgU7pBuEIofZfIlP3ljyIBMzGm9DKrh71yebsW7s AMA02mVYkZ2txfXFGaAqv7V5XohenlE7xxWDg8O8t72S60nFCFmPbamBcMUaAOK+tLem Ex2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:cc:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=G/iFZTyVDUP/vwz7BwVbbWRTzV6Vhf9QOGZ9lHh0uzM=; b=hLfJQsWZzlHudRWG5uvnpF0VNEY5aFI+Scvxl2EHCSfRC1mPwwmTzpOx++8K4lBVFd G9m/otQjkyPmRL5sWx7hP35lBXdLTkjK7YxvUFkCsE2hZJ1GN60vvV4+U1FahPVVYaHE yK2XMoBUXRqgQRVSHcI1fa5djnBI5pCwwPuk+Tw1YwTbzrdZm/IkVxTIIsCFCqORwaHc XaQ0ALgygUUFtODeeusTvEOXf4a4M5la7tV+0h8JI+nBAYKXqywMWW8PHEs3MifowMfl 6y1cXcS35bU1L9vHqViAAVGREnTlnwh9UhzPZQoxu6a3tcyQptB+RV043MkePDky3Jy7 8W8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=gyYsiCN9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t9-20020a056402020900b004ad0c7d39b0si413972edv.297.2023.02.15.01.19.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 15 Feb 2023 01:19:29 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=gyYsiCN9; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 697373858433 for ; Wed, 15 Feb 2023 09:19:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 697373858433 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1676452767; bh=G/iFZTyVDUP/vwz7BwVbbWRTzV6Vhf9QOGZ9lHh0uzM=; h=Date:To:Cc:Subject:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=gyYsiCN9Qs12yQEyacWqbnTL60nx5kLmJRHBDmS9QnItTwS4d/T9t5dUo+fSeeFpo nggfilT2HS7V5COJy4bJw1WX4NEyZ+jp6m24+uK3XE7wl+7S1v30M7D/DD/8aKKfp0 Zob+gCVD4aGd/CkIy2HXxSD++GYP6iM0ZtBA1vjE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 177793858D1E for ; Wed, 15 Feb 2023 09:18:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 177793858D1E Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-61-caLFGM8hMAygI28hI2rQSg-1; Wed, 15 Feb 2023 04:18:34 -0500 X-MC-Unique: caLFGM8hMAygI28hI2rQSg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 57B79857A89; Wed, 15 Feb 2023 09:18:34 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.193.203]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 10D3F1121314; Wed, 15 Feb 2023 09:18:33 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 31F9IVg5755366 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 15 Feb 2023 10:18:31 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 31F9ITbg755365; Wed, 15 Feb 2023 10:18:29 +0100 Date: Wed, 15 Feb 2023 10:18:29 +0100 To: Segher Boessenkool Cc: gcc-patches@gcc.gnu.org Subject: [committed] powerpc: Fix up expansion for WIDEN_MULT_PLUS_EXPR [PR108787] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.3 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757888138648311987?= X-GMAIL-MSGID: =?utf-8?q?1757888138648311987?= Hi! WIDEN_MULT_PLUS_EXPR as documented has the factor operands with the same precision and the addend and result another one at least twice as wide. Similarly, {,u}maddMN4 is documented as 'maddMN4' Multiply operands 1 and 2, sign-extend them to mode N, add operand 3, and store the result in operand 0. Operands 1 and 2 have mode M and operands 0 and 3 have mode N. Both modes must be integer or fixed-point modes and N must be twice the size of M. In other words, 'maddMN4' is like 'mulMN3' except that it also adds operand 3. These instructions are not allowed to 'FAIL'. 'umaddMN4' Like 'maddMN4', but zero-extend the multiplication operands instead of sign-extending them. The PR103109 addition of these expanders to rs6000 didn't handle this correctly though, it treated the last argument as also having mode M sign or zero extended into N. Unfortunately this means incorrect code generation whenever the last operand isn't really sign or zero extended from DImode to TImode. The following patch removes maddditi4 expander altogether from rs6000.md, because we'd need maddhd 9,3,4,5 sradi 10,5,63 maddld 3,3,4,5 sub 9,9,10 add 4,9,6 which is longer than mulld 9,3,4 mulhd 4,3,4 addc 3,9,5 adde 4,4,6 and nothing would be able to optimize the case of last operand already sign-extended from DImode to TImode into just mr 9,3 maddld 3,3,4,5 maddhd 4,9,4,5 or so. And fixes umaddditi4, so that it emits an add at the end to add the high half of the last operand, fortunately in this case if the high half of the last operand is known to be zero (i.e. last operand is zero extended from DImode to TImode) then combine will drop the useless add. If we wanted to get back the signed op1 * op2 + op3 all in the DImode into TImode op0, we'd need to introduce a new tree code next to WIDEN_MULT_PLUS_EXPR and maddMN4 expander, because I'm afraid it can't be done at expansion time in maddMN4 expander to detect whether the operand is sign extended especially because of SUBREGs and the awkwardness of looking at earlier emitted instructions, and combine would need 5 instruction combination. Bootstrapped/regtested on powerpc64-linux (power7, tested -m32/-m64), powerpc64le-linux (power8 and another on power9 with --with-cpu-64=power9 --with-tune-64=power9), preapproved by Segher in the PR, committed to trunk. 2023-02-15 Jakub Jelinek PR target/108787 PR target/103109 * config/rs6000/rs6000.md (maddditi4): Change into umaddditi4 only expander, change operand 3 to be TImode, emit maddlddi4 and umadddi4_highpart{,_le} with its low half and finally add the high half to the result. * gcc.dg/pr108787.c: New test. * gcc.target/powerpc/pr108787.c: New test. * gcc.target/powerpc/pr103109-1.c: Adjust expected instruction counts. Jakub --- gcc/config/rs6000/rs6000.md.jj 2023-01-16 11:52:16.036734757 +0100 +++ gcc/config/rs6000/rs6000.md 2023-02-14 21:02:05.637399466 +0100 @@ -3226,25 +3226,40 @@ (define_insn "maddld4" "maddld %0,%1,%2,%3" [(set_attr "type" "mul")]) -(define_expand "maddditi4" +;; umaddditi4 generally needs maddhdu + maddld + add instructions, +;; unless last operand is zero extended from DImode, then needs +;; maddhdu + maddld, which is both faster than mulld + mulhdu + addc + adde +;; resp. mulld + mulhdu + addc + addze. +;; We don't define maddditi4, as that one needs +;; maddhd + sradi + maddld + add + sub and for last operand sign extended +;; from DImode nothing is able to optimize it into maddhd + maddld, while +;; without maddditi4 mulld + mulhd + addc + adde or +;; mulld + mulhd + sradi + addc + adde is needed. See PR108787. +(define_expand "umaddditi4" [(set (match_operand:TI 0 "gpc_reg_operand") (plus:TI - (mult:TI (any_extend:TI (match_operand:DI 1 "gpc_reg_operand")) - (any_extend:TI (match_operand:DI 2 "gpc_reg_operand"))) - (any_extend:TI (match_operand:DI 3 "gpc_reg_operand"))))] + (mult:TI (zero_extend:TI (match_operand:DI 1 "gpc_reg_operand")) + (zero_extend:TI (match_operand:DI 2 "gpc_reg_operand"))) + (match_operand:TI 3 "gpc_reg_operand")))] "TARGET_MADDLD && TARGET_POWERPC64" { rtx op0_lo = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 8 : 0); rtx op0_hi = gen_rtx_SUBREG (DImode, operands[0], BYTES_BIG_ENDIAN ? 0 : 8); + rtx op3_lo = gen_rtx_SUBREG (DImode, operands[3], BYTES_BIG_ENDIAN ? 8 : 0); + rtx op3_hi = gen_rtx_SUBREG (DImode, operands[3], BYTES_BIG_ENDIAN ? 0 : 8); + rtx hi_temp = gen_reg_rtx (DImode); - emit_insn (gen_maddlddi4 (op0_lo, operands[1], operands[2], operands[3])); + emit_insn (gen_maddlddi4 (op0_lo, operands[1], operands[2], op3_lo)); if (BYTES_BIG_ENDIAN) - emit_insn (gen_madddi4_highpart (op0_hi, operands[1], operands[2], - operands[3])); + emit_insn (gen_umadddi4_highpart (hi_temp, operands[1], operands[2], + op3_lo)); else - emit_insn (gen_madddi4_highpart_le (op0_hi, operands[1], operands[2], - operands[3])); + emit_insn (gen_umadddi4_highpart_le (hi_temp, operands[1], operands[2], + op3_lo)); + + emit_insn (gen_adddi3 (op0_hi, hi_temp, op3_hi)); + DONE; }) --- gcc/testsuite/gcc.dg/pr108787.c.jj 2023-02-14 21:18:21.285191941 +0100 +++ gcc/testsuite/gcc.dg/pr108787.c 2023-02-14 21:17:03.545324004 +0100 @@ -0,0 +1,27 @@ +/* PR target/108787 */ +/* { dg-do run { target int128 } } */ +/* { dg-options "-O2" } */ + +__attribute__((noipa)) unsigned __int128 +foo (unsigned long long x, unsigned long long y, unsigned long long z, unsigned long long u, unsigned long long v, unsigned long long w) +{ + unsigned __int128 r, d; + r = ((unsigned __int128) x * u); + d = ((unsigned __int128) y * w); + r += d; + d = ((unsigned __int128) z * v); + r += d; + return r; +} + +int +main () +{ + if (__CHAR_BIT__ != 8 || __SIZEOF_LONG_LONG__ != 8 || __SIZEOF_INT128__ != 16) + return 0; + unsigned __int128 x = foo (0x3efe88da491ULL, 0xd105e9b4a44ULL, 0x4efa677b3dbULL, 0x42c052bac7bULL, 0x99638a13199cULL, 0x56b640d064ULL); + if ((unsigned long long) (x >> 64) != 0x000000000309ff93ULL + || (unsigned long long) x != 0xbd5c98fdf2bdbcafULL) + __builtin_abort (); + return 0; +} --- gcc/testsuite/gcc.target/powerpc/pr108787.c.jj 2023-02-14 21:19:13.292434605 +0100 +++ gcc/testsuite/gcc.target/powerpc/pr108787.c 2023-02-14 21:19:25.499256860 +0100 @@ -0,0 +1,6 @@ +/* PR target/108787 */ +/* { dg-do run { target int128 } } */ +/* { dg-require-effective-target p9vector_hw } */ +/* { dg-options "-O2 -mdejagnu-cpu=power9" } */ + +#include "../../gcc.dg/pr108787.c" --- gcc/testsuite/gcc.target/powerpc/pr103109-1.c.jj 2022-08-19 16:00:05.198388230 +0200 +++ gcc/testsuite/gcc.target/powerpc/pr103109-1.c 2023-02-15 09:40:42.183123930 +0100 @@ -3,8 +3,8 @@ /* { dg-require-effective-target int128 } */ /* { dg-require-effective-target powerpc_p9modulo_ok } */ /* { dg-require-effective-target has_arch_ppc64 } */ -/* { dg-final { scan-assembler-times {\mmaddld\M} 2 } } */ -/* { dg-final { scan-assembler-times {\mmaddhd\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mmaddld\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mmaddhd\M} 0 } } */ /* { dg-final { scan-assembler-times {\mmaddhdu\M} 1 } } */ #include "pr103109.h"