From patchwork Tue Oct 24 10:08:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 157362 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1835470vqx; Tue, 24 Oct 2023 03:08:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEG8CzIWrSqgeqJfctrBES4dXfTeAUenOlDFOZpA3YZU5VAU5qpD4+IrZh1q9qLyFaYdk0D X-Received: by 2002:a05:620a:4305:b0:76f:1614:576b with SMTP id u5-20020a05620a430500b0076f1614576bmr13100806qko.1.1698142127368; Tue, 24 Oct 2023 03:08:47 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698142127; cv=pass; d=google.com; s=arc-20160816; b=FrapmK6HnxLz2W+B/CcKbdl0bF+PdM2dKbrflN4sAWgO6z9pxz44IuoH97jddZU4PB YH3+e1xDvFHm6z27lhjv7+AclnO2UD07ooGSrwtGeDYGPNdUi6WGZKMPtLiiH51Gur+b Vg67lhIXE1JXs131Vq8WMc/9U+gLKc5hHk/t6r/zKTYJXkWAYae9Ps5NBlJ00jm9Yipn ZyauEpEHubJTO6kuEf0wFv7ACWmMtkKIDSHxa5lH9FLDKwEktq/fzW4Z03P2kF0mZLmt KDoeTgBzs9oNvaJKdwDy5ju40rBFKksCL702ndgOJxTJxTr8V1VdptN+oLHJLjGrKOHE bffQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:mime-version:user-agent :message-id:date:subject:cc:mail-followup-to:to:from:arc-filter :dmarc-filter:delivered-to; bh=ikpNJZ4yEYKD2gXTLN69AS6NeKfsGVaJB39a5iqKB7g=; fh=no2Hrvkch5L61dN6nIMV00vCCiLlKGxZlfYcdXbgg5A=; b=DPEtlBKn/hetXUcRGqU5iHDwoijoHmFi7MQdj8c62zGaP+kp1gVAG7MiQ1QGW52+ra 85whs8NMRmlgD86WfgZyO+x/7L9OOBMfYPmag+RZSQtJ4n1sGrcxG8vFRIQqAKyvpsB3 aT4BJvqmLzQ90A2upCQPXcUaldVqk6sJ4dn8zDr2Ot0hlRmZidGUalRb1LpowpIhrJQn Bn9+loeYPz+dzH10LVErn2QSpY496YBBOBPaIQYBxrzlBw61RpMOfNugeoi8gM+RTAM/ TjRiMsRahN/ydoLLjCGUx61Sa0PN39Rz8eP9547st7YKfND6kzImK24CAUWSxiZb8xsX iWHA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l6-20020a05620a28c600b0076f17d9dc1asi3708684qkp.493.2023.10.24.03.08.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Oct 2023 03:08:47 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 269B43857803 for ; Tue, 24 Oct 2023 10:08:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 4E5C53858CDB for ; Tue, 24 Oct 2023 10:08:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4E5C53858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4E5C53858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698142104; cv=none; b=dlFUmpbOBuvd//EC+3BJRDeUxfkv8ERrmqfbbB1dGKBtAEi4WefoNREiFt86XyJkz3Gn0euzqfVbIOXBdCfe4lXhytLtiEtplvoE/8xNkgBw7ATTwSYOpPeFIrVrflN852o5T0G6apUb3RBn0jQV4SfVyD7pN1L2pnBDSfTudzI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698142104; c=relaxed/simple; bh=LiKMkvUI/gQh66twez5aQ/TZ/p6BUzzq8gNCR9QSXVQ=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=GVBErC1uK9RDWR1NSJfZjB7Rg6gU23ZJ/VgN9x42LsRa+kvUoWs7MK1vrjEzqT62LrezTYaIhcPTF6EoMq4xv5iFvFJUO6bxx679/o37SCwmCEpRsqw7mwKEsjDGLtZk1IfGZAQnLlLFEQLFgT2M/uW2xXHF7l8nQG9yuxK4ncw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id E4CD32F4; Tue, 24 Oct 2023 03:09:03 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F35B3F64C; Tue, 24 Oct 2023 03:08:22 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, ubizjak@gmail.com, kirill.yukhin@gmail.com, hongtao.liu@intel.com, richard.sandiford@arm.com Cc: ubizjak@gmail.com, kirill.yukhin@gmail.com, hongtao.liu@intel.com Subject: [PATCH] i386: Avoid paradoxical subreg dests in vector zero_extend Date: Tue, 24 Oct 2023 11:08:21 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.3 (gnu/linux) MIME-Version: 1.0 X-Spam-Status: No, score=-23.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780631079434617052 X-GMAIL-MSGID: 1780631079434617052 For the V2HI -> V2SI zero extension in: typedef unsigned short v2hi __attribute__((vector_size(4))); typedef unsigned int v2si __attribute__((vector_size(8))); v2si f (v2hi x) { return (v2si) {x[0], x[1]}; } ix86_expand_sse_extend would generate: (set (reg:V2HI 102) (const_vector:V2HI [(const_int 0 [0]) (const_int 0 [0])])) (set (subreg:V8HI (reg:V2HI 101) 0) (vec_select:V8HI (vec_concat:V16HI (subreg:V8HI (reg/v:V2HI 99 [ x ]) 0) (subreg:V8HI (reg:V2HI 102) 0)) (parallel [(const_int 0 [0]) (const_int 8 [0x8]) (const_int 1 [0x1]) (const_int 9 [0x9]) (const_int 2 [0x2]) (const_int 10 [0xa]) (const_int 3 [0x3]) (const_int 11 [0xb])]))) (set (reg:V2SI 100) (subreg:V2SI (reg:V2HI 101) 0)) (expr_list:REG_EQUAL (zero_extend:V2SI (reg/v:V2HI 99 [ x ]))) But using (subreg:V2SI (reg:V2HI 101) 0) as the destination of the vec_select means that only the low 4 bytes of the destination are stored. Only the lower half of reg 100 is well-defined. Things tend to happen to work if the register allocator ties reg 101 to reg 100. But it caused problems with the upcoming late-combine pass because we propagated the set of reg 100 into its uses. Tested on x86_64-linux-gnu. OK to install? Richard gcc/ * config/i386/i386-expand.cc (ix86_split_mmx_punpck): Allow the destination to be wider than the sources. Take the mode from the first source. (ix86_expand_sse_extend): Pass the destination directly to ix86_split_mmx_punpck, rather than using a fresh register that is half the size. --- gcc/config/i386/i386-expand.cc | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 1eae9d7c78c..2361ff77af3 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -1110,7 +1110,9 @@ ix86_split_mmx_pack (rtx operands[], enum rtx_code code) ix86_move_vector_high_sse_to_mmx (op0); } -/* Split MMX punpcklXX/punpckhXX with SSE punpcklXX. */ +/* Split MMX punpcklXX/punpckhXX with SSE punpcklXX. This is also used + for a full unpack of OPERANDS[1] and OPERANDS[2] into a wider + OPERANDS[0]. */ void ix86_split_mmx_punpck (rtx operands[], bool high_p) @@ -1118,7 +1120,7 @@ ix86_split_mmx_punpck (rtx operands[], bool high_p) rtx op0 = operands[0]; rtx op1 = operands[1]; rtx op2 = operands[2]; - machine_mode mode = GET_MODE (op0); + machine_mode mode = GET_MODE (op1); rtx mask; /* The corresponding SSE mode. */ machine_mode sse_mode, double_sse_mode; @@ -5660,7 +5662,7 @@ ix86_expand_sse_extend (rtx dest, rtx src, bool unsigned_p) gcc_unreachable (); } - ops[0] = gen_reg_rtx (imode); + ops[0] = dest; ops[1] = force_reg (imode, src); @@ -5671,7 +5673,6 @@ ix86_expand_sse_extend (rtx dest, rtx src, bool unsigned_p) ops[1], pc_rtx, pc_rtx); ix86_split_mmx_punpck (ops, false); - emit_move_insn (dest, lowpart_subreg (GET_MODE (dest), ops[0], imode)); } /* Unpack SRC into the next wider integer vector type. UNSIGNED_P is