From patchwork Fri Oct 20 15:48:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 156198 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp1149601vqb; Fri, 20 Oct 2023 08:49:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFCr/vTQpxSDameXMuSId1x5d8W0dpjTlpZIcx7T13WFzUr8JS53b3H/tG2tkTk6AaY7+kY X-Received: by 2002:a05:620a:444c:b0:76f:1272:2aa8 with SMTP id w12-20020a05620a444c00b0076f12722aa8mr2762524qkp.6.1697816965369; Fri, 20 Oct 2023 08:49:25 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697816965; cv=pass; d=google.com; s=arc-20160816; b=dqznEuGqZa2z/V4cuw9JivwZN+z8RCUWLuJkaRGg6TJxh/F8vz14p1B5NW1l4i0/YS 38Z8hYfk1T+dHCBx6t0bLifOjmz9T2tp/u5+V2rf6M+3IwJP6fYIufCL+LNhwuM9YB9i plqtubcbJEUehemuQ1fMsI1LGf0TSWhvN03d+5oe50l/0K74Hh9XVsqCVKZqhznhjr00 rRDzJzCQK2kxO16IyVYTjEJHcJngkhCDR6xnHQXFCqlBBpmLmXeWYp2JwRestI7WIMTr +ESzQy7FU3FvHm4DmukvMrmBlCdpimN4TZa1rHw+1nykyIwGbm5pBI/EyYPShcoPeW+7 IlLQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:from :content-language:user-agent:mime-version:date:message-id :ironport-sdr:arc-filter:dmarc-filter:delivered-to; bh=9w6aCMB7LcBFdGYh+H3A12/d1pnv01pJ+R9CVlH5cg0=; fh=BM63FC978zSUM/QitLZISNC28G5ibsmZhVfIi0XI5FA=; b=wU2NU0yLy1A+pkN7qcxemFEz8ZfDqraCFRYYZ5rk5d1QKrHd5+wtpbP0D1JwoTX/1P HBhKNzKeI5aiPAd9sKpmu3g6cZ+SCPg/7nOAUJs8VQMCsD1ZBCTRo/rZ+ZYuEL1tDW3f gjE6MVwyFrOjOj1UUinUYkbzOT/gGUA5Y9l5qwYMbFWqYqRKWzTKLzCUjW1qC36MB4ML lBTtH5+fv1rh8uyHyxHBLWdytjf06Y881UfDJr54WQyTlLqlSXex6xgqIlaDtbo/oAaX 0JtL1dmc0284XaLlJ9HwhAshZMQbMnQ6++O0yF/f08EaUQQBlBLZLg6baxgzJLjObu6j jCIw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id i12-20020a05620a0a0c00b00768046bd0f6si1411694qka.381.2023.10.20.08.49.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Oct 2023 08:49:25 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2649E385841A for ; Fri, 20 Oct 2023 15:49:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id 637773858D20 for ; Fri, 20 Oct 2023 15:49:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 637773858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 637773858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=68.232.129.153 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697816941; cv=none; b=Gy8F8Mo5mu3zyE2gqjFXblyu7aESUqEu3+LrkpvNKRJo3+fHMt9wOxjyJ4uKZkQQSPGjCvYvNVQ9JKSqMxviFAjbMhG3wqrLl0NrD94EpdPIdnBvhIw5m2BcwnZFEdUJfw3+UvEllVCuKEILJY7VzMHN9/yumVywMhTqDduZs5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697816941; c=relaxed/simple; bh=WNGVi4axwDwwXwuQCYk5crBv7TZtaKMjiKO9591IsA8=; h=Message-ID:Date:MIME-Version:From:Subject:To; b=ZmJgNuMmECf73bM/wxHox9rZeClmwfJndLTByDj09JMmdVmcslwPN2+zVkg4xuSdenzAtV/g3/TPDDeSJ43oV0idmlIQ9Phrt9h0FJEfqA7QbrFtJLkVxlkQrOi2JQVaXudRQ3ror+JseKp8yrUWqQLM1P/zhq0VMT3CD2IT2Mo= ARC-Authentication-Results: i=1; server2.sourceware.org X-CSE-ConnectionGUID: kd76tHXJQVWNzXxNsdX/cA== X-CSE-MsgGUID: dHIIuaM4SLu5wlX7gKoOcg== X-IronPort-AV: E=Sophos;i="6.03,239,1694764800"; d="scan'208";a="22898353" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 20 Oct 2023 07:48:59 -0800 IronPort-SDR: 5osPo4RbCjRkkbURr9kA7ERXH3trNXMfUubftsSA0ZddgsyqmAeTr1vsju0BF3hIFbd7ZKaJaz BJTjbVIE4yCzz/BAmRGSNIH9QjcKo/RXIag3Dek7xQWc50X02L1d4D0UduLVX7MoT7p2rToGkZ +j4piHnuLDlCF4THlceO3eJIcfYcDey2MgjGFOE337guDPAlh1GmZXSrMbclR1p9lUJwniT62B dBg3GzceA4pM/H3fR60/eNvJ5EIPfAqP+VhXpEsH7a+R3Mr/qcq9ANiyEYq/uq3LG4N7LL1/u+ 8PM= Message-ID: <63e907af-cde8-4f63-bba9-d39fcd5623fb@codesourcery.com> Date: Fri, 20 Oct 2023 16:48:54 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-GB From: Andrew Stubbs Subject: [PATCH] vect: Don't set excess bits in unform masks To: "gcc-patches@gcc.gnu.org" , Richard Biener X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780290122068437149 X-GMAIL-MSGID: 1780290122068437149 This patch fixes a wrong-code bug on amdgcn in which the excess "ones" in the mask enable extra lanes that were supposed to be unused and are therefore undefined. Richi suggested an alternative approach involving narrower types and then a zero-extend to the actual mask type. This solved the problem for the specific test case that I had, but I'm not sure if it would work with V2 and V4 modes (not that I've observed bad behaviour from them anyway, but still). There were some other caveats involving "two-lane division" that I don't fully understand, so I went with the simpler implementation. This patch does have the disadvantage of an additional "and" instruction in the non-constant case even for machines that don't need it. I'm not sure how to fix that without an additional target hook. (If GCC could use the 64-lane vectors more effectively without the assistance of artificially reduced sizes then this problem wouldn't exist.) OK to commit? Andrew vect: Don't set excess bits in unform masks AVX ignores any excess bits in the mask, but AMD GCN magically uses a larger vector than was intended (the smaller sizes are "fake"), leading to wrong-code. gcc/ChangeLog: * expr.cc (store_constructor): Add "and" operation to uniform mask generation. diff --git a/gcc/expr.cc b/gcc/expr.cc index 4220cbd9f8f..fb4609f616e 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -7440,7 +7440,7 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size, break; } /* Use sign-extension for uniform boolean vectors with - integer modes. */ + integer modes. Effectively "vec_duplicate" for bitmasks. */ if (!TREE_SIDE_EFFECTS (exp) && VECTOR_BOOLEAN_TYPE_P (type) && SCALAR_INT_MODE_P (mode) @@ -7449,7 +7449,21 @@ store_constructor (tree exp, rtx target, int cleared, poly_int64 size, { rtx op0 = force_reg (TYPE_MODE (TREE_TYPE (elt)), expand_normal (elt)); - convert_move (target, op0, 0); + rtx tmp = gen_reg_rtx (mode); + convert_move (tmp, op0, 0); + + if (known_ne (TYPE_VECTOR_SUBPARTS (type), + GET_MODE_PRECISION (mode))) + { + /* Ensure no excess bits are set. + GCN needs this, AVX does not. */ + expand_binop (mode, and_optab, tmp, + GEN_INT ((1 << (TYPE_VECTOR_SUBPARTS (type) + .to_constant())) - 1), + target, true, OPTAB_DIRECT); + } + else + emit_move_insn (target, tmp); break; }