From patchwork Thu Feb 22 20:02:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 205032 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp179425dyb; Thu, 22 Feb 2024 12:03:43 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCVUGeyGt1nB7QjYuVgNYsgAfSj9QDAtjcTbRs/NOPH4S0UEyL8tVGWH/+oyEsya+zTJLXg2C71Kt1ys4eQphLgBqrN9NA== X-Google-Smtp-Source: AGHT+IHKgU9JEugKB6qzNFbPVfr4PIqvMKVxKAtfjfdA+Srhk2mbIQ4ZzgYVEDHX7r6qdEZZGcpK X-Received: by 2002:a05:6214:cae:b0:68f:cd03:6f0f with SMTP id s14-20020a0562140cae00b0068fcd036f0fmr303546qvs.3.1708632223706; Thu, 22 Feb 2024 12:03:43 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708632223; cv=pass; d=google.com; s=arc-20160816; b=IaiDuEp59BRk/H/bNq/F9H9y2FMSgkJj0JmIUqFUJ/fCMaJxyrUUWXz9XBJ6I+iqxu JMpc1KjXjdUMvwwUsutCIdBihOtg6lV3ifzjpyBVgbnanokhjQC40HSCGTTY2JP2U1FR MAlBNNFCLIvcqRvtttoPS5Po68Y1viJ+RCnHG+kvNqvng2o7mwMBjl6X/yq0H6R9Y/f2 5Tpvbid3iJ/n6IdK9UQkn2yJQfMdlBrdr8fDxguvxNB+nco/hCW0gTr1esHx88gMPdw9 E5t1c6WUJa4U0+dpTOyeGABlbTwI7his/GHYh+/vqbzrB1Ul2YeVHE5Y4UhrErcshalb DmRg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding:to :subject:from:content-language:cc:user-agent:mime-version:date :message-id:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=YPHhwcXIrtkLHcFQ90wLnEA/JSu+SM5fJ4Bb3N6HWzk=; fh=x4fM09mUZZRnRi4B0eK2pnlpMML2jP/9cpNZ/qpB+VM=; b=LktbkpOPESByJn/T4uw5KDWL1tn9lkBF15wgOHtmhxyxvMm8/oDFXah70y5QLVxNB5 YfqmHZid/xpk+veloocI7P9f32eWqyrekSq0F2RP5GFc2nhqfX7rKv4oIAC08WNjPERP Y0f6oUXWAslaGaWYlBsw31y7k3BQNWRUk6L1+sxFQk+6wiLhpvy05BqvEOtbZYw+x2DK 5UlWwh+efWGZkRB0u5NamV/NTgpAGtKuQhRa2E0RfEL1xJc6fRxzwt8xA+EzjUppb43E o/+x+BL7RYvjVYT1zAdkMTUmZ3YeawO9BlG6gHrygRrAVtITdcQazqeQ1Uqz9CFjKwzp X9oA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=PsZRsfOm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id u6-20020a0562141c0600b0068eee9b15a7si13466311qvc.585.2024.02.22.12.03.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Feb 2024 12:03:43 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=PsZRsfOm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5C5E63858CD1 for ; Thu, 22 Feb 2024 20:03:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 2E6713858C52 for ; Thu, 22 Feb 2024 20:02:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2E6713858C52 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2E6713858C52 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708632177; cv=none; b=FARAIXT8kpGoyqVJZg/oYfh2koDHt0S93b6uYPsFfPnbaHKUsYUus7akRnXrK8KjoSD8xXWlFQrcENckcN3M66Scgr6n45emdSQPTDlHYVfPKIa05OSqk+JAoenrkNIefPk0oDfLamhy9lQvWAKxtOTThbhDGJUPY9d74yOK5qc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1708632177; c=relaxed/simple; bh=r7PJQ1ovcI2ENjbn3wzWwSVbRaSbaqQUTaJoDMjmtXg=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=PKkeJa8F/W30B2WBRE46WPCE4KRmgA2FhfCHfuOQEsCaj3Ch4vrQ5pVuvip96HmQ67C+p23RA26Xmaex8CQ9b7jLA5tMLBUZ+kkMiwehj+rOBB/AZ/qm11d4/aK7v21gJXf3UaX4E5jPtU0kxJ3AxogMwBeBxr70Qg/Q+pvU4QQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-33d0a7f2424so81852f8f.0 for ; Thu, 22 Feb 2024 12:02:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1708632165; x=1709236965; darn=gcc.gnu.org; h=content-transfer-encoding:to:subject:from:content-language:cc :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=YPHhwcXIrtkLHcFQ90wLnEA/JSu+SM5fJ4Bb3N6HWzk=; b=PsZRsfOmsJFLMqCj5btsXxIDVUepiFgTKlojCZITxbxIUCASmCX7/ueTaVh8CY408Z 56zH4+Y7ly3++H6BfTyKxXGUBcy6l7wvCta27/BYW5k4gauPHrd+oWv2bQc8wZwzL9cr 9JAThGXHb8gkCLxQzNusqdYr5p5UUTNClUmxgyJYYlUXZO8kPGAjrSI0c5QUCc3xyP3Z vbfEWP0axHkYlTqpdbqBbAQGUAv7x1Ehi2AvdX8EDYKUaeBMx+O+QHzcKBRt1n7fpui9 61Y/sWrtGLBjuj9ILboE9+3GQJhQbzR7WlvWz2OUqiHUMpnkJjGPwowR92Bvrl3xiGwU U5lg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708632165; x=1709236965; h=content-transfer-encoding:to:subject:from:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=YPHhwcXIrtkLHcFQ90wLnEA/JSu+SM5fJ4Bb3N6HWzk=; b=qKYdvEh32tT5Fky0dE8zMUZxc/Yjbnv8LBPLqDh/xbJ823Bt6jxoDodjC7h1+KKW/5 X+ewqoXSsZcNTDYx+PcHmKgd3/IemSijvf26NYrlyOcuqUyoeaBYSabGAFqJaDcve/Ta CE6Cq9J2XiqsvQaT2FkR8Q/jptCa0gQx1h4pCr6mzJ31I2YbgVQXsnWEgviMmBSC71oy rAiU5W4zipz/lYIjim0DEd3qXQMQpMyoWDLt7pTt/DvlWnXz6G3ubhpR5xLfdIoccsBE yIw+A8Ue62H0J23JQaKRzQgR0PLVHZD0h/JxctHiROy7+i4nSsEF0dChQPikVe4EQlH2 /YLg== X-Gm-Message-State: AOJu0YxJDwQaAAf/RB5t6nzlfLJFNfVGfJCxMWxHl4n787yIwDhhPgzJ GuIoyF/g7rx4AyoH1MUMS2NxlGC7UY/M8ygJxqCvPSrm3nzAX2Yt3melqplb X-Received: by 2002:a05:6000:9:b0:33b:784c:276e with SMTP id h9-20020a056000000900b0033b784c276emr97247wrx.25.1708632165123; Thu, 22 Feb 2024 12:02:45 -0800 (PST) Received: from [192.168.1.24] (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id vu12-20020a170907a64c00b00a3f15cb8d9dsm2471532ejc.126.2024.02.22.12.02.44 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 22 Feb 2024 12:02:44 -0800 (PST) Message-ID: <80141fca-9a4e-459c-a31f-94fea55d9787@gmail.com> Date: Thu, 22 Feb 2024 21:02:43 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com, jeffreyalaw Content-Language: en-US From: Robin Dapp Subject: [PATCH] RISC-V: Fix vec_init for simple sequences [PR114028]. To: gcc-patches , palmer , Kito Cheng , "juzhe.zhong@rivai.ai" X-Spam-Status: No, score=-8.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791630742502226380 X-GMAIL-MSGID: 1791630742502226380 Hi, for a vec_init (_a, _a, _a, _a) with _a of mode DImode we try to construct a "superword" of two "_a"s. This only works for modes < Pmode when we can "shift and or" two halves into one Pmode register. This patch disallows the optimization for inner_mode == Pmode and emits a simple broadcast in such a case. The test is not a run test because it requires vlen=256 in qemu. I can adjust that still of course. Regtested on rv64, rv32 still running. Regards Robin gcc/ChangeLog: PR target/114028 * config/riscv/riscv-v.cc (rvv_builder::can_duplicate_repeating_sequence_p): Return false if inner mode is already Pmode. (rvv_builder::is_all_same_sequence): New function. (expand_vec_init): Emit broadcast if sequence is all same. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/pr114028.c: New test. --- gcc/config/riscv/riscv-v.cc | 25 ++++++++++++++++++- .../gcc.target/riscv/rvv/autovec/pr114028.c | 25 +++++++++++++++++++ 2 files changed, 49 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 0cfbd21ce6f..29d58deb995 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -443,6 +443,7 @@ public: } bool can_duplicate_repeating_sequence_p (); + bool is_repeating_sequence (); rtx get_merged_repeating_sequence (); bool repeating_sequence_use_merge_profitable_p (); @@ -483,7 +484,8 @@ rvv_builder::can_duplicate_repeating_sequence_p () { poly_uint64 new_size = exact_div (full_nelts (), npatterns ()); unsigned int new_inner_size = m_inner_bits_size * npatterns (); - if (!int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode) + if (m_inner_mode == Pmode + || !int_mode_for_size (new_inner_size, 0).exists (&m_new_inner_mode) || GET_MODE_SIZE (m_new_inner_mode) > UNITS_PER_WORD || !get_vector_mode (m_new_inner_mode, new_size).exists (&m_new_mode)) return false; @@ -492,6 +494,18 @@ rvv_builder::can_duplicate_repeating_sequence_p () return nelts_per_pattern () == 1; } +/* Return true if the vector is a simple sequence with one pattern and all + elements the same. */ +bool +rvv_builder::is_repeating_sequence () +{ + if (npatterns () > 1) + return false; + if (full_nelts ().is_constant ()) + return repeating_sequence_p (0, full_nelts ().to_constant (), 1); + return nelts_per_pattern () == 1; +} + /* Return true if it is a repeating sequence that using merge approach has better codegen than using default approach (slide1down). @@ -2544,6 +2558,15 @@ expand_vec_init (rtx target, rtx vals) v.quick_push (XVECEXP (vals, 0, i)); v.finalize (); + /* If the sequence is v = { a, a, a, a } just broadcast an element. */ + if (v.is_repeating_sequence ()) + { + machine_mode mode = GET_MODE (target); + rtx dup = expand_vector_broadcast (mode, v.elt (0)); + emit_move_insn (target, dup); + return; + } + if (nelts > 3) { /* Case 1: Convert v = { a, b, a, b } into v = { ab, ab }. */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c new file mode 100644 index 00000000000..a451d85e3fe --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/pr114028.c @@ -0,0 +1,25 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv_zvl256b -O3" } */ + +int a, d = 55003; +long c = 0, h; +long e = 1; +short i; + +int +main () +{ + for (int g = 0; g < 16; g++) + { + d |= c; + short l = d; + i = l < 0 || a >> 4 ? d : a; + h = i - 8L; + e &= h; + } + + if (e != 1) + __builtin_abort (); +} + +/* { dg-final { scan-assembler-times "vmv\.v\.i\tv\[0-9\],0" 0 } } */