From patchwork Fri Oct 20 17:25:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 156239 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2010:b0:403:3b70:6f57 with SMTP id fe16csp1214904vqb; Fri, 20 Oct 2023 10:26:48 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHb20qvdDbXYGSiswJh7lHpXMF0H2u7adnIfm/WzlpAij2KXQI5pl6IksuER0++y/gBCeDI X-Received: by 2002:a05:6214:20ad:b0:66d:63bc:207a with SMTP id 13-20020a05621420ad00b0066d63bc207amr2505664qvd.12.1697822807988; Fri, 20 Oct 2023 10:26:47 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697822807; cv=pass; d=google.com; s=arc-20160816; b=rU/eHu/vFKlv6Uh/UKPs2nUyzkzre1ygSpQuYPk1RmDbi8Q92JCQQORZkkkhpZD9NZ 5FCXIeJVHBZ0yPC4ID8Zlp3sTmfMvUKh+P5caVQlUMPOPSWozt5ESCgwLr/PEDy0iIWu IMdNSWQMBQeywctzIMLRsDt0AnL/7noFNsJrX8nhIN+hTEz29rQwHV7Zk4rP5E4Howux nx+Utf6jb8KKIllkwkV6usvXeIUftcmXntyUQouiwaoV7keTa92X1R5xRNLMoxqMRC+3 ZZOjEHRKQH797aGDwxMPIaKkSEJCaREuaXXRWPVCWwecVa1dJpZq31pqI2E2XZtHNQwv /yWQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:message-id:date:from :mime-version:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=aQ9aZjjbXIH2N3Ecl+D98CTn8zA5On5b96/W48IWpp8=; fh=V0KvqjtAiJWTVaVMCJy7UFnCgydTPOItgzW6VQLL3VQ=; b=uekD4s1m/zC1PamL8lOImrgAwNSAsbLWT0ITyiuqUS4aR2q0aj+5m3omVaUyEdA2bv jJu07SAD1EUz/9qJyvqTDuHWF1KJpQiJ+rkfVjw1AWC6GCj6JdnhGToSZtX2+/q6Cogb 8gKEVJX3YPDWcgiDZAirIob1bTGQrwMoAIISTJ1cuuzjNjOSfcutgvE5y3mSSh9WcWXv hSOm1ich//BGhXaF1YukFEgUhpfgLjMM+UAfiMp8QS0uAKzuu5z8+V7wY79+N3R0OM5J O0EHzxE78e/7T78a3kagt3OqybFASMz5rAfYjBhdsOq/jMY8Qj83axnOT81evWQGwZbT gfAw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TBrHrsl6; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id q8-20020a05620a024800b0076c8d5c1aeesi1529897qkn.551.2023.10.20.10.26.47 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Oct 2023 10:26:47 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=TBrHrsl6; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B7E7B3857703 for ; Fri, 20 Oct 2023 17:26:47 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42d.google.com (mail-wr1-x42d.google.com [IPv6:2a00:1450:4864:20::42d]) by sourceware.org (Postfix) with ESMTPS id ECE5A3858D39 for ; Fri, 20 Oct 2023 17:26:22 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org ECE5A3858D39 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org ECE5A3858D39 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::42d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697822785; cv=none; b=QIt1SyO/1F7VFRQQSl1SnhZSH/fHPMIbo+BR3702jcK/IYDMyoGeHnmqz2EFjHQcMkdUIK9WSSNEJ0DZwvk20WFIjn/KMWUapLj5vYGt4p7QvVmzeahdSam/au7/A9CY4fNlGgxm1i4aZPyznVRDevznn5WjtGnfJvgrlIjtdqY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697822785; c=relaxed/simple; bh=eK4Z6DjId4Oy4hdHMeHaMy0KRv79bIU0YgcCr+/0wok=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=K47LmPW3K9dYy5NgQ9mFgepGfSCBGUtEVLQEvqBZ6scUijgALgOSY/maVtcNO0qFnY9lrLspDQxbR1chSHoq8HsryQ1ZL8kJOTyNiesxHkrG8dhJKP7pAYg59W1cd6GZRkwaZTC2hqJ30rqmN/Qjov/DEAZ+ELHK+9ta+hP5pRg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x42d.google.com with SMTP id ffacd0b85a97d-32d834ec222so758624f8f.0 for ; Fri, 20 Oct 2023 10:26:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1697822781; x=1698427581; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=aQ9aZjjbXIH2N3Ecl+D98CTn8zA5On5b96/W48IWpp8=; b=TBrHrsl60YNQQ1JpUaxQ0dIs4w6XsoDxTwgemipZkJeFLT7aIPcX9RM4I6qQtaUFus 38obGyXOPbLSikDHy0K0ZbWFhFZxhnK9eC8VExa7mUVBks08OP/y4ZFR8vmXrUAwLzzj WbW1vfv2BgwY9e5TiHC6vtGNriByMRbM99yGgst/YxpYVn2ET4euO4hOg43rSkajpyU2 zN/TJVHRqFYJSKHT1ZnLjqEI6eSDkNM3N4I3B/timVanrvS+q4kprxkFjpFaya+lYZRN FU0rYBYKSu3EaiyohsQBR2hphGY55vUCvr4xZkaJ0m7oGG52QNBfKseEzMxpjJwRpuUB M+1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697822781; x=1698427581; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=aQ9aZjjbXIH2N3Ecl+D98CTn8zA5On5b96/W48IWpp8=; b=PcUuGjmfvKUUpe+vY+dk1lNpYKGR9p6TNYYE8q6EDFnbPDrDtW2HnP/kv7ltUebb5y aS2oO1BpewF18xnAcdZkPCboz51o2OCm+0g/Y4H8rawjEusISYQBZjHVkYPJhQ7G13Rh 0Dm1GArpTFXxeuJvHlxbkImyRbHVukq5VBhqvCoj1DJcmjNuwNQ1UB/6g+G/nLKUKF71 Wby5wIv7XW5z2RwPDEdVrm6VC5UNkICF36Bp9Z4ZZJVXn60CpJSK3BaWf4uaciQwgDeB 4mfGUDiGVuIs+bR7vlrtwAFOL1aXFYprD5TbOw45lItPBZmsMnmmrF/U9psI+5rQe/tK Yl9w== X-Gm-Message-State: AOJu0Yx64RruEDYTHltQVDob6MMCzvYepgneJenwH9whT3hdDU7dkMcZ GJfuVRqutnAKbPWXQbRD7j2kE2zTr9KAf5DM1x+qsPJiQxe7rn+0BtM= X-Received: by 2002:adf:e80a:0:b0:32d:8cfd:5780 with SMTP id o10-20020adfe80a000000b0032d8cfd5780mr1678100wrm.27.1697822780825; Fri, 20 Oct 2023 10:26:20 -0700 (PDT) MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Fri, 20 Oct 2023 22:55:44 +0530 Message-ID: Subject: PR111754 To: gcc Patches , Richard Sandiford , Richard Biener X-Spam-Status: No, score=-9.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_NUMSUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780296248308199953 X-GMAIL-MSGID: 1780296248308199953 Hi, For the following test-case: typedef float __attribute__((__vector_size__ (16))) F; F foo (F a, F b) { F v = (F) { 9 }; return __builtin_shufflevector (v, v, 1, 0, 1, 2); } Compiling with -O2 results in following ICE: foo.c: In function ‘foo’: foo.c:6:10: internal compiler error: in decompose, at rtl.h:2314 6 | return __builtin_shufflevector (v, v, 1, 0, 1, 2); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 0x7f3185 wi::int_traits >::decompose(long*, unsigned int, std::pair const&) ../../gcc/gcc/rtl.h:2314 0x7f3185 wide_int_ref_storage::wide_int_ref_storage >(std::pair const&) ../../gcc/gcc/wide-int.h:1089 0x7f3185 generic_wide_int >::generic_wide_int >(std::pair const&) ../../gcc/gcc/wide-int.h:847 0x7f3185 poly_int<1u, generic_wide_int > >::poly_int >(poly_int_full, std::pair const&) ../../gcc/gcc/poly-int.h:467 0x7f3185 poly_int<1u, generic_wide_int > >::poly_int >(std::pair const&) ../../gcc/gcc/poly-int.h:453 0x7f3185 wi::to_poly_wide(rtx_def const*, machine_mode) ../../gcc/gcc/rtl.h:2383 0x7f3185 rtx_vector_builder::step(rtx_def*, rtx_def*) const ../../gcc/gcc/rtx-vector-builder.h:122 0xfd4e1b vector_builder::elt(unsigned int) const ../../gcc/gcc/vector-builder.h:253 0xfd4d11 rtx_vector_builder::build() ../../gcc/gcc/rtx-vector-builder.cc:73 0xc21d9c const_vector_from_tree ../../gcc/gcc/expr.cc:13487 0xc21d9c expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) ../../gcc/gcc/expr.cc:11059 0xaee682 expand_expr(tree_node*, rtx_def*, machine_mode, expand_modifier) ../../gcc/gcc/expr.h:310 0xaee682 expand_return ../../gcc/gcc/cfgexpand.cc:3809 0xaee682 expand_gimple_stmt_1 ../../gcc/gcc/cfgexpand.cc:3918 0xaee682 expand_gimple_stmt ../../gcc/gcc/cfgexpand.cc:4044 0xaf28f0 expand_gimple_basic_block ../../gcc/gcc/cfgexpand.cc:6100 0xaf4996 execute ../../gcc/gcc/cfgexpand.cc:6835 IIUC, the issue is that fold_vec_perm returns a vector having float element type with res_nelts_per_pattern == 3, and later ICE's when it tries to derive element v[3], not present in the encoding, while trying to build rtx vector in rtx_vector_builder::build(): for (unsigned int i = 0; i < nelts; ++i) RTVEC_ELT (v, i) = elt (i); The attached patch tries to fix this by returning false from valid_mask_for_fold_vec_perm_cst if sel has a stepped sequence and input vector has non-integral element type, so for VLA vectors, it will only build result with dup sequence (nelts_per_pattern < 3) for non-integral element type. For VLS vectors, this will still work for stepped sequence since it will then use the "VLS exception" in fold_vec_perm_cst, and set: res_npattern = res_nelts and res_nelts_per_pattern = 1 and fold the above case to: F foo (F a, F b) { [local count: 1073741824]: return { 0.0, 9.0e+0, 0.0, 0.0 }; } But I am not sure if this is entirely correct, since: tree res = out_elts.build (); will canonicalize the encoding and may result in a stepped sequence (vector_builder::finalize() may reduce npatterns at the cost of increasing nelts_per_pattern) ? PS: This issue is now latent after PR111648 fix, since valid_mask_for_fold_vec_perm_cst with sel = {1, 0, 1, ...} returns false because the corresponding pattern in arg0 is not a natural stepped sequence, and folds correctly using VLS exception. However, I guess the underlying issue of dealing with non-integral element types in fold_vec_perm_cst still remains ? The patch passes bootstrap+test with and without SVE on aarch64-linux-gnu, and on x86_64-linux-gnu. Thanks, Prathamesh diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 82299bb7f1d..cedfc9616e9 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -10642,6 +10642,11 @@ valid_mask_for_fold_vec_perm_cst_p (tree arg0, tree arg1, if (sel_nelts_per_pattern < 3) return true; + /* If SEL contains stepped sequence, ensure that we are dealing with + integral vector_cst. */ + if (!INTEGRAL_TYPE_P (TREE_TYPE (TREE_TYPE (arg0)))) + return false; + for (unsigned pattern = 0; pattern < sel_npatterns; pattern++) { poly_uint64 a1 = sel[pattern + sel_npatterns]; diff --git a/gcc/testsuite/gcc.dg/vect/pr111754.c b/gcc/testsuite/gcc.dg/vect/pr111754.c new file mode 100644 index 00000000000..7c1c16875c7 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr111754.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-optimized" } */ + +typedef float __attribute__((__vector_size__ (16))) F; + +F foo (F a, F b) +{ + F v = (F) { 9 }; + return __builtin_shufflevector (v, v, 1, 0, 1, 2); +} + +/* { dg-final { scan-tree-dump-not "VEC_PERM_EXPR" "optimized" } } */ +/* { dg-final { scan-tree-dump "return \{ 0.0, 9.0e\\+0, 0.0, 0.0 \}" "optimized" } } */