From patchwork Thu Nov 23 11:36:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Prathamesh Kulkarni X-Patchwork-Id: 168888 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce62:0:b0:403:3b70:6f57 with SMTP id o2csp378989vqx; Thu, 23 Nov 2023 03:37:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IEusMQu+Ro5Fs28HXZq3vwS2ycWbYSOzvP/5Pbkn8QmxsA77s9NMmpKJ0e3+VlgqLxwvv2O X-Received: by 2002:a05:6870:1d0f:b0:1f4:d1f3:6b18 with SMTP id pa15-20020a0568701d0f00b001f4d1f36b18mr6083317oab.34.1700739447108; Thu, 23 Nov 2023 03:37:27 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700739447; cv=pass; d=google.com; s=arc-20160816; b=s65ypXox3AxVvnTxLnvF/SsVNM4fuPipGoxJOwTei/N1bw0YYHjGicI2hYR4r5Lp9M Mkuer81+mAPz8Ho6SdxIyK10PZ8ZEfzhykJ52RRBp6mkMn3pg+RQKAVuILOZdeIt8XhQ TM3voiRZavnIVtXm8nZwHvXfdkI7l2QeM/bUs6NSyC2HN+O6x7GLX81UyH6Wb7TNosFJ YHNmmcJYJfgZs+Fvs3V7lV3O04M5ZniAReeIUdeveLRkVqXDSkN5WBmXlnDd13tgI7e9 42AEswGxJKR/ea31uJEEiCG40ac9r0ZHQ7u8NE1Ip15u7NHmG1mD3MPANeInCC3E2lEt AT5w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:message-id:date:from :mime-version:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=8pd6qA11yIs7I0GEh0DoxxarhzaJcv7Vt7oTdNzuWuU=; fh=dlG5TvQASCJGDJgO85bMuWcJbCtoD20NfiHcjOBIqVM=; b=dg8ArPB7R+T/WoOqG9jcd4RMp+WY5XZ9kpBsjt+1IUtiPJW3MkXoTdkWYNO6CTHMGn iAUn5GOhSrxnecqF09+38a2XC2OgftHP1WD353LovnaemnK7x8NLuQlaRoEtCZncABY0 keLvh3pEB/innj9176ckZ5XnoJNhyK84txItXlnxcI6JWrRY0huPFe1Xjk8nPFnhgO0E SjGt0UV4/E72IiPZKbZPme88gZ7WAjKSb78jYhKlqFWvsoHFfpgFBbBcGeM3mozf3K2w Ks/xf8fWJfS1OFOj4ICh5fzuSSHa+NdwJlPI91gg1gsxJE7BIMw67Xt+Cbdgcr/0tmHG NCOg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=VhAUAIFV; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d18-20020a05622a101200b00423712e0cdbsi920722qte.553.2023.11.23.03.37.27 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Nov 2023 03:37:27 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=VhAUAIFV; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D6250385AC32 for ; Thu, 23 Nov 2023 11:37:26 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x12e.google.com (mail-lf1-x12e.google.com [IPv6:2a00:1450:4864:20::12e]) by sourceware.org (Postfix) with ESMTPS id 9C49F3858C20 for ; Thu, 23 Nov 2023 11:37:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9C49F3858C20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9C49F3858C20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700739424; cv=none; b=Ysd6bZTRSfwYIbSb+kgYlEioqTZrm5iAPUpQzXL5RiTDyrHyrN5x3TOcuokL/khXaTFDvRdSHzdgd6KbaFY61lS+B1E8SNtYMIG9tQqJ8bLHvALE+sR+EGkXzg8f7eWMRWAE1S4bp0o+LFG4azhJlVGkOJa8uWwntd4S/1wugXs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700739424; c=relaxed/simple; bh=cREsd3CT/uMGp5DOq5IJ5AoUr9iPjuF28QupKuxAiEA=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=LbeDrhVonTDXBd4jLyaGJxQlrFlJArSNu5XHStrN4ACuM84PgMGjSvAmQdAgXJiJiE0Bw7XtOt8sDqN7iK3rOeAMdUknilzgADF4Kfv3emCkUWP8L5sdy9r5zVhJ7d9+9fxSE2Smwo/nnGMad6GiDFXL0f5HRGgP8cm8uUuhJtY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12e.google.com with SMTP id 2adb3069b0e04-507a29c7eefso921491e87.1 for ; Thu, 23 Nov 2023 03:37:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1700739420; x=1701344220; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=8pd6qA11yIs7I0GEh0DoxxarhzaJcv7Vt7oTdNzuWuU=; b=VhAUAIFVuU3aIpuee9/eC37A/bguOlTl7kSAnyZGXP4psZfQ0ykpcoS9SAnM2+CKUC SLivY4VS4NvQKA94/93dvdxMF28ZfJsM2cBa5kf1HwpLiJvgiIatOjT9cVqcGYYzP9Ff BDZnlJLd6w5kaZ7XfKYsQxZBkl/wTNnpkQLpGk6jHALTiQt7QQcvGfeiilSg4gh6E0AU IKEZgrqLW3onITb2MHPivYtXYuSn5jrDU76VWf2yW2RX/gZib7muYmTmT0tKl0sdj5Co yyo9wOyOnZ2uu2WkYOxutq5Sd436FXEi66Y8YLXXzsDfsgPDWz3Izyfw8cep7Qa96F3P x6dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700739420; x=1701344220; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=8pd6qA11yIs7I0GEh0DoxxarhzaJcv7Vt7oTdNzuWuU=; b=IaDkPRGlowjXVTfj1jA5t/lngvMMRVwZDavmdZ7X6eUXQPbkXloj3SvCys7cgfu1c1 vkeIjJQ4oNSD+FbxOjomnEdG2jy6Zd/HLtB7vKkkdto4r1UulcmZIteRFzT0yGMc3NKu knFj5oULGo4TdgzOIK4+xsitaaDk2gGJcPY5lSkYtXeOShOh8FAfTFE/7CgDhlzBIBPB PKBMXr5NjGY1LpyZ5XFj/VY1gwPJVe70JfJ6BsGVarEB54PcgKOghtg200gv+PjtXcJv sYeYQY+576NWT+RDwZ85RCqNdMuLfxRuf7wxpoRDxnYg0sFuDAJPQCQrQm3d/pWqbBxn dGWg== X-Gm-Message-State: AOJu0Yx36aTp3H/Ic8a2Cx5t3yRi7J/VneN7izCy3j3CloYYonKvkYcA TAeiQ5IV5XbpQ1qhX7HYq9Cr77LB1NSjzWbXHRfDXhErl5BUNvFQ X-Received: by 2002:ac2:4882:0:b0:508:11c3:c8ca with SMTP id x2-20020ac24882000000b0050811c3c8camr3636624lfc.7.1700739420320; Thu, 23 Nov 2023 03:37:00 -0800 (PST) MIME-Version: 1.0 From: Prathamesh Kulkarni Date: Thu, 23 Nov 2023 17:06:24 +0530 Message-ID: Subject: [aarch64] PR111702 - ICE in insert_regs after interleave+zip1 vector initialization patch To: Richard Sandiford , gcc Patches X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, WEIRD_PORT autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783354566355649633 X-GMAIL-MSGID: 1783354566355649633 Hi Richard, For the test-case mentioned in PR111702, compiling with -O2 -frounding-math -fstack-protector-all results in following ICE during cse2 pass: test.c: In function 'foo': test.c:119:1: internal compiler error: in insert_regs, at cse.cc:1120 119 | } | ^ 0xb7ebb0 insert_regs ../../gcc/gcc/cse.cc:1120 0x1f95134 merge_equiv_classes ../../gcc/gcc/cse.cc:1764 0x1f9b9ab cse_insn ../../gcc/gcc/cse.cc:4793 0x1f9fe30 cse_extended_basic_block ../../gcc/gcc/cse.cc:6577 0x1f9fe30 cse_main ../../gcc/gcc/cse.cc:6722 0x1fa0984 rest_of_handle_cse2 ../../gcc/gcc/cse.cc:7620 0x1fa0984 execute ../../gcc/gcc/cse.cc:7675 This happens only with interleave+zip1 vector initialization with -frounding-math -fstack-protector-all, while it compiles OK without -fstack-protector-all. Also, it compiles OK with fallback sequence code-gen (with or without -fstack-protector-all). Unfortunately, I haven't been able to reduce the test-case further :/ From the test-case, it seems only the vector initializer for type J uses interleave+zip1 approach, while rest of the vector initializers use fallback sequence. J is defined as: typedef _Float16 __attribute__((__vector_size__ (16))) J; and the initializer is: (J) { 11654, 4801, 5535, 9743, 61680} interleave+zip1 sequence for above initializer J: mode = V8HF vals: (parallel:V8HF [ (reg:HF 642) (reg:HF 645) (reg:HF 648) (reg:HF 651) (reg:HF 654) (const_double:HF 0.0 [0x0.0p+0]) repeated x3 ]) target: (reg:V8HF 641) seq: (insn 1058 0 1059 (set (reg:V4HF 657) (const_vector:V4HF [ (const_double:HF 0.0 [0x0.0p+0]) repeated x4 ])) "test.c":81:8 -1 (nil)) (insn 1059 1058 1060 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 642)) (reg:V4HF 657) (const_int 1 [0x1]))) "test.c":81:8 -1 (nil)) (insn 1060 1059 1061 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 648)) (reg:V4HF 657) (const_int 2 [0x2]))) "test.c":81:8 -1 (nil)) (insn 1061 1060 1062 (set (reg:V4HF 657) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 654)) (reg:V4HF 657) (const_int 4 [0x4]))) "test.c":81:8 -1 (nil)) (insn 1062 1061 1063 (set (reg:V4HF 658) (const_vector:V4HF [ (const_double:HF 0.0 [0x0.0p+0]) repeated x4 ])) "test.c":81:8 -1 (nil)) (insn 1063 1062 1064 (set (reg:V4HF 658) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 645)) (reg:V4HF 658) (const_int 1 [0x1]))) "test.c":81:8 -1 (nil)) (insn 1064 1063 1065 (set (reg:V4HF 658) (vec_merge:V4HF (vec_duplicate:V4HF (reg:HF 651)) (reg:V4HF 658) (const_int 2 [0x2]))) "test.c":81:8 -1 (nil)) (insn 1065 1064 0 (set (reg:V8HF 641) (unspec:V8HF [ (subreg:V8HF (reg:V4HF 657) 0) (subreg:V8HF (reg:V4HF 658) 0) ] UNSPEC_ZIP1)) "test.c":81:8 -1 (nil)) It seems to me that the above sequence correctly initializes the vector into r641 ? insns 1058-1061 construct r657 = { r642, r648, r654, 0 } insns 1062-1064 construct r658 = { r645, r651, 0, 0 } and zip1 will create r641 = { r642, r645, r648, r651, r654, 0, 0, 0 } For the above test, it seems that with interleave+zip1 approach and -fstack-protector-all, in cse pass, there are two separate equivalence classes created for (const_int 1), that need to be merged in cse_insn: if (elt->first_same_value != src_eqv_elt->first_same_value) { /* The REG_EQUAL is indicating that two formerly distinct classes are now equivalent. So merge them. */ merge_equiv_classes (elt, src_eqv_elt); elt equivalence chain: Equivalence chain for (subreg:QI (reg:V16QI 671) 0): (subreg:QI (reg:V16QI 671) 0) (const_int 1 [0x1]) src_eqv_elt equivalence chain: Equivalence chain for (const_int 1 [0x1]): (reg:QI 34 v2) (reg:QI 32 v0) (reg:QI 34 v2) (const_int 1 [0x1]) (vec_select:QI (reg:V16QI 671) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 32 v0) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 2 [0x2]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 1 [0x1]) ])) The issue is that merge_equiv_classes doesn't seem to deal correctly with multiple occurences of same register in class2 (src_eqv_elt), which has two occurrences of (reg:QI 34 v2) In merge_equiv_classes, on first iteration, it will remove (reg:QI 34) from reg_equiv_table by calling delete_equiv_reg(34), and in insert_regs it will create an entry for (reg:QI 34) in qty_table with new quantity number, and create new equivalence in reg_eqv_table. When we again come across (reg:QI 34) in class2, it will unconditionally remove the register from reg_eqv_table, thus making REG_QTY(34) = -35, even tho (reg:QI 34) is now present in class1 chain. Then in insert_regs, we have: x: (reg:QI 34 v2) classp: (subreg:QI (reg:V16QI 671) 0) (reg:QI 34 v2) (const_int 1 [0x1]) And while iterating over elements in classp, we end up with regno == c_regno == 34. However, as mentioned above, merge_equiv_classes has deleted entry for (reg:QI 34) from reg_eqv_table, so it's no longer valid, and thus end up hitting the following assert: gcc_assert (REGNO_QTY_VALID_P (c_regno)); I am not sure tho why this is triggered only with interleave+zip1 approach with -fstack-protector-all. The attached (untested) patch is a workaround for the above issue -- In merge_equiv_classes, while iterating over elements in class2, it simply checks if element is a reg, and already inserted in class1 with equivalent mode, and avoids deleting it from reg_eqv_table in that case. This avoids hitting the assert, and following is the result of merging above two classes: Equivalence chain for (subreg:QI (reg:V16QI 671) 0): (subreg:QI (reg:V16QI 671) 0) (reg:QI 34 v2) (reg:QI 32 v0) (reg:QI 34 v2) (const_int 1 [0x1]) (const_int 1 [0x1]) (vec_select:QI (reg:V16QI 671) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 1 [0x1]) ])) (vec_select:QI (reg:V16QI 33 v1) (parallel [ (const_int 2 [0x2]) ])) (vec_select:QI (reg:V16QI 32 v0) (parallel [ (const_int 1 [0x1]) ])) Which seems to be OK (?), but am not sure if this patch is in the right direction, and is also not efficient. Could you please suggest how to proceed ? Thanks, Prathamesh diff --git a/gcc/cse.cc b/gcc/cse.cc index f9603fdfd43..1e20be457c4 100644 --- a/gcc/cse.cc +++ b/gcc/cse.cc @@ -1747,7 +1747,16 @@ merge_equiv_classes (struct table_elt *class1, struct table_elt *class2) if (REG_P (exp)) { need_rehash = REGNO_QTY_VALID_P (REGNO (exp)); - delete_reg_equiv (REGNO (exp)); + + /* If reg is already inserted into class1 and has a valid new + quantity, avoid deleting it from reg_eqv_table. */ + table_elt *e; + for (e = class1->first_same_value; e; e = e->next_same_value) + if (REG_P (e->exp) && REGNO (e->exp) == REGNO (exp) + && e->mode == mode) + break; + if (e == NULL) + delete_reg_equiv (REGNO (exp)); } if (REG_P (exp) && REGNO (exp) >= FIRST_PSEUDO_REGISTER)