From patchwork Thu Dec 7 08:36:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 175017 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp4634010vqy; Thu, 7 Dec 2023 00:37:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IGhh9ewSWEwSs2UHBVwtCQvdTt8WO2ul54NfEDBhNFC3VR9sGqjMOBXrksf07UprBSx6LFb X-Received: by 2002:ae9:e50e:0:b0:77f:3b80:2a73 with SMTP id w14-20020ae9e50e000000b0077f3b802a73mr1235440qkf.4.1701938220611; Thu, 07 Dec 2023 00:37:00 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701938220; cv=pass; d=google.com; s=arc-20160816; b=SWrlts16ZxkNpOqiWCGtj3wLFRhzPd15AAPs/i9zdTAQrxCGEec280x/sMW0ZGuC+Z Oa+PnVWUFvzh/R8P5M2y87mZeaheYjiuRe+rnR3xoCy4LXAq2klojwWWLZxHXisbuejx KxH49Atj+JriFp3o7YN3cAM0LqCroHqxOUL4uCLasfZxitP+zntzCfumsm1S7kWV0VDQ VSLhSFaw6LBTAYT02M6KnBcDAYVepgA/l9DCB6xH6E+ZBD/Q9ksB/KlSu2GZK6iv0siM 9GAIxO+HfkmL0/18ryPbMPuPxF9q2+spmh4OkaAxRoRvsi15i8RCRO3k0Fk9uH+bPHAj W1rg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:reply-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-disposition :mime-version:message-id:subject:cc:to:from:date:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=CxlIwfI/5QYZ8aELFyiTrmKl/6FO+f4mL7kPkY8RN+o=; fh=1eSk23nwesQJIzDgYGQ70Zwi44eJ5HnGJs8f4OTh0UY=; b=0Wnv0nVzYL5MwKblh808dMpBrcA++HGMoR0Gyibnkrl11nWZXbG3hx4/Gn8SrOGH8Q m4r+LeW4WgYX+HaQFtkT1+WHwtiTml4eajGvXO6n4gMPMBie8+40RCPRD6l6xTwXgZzY F74fJKxS6fmj4UZmpz/AVuupnZrpgUKadlvw4B2glINciRSG74SkxSIGf5r5z9Dvzp/Q m2ynNs+4eI2TutExdvCmrjK52UGuh89zfIhz7gcAzmcOYnCJlp3GwXSr5+8VpVDUSRTd DdzgfFqyHfwHbEGoUunQeSDVkMEBsLXPHEHtFJYE4JuX8l+nB/cubMEf8c69k7qs4+QD iNmg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VWLIUb3D; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id l11-20020a056214104b00b0067a075509d8si844819qvr.68.2023.12.07.00.37.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 00:37:00 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VWLIUb3D; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4A3C9385B515 for ; Thu, 7 Dec 2023 08:37:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 3A8A1385841F for ; Thu, 7 Dec 2023 08:36:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3A8A1385841F Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3A8A1385841F Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701938190; cv=none; b=Wblqn8TdkKV1a0u8kjrQLujFVaZvH7vXPdyGh6bx5xVxY3sjPeZ263Q0rtrPNYCTUYgD3lspwLa/1hKSvY83UVT8uIxtiPPinJNaJj27dZV2vrMZzprgTNvm9DYVyE0hH+HSFacP4ESlHNsMtHM8ghLH0mwdOy4JcLV2Eh+HNzs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701938190; c=relaxed/simple; bh=M9Uq01TwPGqv1uBUmq7bl7GDogwNk4u5ELEzrP+fSkc=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=lDm439djdscL56+++6tnikr4kzM38jH2725Jf1QBIuuhhs9kqovR+ecr7sKXJPDqUOsZNd4wbnhEv5pxjuumrBfg767AsCOzl+x1LzQnIwetM/c7D4d4RvHMae3e9DNNv15Pkt8WbPZakLs23DqtpYvBgDQC5GXJj410ek+X0dg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701938187; h=from:from:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type; bh=CxlIwfI/5QYZ8aELFyiTrmKl/6FO+f4mL7kPkY8RN+o=; b=VWLIUb3DV2Iuo4YXXDRwehXyn//dXedFMnp1mcxm0NAylPTO2jfeeeMMlWYZMJcFQy6Tye FN4b8NxWPW+oSuQJwAi5Nh8EfGfGbs99nSxd8Z/y4FssbUCn5gBxVTB36jyM3gqXoGbzJd iyyC7LekODmaIdQ7J1JAVzBA1pZEfyE= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-387-R-KF8YHsPJSaLWnb8Z8TGw-1; Thu, 07 Dec 2023 03:36:26 -0500 X-MC-Unique: R-KF8YHsPJSaLWnb8Z8TGw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 12520386A0A2; Thu, 7 Dec 2023 08:36:26 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.195.157]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CA2B72026D66; Thu, 7 Dec 2023 08:36:25 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 3B78aN1n162070 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 7 Dec 2023 09:36:23 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 3B78aMVX162069; Thu, 7 Dec 2023 09:36:22 +0100 Date: Thu, 7 Dec 2023 09:36:22 +0100 From: Jakub Jelinek To: Richard Biener , Vladimir Makarov Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] Add IntegerRange for -param=min-nondebug-insn-uid= and fix vector growing in LRA and vec [PR112411] Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784611571746797645 X-GMAIL-MSGID: 1784611571746797645 Hi! As documented, --param min-nondebug-insn-uid= is very useful in debugging -fcompare-debug issues in RTL dumps, without it it is really hard to find differences. With it, DEBUG_INSNs generally use low INSN_UIDs (1+) and non-DEBUG_INSNs use INSN_UIDs from the parameter up. For good results, the parameter should be larger than the number of DEBUG_INSNs in all or at least problematic functions, so I typically use --param min-nondebug-insn-uid=10000 or --param min-nondebug-insn-uid=1000. The PR is about using --param min-nondebug-insn-uid=2147483647 or similar behavior can be achieved with that minus some epsilon, INSN_UIDs for the non-debug insns then wrap around and as they are signed, all kinds of things break. Obviously, that can happen even without that option, but functions containing more than 2147483647 insns usually don't compile much earlier due to getting out of memory. As it is a debugging option, I'd prefer not to impose any drastically small limits on it because if a function has a lot of DEBUG_INSNs, it is useful to start still above them, otherwise the allocation of uids will DTRT even for DEBUG_INSNs but there will be then differences in non-DEBUG_INSN allocations. So, the following patch uses 0x40000000 limit, half the maximum amount for DEBUG_INSNs and half for non-DEBUG_INSNs, it will still result in very unlikely overflows in real world. Note, using large min-nondebug-insn-uid is very expensive for compile time memory and compile time, because DF as well as various RTL passes use arrays indexed by INSN_UIDs, e.g. LRA with sizeof (void *) elements, ditto df (df->insns). Now, in LRA I've ran into ICEs already with --param min-nondebug-insn-uid=0x2aaaaaaa on 64-bit host. It uses a custom vector management and wants to grow allocation 1.5x when growing, but all this computation is done in int, so already 0x2aaaaaab * 3 / 2 + 1 overflows to negative value. And unlike vec.cc growing which also uses unsigned int type for the above (and the + 1 is not there), it also doesn't make sure if there is an overflow that it allocates at least as much as needed, vec.cc does if ... else /* Grow slower when large. */ alloc = (alloc * 3 / 2); /* If this is still too small, set it to the right size. */ if (alloc < desired) alloc = desired; so even if there is overflow during the * 1.5 computation, but desired is still representable in the range of the alloced counter (31-bits in both vec.h and LRA), it doesn't grow exponentially but at least works for the current value. So, one way to fix the LRA issue would be just to use lra_insn_recog_data_len = index * 3U / 2; if (lra_insn_recog_data_len <= index) lra_insn_recog_data_len = index + 1; basically do what vec.cc does. I thought we can do better for both vec.cc and LRA on 64-bit hosts even without growing the allocated counters, but now that I look at it again, perhaps we can't. The above overflows already with original alloc or lra_insn_recog_data_len 0x55555556, where 0x5555555 * 3U / 2 is still 0x7fffffff and so representable in the 32-bit, but 0x55555556 * 3U / 2 is 1. I thought (and the patch implements it) that we could use alloc * (size_t) 3 / 2 so that on 64-bit hosts it wouldn't overflow that quickly, but 0x55555556 * (size_t) 3 / 2 there is 0x80000001 which is still ok in unsigned, but given that vec.h then stores the counter into unsigned m_alloc:31; bit-field, it is too much. The patch below is what I've actually bootstrapped/regtested on x86_64-linux and i686-linux, but given the above I think I should drop the vec.cc hunk and change (size_t) 3 in the LRA hunk to 3U. With the lra.cc change, one can actually compile simple function with -O0 on 64-bit host with --param min-nondebug-insn-uid=0x40000000 (i.e. the new limit), but already needed quite a big part of my 32GB RAM + 24GB swap. The patch adds a dg-skip-if for that case though, because such option is way too much for 32-bit hosts even at -O0 and empty function, and with -O3 on a longer function it is too much for average 64-bit host as well. Without the dg-skip-if I got on 64-bit host: cc1: out of memory allocating 571230784744 bytes after a total of 2772992 bytes and cc1: out of memory allocating 1388 bytes after a total of 2002944 bytes on 32-bit host. A test requiring more than 532GB of RAM on 64-bit hosts is just too much for our testsuite. 2023-12-07 Jakub Jelinek PR middle-end/112411 * params.opt (-param=min-nondebug-insn-uid=): Add IntegerRange(0, 1073741824). * lra.cc (check_and_expand_insn_recog_data): Use (size_t) 3 rather than 3 in * 3 / 2 computation and if the result is smaller or equal to index, use index + 1. * vec.cc (vec_prefix::calculate_allocation_1): Use (size_t) 3 rather than 3 in * 3 / 2 computation. * gcc.dg/params/blocksort-part.c: Add dg-skip-if for --param min-nondebug-insn-uid=1073741824. Jakub --- gcc/params.opt.jj 2023-11-02 07:49:18.010852541 +0100 +++ gcc/params.opt 2023-12-06 18:55:57.045420935 +0100 @@ -779,7 +779,7 @@ Common Joined UInteger Var(param_min_loo The minimum threshold for probability of semi-invariant condition statement to trigger loop split. -param=min-nondebug-insn-uid= -Common Joined UInteger Var(param_min_nondebug_insn_uid) Param +Common Joined UInteger Var(param_min_nondebug_insn_uid) Param IntegerRange(0, 1073741824) The minimum UID to be used for a nondebug insn. -param=min-size-for-stack-sharing= --- gcc/lra.cc.jj 2023-12-05 13:17:29.642260866 +0100 +++ gcc/lra.cc 2023-12-06 19:52:01.759241999 +0100 @@ -768,7 +768,9 @@ check_and_expand_insn_recog_data (int in if (lra_insn_recog_data_len > index) return; old = lra_insn_recog_data_len; - lra_insn_recog_data_len = index * 3 / 2 + 1; + lra_insn_recog_data_len = index * (size_t) 3 / 2; + if (lra_insn_recog_data_len <= index) + lra_insn_recog_data_len = index + 1; lra_insn_recog_data = XRESIZEVEC (lra_insn_recog_data_t, lra_insn_recog_data, lra_insn_recog_data_len); --- gcc/vec.cc.jj 2023-09-27 10:37:39.329838572 +0200 +++ gcc/vec.cc 2023-12-06 19:53:34.670940078 +0100 @@ -160,7 +160,7 @@ vec_prefix::calculate_allocation_1 (unsi alloc = alloc * 2; else /* Grow slower when large. */ - alloc = (alloc * 3 / 2); + alloc = (alloc * (size_t) 3 / 2); /* If this is still too small, set it to the right size. */ if (alloc < desired) --- gcc/testsuite/gcc.dg/params/blocksort-part.c.jj 2020-01-12 11:54:37.463397567 +0100 +++ gcc/testsuite/gcc.dg/params/blocksort-part.c 2023-12-07 08:46:11.656974144 +0100 @@ -1,4 +1,5 @@ /* { dg-skip-if "AArch64 does not support these bounds." { aarch64*-*-* } { "--param stack-clash-protection-*" } } */ +/* { dg-skip-if "For 32-bit hosts such param is too much and even for 64-bit might require hundreds of GB of RAM" { *-*-* } { "--param min-nondebug-insn-uid=1073741824" } } */ /*-------------------------------------------------------------*/ /*--- Block sorting machinery ---*/