From patchwork Fri Sep 15 02:34:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ken Matsui X-Patchwork-Id: 140054 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp769108vqi; Thu, 14 Sep 2023 20:03:34 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF1lVS1rzrksCjIwns8JvGTQZX3Ye98t5dnhFgmrzuKJ5cnEoksIc9FjL7l3d/6ftG6FnX9 X-Received: by 2002:a5d:678f:0:b0:31c:6697:6947 with SMTP id v15-20020a5d678f000000b0031c66976947mr408022wru.69.1694747014744; Thu, 14 Sep 2023 20:03:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694747014; cv=none; d=google.com; s=arc-20160816; b=Pzb0rK5nw4PNoUNJlcxbpgusUPcr33glGZPS/qe+OmSsI4W1xGdcYh9PUsTg6uqqcI 5UU4Lrw+OksDk37abDivOFnRNSqpMePCyoi0jWEt9QZJ5Tusju0IV5iLAfa7xgOHez1G 7jwOaGtLs3PRYJkyA9/09HOyonE0YZJH6qcHnQFAUWcO2pvPZq1GhSALhKS2HxJBOoPv K+2Kt9T9kx02w2dVv/gDQy0tpc9cPQPbRMcBdWvtnSLtMEYRoc+PJzgtyzF+K2Jm5nJF 0UQkO3aOOp/wRfg747R70PVzB1sm418zy3drUJC6Sz77TttUvCX/tEuAfSiTsW/8Nqiu lq5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=+W8UgB14cAB0Z+dvwvAumwYhBQ7LtEnC5zsWIKfYPcI=; fh=kSSy/oyUXqQD15FLeKdymIgDBFPFITPA1Gj52CNOxRQ=; b=nVMMEY1j1WEABuDLZ5vQpLYJEjYndzsKYxtBTJYUyjRJxD9GXoMnHhMcqq0uDm/2PA R2gkvTcmWGDm6w3+JVDe2c66i58KigJWmEhYCm7FNa21uptO603QQSzm73EF8w8Ml0i2 kzslMoHj7OTZ9k9jWITjkm0mEt42beupidt+3FNeXDLXq23hRGDbt11HQdELGr+KbHyW lQO+HeaFF5vEa0pp0y+aSsyCNZRqBbSc+6OJwS/n0yT4B1w4JTnjbJh5jfJQ3Em9ioW6 tB4UCPJzC54DwFkZb8HKxitQDuBplFTk3fZIZap8Tz88WXJoGzEqKobBDyUdidj9rMO5 nxCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=INhiqHwC; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b11-20020aa7c90b000000b005256d9e4022si2478559edt.418.2023.09.14.20.03.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 20:03:34 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=INhiqHwC; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 497BD3882036 for ; Fri, 15 Sep 2023 02:40:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 497BD3882036 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1694745657; bh=+W8UgB14cAB0Z+dvwvAumwYhBQ7LtEnC5zsWIKfYPcI=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=INhiqHwCH0S3QciAN7iHcxQkWJjMc9j05siBxqaM3kVRbxVOPdiu5j8TF67DcHrQS wSqCGuV1CQYZOBEK7j91h8vUHQISPurtg4Gft8c/SNOfdLGD+/P7xjQ83K3CdIG8oi R35dfqeF5geufiNCa9xHiPU5A4376bahOKKyhLos= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-00641c01.pphosted.com (mx0b-00641c01.pphosted.com [205.220.177.146]) by sourceware.org (Postfix) with ESMTPS id A650A38FCD09; Fri, 15 Sep 2023 02:36:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A650A38FCD09 Received: from pps.filterd (m0247480.ppops.net [127.0.0.1]) by mx0a-00641c01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38F2UJ1K019207; Fri, 15 Sep 2023 02:36:45 GMT Received: from mxout24.cac.washington.edu (mxout24.cac.washington.edu [140.142.234.158]) by mx0a-00641c01.pphosted.com (PPS) with ESMTPS id 3t47de39pc-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 15 Sep 2023 02:36:44 +0000 Received: from smtp.washington.edu (smtp.washington.edu [128.208.60.54]) by mxout24.cac.washington.edu (8.14.4+UW20.07/8.14.4+UW22.04) with ESMTP id 38F2agII009807 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 14 Sep 2023 19:36:43 -0700 X-Auth-Received: from localhost.localdomain ([10.154.75.179]) (authenticated authid=kmatsui) by smtp.washington.edu (8.16.1+UW21.10/8.14.4+UW19.10) with ESMTPSA id 38F2ago8001192 (version=TLSv1.2 cipher=DHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT); Thu, 14 Sep 2023 19:36:42 -0700 X-UW-Orig-Sender: kmatsui@smtp.washington.edu To: gcc-patches@gcc.gnu.org Cc: libstdc++@gcc.gnu.org, Ken Matsui Subject: [PATCH v13 16/40] c, c++: Use 16 bits for all use of enum rid for more keyword space Date: Thu, 14 Sep 2023 19:34:56 -0700 Message-ID: <20230915023640.75216-17-kmatsui@gcc.gnu.org> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230915023640.75216-1-kmatsui@gcc.gnu.org> References: <20230915022305.74083-1-kmatsui@gcc.gnu.org> <20230915023640.75216-1-kmatsui@gcc.gnu.org> MIME-Version: 1.0 X-Proofpoint-GUID: ICUK7qtFIBHzNAtj04Pfe3MJs1OBFQEX X-Proofpoint-ORIG-GUID: ICUK7qtFIBHzNAtj04Pfe3MJs1OBFQEX X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-15_02,2023-09-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 bulkscore=0 priorityscore=1501 lowpriorityscore=0 suspectscore=0 phishscore=0 mlxlogscore=999 spamscore=0 impostorscore=0 malwarescore=0 clxscore=1034 mlxscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309150021 X-Spam-Status: No, score=-13.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NEUTRAL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Ken Matsui via Gcc-patches From: Ken Matsui Reply-To: Ken Matsui Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776997222224492547 X-GMAIL-MSGID: 1777071045658304297 Now that RID_MAX has reached 255, we need to update the bit sizes of every use of the enum rid from 8 to 16 to support more keywords. For struct token_indent_info, the 8-bit increase does not change the overall struct size because the 8-bit just consumes 1 byte from 2 bytes of external fragmentation. Since reordering the fields just changes 1 byte of internal fragmentation to 1 byte of external fragmentation, I keep the original field order. For struct c_token, the 8-bit expansion increased the overall struct size from 24 bytes to 32 bytes. The original struct takes 4 bytes of internal fragmentation (after the location field) and 3 bytes of external fragmentation. Keeping the original order with the 8-bit expansion gives 7 bytes of internal fragmentation (3 bytes after the pragma_kind field + 4 bytes after the location field) and 7 bytes of external fragmentation. Since the original field order was not optimal, reordering the fields results in the same overall size as the original one. I updated the field order to the most efficient order. For struct cp_token, reordering the fields only minimizes internal fragmentation and does not minimize the overall struct size. I keep the original field order. The original struct size was 16 bytes with 3 bits of internal fragmentation. With this 8-bit update, the overall size would be 24 bytes. Since there is no external fragmentation and 7 bytes + 3 bits of internal fragmentation, reordering the fields does not minimize the overall size. I keep the orignal field order. Suppose a pointer takes 8 bytes and int takes 4 bytes. Then, struct ht_identifier takes 16 bytes, and union _cpp_hashnode_value takes 8 bytes. For struct cpp_hashnode, the 8-bit increase consumes 1 more byte, resulting in 33 bytes except for paddings. The original overall size before the 8-bit increase was 32 bytes. However, due to fragmentation, the overall struct size would be 40 bytes. Since there is no external fragmentation and 3 bytes + 5 bits of internal fragmentation, reordering the fields does not minimize the overall size. I keep the original field order. gcc/c-family/ChangeLog: * c-indentation.h (struct token_indent_info): Make keyword 16 bits. gcc/c/ChangeLog: * c-parser.cc (c_parse_init): Handle RID_MAX not to exceed the max value of 16 bits. * c-parser.h (struct c_token): Make keyword 16 bits. Reorder the fields to minimize memory fragmentation. gcc/cp/ChangeLog: * parser.h (struct cp_token): Make keyword 16 bits. (struct cp_lexer): Make saved_keyword 16 bits. libcpp/ChangeLog: * include/cpplib.h (struct cpp_hashnode): Make rid_code 16 bits. Signed-off-by: Ken Matsui --- gcc/c-family/c-indentation.h | 2 +- gcc/c/c-parser.cc | 6 +++--- gcc/c/c-parser.h | 14 +++++++------- gcc/cp/parser.h | 8 +++++--- libcpp/include/cpplib.h | 7 +++++-- 5 files changed, 21 insertions(+), 16 deletions(-) diff --git a/gcc/c-family/c-indentation.h b/gcc/c-family/c-indentation.h index c0e07bf49f1..6d2b88f01a3 100644 --- a/gcc/c-family/c-indentation.h +++ b/gcc/c-family/c-indentation.h @@ -26,7 +26,7 @@ struct token_indent_info { location_t location; ENUM_BITFIELD (cpp_ttype) type : 8; - ENUM_BITFIELD (rid) keyword : 8; + ENUM_BITFIELD (rid) keyword : 16; }; /* Extract token information from TOKEN, which ought to either be a diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index b9a1b75ca43..2086f253923 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -115,9 +115,9 @@ c_parse_init (void) tree id; int mask = 0; - /* Make sure RID_MAX hasn't grown past the 8 bits used to hold the keyword in - the c_token structure. */ - gcc_assert (RID_MAX <= 255); + /* Make sure RID_MAX hasn't grown past the 16 bits used to hold the keyword + in the c_token structure. */ + gcc_assert (RID_MAX <= 65535); mask |= D_CXXONLY; if (!flag_isoc99) diff --git a/gcc/c/c-parser.h b/gcc/c/c-parser.h index 545f0f4d9eb..6a9bd22a793 100644 --- a/gcc/c/c-parser.h +++ b/gcc/c/c-parser.h @@ -51,21 +51,21 @@ enum c_id_kind { /* A single C token after string literal concatenation and conversion of preprocessing tokens to tokens. */ struct GTY (()) c_token { + /* The value associated with this token, if any. */ + tree value; + /* The location at which this token was found. */ + location_t location; + /* If this token is a keyword, this value indicates which keyword. + Otherwise, this value is RID_MAX. */ + ENUM_BITFIELD (rid) keyword : 16; /* The kind of token. */ ENUM_BITFIELD (cpp_ttype) type : 8; /* If this token is a CPP_NAME, this value indicates whether also declared as some kind of type. Otherwise, it is C_ID_NONE. */ ENUM_BITFIELD (c_id_kind) id_kind : 8; - /* If this token is a keyword, this value indicates which keyword. - Otherwise, this value is RID_MAX. */ - ENUM_BITFIELD (rid) keyword : 8; /* If this token is a CPP_PRAGMA, this indicates the pragma that was seen. Otherwise it is PRAGMA_NONE. */ ENUM_BITFIELD (pragma_kind) pragma_kind : 8; - /* The location at which this token was found. */ - location_t location; - /* The value associated with this token, if any. */ - tree value; /* Token flags. */ unsigned char flags; diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h index 6cbb9a8e031..7aa251d11b1 100644 --- a/gcc/cp/parser.h +++ b/gcc/cp/parser.h @@ -44,7 +44,7 @@ struct GTY (()) cp_token { enum cpp_ttype type : 8; /* If this token is a keyword, this value indicates which keyword. Otherwise, this value is RID_MAX. */ - enum rid keyword : 8; + enum rid keyword : 16; /* Token flags. */ unsigned char flags; /* True if this token is from a context where it is implicitly extern "C" */ @@ -59,7 +59,9 @@ struct GTY (()) cp_token { bool purged_p : 1; bool tree_check_p : 1; bool main_source_p : 1; - /* 3 unused bits. */ + /* These booleans use 5 bits within 1 byte, resulting in 3 unused bits. + Since there would be 3 bytes of internal fragmentation to the location + field, the total unused bits would be 27 (= 3 + 24). */ /* The location at which this token was found. */ location_t location; @@ -102,7 +104,7 @@ struct GTY (()) cp_lexer { /* Saved pieces of end token we replaced with the eof token. */ enum cpp_ttype saved_type : 8; - enum rid saved_keyword : 8; + enum rid saved_keyword : 16; /* The next lexer in a linked list of lexers. */ struct cp_lexer *next; diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h index fcdaf082b09..7c37b861a77 100644 --- a/libcpp/include/cpplib.h +++ b/libcpp/include/cpplib.h @@ -988,11 +988,14 @@ struct GTY(()) cpp_hashnode { unsigned int directive_index : 7; /* If is_directive, then index into directive table. Otherwise, a NODE_OPERATOR. */ - unsigned int rid_code : 8; /* Rid code - for front ends. */ + unsigned int rid_code : 16; /* Rid code - for front ends. */ unsigned int flags : 9; /* CPP flags. */ ENUM_BITFIELD(node_type) type : 2; /* CPP node type. */ - /* 5 bits spare. */ + /* These bitfields use 35 bits (= 1 + 7 + 16 + 9 + 2). The exceeded 3 bits + in terms of bytes leave 5 unused bits within 1 byte. Since there would + be 3 bytes of internal fragmentation to the deferred field, the total + unused bits would be 29 (= 5 + 24). */ /* The deferred cookie is applicable to NT_USER_MACRO or NT_VOID. The latter for when a macro had a prevailing undef.