Message ID | 20230711215716.12980-1-david.faust@oracle.com |
---|---|
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp763194vqm; Tue, 11 Jul 2023 14:59:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlHE+036070Nh2B4RGfa1NtcPSzN4xOZhDhIRJQJrRJf7dHL2O4a7jQJzW0PeOiqEWcZnhzw X-Received: by 2002:a05:6512:3593:b0:4f9:d272:5847 with SMTP id m19-20020a056512359300b004f9d2725847mr13227321lfr.68.1689112751542; Tue, 11 Jul 2023 14:59:11 -0700 (PDT) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c16-20020aa7c990000000b0051de7b55474si3245630edt.550.2023.07.11.14.59.11 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 14:59:11 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=vL+48tjM; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4C8FF385700F for <ouuuleilei@gmail.com>; Tue, 11 Jul 2023 21:59:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4C8FF385700F DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689112744; bh=g5eolGHNDFTKza5hg1zuRVgMQpVUHlEyc5cG8JXkwE0=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=vL+48tjMlLFu4QFs0XLLCReCVBIr7TctNhZoXEsODrXdDIb/uMIQMfJIcg+ZAr/QA fwED8nptOmNuKcbHNrj3UK/qNbTZ1ir/4FUnmsoCUZms0OxWglTai7eaKOoN+SEcYH rvHDBgExfQvPZ3GrgWwG4PhUh33Jri8CzUCJkDwQ= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by sourceware.org (Postfix) with ESMTPS id 12A7E3858D1E for <gcc-patches@gcc.gnu.org>; Tue, 11 Jul 2023 21:58:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12A7E3858D1E Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 36BIDBvB010943; Tue, 11 Jul 2023 21:58:18 GMT Received: from phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta02.appoci.oracle.com [147.154.114.232]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3rrfj63xdq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Jul 2023 21:58:17 +0000 Received: from pps.filterd (phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 36BLQO16007166; Tue, 11 Jul 2023 21:58:17 GMT Received: from nam02-dm3-obe.outbound.protection.outlook.com (mail-dm3nam02lp2044.outbound.protection.outlook.com [104.47.56.44]) by phxpaimrmta02.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3rpx85hcyt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 11 Jul 2023 21:58:17 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=hUa8vnixGi7tbGVHxgH8wR74YIFj3llgnSRVvnvPYA88zxGVYHPrTKe7Z93zOYEI6XpHl9MBgDz3iAXZUiE/SRH8tTI4liCjTZ5SX3ylUVoJcQe50BfTpYQZ2cSKcrNLqh8aTHEGGfFV1+aaWPnSk/FL1SEsfYR1vv1ZtQdBn5DtQl9+bP9wLJqpzsoLyhebmxLR0+JnXXRlZaQbaWXdkPY9JePazfCRdu+N0WXKw0OiDBr17BztwzGwQCYY9iZfEoGoXUzqAEKSAP9DkDLxDnjLexpc64k8QKTtq679o6djrAK9D66wbJc6dLf1tYtqF/8IHRQDEY4e0j4YSBMMEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=g5eolGHNDFTKza5hg1zuRVgMQpVUHlEyc5cG8JXkwE0=; b=bcYFg6bpqkfT0i485Lq2oY/0P/oBsTgLuo0ht3XnhQXrPF9eUqa0pbRAwtItf51mp1wWWE/836p/ncYoq6REnW9I9Vu2/6F6g3rRnvwYBfRHe6T8es2n3BKt7Qh/KlKpsY8ETpX6gvYXslZcVyvKhRD0gSdrwntDE2ttj4++hyg5VJYWN+ZAKzzbo90d3tpSrseQForn7uC6Ac76Mpf3bRDINsRStzFF1dmvH/vmpgR6jV9+V58d1FHuMIwDNcbPSGF3UU0Mv5U3eeoHXbu7H5F2jiCRrggIVnWHJriSy0ze5Pd6Qn5N04oqsR5a6MHZtNThh4/Gzbz1k1DNkMsdJA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none Received: from MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) by CO6PR10MB5650.namprd10.prod.outlook.com (2603:10b6:303:14f::20) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6565.30; Tue, 11 Jul 2023 21:58:14 +0000 Received: from MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::1797:59c3:a40e:f8ab]) by MN2PR10MB3213.namprd10.prod.outlook.com ([fe80::1797:59c3:a40e:f8ab%4]) with mapi id 15.20.6565.028; Tue, 11 Jul 2023 21:58:14 +0000 To: gcc-patches@gcc.gnu.org Cc: jose.marchesi@oracle.com, yhs@meta.com Subject: [PATCH 0/9] Add btf_decl_tag C attribute Date: Tue, 11 Jul 2023 14:57:07 -0700 Message-Id: <20230711215716.12980-1-david.faust@oracle.com> X-Mailer: git-send-email 2.39.1 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: SN4PR0501CA0070.namprd05.prod.outlook.com (2603:10b6:803:41::47) To MN2PR10MB3213.namprd10.prod.outlook.com (2603:10b6:208:131::33) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MN2PR10MB3213:EE_|CO6PR10MB5650:EE_ X-MS-Office365-Filtering-Correlation-Id: c19f45fb-5423-4ae3-348c-08db8259ecf4 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: rSabDfnpLPtW3Cj9MvwENMYTTHSB/LFgYJvdEGjX7iq7lFLJQvhrysAR5b+gP7xTl1uIHUHfxO4snq+aYA1ineB08CgBfFkphO/4CAmal5bVNjFvCiy2+igDAuYg6V3TN07TtZiRb8iyhZVsCI3+G/fDjPBsHtB2j9sQtlcXN6lNvl+Tz5QvqtQ+RsnH4wAgKrnvNxdWbb8QQb9D18/38+YVntWpkQ9bc/UA/hYInE2CGoSrUMIjGbAJYj4CoX8XG+AT6set0DzFR4DrXJqms5IfUzaqxFWaw+q9FvYQzVDP71hUd3H067+97YvPswI6UzdakB6fpvPc26WnBppP0Pu0Qp01TVLbVQvqzlKi8zqKTNF1JS5/aGTZ5XT4CeNWYXJNTEq8y/w1iTAR4Nl6+l838b0kfqD2Ke9iKNJj4abs/bGuu0wlwORdObihTIVT5QhCHiyf0RnowD6Eu+C79tPjwly+Lb9Lj/WN8xSAgLYdNOyL+kCFT21Ddbr4botZ8D12cgXIm6PS0Ij8Y1Xl4DwQYdoqqyvLrbHU0DSDg5xrcaBUlAH4gSvW+JUOji4GYNJ2fgfZfGMM8qp4m6SZwQ== X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR10MB3213.namprd10.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(39860400002)(346002)(136003)(376002)(396003)(366004)(451199021)(86362001)(38100700002)(36756003)(6486002)(6666004)(6506007)(26005)(186003)(1076003)(6512007)(966005)(2616005)(5660300002)(2906002)(316002)(66556008)(66946007)(66476007)(8676002)(44832011)(8936002)(83380400001)(4326008)(478600001)(6916009)(41300700001); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 73T/L22mjTFR3vPts1YehdW4MlhW5hWuC0KCHeYPL0L7ZAOI7tIWTpnVWB+yJzyn7EJ28/h2mA33xefQUs0q6NbSGA+tmCKzFfwOJa6WFOK1oRYEbsLlHWrfK2XShkBTjp/AW/X7bCp3i50StInEepAFhttz9AGY1DMlc4JsJJ7X2AvmoH1Z4ZRQy7wGVg/JIy4sn2RozwSd/6GVhYmsIIEeLP4CxZn7t4fUvoJgizK0RZmP7ETagpZiRx+USGSUuTfY8a69l5649j7KjalfrpjqMmnpcOS/4XT09LWGi1VCyRGTEOQlYIdd79+M1zpBvtRaFdtQ6yh7iTQDz0IV9InD9O37xYhRu2Pm4D+EoK97W7d50TsomZtyrwt5c/e8KoC4jnB34pgA/G+op87dl6ieBQio4gDEhWOw5Lcf4TKNFRj/3JGtYCKHVUmr+Yw+Ye3d5/Ozl+tCvZgy35aXWVlBmJiqjX5Hv5h2s5wvv5MJm0OuCIh3ITREBAAFzj7+VSiujtKuDQZ419ST1BeDcESvvNVQy7/KMF1raox7wBat7MdswfBLXmgtUGwQlPxP9k7e6yxziYzw2FD9YTYumWKeQM7J8QeGY4MUlYc6+ciFi1xqXyTT8ZjfAhOUbZlRguMCHgx8+nE5Qy1XJJV2W7tfOpi+eofE1mV3DJIybkZuWXhm5DGt3kkx+4gnQXjQcKeLAgn2Cq0c0Ad+lnEPIDIllc5rCLzH/M+2T9yb/LdfpPl/9vPbqAptC4alUelXrZTiy9owfqhqWP2t/31Et30i+88ehHCq4T8g8lsI42jr+KL4dqidO7QR7t+I6R0QFsxBrGsWYPLc2r7Ie5GyLNVnV3uDCVqAdgtk2CyAO6YohkPD89/KlHyZTxgmx5/hlggxrITuEvOyj8WYOWVZJozy53bIpSNU6N46H4UwxflubG3lueRumDAj/8Ah3Bc7tn3rZUIbBaeh4KbbbEmvxuXezH/MS8U/2aiz2hZMLQqjMAiK3EnvoFJbkIdH9F06p7nLuzxdtUq9F7yHilKz0Xj/oucUIVRtFo+yh2NiRKE0JtpMbj3GA8w+UFudHnejcHcAMyQq1T5rh2mCtGCQczmbbr/6OZRPfYEfpklOZTP+LHZDTmWcVabyZQGnmtAdtObHO6uOU8vfcBL0TRsxPinFjgDsTBfZGLZcTOwh08WUlGg7Vu2UA3CE3IkY6M/+4fJMz05SZZFqBK2xypK8bFGV7VoUFWsmAXl9uFICmmVt4+eq0y+F9iIyN86HV0ww2Wc9VvfT3/ubAV4AAKg1gQlYwD3TxP78s5CZtkEZUQXF4aOBtr8iilFBH7r+aJRTeiegzLAedl6UnuR3gQWiFEKu3t65Brd9Uz8rO6jT5a5VepBPaff5IErBO/xPk/UPmq2d+rz70kVXUBRwFPOtPRGYEk89UuSWt8QcOGZL41I0NCPJr+9udpaFAQnopx2xylyPo+YZLn218t4vuGkGqsvtlkICb3oOBsotvs9M7SFYPlmXFZrDWLdNmAZA8Q6sa+ipNBEkbaNa1uoFR8L5MAalul+3lsWsraQLgXj3nSI0x16uydutetSvShXAXC+u X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: m9xZh0aqp6kT6AArgLtrLgXvdu8XhTvLoF+/8BXmCcGEW/KCBWbftfV9MhRkr80EmZmVQBF+ACI0B+RN0aPu4QJ5Iam4ghNi1IffAaWr4vA3Tgt9qoRbjDEZa9/+e1vJc4uMo+XElhRqPe21AB8fqEwkBo1P2s+Kv1B7BPS/iS31ecwhamdr95E5nX5FCH0kKOjeLtq8zRji7nPv21EHJ2gqUhEnmVSRl8PYD9D61IH/Sh4/MKauUwdrApBKthdgT+7Oz4qiCr2RSiQVEUkYAFOsXFC9/uBAQwz/dmuOHuaV0Z2fQ1mvSjXkp4HpHZQuK793lhZA3V5V1m6TqAFrH6gsKQm8qik58ohAPfl5XVx1xWtqdRLqt0SC+QEPjlCNqPISKSti83hmU7HkUYkj03mPdxEpriWtl7XZAR+i8HewI9/qQUyM0TDgi3Ikr3CYTjdPi7bBOAS0e/nMOD8HtBQFIK2kV/70/eL+XhFPIKqwqI36N3SnCB1TW3CEOdmhTT0J+skeVq3dsSbnB8D/w9Eeu9/eL2AnAPkplb+MUMW/u6u602kdAIXVU7UIktVXRx7vRoOyy7CeUHBz6sLMEO32Xob6R2jDvQbRNDOWvwc3ACjCpDZYguPszQEUT3Vu2hHYNQ0dgkqSPuAEy+7NlXF2QE6bEV4im2BVl/LLCCXUoxw1/ccFkeEifZw/hppa3RbgJ0gG6VahhK/GeRQ3kXNvKpQAaGp4Zpt2KqqLp1/U1iea/3Nl8jhhUe9dIW6qphJqTa/+WVTUlaTOpWF6Fg== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: c19f45fb-5423-4ae3-348c-08db8259ecf4 X-MS-Exchange-CrossTenant-AuthSource: MN2PR10MB3213.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Jul 2023 21:58:14.4752 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 6jjnCUNf/fXEaLalu26yViQ2THhKMqUD7w4EB7hABA5dx/sOhiD6nk7Gz1iqzO07sHTkO5/6HHOqu/xkNs1M6Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO6PR10MB5650 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-07-11_12,2023-07-11_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 mlxlogscore=999 adultscore=0 mlxscore=0 spamscore=0 phishscore=0 malwarescore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2305260000 definitions=main-2307110199 X-Proofpoint-GUID: ZUPEdIoW8oO9ggeSWIvT8PmKPnv0h11Z X-Proofpoint-ORIG-GUID: ZUPEdIoW8oO9ggeSWIvT8PmKPnv0h11Z X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> From: David Faust via Gcc-patches <gcc-patches@gcc.gnu.org> Reply-To: David Faust <david.faust@oracle.com> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771163092545661263 X-GMAIL-MSGID: 1771163092545661263 |
Series |
Add btf_decl_tag C attribute
|
|
Message
David Faust
July 11, 2023, 9:57 p.m. UTC
Hello, This series adds support for a new attribute, "btf_decl_tag" in GCC. The same attribute is already supported in clang, and is used by various components of the BPF ecosystem. The purpose of the attribute is to allow to associate (to "tag") declarations with arbitrary string annotations, which are emitted into debugging information (DWARF and/or BTF) to facilitate post-compilation analysis (the motivating use case being the Linux kernel BPF verifier). Multiple tags are allowed on the same declaration. These strings are not interpreted by the compiler, and the attribute itself has no effect on generated code, other than to produce additional DWARF DIEs and/or BTF records conveying the annotations. This entails: - A new C-language-level attribute which allows to associate (to "tag") particular declarations with arbitrary strings. - The conveyance of that information in DWARF in the form of a new DIE, DW_TAG_GNU_annotation, with tag number (0x6000) and format matching that of the DW_TAG_LLVM_annotation extension supported in LLVM for the same purpose. These DIEs are already supported by BPF tooling, such as pahole. - The conveyance of that information in BTF debug info in the form of BTF_KIND_DECL_TAG records. These records are already supported by LLVM and other tools in the eBPF ecosystem, such as the Linux kernel eBPF verifier. Background ========== The purpose of these tags is to convey additional semantic information to post-compilation consumers, in particular the Linux kernel eBPF verifier. The verifier can make use of that information while analyzing a BPF program to aid in determining whether to allow or reject the program to be run. More background on these tags can be found in the early support for them in the kernel here [1] and [2]. The "btf_decl_tag" attribute is half the story; the other half is a sibling attribute "btf_type_tag" which serves the same purpose but applies to types. Support for btf_type_tag will come in a separate patch series, since it is impaced by GCC bug 110439 which needs to be addressed first. I submitted an initial version of this work (including btf_type_tag) last spring [3], however at the time there were some open questions about the behavior of the btf_type_tag attribute and issues with its implementation. Since then we have clarified these details and agreed to solutions with the BPF community and LLVM BPF folks. The main motivation for emitting the tags in DWARF is that the Linux kernel generates its BTF information via pahole, using DWARF as a source: +--------+ BTF BTF +----------+ | pahole |-------> vmlinux.btf ------->| verifier | +--------+ +----------+ ^ ^ | | DWARF | BTF | | | vmlinux +-------------+ module1.ko | BPF program | module2.ko +-------------+ ... This is because: a) pahole adds additional kernel-specific information into the produced BTF based on additional analysis of kernel objects. b) Unlike GCC, LLVM will only generate BTF for BPF programs. b) GCC can generate BTF for whatever target with -gbtf, but there is no support for linking/deduplicating BTF in the linker. In the scenario above, the verifier needs access to the pointer tags of both the kernel types/declarations (conveyed in the DWARF and translated to BTF by pahole) and those of the BPF program (available directly in BTF). DWARF Representation ==================== As noted above, btf_decl_tag is represented in DWARF via a new DIE DW_TAG_GNU_annotation, with identical format to the LLVM DWARF extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has the following format: DW_TAG_GNU_annotation (0x6000) DW_AT_name: "btf_decl_tag" DW_AT_const_value: <string argument> These DIEs are placed in the DWARF tree as children of the DIE for the appropriate declaration, and one such DIE is created for each occurrence of the btf_decl_tag attribute on a declaration. For example: const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); This declaration produces the following DWARF: <1><1e>: Abbrev Number: 2 (DW_TAG_variable) <1f> DW_AT_name : c <24> DW_AT_type : <0x49> ... <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag <44> DW_AT_const_value : __c <2><48>: Abbrev Number: 0 <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) ... The DIEs for btf_decl_tag are placed as children of the DIE for variable "c". BTF Representation ================== In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer to the annotated object by BTF type ID, as well as a component index which is used for btf_decl_tags placed on struct/union members or function arguments. For example, the BTF for the above declaration is: [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED [2] CONST '(anon)' type_id=1 [3] PTR '(anon)' type_id=2 [4] DECL_TAG '__c' type_id=6 component_idx=-1 [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 [6] VAR 'c' type_id=3, linkage=global ... The BTF format is documented here [4]. References ========== [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst David Faust (9): c-family: add btf_decl_tag attribute include: add BTF decl tag defines dwarf: create annotation DIEs for decl tags dwarf: expose get_die_parent ctf: add support to pass through BTF tags dwarf2ctf: convert annotation DIEs to CTF types btf: create and output BTF_KIND_DECL_TAG types testsuite: add tests for BTF decl tags doc: document btf_decl_tag attribute gcc/btfout.cc | 81 ++++++++++++++++++- gcc/c-family/c-attribs.cc | 23 ++++++ gcc/ctf-int.h | 28 +++++++ gcc/ctfc.cc | 10 ++- gcc/ctfc.h | 17 +++- gcc/doc/extend.texi | 47 +++++++++++ gcc/dwarf2ctf.cc | 73 ++++++++++++++++- gcc/dwarf2out.cc | 37 ++++++++- gcc/dwarf2out.h | 1 + .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ include/btf.h | 14 +++- include/dwarf2.def | 4 + 17 files changed, 437 insertions(+), 15 deletions(-) create mode 100644 gcc/ctf-int.h create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c
Comments
On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hello, > > This series adds support for a new attribute, "btf_decl_tag" in GCC. > The same attribute is already supported in clang, and is used by various > components of the BPF ecosystem. > > The purpose of the attribute is to allow to associate (to "tag") > declarations with arbitrary string annotations, which are emitted into > debugging information (DWARF and/or BTF) to facilitate post-compilation > analysis (the motivating use case being the Linux kernel BPF verifier). > Multiple tags are allowed on the same declaration. > > These strings are not interpreted by the compiler, and the attribute > itself has no effect on generated code, other than to produce additional > DWARF DIEs and/or BTF records conveying the annotations. > > This entails: > > - A new C-language-level attribute which allows to associate (to "tag") > particular declarations with arbitrary strings. > > - The conveyance of that information in DWARF in the form of a new DIE, > DW_TAG_GNU_annotation, with tag number (0x6000) and format matching > that of the DW_TAG_LLVM_annotation extension supported in LLVM for > the same purpose. These DIEs are already supported by BPF tooling, > such as pahole. > > - The conveyance of that information in BTF debug info in the form of > BTF_KIND_DECL_TAG records. These records are already supported by > LLVM and other tools in the eBPF ecosystem, such as the Linux kernel > eBPF verifier. > > > Background > ========== > > The purpose of these tags is to convey additional semantic information > to post-compilation consumers, in particular the Linux kernel eBPF > verifier. The verifier can make use of that information while analyzing > a BPF program to aid in determining whether to allow or reject the > program to be run. More background on these tags can be found in the > early support for them in the kernel here [1] and [2]. > > The "btf_decl_tag" attribute is half the story; the other half is a > sibling attribute "btf_type_tag" which serves the same purpose but > applies to types. Support for btf_type_tag will come in a separate > patch series, since it is impaced by GCC bug 110439 which needs to be > addressed first. > > I submitted an initial version of this work (including btf_type_tag) > last spring [3], however at the time there were some open questions > about the behavior of the btf_type_tag attribute and issues with its > implementation. Since then we have clarified these details and agreed > to solutions with the BPF community and LLVM BPF folks. > > The main motivation for emitting the tags in DWARF is that the Linux > kernel generates its BTF information via pahole, using DWARF as a source: > > +--------+ BTF BTF +----------+ > | pahole |-------> vmlinux.btf ------->| verifier | > +--------+ +----------+ > ^ ^ > | | > DWARF | BTF | > | | > vmlinux +-------------+ > module1.ko | BPF program | > module2.ko +-------------+ > ... > > This is because: > > a) pahole adds additional kernel-specific information into the > produced BTF based on additional analysis of kernel objects. > > b) Unlike GCC, LLVM will only generate BTF for BPF programs. > > b) GCC can generate BTF for whatever target with -gbtf, but there is no > support for linking/deduplicating BTF in the linker. > > In the scenario above, the verifier needs access to the pointer tags of > both the kernel types/declarations (conveyed in the DWARF and translated > to BTF by pahole) and those of the BPF program (available directly in BTF). > > > DWARF Representation > ==================== > > As noted above, btf_decl_tag is represented in DWARF via a new DIE > DW_TAG_GNU_annotation, with identical format to the LLVM DWARF > extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has > the following format: > > DW_TAG_GNU_annotation (0x6000) > DW_AT_name: "btf_decl_tag" > DW_AT_const_value: <string argument> > > These DIEs are placed in the DWARF tree as children of the DIE for the > appropriate declaration, and one such DIE is created for each occurrence > of the btf_decl_tag attribute on a declaration. > > For example: > > const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); > > This declaration produces the following DWARF: > > <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > <1f> DW_AT_name : c > <24> DW_AT_type : <0x49> > ... > <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) > <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem > <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) > <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > <44> DW_AT_const_value : __c > <2><48>: Abbrev Number: 0 > <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) > ... > > The DIEs for btf_decl_tag are placed as children of the DIE for > variable "c". It looks like a bit of overkill, and inefficient as well. Why's the tags not referenced via the existing DW_AT_description? Iff you want new TAGs why require them as children for each DIE rather than referencing (and sharing!) them via a DIE reference from a new attribute? That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. But well ... Richard. > > BTF Representation > ================== > > In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer > to the annotated object by BTF type ID, as well as a component index which is > used for btf_decl_tags placed on struct/union members or function arguments. > > For example, the BTF for the above declaration is: > > [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > [2] CONST '(anon)' type_id=1 > [3] PTR '(anon)' type_id=2 > [4] DECL_TAG '__c' type_id=6 component_idx=-1 > [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 > [6] VAR 'c' type_id=3, linkage=global > ... > > The BTF format is documented here [4]. > > > References > ========== > > [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ > [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ > [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html > [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst > > > David Faust (9): > c-family: add btf_decl_tag attribute > include: add BTF decl tag defines > dwarf: create annotation DIEs for decl tags > dwarf: expose get_die_parent > ctf: add support to pass through BTF tags > dwarf2ctf: convert annotation DIEs to CTF types > btf: create and output BTF_KIND_DECL_TAG types > testsuite: add tests for BTF decl tags > doc: document btf_decl_tag attribute > > gcc/btfout.cc | 81 ++++++++++++++++++- > gcc/c-family/c-attribs.cc | 23 ++++++ > gcc/ctf-int.h | 28 +++++++ > gcc/ctfc.cc | 10 ++- > gcc/ctfc.h | 17 +++- > gcc/doc/extend.texi | 47 +++++++++++ > gcc/dwarf2ctf.cc | 73 ++++++++++++++++- > gcc/dwarf2out.cc | 37 ++++++++- > gcc/dwarf2out.h | 1 + > .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ > .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ > .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ > .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ > .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ > .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ > include/btf.h | 14 +++- > include/dwarf2.def | 4 + > 17 files changed, 437 insertions(+), 15 deletions(-) > create mode 100644 gcc/ctf-int.h > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c > > -- > 2.40.1 >
[Added Eduard Zingerman in CC, who is implementing this same feature in clang/llvm and also the consumer component in the kernel (pahole).] Hi Richard. > On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: >> >> Hello, >> >> This series adds support for a new attribute, "btf_decl_tag" in GCC. >> The same attribute is already supported in clang, and is used by various >> components of the BPF ecosystem. >> >> The purpose of the attribute is to allow to associate (to "tag") >> declarations with arbitrary string annotations, which are emitted into >> debugging information (DWARF and/or BTF) to facilitate post-compilation >> analysis (the motivating use case being the Linux kernel BPF verifier). >> Multiple tags are allowed on the same declaration. >> >> These strings are not interpreted by the compiler, and the attribute >> itself has no effect on generated code, other than to produce additional >> DWARF DIEs and/or BTF records conveying the annotations. >> >> This entails: >> >> - A new C-language-level attribute which allows to associate (to "tag") >> particular declarations with arbitrary strings. >> >> - The conveyance of that information in DWARF in the form of a new DIE, >> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching >> that of the DW_TAG_LLVM_annotation extension supported in LLVM for >> the same purpose. These DIEs are already supported by BPF tooling, >> such as pahole. >> >> - The conveyance of that information in BTF debug info in the form of >> BTF_KIND_DECL_TAG records. These records are already supported by >> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel >> eBPF verifier. >> >> >> Background >> ========== >> >> The purpose of these tags is to convey additional semantic information >> to post-compilation consumers, in particular the Linux kernel eBPF >> verifier. The verifier can make use of that information while analyzing >> a BPF program to aid in determining whether to allow or reject the >> program to be run. More background on these tags can be found in the >> early support for them in the kernel here [1] and [2]. >> >> The "btf_decl_tag" attribute is half the story; the other half is a >> sibling attribute "btf_type_tag" which serves the same purpose but >> applies to types. Support for btf_type_tag will come in a separate >> patch series, since it is impaced by GCC bug 110439 which needs to be >> addressed first. >> >> I submitted an initial version of this work (including btf_type_tag) >> last spring [3], however at the time there were some open questions >> about the behavior of the btf_type_tag attribute and issues with its >> implementation. Since then we have clarified these details and agreed >> to solutions with the BPF community and LLVM BPF folks. >> >> The main motivation for emitting the tags in DWARF is that the Linux >> kernel generates its BTF information via pahole, using DWARF as a source: >> >> +--------+ BTF BTF +----------+ >> | pahole |-------> vmlinux.btf ------->| verifier | >> +--------+ +----------+ >> ^ ^ >> | | >> DWARF | BTF | >> | | >> vmlinux +-------------+ >> module1.ko | BPF program | >> module2.ko +-------------+ >> ... >> >> This is because: >> >> a) pahole adds additional kernel-specific information into the >> produced BTF based on additional analysis of kernel objects. >> >> b) Unlike GCC, LLVM will only generate BTF for BPF programs. >> >> b) GCC can generate BTF for whatever target with -gbtf, but there is no >> support for linking/deduplicating BTF in the linker. >> >> In the scenario above, the verifier needs access to the pointer tags of >> both the kernel types/declarations (conveyed in the DWARF and translated >> to BTF by pahole) and those of the BPF program (available directly in BTF). >> >> >> DWARF Representation >> ==================== >> >> As noted above, btf_decl_tag is represented in DWARF via a new DIE >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has >> the following format: >> >> DW_TAG_GNU_annotation (0x6000) >> DW_AT_name: "btf_decl_tag" >> DW_AT_const_value: <string argument> >> >> These DIEs are placed in the DWARF tree as children of the DIE for the >> appropriate declaration, and one such DIE is created for each occurrence >> of the btf_decl_tag attribute on a declaration. >> >> For example: >> >> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); >> >> This declaration produces the following DWARF: >> >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) >> <1f> DW_AT_name : c >> <24> DW_AT_type : <0x49> >> ... >> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) >> <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag >> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem >> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) >> <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag >> <44> DW_AT_const_value : __c >> <2><48>: Abbrev Number: 0 >> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) >> ... >> >> The DIEs for btf_decl_tag are placed as children of the DIE for >> variable "c". > > It looks like a bit of overkill, and inefficient as well. Why's the > tags not referenced via the existing DW_AT_description? The DWARF spec ("Entity Descriptions") seems to imply that the DW_AT_description attribute is intended to be used to hold alternative ways to denote the same "debugging information" (object, type, ...), i.e. alternative aliases to refer to the same entity than the DW_AT_name. For example, for a type name='foo' we could have description='aka. long int'. We don't think this is the case of the btf tags, which are more like properties partially characterizing the tagged "debugging information", but couldn't be used as an alias to the name. Also, repurposing the DW_AT_description attribute to hold btf tag information would require to introduce a mini-language and subsequent parsing by the clients: how to denote several tags, how to encode the embedded string contents, etc. You kick the complexity out the door and it comes back in through the window :) Finally, for what we know, the existing attribute may already be used by some language and handled by some debugger the way it is recommended in the spec. That would be incompatible with having btf tags encoded there. > Iff you want new TAGs why require them as children for each DIE rather > than referencing (and sharing!) them via a DIE reference from a new > attribute? Hmm, thats a very good question. The Linux kernel sources uses both declaration tags and type tags and not sharing the DIEs may result in serious bloating, since the tags are brought in to declarations and type specifiers via macros... > That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. > > But well ... > > Richard. > >> >> BTF Representation >> ================== >> >> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer >> to the annotated object by BTF type ID, as well as a component index which is >> used for btf_decl_tags placed on struct/union members or function arguments. >> >> For example, the BTF for the above declaration is: >> >> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED >> [2] CONST '(anon)' type_id=1 >> [3] PTR '(anon)' type_id=2 >> [4] DECL_TAG '__c' type_id=6 component_idx=-1 >> [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 >> [6] VAR 'c' type_id=3, linkage=global >> ... >> >> The BTF format is documented here [4]. >> >> >> References >> ========== >> >> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html >> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst >> >> >> David Faust (9): >> c-family: add btf_decl_tag attribute >> include: add BTF decl tag defines >> dwarf: create annotation DIEs for decl tags >> dwarf: expose get_die_parent >> ctf: add support to pass through BTF tags >> dwarf2ctf: convert annotation DIEs to CTF types >> btf: create and output BTF_KIND_DECL_TAG types >> testsuite: add tests for BTF decl tags >> doc: document btf_decl_tag attribute >> >> gcc/btfout.cc | 81 ++++++++++++++++++- >> gcc/c-family/c-attribs.cc | 23 ++++++ >> gcc/ctf-int.h | 28 +++++++ >> gcc/ctfc.cc | 10 ++- >> gcc/ctfc.h | 17 +++- >> gcc/doc/extend.texi | 47 +++++++++++ >> gcc/dwarf2ctf.cc | 73 ++++++++++++++++- >> gcc/dwarf2out.cc | 37 ++++++++- >> gcc/dwarf2out.h | 1 + >> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ >> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ >> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ >> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ >> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ >> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ >> include/btf.h | 14 +++- >> include/dwarf2.def | 4 + >> 17 files changed, 437 insertions(+), 15 deletions(-) >> create mode 100644 gcc/ctf-int.h >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c >> >> -- >> 2.40.1 >>
On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi <jose.marchesi@oracle.com> wrote: > > > [Added Eduard Zingerman in CC, who is implementing this same feature in > clang/llvm and also the consumer component in the kernel (pahole).] > > Hi Richard. > > > On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > >> > >> Hello, > >> > >> This series adds support for a new attribute, "btf_decl_tag" in GCC. > >> The same attribute is already supported in clang, and is used by various > >> components of the BPF ecosystem. > >> > >> The purpose of the attribute is to allow to associate (to "tag") > >> declarations with arbitrary string annotations, which are emitted into > >> debugging information (DWARF and/or BTF) to facilitate post-compilation > >> analysis (the motivating use case being the Linux kernel BPF verifier). > >> Multiple tags are allowed on the same declaration. > >> > >> These strings are not interpreted by the compiler, and the attribute > >> itself has no effect on generated code, other than to produce additional > >> DWARF DIEs and/or BTF records conveying the annotations. > >> > >> This entails: > >> > >> - A new C-language-level attribute which allows to associate (to "tag") > >> particular declarations with arbitrary strings. > >> > >> - The conveyance of that information in DWARF in the form of a new DIE, > >> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching > >> that of the DW_TAG_LLVM_annotation extension supported in LLVM for > >> the same purpose. These DIEs are already supported by BPF tooling, > >> such as pahole. > >> > >> - The conveyance of that information in BTF debug info in the form of > >> BTF_KIND_DECL_TAG records. These records are already supported by > >> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel > >> eBPF verifier. > >> > >> > >> Background > >> ========== > >> > >> The purpose of these tags is to convey additional semantic information > >> to post-compilation consumers, in particular the Linux kernel eBPF > >> verifier. The verifier can make use of that information while analyzing > >> a BPF program to aid in determining whether to allow or reject the > >> program to be run. More background on these tags can be found in the > >> early support for them in the kernel here [1] and [2]. > >> > >> The "btf_decl_tag" attribute is half the story; the other half is a > >> sibling attribute "btf_type_tag" which serves the same purpose but > >> applies to types. Support for btf_type_tag will come in a separate > >> patch series, since it is impaced by GCC bug 110439 which needs to be > >> addressed first. > >> > >> I submitted an initial version of this work (including btf_type_tag) > >> last spring [3], however at the time there were some open questions > >> about the behavior of the btf_type_tag attribute and issues with its > >> implementation. Since then we have clarified these details and agreed > >> to solutions with the BPF community and LLVM BPF folks. > >> > >> The main motivation for emitting the tags in DWARF is that the Linux > >> kernel generates its BTF information via pahole, using DWARF as a source: > >> > >> +--------+ BTF BTF +----------+ > >> | pahole |-------> vmlinux.btf ------->| verifier | > >> +--------+ +----------+ > >> ^ ^ > >> | | > >> DWARF | BTF | > >> | | > >> vmlinux +-------------+ > >> module1.ko | BPF program | > >> module2.ko +-------------+ > >> ... > >> > >> This is because: > >> > >> a) pahole adds additional kernel-specific information into the > >> produced BTF based on additional analysis of kernel objects. > >> > >> b) Unlike GCC, LLVM will only generate BTF for BPF programs. > >> > >> b) GCC can generate BTF for whatever target with -gbtf, but there is no > >> support for linking/deduplicating BTF in the linker. > >> > >> In the scenario above, the verifier needs access to the pointer tags of > >> both the kernel types/declarations (conveyed in the DWARF and translated > >> to BTF by pahole) and those of the BPF program (available directly in BTF). > >> > >> > >> DWARF Representation > >> ==================== > >> > >> As noted above, btf_decl_tag is represented in DWARF via a new DIE > >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF > >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has > >> the following format: > >> > >> DW_TAG_GNU_annotation (0x6000) > >> DW_AT_name: "btf_decl_tag" > >> DW_AT_const_value: <string argument> > >> > >> These DIEs are placed in the DWARF tree as children of the DIE for the > >> appropriate declaration, and one such DIE is created for each occurrence > >> of the btf_decl_tag attribute on a declaration. > >> > >> For example: > >> > >> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); > >> > >> This declaration produces the following DWARF: > >> > >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > >> <1f> DW_AT_name : c > >> <24> DW_AT_type : <0x49> > >> ... > >> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) > >> <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > >> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem > >> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) > >> <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > >> <44> DW_AT_const_value : __c > >> <2><48>: Abbrev Number: 0 > >> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) > >> ... > >> > >> The DIEs for btf_decl_tag are placed as children of the DIE for > >> variable "c". > > > > It looks like a bit of overkill, and inefficient as well. Why's the > > tags not referenced via the existing DW_AT_description? > > The DWARF spec ("Entity Descriptions") seems to imply that the > DW_AT_description attribute is intended to be used to hold alternative > ways to denote the same "debugging information" (object, type, ...), > i.e. alternative aliases to refer to the same entity than the > DW_AT_name. For example, for a type name='foo' we could have > description='aka. long int'. We don't think this is the case of the btf > tags, which are more like properties partially characterizing the tagged > "debugging information", but couldn't be used as an alias to the name. > > Also, repurposing the DW_AT_description attribute to hold btf tag > information would require to introduce a mini-language and subsequent > parsing by the clients: how to denote several tags, how to encode the > embedded string contents, etc. You kick the complexity out the door and > it comes back in through the window :) > > Finally, for what we know, the existing attribute may already be used by > some language and handled by some debugger the way it is recommended in > the spec. That would be incompatible with having btf tags encoded > there. How are the C/C++ standard attributes proposed to be encoded in dwarf? I think adding special encoding just for BTF tags looks wrong. > > Iff you want new TAGs why require them as children for each DIE rather > > than referencing (and sharing!) them via a DIE reference from a new > > attribute? > > Hmm, thats a very good question. The Linux kernel sources uses both > declaration tags and type tags and not sharing the DIEs may result in > serious bloating, since the tags are brought in to declarations and type > specifiers via macros... > > > That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. > > > > But well ... > > > > Richard. > > > >> > >> BTF Representation > >> ================== > >> > >> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer > >> to the annotated object by BTF type ID, as well as a component index which is > >> used for btf_decl_tags placed on struct/union members or function arguments. > >> > >> For example, the BTF for the above declaration is: > >> > >> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > >> [2] CONST '(anon)' type_id=1 > >> [3] PTR '(anon)' type_id=2 > >> [4] DECL_TAG '__c' type_id=6 component_idx=-1 > >> [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 > >> [6] VAR 'c' type_id=3, linkage=global > >> ... > >> > >> The BTF format is documented here [4]. > >> > >> > >> References > >> ========== > >> > >> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ > >> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ > >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html > >> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst > >> > >> > >> David Faust (9): > >> c-family: add btf_decl_tag attribute > >> include: add BTF decl tag defines > >> dwarf: create annotation DIEs for decl tags > >> dwarf: expose get_die_parent > >> ctf: add support to pass through BTF tags > >> dwarf2ctf: convert annotation DIEs to CTF types > >> btf: create and output BTF_KIND_DECL_TAG types > >> testsuite: add tests for BTF decl tags > >> doc: document btf_decl_tag attribute > >> > >> gcc/btfout.cc | 81 ++++++++++++++++++- > >> gcc/c-family/c-attribs.cc | 23 ++++++ > >> gcc/ctf-int.h | 28 +++++++ > >> gcc/ctfc.cc | 10 ++- > >> gcc/ctfc.h | 17 +++- > >> gcc/doc/extend.texi | 47 +++++++++++ > >> gcc/dwarf2ctf.cc | 73 ++++++++++++++++- > >> gcc/dwarf2out.cc | 37 ++++++++- > >> gcc/dwarf2out.h | 1 + > >> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ > >> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ > >> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ > >> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ > >> include/btf.h | 14 +++- > >> include/dwarf2.def | 4 + > >> 17 files changed, 437 insertions(+), 15 deletions(-) > >> create mode 100644 gcc/ctf-int.h > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c > >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c > >> > >> -- > >> 2.40.1 > >>
> On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi > <jose.marchesi@oracle.com> wrote: >> >> >> [Added Eduard Zingerman in CC, who is implementing this same feature in >> clang/llvm and also the consumer component in the kernel (pahole).] >> >> Hi Richard. >> >> > On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches >> > <gcc-patches@gcc.gnu.org> wrote: >> >> >> >> Hello, >> >> >> >> This series adds support for a new attribute, "btf_decl_tag" in GCC. >> >> The same attribute is already supported in clang, and is used by various >> >> components of the BPF ecosystem. >> >> >> >> The purpose of the attribute is to allow to associate (to "tag") >> >> declarations with arbitrary string annotations, which are emitted into >> >> debugging information (DWARF and/or BTF) to facilitate post-compilation >> >> analysis (the motivating use case being the Linux kernel BPF verifier). >> >> Multiple tags are allowed on the same declaration. >> >> >> >> These strings are not interpreted by the compiler, and the attribute >> >> itself has no effect on generated code, other than to produce additional >> >> DWARF DIEs and/or BTF records conveying the annotations. >> >> >> >> This entails: >> >> >> >> - A new C-language-level attribute which allows to associate (to "tag") >> >> particular declarations with arbitrary strings. >> >> >> >> - The conveyance of that information in DWARF in the form of a new DIE, >> >> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching >> >> that of the DW_TAG_LLVM_annotation extension supported in LLVM for >> >> the same purpose. These DIEs are already supported by BPF tooling, >> >> such as pahole. >> >> >> >> - The conveyance of that information in BTF debug info in the form of >> >> BTF_KIND_DECL_TAG records. These records are already supported by >> >> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel >> >> eBPF verifier. >> >> >> >> >> >> Background >> >> ========== >> >> >> >> The purpose of these tags is to convey additional semantic information >> >> to post-compilation consumers, in particular the Linux kernel eBPF >> >> verifier. The verifier can make use of that information while analyzing >> >> a BPF program to aid in determining whether to allow or reject the >> >> program to be run. More background on these tags can be found in the >> >> early support for them in the kernel here [1] and [2]. >> >> >> >> The "btf_decl_tag" attribute is half the story; the other half is a >> >> sibling attribute "btf_type_tag" which serves the same purpose but >> >> applies to types. Support for btf_type_tag will come in a separate >> >> patch series, since it is impaced by GCC bug 110439 which needs to be >> >> addressed first. >> >> >> >> I submitted an initial version of this work (including btf_type_tag) >> >> last spring [3], however at the time there were some open questions >> >> about the behavior of the btf_type_tag attribute and issues with its >> >> implementation. Since then we have clarified these details and agreed >> >> to solutions with the BPF community and LLVM BPF folks. >> >> >> >> The main motivation for emitting the tags in DWARF is that the Linux >> >> kernel generates its BTF information via pahole, using DWARF as a source: >> >> >> >> +--------+ BTF BTF +----------+ >> >> | pahole |-------> vmlinux.btf ------->| verifier | >> >> +--------+ +----------+ >> >> ^ ^ >> >> | | >> >> DWARF | BTF | >> >> | | >> >> vmlinux +-------------+ >> >> module1.ko | BPF program | >> >> module2.ko +-------------+ >> >> ... >> >> >> >> This is because: >> >> >> >> a) pahole adds additional kernel-specific information into the >> >> produced BTF based on additional analysis of kernel objects. >> >> >> >> b) Unlike GCC, LLVM will only generate BTF for BPF programs. >> >> >> >> b) GCC can generate BTF for whatever target with -gbtf, but there is no >> >> support for linking/deduplicating BTF in the linker. >> >> >> >> In the scenario above, the verifier needs access to the pointer tags of >> >> both the kernel types/declarations (conveyed in the DWARF and translated >> >> to BTF by pahole) and those of the BPF program (available directly in BTF). >> >> >> >> >> >> DWARF Representation >> >> ==================== >> >> >> >> As noted above, btf_decl_tag is represented in DWARF via a new DIE >> >> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF >> >> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has >> >> the following format: >> >> >> >> DW_TAG_GNU_annotation (0x6000) >> >> DW_AT_name: "btf_decl_tag" >> >> DW_AT_const_value: <string argument> >> >> >> >> These DIEs are placed in the DWARF tree as children of the DIE for the >> >> appropriate declaration, and one such DIE is created for each occurrence >> >> of the btf_decl_tag attribute on a declaration. >> >> >> >> For example: >> >> >> >> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); >> >> >> >> This declaration produces the following DWARF: >> >> >> >> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) >> >> <1f> DW_AT_name : c >> >> <24> DW_AT_type : <0x49> >> >> ... >> >> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) >> >> <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag >> >> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem >> >> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) >> >> <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag >> >> <44> DW_AT_const_value : __c >> >> <2><48>: Abbrev Number: 0 >> >> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) >> >> ... >> >> >> >> The DIEs for btf_decl_tag are placed as children of the DIE for >> >> variable "c". >> > >> > It looks like a bit of overkill, and inefficient as well. Why's the >> > tags not referenced via the existing DW_AT_description? >> >> The DWARF spec ("Entity Descriptions") seems to imply that the >> DW_AT_description attribute is intended to be used to hold alternative >> ways to denote the same "debugging information" (object, type, ...), >> i.e. alternative aliases to refer to the same entity than the >> DW_AT_name. For example, for a type name='foo' we could have >> description='aka. long int'. We don't think this is the case of the btf >> tags, which are more like properties partially characterizing the tagged >> "debugging information", but couldn't be used as an alias to the name. >> >> Also, repurposing the DW_AT_description attribute to hold btf tag >> information would require to introduce a mini-language and subsequent >> parsing by the clients: how to denote several tags, how to encode the >> embedded string contents, etc. You kick the complexity out the door and >> it comes back in through the window :) >> >> Finally, for what we know, the existing attribute may already be used by >> some language and handled by some debugger the way it is recommended in >> the spec. That would be incompatible with having btf tags encoded >> there. > > How are the C/C++ standard attributes proposed to be encoded in dwarf? > I think adding special encoding just for BTF tags looks wrong. To my knowledge the impact that existing standard C/C++ attributes may have in the debug info can already be encoded with the existing DWARF mechanisms. For example, an attribute that results in some type to be considered volatile may result in the addition of a DW_TAG_volatile_type tag in the corresponding DW_AT_type chain. But for these "btf tags" attributes, whose impact in debug info (and in fact only purpose) is to associate an arbitrary string to either a declared object or a type, we couldn't find any way to convey the information in DWARF without breaking backwards compatibility, other than introducing the new DIE tag. [What would have been perfect is to be able to chain the DW_TAG_GNU_annotation in the DW_AT_type chains much like qualifiers are handled. But that would break every DWARF reader in existence :( I wish DWARF would have a DW_TAG_type_nop.] By the way, as I think Faust has already mentioned in the cover letter, we (the BPF GCC folk) think the "btf_{type,decl}_tag" C attribute name is sort of a misnomer, because these attributes are not necessarily coupled with BTF. Granted, the main usage of this is to convey the annotations (via DWARF) to pahole so it can be converted to BTF. But other programs like the drgn debugger have expressed interest to access the annotations directly from the DWARF, and BTF is not involved at all. After a (little heated ;P) discussion in the BPF kernel list, the kernel people said they would be ok with GCC using a different name for the attributes than clang uses. They can abstract it using a macro in the kernel sources. My personal favorite would be {type,decl}_annotation. > >> > Iff you want new TAGs why require them as children for each DIE rather >> > than referencing (and sharing!) them via a DIE reference from a new >> > attribute? >> >> Hmm, thats a very good question. The Linux kernel sources uses both >> declaration tags and type tags and not sharing the DIEs may result in >> serious bloating, since the tags are brought in to declarations and type >> specifiers via macros... >> >> > That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. >> > >> > But well ... >> > >> > Richard. >> > >> >> >> >> BTF Representation >> >> ================== >> >> >> >> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer >> >> to the annotated object by BTF type ID, as well as a component index which is >> >> used for btf_decl_tags placed on struct/union members or function arguments. >> >> >> >> For example, the BTF for the above declaration is: >> >> >> >> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED >> >> [2] CONST '(anon)' type_id=1 >> >> [3] PTR '(anon)' type_id=2 >> >> [4] DECL_TAG '__c' type_id=6 component_idx=-1 >> >> [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 >> >> [6] VAR 'c' type_id=3, linkage=global >> >> ... >> >> >> >> The BTF format is documented here [4]. >> >> >> >> >> >> References >> >> ========== >> >> >> >> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >> >> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ >> >> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html >> >> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst >> >> >> >> >> >> David Faust (9): >> >> c-family: add btf_decl_tag attribute >> >> include: add BTF decl tag defines >> >> dwarf: create annotation DIEs for decl tags >> >> dwarf: expose get_die_parent >> >> ctf: add support to pass through BTF tags >> >> dwarf2ctf: convert annotation DIEs to CTF types >> >> btf: create and output BTF_KIND_DECL_TAG types >> >> testsuite: add tests for BTF decl tags >> >> doc: document btf_decl_tag attribute >> >> >> >> gcc/btfout.cc | 81 ++++++++++++++++++- >> >> gcc/c-family/c-attribs.cc | 23 ++++++ >> >> gcc/ctf-int.h | 28 +++++++ >> >> gcc/ctfc.cc | 10 ++- >> >> gcc/ctfc.h | 17 +++- >> >> gcc/doc/extend.texi | 47 +++++++++++ >> >> gcc/dwarf2ctf.cc | 73 ++++++++++++++++- >> >> gcc/dwarf2out.cc | 37 ++++++++- >> >> gcc/dwarf2out.h | 1 + >> >> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ >> >> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ >> >> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ >> >> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ >> >> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ >> >> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ >> >> include/btf.h | 14 +++- >> >> include/dwarf2.def | 4 + >> >> 17 files changed, 437 insertions(+), 15 deletions(-) >> >> create mode 100644 gcc/ctf-int.h >> >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c >> >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c >> >> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c >> >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c >> >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c >> >> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c >> >> >> >> -- >> >> 2.40.1 >> >>
On 7/12/23 06:49, Jose E. Marchesi wrote: > >> On Wed, Jul 12, 2023 at 2:44 PM Jose E. Marchesi >> <jose.marchesi@oracle.com> wrote: >>> >>> >>> [Added Eduard Zingerman in CC, who is implementing this same feature in >>> clang/llvm and also the consumer component in the kernel (pahole).] >>> >>> Hi Richard. >>> >>>> On Tue, Jul 11, 2023 at 11:58 PM David Faust via Gcc-patches >>>> <gcc-patches@gcc.gnu.org> wrote: >>>>> >>>>> Hello, >>>>> >>>>> This series adds support for a new attribute, "btf_decl_tag" in GCC. >>>>> The same attribute is already supported in clang, and is used by various >>>>> components of the BPF ecosystem. >>>>> >>>>> The purpose of the attribute is to allow to associate (to "tag") >>>>> declarations with arbitrary string annotations, which are emitted into >>>>> debugging information (DWARF and/or BTF) to facilitate post-compilation >>>>> analysis (the motivating use case being the Linux kernel BPF verifier). >>>>> Multiple tags are allowed on the same declaration. >>>>> >>>>> These strings are not interpreted by the compiler, and the attribute >>>>> itself has no effect on generated code, other than to produce additional >>>>> DWARF DIEs and/or BTF records conveying the annotations. >>>>> >>>>> This entails: >>>>> >>>>> - A new C-language-level attribute which allows to associate (to "tag") >>>>> particular declarations with arbitrary strings. >>>>> >>>>> - The conveyance of that information in DWARF in the form of a new DIE, >>>>> DW_TAG_GNU_annotation, with tag number (0x6000) and format matching >>>>> that of the DW_TAG_LLVM_annotation extension supported in LLVM for >>>>> the same purpose. These DIEs are already supported by BPF tooling, >>>>> such as pahole. >>>>> >>>>> - The conveyance of that information in BTF debug info in the form of >>>>> BTF_KIND_DECL_TAG records. These records are already supported by >>>>> LLVM and other tools in the eBPF ecosystem, such as the Linux kernel >>>>> eBPF verifier. >>>>> >>>>> >>>>> Background >>>>> ========== >>>>> >>>>> The purpose of these tags is to convey additional semantic information >>>>> to post-compilation consumers, in particular the Linux kernel eBPF >>>>> verifier. The verifier can make use of that information while analyzing >>>>> a BPF program to aid in determining whether to allow or reject the >>>>> program to be run. More background on these tags can be found in the >>>>> early support for them in the kernel here [1] and [2]. >>>>> >>>>> The "btf_decl_tag" attribute is half the story; the other half is a >>>>> sibling attribute "btf_type_tag" which serves the same purpose but >>>>> applies to types. Support for btf_type_tag will come in a separate >>>>> patch series, since it is impaced by GCC bug 110439 which needs to be >>>>> addressed first. >>>>> >>>>> I submitted an initial version of this work (including btf_type_tag) >>>>> last spring [3], however at the time there were some open questions >>>>> about the behavior of the btf_type_tag attribute and issues with its >>>>> implementation. Since then we have clarified these details and agreed >>>>> to solutions with the BPF community and LLVM BPF folks. >>>>> >>>>> The main motivation for emitting the tags in DWARF is that the Linux >>>>> kernel generates its BTF information via pahole, using DWARF as a source: >>>>> >>>>> +--------+ BTF BTF +----------+ >>>>> | pahole |-------> vmlinux.btf ------->| verifier | >>>>> +--------+ +----------+ >>>>> ^ ^ >>>>> | | >>>>> DWARF | BTF | >>>>> | | >>>>> vmlinux +-------------+ >>>>> module1.ko | BPF program | >>>>> module2.ko +-------------+ >>>>> ... >>>>> >>>>> This is because: >>>>> >>>>> a) pahole adds additional kernel-specific information into the >>>>> produced BTF based on additional analysis of kernel objects. >>>>> >>>>> b) Unlike GCC, LLVM will only generate BTF for BPF programs. >>>>> >>>>> b) GCC can generate BTF for whatever target with -gbtf, but there is no >>>>> support for linking/deduplicating BTF in the linker. >>>>> >>>>> In the scenario above, the verifier needs access to the pointer tags of >>>>> both the kernel types/declarations (conveyed in the DWARF and translated >>>>> to BTF by pahole) and those of the BPF program (available directly in BTF). >>>>> >>>>> >>>>> DWARF Representation >>>>> ==================== >>>>> >>>>> As noted above, btf_decl_tag is represented in DWARF via a new DIE >>>>> DW_TAG_GNU_annotation, with identical format to the LLVM DWARF >>>>> extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has >>>>> the following format: >>>>> >>>>> DW_TAG_GNU_annotation (0x6000) >>>>> DW_AT_name: "btf_decl_tag" >>>>> DW_AT_const_value: <string argument> >>>>> >>>>> These DIEs are placed in the DWARF tree as children of the DIE for the >>>>> appropriate declaration, and one such DIE is created for each occurrence >>>>> of the btf_decl_tag attribute on a declaration. >>>>> >>>>> For example: >>>>> >>>>> const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); >>>>> >>>>> This declaration produces the following DWARF: >>>>> >>>>> <1><1e>: Abbrev Number: 2 (DW_TAG_variable) >>>>> <1f> DW_AT_name : c >>>>> <24> DW_AT_type : <0x49> >>>>> ... >>>>> <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) >>>>> <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag >>>>> <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem >>>>> <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) >>>>> <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag >>>>> <44> DW_AT_const_value : __c >>>>> <2><48>: Abbrev Number: 0 >>>>> <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) >>>>> ... >>>>> >>>>> The DIEs for btf_decl_tag are placed as children of the DIE for >>>>> variable "c". >>>> >>>> It looks like a bit of overkill, and inefficient as well. Why's the >>>> tags not referenced via the existing DW_AT_description? The DWARF format here (with children DIEs) is the format already in use, supported by the kernel/pahole and emitted by clang. So the simple answer is just to be compatible with the existing tools. That's not to say it couldn't be improved, so long as all the relevant producers/consumers (clang, pahole, whatever else) agree. >>> >>> The DWARF spec ("Entity Descriptions") seems to imply that the >>> DW_AT_description attribute is intended to be used to hold alternative >>> ways to denote the same "debugging information" (object, type, ...), >>> i.e. alternative aliases to refer to the same entity than the >>> DW_AT_name. For example, for a type name='foo' we could have >>> description='aka. long int'. We don't think this is the case of the btf >>> tags, which are more like properties partially characterizing the tagged >>> "debugging information", but couldn't be used as an alias to the name. >>> >>> Also, repurposing the DW_AT_description attribute to hold btf tag >>> information would require to introduce a mini-language and subsequent >>> parsing by the clients: how to denote several tags, how to encode the >>> embedded string contents, etc. You kick the complexity out the door and >>> it comes back in through the window :) >>> >>> Finally, for what we know, the existing attribute may already be used by >>> some language and handled by some debugger the way it is recommended in >>> the spec. That would be incompatible with having btf tags encoded >>> there. >> >> How are the C/C++ standard attributes proposed to be encoded in dwarf? >> I think adding special encoding just for BTF tags looks wrong. > > To my knowledge the impact that existing standard C/C++ attributes may > have in the debug info can already be encoded with the existing DWARF > mechanisms. For example, an attribute that results in some type to be > considered volatile may result in the addition of a DW_TAG_volatile_type > tag in the corresponding DW_AT_type chain. > > But for these "btf tags" attributes, whose impact in debug info (and in > fact only purpose) is to associate an arbitrary string to either a > declared object or a type, we couldn't find any way to convey the > information in DWARF without breaking backwards compatibility, other > than introducing the new DIE tag. > > [What would have been perfect is to be able to chain the > DW_TAG_GNU_annotation in the DW_AT_type chains much like qualifiers are > handled. But that would break every DWARF reader in existence :( I > wish DWARF would have a DW_TAG_type_nop.] > > By the way, as I think Faust has already mentioned in the cover letter, > we (the BPF GCC folk) think the "btf_{type,decl}_tag" C attribute name > is sort of a misnomer, because these attributes are not necessarily > coupled with BTF. Granted, the main usage of this is to convey the > annotations (via DWARF) to pahole so it can be converted to BTF. But > other programs like the drgn debugger have expressed interest to access > the annotations directly from the DWARF, and BTF is not involved at all. > After a (little heated ;P) discussion in the BPF kernel list, the kernel > people said they would be ok with GCC using a different name for the > attributes than clang uses. They can abstract it using a macro in the > kernel sources. My personal favorite would be {type,decl}_annotation. > >> >>>> Iff you want new TAGs why require them as children for each DIE rather >>>> than referencing (and sharing!) them via a DIE reference from a new >>>> attribute? >>> >>> Hmm, thats a very good question. The Linux kernel sources uses both >>> declaration tags and type tags and not sharing the DIEs may result in >>> serious bloating, since the tags are brought in to declarations and type >>> specifiers via macros... I agree this is a good point and could certainly be an improvement to the format. But, I'm not sure how this would work for objects with multiple btf_tags. As I understand, a given attribute can only be supplied once for any given DIE. So if there are multiple tags on one object, it's not clear to me how we would store the references to them. Maybe with some chaining between the btf_tag DIEs... I think a little duplication might be unavoidable but it still could be a big improvement, since the usual use case iiuc is "relatively few distinct tags, but a lot of objects with those tags". FWIW and as some background, there was similar discussion in the llvm list back when the tags were first proposed in this thread: https://lists.llvm.org/pipermail/llvm-dev/2021-June/151009.html which led eventually to the discussion of using multiple DIEs as children instead (later in the same thread): https://lists.llvm.org/pipermail/llvm-dev/2021-June/151256.html >>> >>>> That said, I'd go with DW_AT_description 'btf_decl_tag ("devicemem")'. >>>> >>>> But well ... >>>> >>>> Richard. >>>> >>>>> >>>>> BTF Representation >>>>> ================== >>>>> >>>>> In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer >>>>> to the annotated object by BTF type ID, as well as a component index which is >>>>> used for btf_decl_tags placed on struct/union members or function arguments. >>>>> >>>>> For example, the BTF for the above declaration is: >>>>> >>>>> [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED >>>>> [2] CONST '(anon)' type_id=1 >>>>> [3] PTR '(anon)' type_id=2 >>>>> [4] DECL_TAG '__c' type_id=6 component_idx=-1 >>>>> [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 >>>>> [6] VAR 'c' type_id=3, linkage=global >>>>> ... >>>>> >>>>> The BTF format is documented here [4]. >>>>> >>>>> >>>>> References >>>>> ========== >>>>> >>>>> [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ >>>>> [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ >>>>> [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html >>>>> [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst >>>>> >>>>> >>>>> David Faust (9): >>>>> c-family: add btf_decl_tag attribute >>>>> include: add BTF decl tag defines >>>>> dwarf: create annotation DIEs for decl tags >>>>> dwarf: expose get_die_parent >>>>> ctf: add support to pass through BTF tags >>>>> dwarf2ctf: convert annotation DIEs to CTF types >>>>> btf: create and output BTF_KIND_DECL_TAG types >>>>> testsuite: add tests for BTF decl tags >>>>> doc: document btf_decl_tag attribute >>>>> >>>>> gcc/btfout.cc | 81 ++++++++++++++++++- >>>>> gcc/c-family/c-attribs.cc | 23 ++++++ >>>>> gcc/ctf-int.h | 28 +++++++ >>>>> gcc/ctfc.cc | 10 ++- >>>>> gcc/ctfc.h | 17 +++- >>>>> gcc/doc/extend.texi | 47 +++++++++++ >>>>> gcc/dwarf2ctf.cc | 73 ++++++++++++++++- >>>>> gcc/dwarf2out.cc | 37 ++++++++- >>>>> gcc/dwarf2out.h | 1 + >>>>> .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ >>>>> .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ >>>>> .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ >>>>> .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ >>>>> .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ >>>>> .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ >>>>> include/btf.h | 14 +++- >>>>> include/dwarf2.def | 4 + >>>>> 17 files changed, 437 insertions(+), 15 deletions(-) >>>>> create mode 100644 gcc/ctf-int.h >>>>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c >>>>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c >>>>> create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c >>>>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c >>>>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c >>>>> create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c >>>>> >>>>> -- >>>>> 2.40.1 >>>>>
Gentle ping. https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624156.html On 7/11/23 14:57, David Faust via Gcc-patches wrote: > Hello, > > This series adds support for a new attribute, "btf_decl_tag" in GCC. > The same attribute is already supported in clang, and is used by various > components of the BPF ecosystem. > > The purpose of the attribute is to allow to associate (to "tag") > declarations with arbitrary string annotations, which are emitted into > debugging information (DWARF and/or BTF) to facilitate post-compilation > analysis (the motivating use case being the Linux kernel BPF verifier). > Multiple tags are allowed on the same declaration. > > These strings are not interpreted by the compiler, and the attribute > itself has no effect on generated code, other than to produce additional > DWARF DIEs and/or BTF records conveying the annotations. > > This entails: > > - A new C-language-level attribute which allows to associate (to "tag") > particular declarations with arbitrary strings. > > - The conveyance of that information in DWARF in the form of a new DIE, > DW_TAG_GNU_annotation, with tag number (0x6000) and format matching > that of the DW_TAG_LLVM_annotation extension supported in LLVM for > the same purpose. These DIEs are already supported by BPF tooling, > such as pahole. > > - The conveyance of that information in BTF debug info in the form of > BTF_KIND_DECL_TAG records. These records are already supported by > LLVM and other tools in the eBPF ecosystem, such as the Linux kernel > eBPF verifier. > > > Background > ========== > > The purpose of these tags is to convey additional semantic information > to post-compilation consumers, in particular the Linux kernel eBPF > verifier. The verifier can make use of that information while analyzing > a BPF program to aid in determining whether to allow or reject the > program to be run. More background on these tags can be found in the > early support for them in the kernel here [1] and [2]. > > The "btf_decl_tag" attribute is half the story; the other half is a > sibling attribute "btf_type_tag" which serves the same purpose but > applies to types. Support for btf_type_tag will come in a separate > patch series, since it is impaced by GCC bug 110439 which needs to be > addressed first. > > I submitted an initial version of this work (including btf_type_tag) > last spring [3], however at the time there were some open questions > about the behavior of the btf_type_tag attribute and issues with its > implementation. Since then we have clarified these details and agreed > to solutions with the BPF community and LLVM BPF folks. > > The main motivation for emitting the tags in DWARF is that the Linux > kernel generates its BTF information via pahole, using DWARF as a source: > > +--------+ BTF BTF +----------+ > | pahole |-------> vmlinux.btf ------->| verifier | > +--------+ +----------+ > ^ ^ > | | > DWARF | BTF | > | | > vmlinux +-------------+ > module1.ko | BPF program | > module2.ko +-------------+ > ... > > This is because: > > a) pahole adds additional kernel-specific information into the > produced BTF based on additional analysis of kernel objects. > > b) Unlike GCC, LLVM will only generate BTF for BPF programs. > > b) GCC can generate BTF for whatever target with -gbtf, but there is no > support for linking/deduplicating BTF in the linker. > > In the scenario above, the verifier needs access to the pointer tags of > both the kernel types/declarations (conveyed in the DWARF and translated > to BTF by pahole) and those of the BPF program (available directly in BTF). > > > DWARF Representation > ==================== > > As noted above, btf_decl_tag is represented in DWARF via a new DIE > DW_TAG_GNU_annotation, with identical format to the LLVM DWARF > extension DW_TAG_LLVM_annotation serving the same purpose. The DIE has > the following format: > > DW_TAG_GNU_annotation (0x6000) > DW_AT_name: "btf_decl_tag" > DW_AT_const_value: <string argument> > > These DIEs are placed in the DWARF tree as children of the DIE for the > appropriate declaration, and one such DIE is created for each occurrence > of the btf_decl_tag attribute on a declaration. > > For example: > > const int * c __attribute__((btf_decl_tag ("__c"), btf_decl_tag ("devicemem"))); > > This declaration produces the following DWARF: > > <1><1e>: Abbrev Number: 2 (DW_TAG_variable) > <1f> DW_AT_name : c > <24> DW_AT_type : <0x49> > ... > <2><36>: Abbrev Number: 3 (User TAG value: 0x6000) > <37> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > <3b> DW_AT_const_value : (indirect string, offset: 0): devicemem > <2><3f>: Abbrev Number: 4 (User TAG value: 0x6000) > <40> DW_AT_name : (indirect string, offset: 0x4c): btf_decl_tag > <44> DW_AT_const_value : __c > <2><48>: Abbrev Number: 0 > <1><49>: Abbrev Number: 5 (DW_TAG_pointer_type) > ... > > The DIEs for btf_decl_tag are placed as children of the DIE for > variable "c". > > BTF Representation > ================== > > In BTF, BTF_KIND_DECL_TAG records convey the annotations. These records refer > to the annotated object by BTF type ID, as well as a component index which is > used for btf_decl_tags placed on struct/union members or function arguments. > > For example, the BTF for the above declaration is: > > [1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED > [2] CONST '(anon)' type_id=1 > [3] PTR '(anon)' type_id=2 > [4] DECL_TAG '__c' type_id=6 component_idx=-1 > [5] DECL_TAG 'devicemem' type_id=6 component_idx=-1 > [6] VAR 'c' type_id=3, linkage=global > ... > > The BTF format is documented here [4]. > > > References > ========== > > [1] https://lore.kernel.org/bpf/20210914223004.244411-1-yhs@fb.com/ > [2] https://lore.kernel.org/bpf/20211011040608.3031468-1-yhs@fb.com/ > [3] https://gcc.gnu.org/pipermail/gcc-patches/2022-May/593936.html > [4] https://www.kernel.org/doc/Documentation/bpf/btf.rst > > > David Faust (9): > c-family: add btf_decl_tag attribute > include: add BTF decl tag defines > dwarf: create annotation DIEs for decl tags > dwarf: expose get_die_parent > ctf: add support to pass through BTF tags > dwarf2ctf: convert annotation DIEs to CTF types > btf: create and output BTF_KIND_DECL_TAG types > testsuite: add tests for BTF decl tags > doc: document btf_decl_tag attribute > > gcc/btfout.cc | 81 ++++++++++++++++++- > gcc/c-family/c-attribs.cc | 23 ++++++ > gcc/ctf-int.h | 28 +++++++ > gcc/ctfc.cc | 10 ++- > gcc/ctfc.h | 17 +++- > gcc/doc/extend.texi | 47 +++++++++++ > gcc/dwarf2ctf.cc | 73 ++++++++++++++++- > gcc/dwarf2out.cc | 37 ++++++++- > gcc/dwarf2out.h | 1 + > .../gcc.dg/debug/btf/btf-decltag-func.c | 21 +++++ > .../gcc.dg/debug/btf/btf-decltag-sou.c | 33 ++++++++ > .../gcc.dg/debug/btf/btf-decltag-var.c | 19 +++++ > .../gcc.dg/debug/dwarf2/annotation-decl-1.c | 9 +++ > .../gcc.dg/debug/dwarf2/annotation-decl-2.c | 18 +++++ > .../gcc.dg/debug/dwarf2/annotation-decl-3.c | 17 ++++ > include/btf.h | 14 +++- > include/dwarf2.def | 4 + > 17 files changed, 437 insertions(+), 15 deletions(-) > create mode 100644 gcc/ctf-int.h > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-func.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-sou.c > create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-decltag-var.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-1.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-2.c > create mode 100644 gcc/testsuite/gcc.dg/debug/dwarf2/annotation-decl-3.c >