From patchwork Mon Oct 23 00:55:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Harmstone X-Patchwork-Id: 156619 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1010831vqx; Sun, 22 Oct 2023 17:56:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGVy5ef4dJfokXIvBxeNEX2l8hWgjRwo14p//78A6QINzk+jmCQZJRu7yy75v6x2OqAOQT3 X-Received: by 2002:a05:620a:9cd:b0:778:9210:ceb3 with SMTP id y13-20020a05620a09cd00b007789210ceb3mr7481465qky.46.1698022600227; Sun, 22 Oct 2023 17:56:40 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698022600; cv=pass; d=google.com; s=arc-20160816; b=PHYkn/ts82hDwWNZfX3wTXgBhG5zUeUXI4F27vEII9RwXfEaXkeMiE6exDNO23IDZH Ykf1QntVyYUahMaCNHHI5If4QOHEzoRm0cE/huJP69Pn5+jc6QGPknGeQchEd6oBwsaV btVbcEH5zqN5El0Uq2UPYQpscL6DhJCOlqt9KMQ7+WNThk0Qnndv4FC8GwTc8y56GDhI nP8xGjCIc2P7QIGEe7r9AasFQQRk62XfylA9bY+zjG6gDMsKtTlWvrCjyMKGrcX55Gjd Vz2tk2jaou4E3YIBOD5paq9b+cgJ9BGckm3fdW5wnS09vp2ElQf5KeXj3jg1DmQQWhcF IM+Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=tiGDOc2CZogh6wLQyUQIA8hUUKi+ZjzEG4mTVT3qGuI=; fh=ULVDVi0LXJXve4bv6avxt5t1j9fbG672tLMzvs8AGaA=; b=E6a3gJaJhuYf+sfbfoerNIUKI+iXsliAne0Dc53XUACtMLdskAAwZttYaZNXCOoxjZ uas62lLINrPwnuJt0muBQ1KES6RAXcOTruh514gQLVMptM4PiiY2xCqwQjHTuuAFBDpg lrCDWZM2tE9iQcoUfamPRAScTRe8/W8omZJNNA7Amb2KU1qc7Wj4bv9xCHnf4I3jcSvF f820sswJGcHOq7qKacpt+K+/z4CZh6ObvWiaXOtOPhyKWjzdKstog95WkK47xPxOXxlS j5eWPTC0Nt6gmQYgtCJetbw+uyTL4YhfUmlCyPQv30TcHWJ3A7Uv3UT2odDBrjP1fmCp 0LTA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@harmstone.com header.s=mail header.b=TBHL4Otu; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id y30-20020a05620a0e1e00b00773b7d9a01esi4461728qkm.72.2023.10.22.17.56.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Oct 2023 17:56:40 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@harmstone.com header.s=mail header.b=TBHL4Otu; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 019593858410 for ; Mon, 23 Oct 2023 00:56:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.burntcomma.com (mail.burntcomma.com [IPv6:2a02:8012:8cf0:250::6d61:696c]) by sourceware.org (Postfix) with ESMTPS id 3E1F43858CDA for ; Mon, 23 Oct 2023 00:55:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3E1F43858CDA Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=harmstone.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=harmstone.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3E1F43858CDA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a02:8012:8cf0:250::6d61:696c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698022547; cv=none; b=eWe5tkvPTMDA1s+FnZgeMEk/Qz9mlq26AOf7iuuiEclBOIliHhq6R7YWPD0KJUevIxAukiggRQG0HLyn4XH95vQ1G6QOBf313QVuSNFeRWKT7HR4Xg2V6QZcUaCP6Q9PzUu2kuB89r5Qw0Xv0tYz3fN3/npYxTnMOMX9mOpiMI8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698022547; c=relaxed/simple; bh=EGD5LLpbT5GrHKmqe+viLyfgxsF7YnXAaKbYQ6rw2qk=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:Mime-Version; b=wiM81kboLqkliEBfexTnqKTE6hd9XfxjigeGDCB4zLn3t/ml9oYxtqWlzzWl877+4x/J82nBkho3KBk1rtdXM26mElRQSK8Fmyxe0B6SysOv362btW53tVOueRI4zV5uD3ntVximBdNRfRwFk4dLDM9glHCbzCi2CY2aUG4tBLY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost.localdomain (beren.burntcomma.com [IPv6:2a02:8012:8cf0:0:b62e:99ff:fee9:ad9f]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (Client did not present a certificate) by mail.burntcomma.com (Postfix) with ESMTPSA id BA10A16B320A3; Mon, 23 Oct 2023 01:55:43 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=harmstone.com; s=mail; t=1698022543; bh=tiGDOc2CZogh6wLQyUQIA8hUUKi+ZjzEG4mTVT3qGuI=; h=From:To:Cc:Subject:Date:In-Reply-To:References; b=TBHL4Oturn4bFgy4898c46GaCZvjT/Bk4RMuHVqzUdekamMzvVK3DO+2R9stuHs22 fg8p+1rgiOQdpEi8lgwLCNCJQydV1GdkO9jb8wzjdgQCX/uaCKciYsbFsJlNpe3uUC iZWi/2VdfWBbEergHToovKzhUKrSdou3fI92BHhU= From: Mark Harmstone To: gcc-patches@gcc.gnu.org Cc: Mark Harmstone Subject: [PATCH 3/5] Output file checksums in CodeView section Date: Mon, 23 Oct 2023 01:55:29 +0100 Message-ID: <20231023005531.19921-3-mark@harmstone.com> In-Reply-To: <20231023005531.19921-1-mark@harmstone.com> References: <20231023005531.19921-1-mark@harmstone.com> Mime-Version: 1.0 X-Spam-Status: No, score=-13.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780505746152427298 X-GMAIL-MSGID: 1780505746152427298 Outputs the file name and MD5 hash of the main source file into the CodeView .debug$S section, along with that of any #include'd files. --- gcc/dwarf2codeview.cc | 254 ++++++++++++++++++++++++++++++++++++++++++ gcc/dwarf2codeview.h | 1 + gcc/dwarf2out.cc | 3 + 3 files changed, 258 insertions(+) diff --git a/gcc/dwarf2codeview.cc b/gcc/dwarf2codeview.cc index e2bfdf8efeb..d93ba1ed668 100644 --- a/gcc/dwarf2codeview.cc +++ b/gcc/dwarf2codeview.cc @@ -37,6 +37,257 @@ along with GCC; see the file COPYING3. If not see #define CV_SIGNATURE_C13 4 +#define DEBUG_S_STRINGTABLE 0xf3 +#define DEBUG_S_FILECHKSMS 0xf4 + +#define CHKSUM_TYPE_MD5 1 + +#define HASH_SIZE 16 + +struct codeview_string +{ + codeview_string *next; + uint32_t offset; + char *string; +}; + +struct string_hasher : free_ptr_hash +{ + typedef const char *compare_type; + + static hashval_t hash (const codeview_string *x) + { + return htab_hash_string (x->string); + } + + static bool equal (const codeview_string *x, const char *y) + { + return !strcmp (x->string, y); + } + + static void mark_empty (codeview_string *x) + { + if (x->string) + { + free (x->string); + x->string = NULL; + } + } + + static void remove (codeview_string *&x) + { + free (x->string); + } +}; + +struct codeview_source_file +{ + codeview_source_file *next; + unsigned int file_num; + uint32_t string_offset; + char *filename; + uint8_t hash[HASH_SIZE]; +}; + +static codeview_source_file *files, *last_file; +static unsigned int num_files; +static uint32_t string_offset = 1; +static hash_table *strings_htab; +static codeview_string *strings, *last_string; + +/* Adds string to the string table, returning its offset. If already present, + this returns the offset of the existing string. */ + +static uint32_t +add_string (const char *string) +{ + codeview_string **slot; + codeview_string *s; + size_t len; + + if (!strings_htab) + strings_htab = new hash_table (10); + + slot = strings_htab->find_slot_with_hash (string, htab_hash_string (string), + INSERT); + + if (*slot) + return (*slot)->offset; + + s = (codeview_string *) xmalloc (sizeof (codeview_string)); + len = strlen (string); + + s->next = NULL; + + s->offset = string_offset; + string_offset += len + 1; + + s->string = xstrdup (string); + + if (last_string) + last_string->next = s; + else + strings = s; + + last_string = s; + + *slot = s; + + return s->offset; +} + +/* A new source file has been encountered - record the details and calculate + its hash. */ + +void +codeview_start_source_file (const char *filename) +{ + codeview_source_file *sf; + char *path; + uint32_t string_offset; + FILE *f; + + path = lrealpath (filename); + string_offset = add_string (path); + free (path); + + sf = files; + while (sf) + { + if (sf->string_offset == string_offset) + return; + + sf = sf->next; + } + + sf = (codeview_source_file *) xmalloc (sizeof (codeview_source_file)); + sf->next = NULL; + sf->file_num = num_files; + sf->string_offset = string_offset; + sf->filename = xstrdup (filename); + + f = fopen (filename, "r"); + if (!f) + internal_error ("could not open %s for reading", filename); + + if (md5_stream (f, sf->hash)) + { + fclose (f); + internal_error ("md5_stream failed"); + } + + fclose (f); + + if (last_file) + last_file->next = sf; + else + files = sf; + + last_file = sf; + num_files++; +} + +/* Write out the strings table into the .debug$S section. The linker will + parse this, and handle the deduplication and hashing for all the object + files. */ + +static void +write_strings_table (void) +{ + codeview_string *string; + + fputs (integer_asm_op (4, false), asm_out_file); + fprint_whex (asm_out_file, DEBUG_S_STRINGTABLE); + putc ('\n', asm_out_file); + + fputs (integer_asm_op (4, false), asm_out_file); + asm_fprintf (asm_out_file, "%LLcv_strings_end - %LLcv_strings_start\n"); + + asm_fprintf (asm_out_file, "%LLcv_strings_start:\n"); + + /* The first entry is always an empty string. */ + fputs (integer_asm_op (1, false), asm_out_file); + fprint_whex (asm_out_file, 0); + putc ('\n', asm_out_file); + + string = strings; + while (string) + { + ASM_OUTPUT_ASCII (asm_out_file, string->string, + strlen (string->string) + 1); + + string = string->next; + } + + delete strings_htab; + + asm_fprintf (asm_out_file, "%LLcv_strings_end:\n"); + + ASM_OUTPUT_ALIGN (asm_out_file, 2); +} + +/* Write out the file checksums data into the .debug$S section. */ + +static void +write_source_files (void) +{ + fputs (integer_asm_op (4, false), asm_out_file); + fprint_whex (asm_out_file, DEBUG_S_FILECHKSMS); + putc ('\n', asm_out_file); + + fputs (integer_asm_op (4, false), asm_out_file); + asm_fprintf (asm_out_file, + "%LLcv_filechksms_end - %LLcv_filechksms_start\n"); + + asm_fprintf (asm_out_file, "%LLcv_filechksms_start:\n"); + + while (files) + { + codeview_source_file *next = files->next; + + /* This is struct file_checksum in binutils, or filedata in Microsoft's + dumpsym7.cpp: + + struct file_checksum + { + uint32_t file_id; + uint8_t checksum_length; + uint8_t checksum_type; + } ATTRIBUTE_PACKED; + + followed then by the bytes of the hash, padded to the next 4 bytes. + file_id here is actually the offset in the strings table. */ + + fputs (integer_asm_op (4, false), asm_out_file); + fprint_whex (asm_out_file, files->string_offset); + putc ('\n', asm_out_file); + + fputs (integer_asm_op (1, false), asm_out_file); + fprint_whex (asm_out_file, HASH_SIZE); + putc ('\n', asm_out_file); + + fputs (integer_asm_op (1, false), asm_out_file); + fprint_whex (asm_out_file, CHKSUM_TYPE_MD5); + putc ('\n', asm_out_file); + + for (unsigned int i = 0; i < HASH_SIZE; i++) + { + fputs (integer_asm_op (1, false), asm_out_file); + fprint_whex (asm_out_file, files->hash[i]); + putc ('\n', asm_out_file); + } + + ASM_OUTPUT_ALIGN (asm_out_file, 2); + + free (files->filename); + free (files); + + files = next; + } + + asm_fprintf (asm_out_file, "%LLcv_filechksms_end:\n"); +} + /* Finish CodeView debug info emission. */ void @@ -47,4 +298,7 @@ codeview_debug_finish (void) fputs (integer_asm_op (4, false), asm_out_file); fprint_whex (asm_out_file, CV_SIGNATURE_C13); putc ('\n', asm_out_file); + + write_strings_table (); + write_source_files (); } diff --git a/gcc/dwarf2codeview.h b/gcc/dwarf2codeview.h index efda148eb49..e2d732bb9b6 100644 --- a/gcc/dwarf2codeview.h +++ b/gcc/dwarf2codeview.h @@ -26,5 +26,6 @@ along with GCC; see the file COPYING3. If not see /* Debug Format Interface. Used in dwarf2out.cc. */ extern void codeview_debug_finish (void); +extern void codeview_start_source_file (const char *); #endif /* GCC_DWARF2CODEVIEW_H */ diff --git a/gcc/dwarf2out.cc b/gcc/dwarf2out.cc index 557464c4c24..945176a91bc 100644 --- a/gcc/dwarf2out.cc +++ b/gcc/dwarf2out.cc @@ -28823,6 +28823,9 @@ dwarf2out_set_ignored_loc (unsigned int line, unsigned int column, static void dwarf2out_start_source_file (unsigned int lineno, const char *filename) { + if (codeview_debuginfo_p ()) + codeview_start_source_file (filename); + if (debug_info_level >= DINFO_LEVEL_VERBOSE) { macinfo_entry e;