From patchwork Thu May 11 05:28:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tristan Gingold X-Patchwork-Id: 92359 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp4124823vqo; Wed, 10 May 2023 22:29:11 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5mx4RHFwhr9uyWQKzbGumzPD02gj5W/u0lTiyd3YoG9u4g2UNLbTBojfnoTwdfMHnmvuy5 X-Received: by 2002:a17:907:70e:b0:965:6199:cf60 with SMTP id xb14-20020a170907070e00b009656199cf60mr17755925ejb.42.1683782951174; Wed, 10 May 2023 22:29:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683782951; cv=none; d=google.com; s=arc-20160816; b=0OwIRuP5RUm/5sa4RWAEX1R4etFtYKSPgTxMxmqWEb+s7fPVNlRUa8lLjTx/g8Jked 0QIDwVw5xCyEd9Gxh5TQQexilVSmYh835oZb8JZeEtyYxmqoEwFFzB4eatQNVnJW+87e K4saBupEdFWNM/dcOc7Kg5EDtsplFZNQFzQUYjxvfg4f9gl0u7UhwLML+vdwS2Bp4su8 dv06wJZoWpRSt1pc08FB3KOqac6mlkDm9dIpdLJOJvla9gRjdPKdSjRdrF6ia/z5/cTM lt+ih3cs4bFtjY7xC4sr1hf7FLv8/EC43noURKwtJ9dBSn0ErRXi5Kz6TP1XnpUF2lum 9kJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:date:message-id :subject:mime-version:content-transfer-encoding:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=rzzJNeTqg4bMdK4842CM+qlbqQa7OA9LM1aem+bbi2s=; b=ltriwRqQfqa2jjcJdpyFCq+r710IwQN3Desrr0pDocKNgZ0QEp5nJMX7nTwTGpv0al e8ntT0RaxQp1IRkVRtRBqjEteT+cCWQrX8AfIh9rSyY0WScwga+ZPQykwGKps2VhQhwg beh1c+WjzGKO//wuPD1ZrZq5X2AVCqelKZD7INOBLJizSR/ptFmsC2y1aii34cHIwYOT g8EAAfq2QaAH9H3QFji3I4JGEEaV0+ZI0zo+YczvqSskodWdvRWXVPJHYq5aZaKM6f+0 GR0MvEWHGZwxj46pc5hsfT7W36bnkMHPEH7n5vTZwPRqGICb8RX3mpquXxc+PsBu1qcr p2EQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ZPwgswyH; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id o16-20020a1709061b1000b009661c26b72asi4583259ejg.501.2023.05.10.22.29.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 May 2023 22:29:11 -0700 (PDT) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@sourceware.org header.s=default header.b=ZPwgswyH; spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=sourceware.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D33353857025 for ; Thu, 11 May 2023 05:29:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D33353857025 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1683782949; bh=rzzJNeTqg4bMdK4842CM+qlbqQa7OA9LM1aem+bbi2s=; h=Subject:Date:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=ZPwgswyHtRxPVs2y9HERG9MEZiTwe/rtfThZN9y2KQPZYrSGOvxYnYcwoxluqr0pH RmeUyIckYOcyfQMhn+GbzS8WemB4QvSeBOi8m5qJBiNXnVL9S0Fw8/q7CyJpdzOH+s YbpAUdaAh0t0lTNlxFPojilf7DPISP/Ezwb/aUsM= X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from smtp4-g21.free.fr (smtp4-g21.free.fr [212.27.42.4]) by sourceware.org (Postfix) with ESMTPS id 969F43858414 for ; Thu, 11 May 2023 05:29:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 969F43858414 Received: from [192.168.2.39] (unknown [88.173.63.180]) (Authenticated sender: tgingold@free.fr) by smtp4-g21.free.fr (Postfix) with ESMTPSA id 28EB119F5B2 for ; Thu, 11 May 2023 07:28:58 +0200 (CEST) Mime-Version: 1.0 (Mac OS X Mail 10.3 \(3273\)) Subject: [PATCH] pe/coff - add support for base64 encoded long section names Message-Id: <243D0799-E3D0-4938-A438-DD8725593F67@free.fr> Date: Thu, 11 May 2023 07:28:58 +0200 To: binutils X-Mailer: Apple Mail (2.3273) X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, JMQ_SPF_NEUTRAL, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Binutils mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Tristan Gingold via Binutils From: Tristan Gingold Reply-To: Tristan Gingold Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org Sender: "Binutils" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765574391996623211?= X-GMAIL-MSGID: =?utf-8?q?1765574391996623211?= Hello, to overcome the 8 characters limit of COFF section names, PE has introduced a convention to put long section names in the strtab and use special section names '/nnnnnnn' when this convention is used. LLVM has added another convention for very large strtab, using '//xxxxxx' names and base64 encoding of the index in the strtab. For the LLVM implementation, see: https://github.com/llvm/llvm-project/blob/6311ab21474a0f3e0340515185cd1d6e33a9892a/llvm/lib/BinaryFormat/COFF.cpp#L20 The object files generated by LLVM using this convention cannot be handled by binutils and this can result in weird errors like: objdump: ../../unisim_retarget_VCOMP.o: warning: COMDAT symbol '.rdata$.refptr.ieee__numeric_std__ELABORATED' does not match section name '//AA1GKj' objdump: ../../unisim_retarget_VCOMP.o: warning: COMDAT symbol '.rdata$.refptr.ieee__std_logic_1164__ELABORATED' does not match section name '//AA1GLQ' This patch adds support in bfd to correctly decode those names. There is no real need to support encoding as the section names are usually put first in the strtab (issues could happen only if there are a huge number of sections with very long names). Mostly manually checked, no regressions with 'make check' (when configured for x86_64-linux + mingw64 target enabled). It has been a while since I haven't submitted a patch, I hope I am correctly following the procedure! Tristan. bfd/ * coffgen.c (extract_long_section_name): New function extracted from ... (make_a_section_from_file): ... here. Add support for base64 long section names. diff --git a/bfd/coffgen.c b/bfd/coffgen.c index ac936def566..c7ccfae59ed 100644 --- a/bfd/coffgen.c +++ b/bfd/coffgen.c @@ -43,6 +43,29 @@ #include "coff/internal.h" #include "libcoff.h" +/* Extract a long section name at STRINDEX and copy it to the bfd objstack. + Return NULL in case of error. */ + +static char * +extract_long_section_name(bfd *abfd, unsigned long strindex) +{ + const char *strings; + char *name; + + strings = _bfd_coff_read_string_table (abfd); + if (strings == NULL) + return NULL; + if ((bfd_size_type)(strindex + 2) >= obj_coff_strings_len (abfd)) + return NULL; + strings += strindex; + name = (char *) bfd_alloc (abfd, (bfd_size_type) strlen (strings) + 1 + 1); + if (name == NULL) + return NULL; + strcpy (name, strings); + + return name; +} + /* Take a section header read from a coff file (in HOST byte order), and make a BFD "section" out of it. This is used by ECOFF. */ @@ -67,32 +90,62 @@ make_a_section_from_file (bfd *abfd, if (bfd_coff_set_long_section_names (abfd, bfd_coff_long_section_names (abfd)) && hdr->s_name[0] == '/') { - char buf[SCNNMLEN]; - long strindex; - char *p; - const char *strings; - /* Flag that this BFD uses long names, even though the format might expect them to be off by default. This won't directly affect the format of any output BFD created from this one, but the information can be used to decide what to do. */ bfd_coff_set_long_section_names (abfd, true); - memcpy (buf, hdr->s_name + 1, SCNNMLEN - 1); - buf[SCNNMLEN - 1] = '\0'; - strindex = strtol (buf, &p, 10); - if (*p == '\0' && strindex >= 0) + + if (hdr->s_name[1] == '/') { - strings = _bfd_coff_read_string_table (abfd); - if (strings == NULL) - return false; - if ((bfd_size_type)(strindex + 2) >= obj_coff_strings_len (abfd)) - return false; - strings += strindex; - name = (char *) bfd_alloc (abfd, - (bfd_size_type) strlen (strings) + 1 + 1); + /* LLVM extension: the '/' is followed by another '/' and then by + the index in the strtab encoded in base64 without NUL at the + end. */ + unsigned strindex; + unsigned i; + + strindex = 0; + for (i = 2; i < SCNNMLEN; i++) + { + char c = hdr->s_name[i]; + unsigned d; + + if (c >= 'A' && c <= 'Z') + d = c - 'A'; + else if (c >= 'a' && c <= 'z') + d = c - 'a' + 26; + else if (c >= '0' && c <= '9') + d = c - '0' + 52; + else if (c == '+') + d = 62; + else if (c == '/') + d = 63; + else + return false; + strindex = (strindex << 6) + d; + } + + name = extract_long_section_name (abfd, strindex); if (name == NULL) return false; - strcpy (name, strings); + } + else + { + /* PE classic long section name. The '/' is followed by the index + in the strtab. The index is formatted as a decimal string. */ + char buf[SCNNMLEN]; + long strindex; + char *p; + + memcpy (buf, hdr->s_name + 1, SCNNMLEN - 1); + buf[SCNNMLEN - 1] = '\0'; + strindex = strtol (buf, &p, 10); + if (*p == '\0' && strindex >= 0) + { + name = extract_long_section_name (abfd, strindex); + if (name == NULL) + return false; + } } }