From patchwork Thu Nov 2 08:39:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam James X-Patchwork-Id: 160807 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp207052vqu; Thu, 2 Nov 2023 01:41:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHcMg6jqxTsNqSgMsp398tA8gIia3FZFDRUw5HTvgbTjJ0xlG+aO/ihMdzefVr7E1QGZr5z X-Received: by 2002:a25:e64a:0:b0:d9a:6669:68ce with SMTP id d71-20020a25e64a000000b00d9a666968cemr15890943ybh.32.1698914514173; Thu, 02 Nov 2023 01:41:54 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698914514; cv=pass; d=google.com; s=arc-20160816; b=BSWy/lkP5gzuxp8qk2BPaoBTCYFlcDFIWmaHd50DyTLzR545Wjy7ZvQHlX12PX36bo W7OA/lBRHeI1nwAS3TfWAT40642w1ZZ3NvVY/L86E34nnSWeFaX/KJSVE1Pris2zrkgC iY4G1kspKE6IZ6RCz7DZDL6TkT1Rsqi/Da+Tlo/D1QdZjr3ZJ+P/Kx7jumN2DP0YoHv+ 9EXQHbIoo/5xTbaZtD+I7C0/9akyP9lL6HDiX9WUILYZI2jgxYNERpgaI054Ra6imhH3 Qb3JF+lIQHozkXj+15avXWEmLo1hs4TPjGrO+0+bB+0ul1cZCQsPRayoRMnheE1z6pTM TCVA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:arc-filter :dmarc-filter:delivered-to; bh=POworUVLPki8RWl6Nhe9hXRc1U0AUHcXMfUuZpU0F5k=; fh=lhAG8ssliBigPOPTNpUmgPeKTFWUg1eGxz7YimpjSKA=; b=XSit6Oce3TM48H1Yrsviw/O1jjfLKziKQxhZzqTzVl7zJo+5FgFmX+6hEktkpmt6Tp eNxvoE/30at18HDrCoZGBF0UJxtqzYxFm15EjcXpMpd3syCUPuq0erLXlY6fzps3pJvA ZBiURaxiRR/E7wWY50R/8rV6FLTvXzwMp9qMbI8Cm9UAYJXUszvELiDfZauph1NMf9S5 kKPKeHsXvw+GmAtP1s4ffG5GYYyDBOEtkvYlqC3Ff2LmhO3NDuaVjIJJt9+tJcmZkOaJ hc4Tv2BSY2JzPEGygL93subcGQMGtVb+z1MRXRgZ3ZUTVZfVkrHyPg1ITGAZnqTFChq2 TMfg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gentoo.org Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ev20-20020a0562140a9400b0066d32666a24si3982948qvb.495.2023.11.02.01.41.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 01:41:54 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=gentoo.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 656C63858D39 for ; Thu, 2 Nov 2023 08:41:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp.gentoo.org (woodpecker.gentoo.org [140.211.166.183]) by sourceware.org (Postfix) with ESMTP id 8B0803858C5E for ; Thu, 2 Nov 2023 08:41:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8B0803858C5E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gentoo.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gentoo.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8B0803858C5E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=140.211.166.183 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698914488; cv=none; b=LKwqeDcPvTiwKk5bqSu9uTtEWUW0fZzmWoU6+TxLAjt9blL6xgjRQ6mKG90ztTlmKeFVyBHjG6zzey4igkzEZd2kOpvTzDHWooVpNCUkY5pa2Tu1PNZT6HdlOR0DNjgaXUPFCILtqbeYzd3QQgrkth2nsBOGHcL14A8BxxMKE9E= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698914488; c=relaxed/simple; bh=mF5RXHHfe+RaQ6w0T9eXJMMHTBcN2Kxna86iKDLinLA=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=XbMOXEDSTmLHD341KvGrMja2+fONL4H8+lxAchf41Tm4iY3AoJlywcbDtJ0Bsnr6wzwUljsQ/Q6FpzGTDMDuOpl1V/fJqLiJ77uy0XydXzK69eaIuyaEXphKQ490HLYCUTC/0INcKx7e9fcHIdngNQh3+0zGtA6JIMXlBWTekfs= ARC-Authentication-Results: i=1; server2.sourceware.org From: Sam James To: gcc-patches@gcc.gnu.org Cc: Sam James Subject: [PATCH 1/4] contrib: add generate_snapshot_index.py Date: Thu, 2 Nov 2023 08:39:05 +0000 Message-ID: <20231102084058.1142941-1-sam@gentoo.org> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_DMARC_STATUS, KAM_LOTSOFHASH, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781440985800627907 X-GMAIL-MSGID: 1781440985800627907 Script to create a map between weekly snapshots and the commit they're based on with space-separated format BRANCH-DATE COMMIT. For example: 8-20210107 5114ee0676e432493ada968e34071f02fb08114f 8-20210114 f9267925c648f2ccd9e4680b699e581003125bcf ... This is helpful for bisects and quickly looking up the information from bug reports. contrib/: * generate_snapshot_index.py: New file. Signed-off-by: Sam James --- contrib/generate_snapshot_index.py | 79 ++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) create mode 100755 contrib/generate_snapshot_index.py diff --git a/contrib/generate_snapshot_index.py b/contrib/generate_snapshot_index.py new file mode 100755 index 000000000000..80fc14b2cf1e --- /dev/null +++ b/contrib/generate_snapshot_index.py @@ -0,0 +1,79 @@ +#!/usr/bin/env python3 +# +# Copyright (C) 2023 Free Software Foundation, Inc. +# Contributed by Sam James. +# +# This script is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 3, or (at your option) +# any later version. +# +# Script to create a map between weekly snapshots and the commit they're based on. +# Creates known_snapshots.txt with space-separated format: BRANCH-DATE COMMIT +# For example: +# 8-20210107 5114ee0676e432493ada968e34071f02fb08114f +# 8-20210114 f9267925c648f2ccd9e4680b699e581003125bcf + +import os +import re +import urllib.request + +MIRROR = "https://mirrorservice.org/sites/sourceware.org/pub/gcc/snapshots/" + + +def get_remote_snapshot_list() -> list[str]: + # Parse the HTML index for links to snapshots + with urllib.request.urlopen(MIRROR) as index_response: + html = index_response.read().decode("utf-8") + snapshots = re.findall(r'href="([0-9]+-.*)"', html) + + return snapshots + + +def load_cached_entries() -> dict[str, str]: + local_snapshots = {} + + with open("known_snapshots.txt", encoding="utf-8") as local_entries: + for entry in local_entries.readlines(): + if not entry: + continue + + date, commit = entry.strip().split(" ") + local_snapshots[date] = commit + + return local_snapshots + + +remote_snapshots = get_remote_snapshot_list() +try: + known_snapshots = load_cached_entries() +except FileNotFoundError: + # No cache available + known_snapshots = {} + +# This would give us chronological order (as in by creation) +# snapshots.sort(reverse=False, key=lambda x: x.split('-')[1]) +# snapshots.sort(reverse=True, key=lambda x: x.split('-')[0]) + +for snapshot in remote_snapshots: + # 8-20210107/ -> 8-20210107 + snapshot = snapshot.strip("/") + + # Don't fetch entries we already have stored. + if snapshot in known_snapshots: + continue + + # The READMEs are plain text with several lines, one of which is: + # "with the following options: git://gcc.gnu.org/git/gcc.git branch releases/gcc-8 revision e4e5ad2304db534957c4af612aa288cb6ef51f25"" + # We match after 'revision ' to grab the commit used. + with urllib.request.urlopen(f"{MIRROR}/{snapshot}/README") as readme_response: + data = readme_response.read().decode("utf-8") + parsed_commit = re.findall(r"revision (.*)", data)[0] + known_snapshots[snapshot] = parsed_commit + +# Dump it all back out to disk. +with open("known_snapshots.txt.tmp", "w", encoding="utf-8") as known_entries: + for name, stored_commit in known_snapshots.items(): + known_entries.write(f"{name} {stored_commit}\n") + +os.rename("known_snapshots.txt.tmp", "known_snapshots.txt")