From patchwork Fri Oct 21 19:15:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Antonio Borneo X-Patchwork-Id: 6948 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp864387wrr; Fri, 21 Oct 2022 12:18:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7b/pjeM2rbLriu3J9ijppnPfkZRT0cajH8P2jPG+HI5YWsYwcS9wGjH2jbsdAMkHt0JIcT X-Received: by 2002:a17:907:6d11:b0:78d:cce7:2bd5 with SMTP id sa17-20020a1709076d1100b0078dcce72bd5mr16735939ejc.43.1666379937962; Fri, 21 Oct 2022 12:18:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666379937; cv=none; d=google.com; s=arc-20160816; b=tBKuWh01I4oxxRJtVxRjJ0klylX5QcxuzjEwWYCriu4b49/O4k7dVP0n4uI9cQDJf0 1u52h5q76SrA+vkFJTr6XCMExlKgo8ccDcKYFtW7+/Qc8H5T7QYYBkIPSIFMB0mMNelU UwT8xg9LLNAC6ickC3UkQB4aFJrOiwPIqBgjzH9IdGRUpgX99nabaKJ5CVrgzA5LmLrH x9qBrz8V9WNuXeDnop/bA2BsKui5A9qjAFPiCfhkyuEUz/6z4n6O1q9AlXhLsNmyhtw9 mx4fpblF+2j5cb4kglKkXay5IJaFKWJuYMWDYXDJMioztGFvsqsnVsHSci6IgxeUpx0i e2ag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=rpB0t4cce39ShZcMP7Omj98OzqHAfccHfhWe9Hk8V0g=; b=uNqnl/daqMWpuQ97htnN5vKMmjlyD0B0JzZHqkXAG8VHmrnDQo73A6vy77m21JcHiO 412DHmTC49lrl7YVSbNkEX1/avhvgRy/RIYR+zSNUkxEZUVNtA3hPSRFYYp9qfnnmZwn BSAUyn4t6Etc4r6Jy2ZSN6pWkfMexxQ8l7MG+YaYtxrZp4uaZNh0WKN9fgwcyqLYJHSj rMnY1wuVMOq+Nyri29eV0LzLl4z24e+6pfij8JiWD8zf6gaM0iE7G0r9x5y+NC/Jeanv Fgc5LpQGdH46atB3I2oo3lWMuVYFRNczqK4boZVHXgsj+dZL1tTq0d/d7Jh9Mmzns7r4 Tfmg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@foss.st.com header.s=selector1 header.b=JYGOVtHz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=foss.st.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b8-20020a170906708800b0078df946ea14si18411950ejk.419.2022.10.21.12.18.30; Fri, 21 Oct 2022 12:18:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@foss.st.com header.s=selector1 header.b=JYGOVtHz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=foss.st.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229777AbiJUTQy (ORCPT + 99 others); Fri, 21 Oct 2022 15:16:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230416AbiJUTQW (ORCPT ); Fri, 21 Oct 2022 15:16:22 -0400 Received: from mx07-00178001.pphosted.com (mx08-00178001.pphosted.com [91.207.212.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16AF31116D for ; Fri, 21 Oct 2022 12:15:42 -0700 (PDT) Received: from pps.filterd (m0046661.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 29LDufN2004136; Fri, 21 Oct 2022 21:15:25 +0200 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=foss.st.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding : content-type; s=selector1; bh=rpB0t4cce39ShZcMP7Omj98OzqHAfccHfhWe9Hk8V0g=; b=JYGOVtHzsm8o02qL7Ie8vsiw5bieAN68feZKusIqBTYDiJ/puZ/I20pL1xd1f9lwbYUz Y1GSNBCHOQUdhf7APHLxKXW7BB3vEa/Ncqo8B49W3jQfapiAhODJkXgkSkvqsQZxLnwB z9U/m+KaTnT+Tu5FgDKaFoUDzeSBmrbQOAzv9oTBRtJqmpHzIThp6Rx8KLhQ5+6CaxpR LunZSi9VFT2Uf8aEwC/5mQtqPRY0nmPiIQstCcxPSqKHWsCYHVa8WGkwz5o+rPAlhXdT b23Jn+HeqAi4RzltqWCU6PlYmWqYmQZKHe7yHKR8+hvJqXpOvdR82vuMmOoRyJfEdLbX VA== Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com (PPS) with ESMTPS id 3kbrgtk7r0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 21 Oct 2022 21:15:25 +0200 Received: from euls16034.sgp.st.com (euls16034.sgp.st.com [10.75.44.20]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id D145B10002A; Fri, 21 Oct 2022 21:15:19 +0200 (CEST) Received: from Webmail-eu.st.com (shfdag1node1.st.com [10.75.129.69]) by euls16034.sgp.st.com (STMicroelectronics) with ESMTP id 67F0B2C4212; Fri, 21 Oct 2022 21:15:19 +0200 (CEST) Received: from localhost (10.211.9.227) by SHFDAG1NODE1.st.com (10.75.129.69) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Fri, 21 Oct 2022 21:15:19 +0200 From: Antonio Borneo To: Andy Whitcroft , Joe Perches , Dwaipayan Ray , Lukas Bulwahn , CC: Antonio Borneo Subject: [PATCH] checkpatch: handle utf8 while computing length of commit msg lines Date: Fri, 21 Oct 2022 21:15:07 +0200 Message-ID: <20221021191507.9026-1-antonio.borneo@foss.st.com> X-Mailer: git-send-email 2.38.0 MIME-Version: 1.0 X-Originating-IP: [10.211.9.227] X-ClientProxiedBy: EQNCAS1NODE3.st.com (10.75.129.80) To SHFDAG1NODE1.st.com (10.75.129.69) X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.895,Hydra:6.0.545,FMLib:17.11.122.1 definitions=2022-10-21_04,2022-10-21_01,2022-06-22_01 X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747326009888178485?= X-GMAIL-MSGID: =?utf-8?q?1747326009888178485?= The current check for the length of each line in the commit msg uses length($line) that counts line's bytes. If the line contains utf8 characters, the byte count can exceed the cap even on quite short lines. Count the utf8 characters for checking line length. Signed-off-by: Antonio Borneo --- Actually it's not fully clear to me if utf8 characters in the commit msg are acceptable/tolerated or to be avoided. In the commit msg of 15662b3e8644 ("checkpatch: add a --strict check for utf-8 in commit logs") is stated: Some find using utf-8 in commit logs inappropriate. scripts/checkpatch.pl | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) base-commit: 9abf2313adc1ca1b6180c508c25f22f9395cc780 diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index 1e5e66ae5a52..eaad5da50554 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -3220,7 +3220,7 @@ sub process { # Check for line lengths > 75 in commit log, warn once if ($in_commit_log && !$commit_log_long_line && - length($line) > 75 && + length(decode("utf8", $line)) > 75 && !($line =~ /^\s*[a-zA-Z0-9_\/\.]+\s+\|\s+\d+/ || # file delta changes $line =~ /^\s*(?:[\w\.\-\+]*\/)++[\w\.\-\+]+:/ ||