From patchwork Wed Apr 12 11:23:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tze-nan Wu X-Patchwork-Id: 82418 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp259409vqo; Wed, 12 Apr 2023 04:48:32 -0700 (PDT) X-Google-Smtp-Source: AKy350ZF1ySWY3mE4LneB1NyF6Xbf69tAhOKQ4d1CrKXGeeVgXqeColWQz1howbD+19sTzxUT6Kq X-Received: by 2002:aa7:9828:0:b0:627:effd:71b7 with SMTP id q8-20020aa79828000000b00627effd71b7mr2435216pfl.33.1681300111969; Wed, 12 Apr 2023 04:48:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681300111; cv=none; d=google.com; s=arc-20160816; b=GfPpdCq8qo8rV/JB/lokpAMDad1Y6tIbG7ROdiETf9qHYkoHEb72ERHPySgaCy29hB 6Pu5A43k98dzorRVsagg5uzAFeDJ6bbp8Sw6iQWX5CjAk6WadVViIxNv/+2SZ+J6ALJ+ YrNx3erZ8cYqo6Pf+/7NHdxn2t+oytq7B3XkCmsxYiMYH9BXVxlhdZTrGOkUOt6zNhiQ g/63L+ZBgj2dMcwINexGjKYKB2yxCFxeOLat6LxCc9LoxavRCr8ncGwX5XSDfMSIBqqy sxTIJV1s0zr2RygMTfcHqmzmVFprWvZfV+SitlHpFVuDgMLTZmxeaMkuOsjQz1GmQGUG d3VQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from :dkim-signature; bh=QP7sQjwZzKXI0b464YWLSZqML8grLgd1UX3dg7K3rSA=; b=KAi52bh7sL47js0Mts9GsI1xYU8bABJriqu9UExnu7+Nf/GazTHi29VyUN0BZ2WT8N RQXiYJRK8sAWBecgXuBfsOTMgFIT8XyFgxNefAKTU0ebxPxOEFGBR7x7HRY8IKSkr1KL jGWpyor9S4ixvCr7IrHBxSuvuqwWC5h5ODvDGa69j03Ko55GLYcd5kQ2Kg/hpXWbEUEf 4o7FD7gtvjmxCM5O5IOuoO6B2Bp7VbxuZXlTjlsmh7Gy3AJ1vt+Y5jE0OP9W/E/2VZZ9 22zZZHS/VMdNOx7srvuK977ZXo1dFReLrgWsa05HlctIyngjHgSUeScZBMYqvQGXOmeV 3mbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mediatek.com header.s=dk header.b=QgezrJLG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=mediatek.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z12-20020aa79e4c000000b0063b1458a290si1404259pfq.329.2023.04.12.04.48.19; Wed, 12 Apr 2023 04:48:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@mediatek.com header.s=dk header.b=QgezrJLG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=mediatek.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229638AbjDLLZN (ORCPT + 99 others); Wed, 12 Apr 2023 07:25:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230294AbjDLLYw (ORCPT ); Wed, 12 Apr 2023 07:24:52 -0400 Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0596740F2; Wed, 12 Apr 2023 04:24:27 -0700 (PDT) X-UUID: 8c6c1186d92411edb6b9f13eb10bd0fe-20230412 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:Message-ID:Date:Subject:CC:To:From; bh=QP7sQjwZzKXI0b464YWLSZqML8grLgd1UX3dg7K3rSA=; b=QgezrJLGrXlXX/1TvSBmalnHr1z23JCPcE6nXNPDzDumPGVvmBymdwaX/VY3FHSV1XDXcQ58DfyjfdqVQFBzKaV5jYdkdPfK/vNTEasXbZaxTNvUndgxiRjVVJNq0n2xZG/Iv1BVqCGRLc0xEvw44nE09OIZmDVcrl1wMpBlK98=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.22,REQID:228adf8c-7605-4685-83d6-6ad6e972d08f,IP:0,U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:95,FILE:0,BULK:0,RULE:Release_Ham,ACTI ON:release,TS:70 X-CID-INFO: VERSION:1.1.22,REQID:228adf8c-7605-4685-83d6-6ad6e972d08f,IP:0,URL :0,TC:0,Content:-25,EDM:0,RT:0,SF:95,FILE:0,BULK:0,RULE:Spam_GS981B3D,ACTI ON:quarantine,TS:70 X-CID-META: VersionHash:120426c,CLOUDID:84f2b9ea-db6f-41fe-8b83-13fe7ed1ef52,B ulkID:2304121924148FJCQDB8,BulkQuantity:0,Recheck:0,SF:38|29|28|17|19|48,T C:nil,Content:0,EDM:-3,IP:nil,URL:11|1,File:nil,Bulk:nil,QS:nil,BEC:nil,CO L:0,OSI:0,OSA:0,AV:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-UUID: 8c6c1186d92411edb6b9f13eb10bd0fe-20230412 Received: from mtkmbs10n1.mediatek.inc [(172.21.101.34)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1256624878; Wed, 12 Apr 2023 19:24:11 +0800 Received: from mtkmbs13n2.mediatek.inc (172.21.101.108) by mtkmbs10n2.mediatek.inc (172.21.101.183) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.25; Wed, 12 Apr 2023 19:24:10 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs13n2.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.25 via Frontend Transport; Wed, 12 Apr 2023 19:24:10 +0800 From: Tze-nan Wu To: , CC: , , , , , , AngeloGioacchino Del Regno , "Paul E. McKenney" , , , , Subject: [PATCH v4] ring-buffer: Ensure proper resetting of atomic variables in ring_buffer_reset_online_cpus Date: Wed, 12 Apr 2023 19:23:56 +0800 Message-ID: <20230412112401.25081-1-Tze-nan.Wu@mediatek.com> X-Mailer: git-send-email 2.18.0 MIME-Version: 1.0 X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,RDNS_NONE, SPF_HELO_PASS,T_SPF_TEMPERROR,UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762970946342668025?= X-GMAIL-MSGID: =?utf-8?q?1762970946342668025?= In ring_buffer_reset_online_cpus, the buffer_size_kb write operation may permanently fail if the cpu_online_mask changes between two for_each_online_buffer_cpu loops. The number of increases and decreases on both cpu_buffer->resize_disabled and cpu_buffer->record_disabled may be inconsistent, causing some CPUs to have non-zero values for these atomic variables after the function returns. This issue can be reproduced by "echo 0 > trace" while hotplugging cpu. After reproducing success, we can find out buffer_size_kb will not be functional anymore. To prevent leaving 'resize_disabled' and 'record_disabled' non-zero after ring_buffer_reset_online_cpus returns, we ensure that each atomic variable has been set up before atomic_sub() to it. Cc: stable@vger.kernel.org Cc: npiggin@gmail.com Fixes: b23d7a5f4a07 ("ring-buffer: speed up buffer resets by avoiding synchronize_rcu for each CPU") Reviewed-by: Cheng-Jui Wang Signed-off-by: Tze-nan Wu --- Changes from v1 to v3: https://lore.kernel.org/all/20230408052226.25268-1-Tze-nan.Wu@mediatek.com/ - Declare the cpumask variable statically rather than dynamically. Changes from v2 to v3: https://lore.kernel.org/all/20230409024616.31099-1-Tze-nan.Wu@mediatek.com/ - Considering holding cpu_hotplug_lock too long because of the synchronize_rcu(), maybe it's better to prevent the issue by copying cpu_online_mask at the entry of the function as V1 does, instead of using cpus_read_lock(). Changes from v3 to v4: https://lore.kernel.org/all/20230410073512.13362-1-Tze-nan.Wu@mediatek.com/ - Considering that the size of cpumask may not be too big on some machines We no longer adopt the approach of copying cpumask at the beginning of the function. Instead, we ensure that atomic variables have been set up before atomic_sub() is called. - Change the title of the patch. --- kernel/trace/ring_buffer.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 76a2d91eecad..8c647d8b5bb4 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -5361,20 +5361,28 @@ void ring_buffer_reset_online_cpus(struct trace_buffer *buffer) for_each_online_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; - atomic_inc(&cpu_buffer->resize_disabled); +#define RESET_BIT (1 << 30) + atomic_add(RESET_BIT, &cpu_buffer->resize_disabled); atomic_inc(&cpu_buffer->record_disabled); } /* Make sure all commits have finished */ synchronize_rcu(); - for_each_online_buffer_cpu(buffer, cpu) { + for_each_buffer_cpu(buffer, cpu) { cpu_buffer = buffer->buffers[cpu]; + /* + * If a CPU came online during the synchronize_rcu(), then + * ignore it. + */ + if (!(atomic_read(&cpu_buffer->resize_disabled) & RESET_BIT)) + continue; + reset_disabled_cpu_buffer(cpu_buffer); atomic_dec(&cpu_buffer->record_disabled); - atomic_dec(&cpu_buffer->resize_disabled); + atomic_sub(RESET_BIT, &cpu_buffer->resize_disabled); } mutex_unlock(&buffer->mutex);