From patchwork Wed Feb 22 14:46:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frederic Weisbecker X-Patchwork-Id: 60563 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp618113wrd; Wed, 22 Feb 2023 06:52:05 -0800 (PST) X-Google-Smtp-Source: AK7set/QgpftJlzyuEisElYRbrMwyk0TTAtdIv2Fl1nWHfdQcZVlLH7hKqAuOF6bWat6gSblQDrf X-Received: by 2002:a05:6402:202:b0:4ad:7c6c:537d with SMTP id t2-20020a056402020200b004ad7c6c537dmr7966971edv.33.1677077525650; Wed, 22 Feb 2023 06:52:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677077525; cv=none; d=google.com; s=arc-20160816; b=Q1Wntn6liGF8JFSNTtcRsIRmkZfBZIl5xkI+Ea29oubZpYzk4Mr5LCx+bcUYowObaI /LdVe32pl8PJLsMe+gbiVagImto+3bzTRZGCiVX9grjquoYQJ1d8Aqi/5mxQ3WcN9/vy Don/sPOhSOdv9VhrbDOs7FLXO2AfjnEYQRoKpewsdFmFV9bKqoKU2f+5ydz5oNNAywYG dJvhuC2Gh/XMK9PoJT/zztjNZ8zhEk48EOTDPhSea1Cv735jfzjIfwv3QsOAHJHmwpE+ QtzsBm6GdzoSR4FXQhs+HHLMMbAeD4zpI1LWHIfBMVrI8Hbed5hLqGE8lRYzhN04yb88 A7qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=j8S8KwqGSfvA+BWFE3awZzpfdbRcW5+HPxYAuXJJeCs=; b=cA8pkc1THubBDNIi+u4btbIzlZS+sEzqlKurmK4aFJ6IQArsJgIblu5CHDTCKa+nSa pBHa0mcZ2dOLSQ2CXtndU1j2Cec678MXoEAbKhGZ9QMePamO8DsMYRVVtTsfIcNuHK5G grmvChSgEZp6u7vWS8kGPf2uozLFi8dOtuIOlJiiiCR6l6ztqRi/dTJSqXM33PG/W282 aM0681pNasJj0g5IaOREAetChw3iLVlmEBrRGFsO5pT2ODyhOYM6KLksm5dNHlX0sr1u x8G/+q1egXLkHIOvYN7ORtnWevJhAA4P1kN0/8kKqqHyyNZnedIiT04SNeUeN9HVnh36 JQPA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Sc66U+Je; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m21-20020aa7c495000000b004acb24be0d2si8721134edq.315.2023.02.22.06.51.42; Wed, 22 Feb 2023 06:52:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Sc66U+Je; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231174AbjBVOrq (ORCPT + 99 others); Wed, 22 Feb 2023 09:47:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232196AbjBVOrf (ORCPT ); Wed, 22 Feb 2023 09:47:35 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D3AD222A2F for ; Wed, 22 Feb 2023 06:47:19 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DD4DD61486 for ; Wed, 22 Feb 2023 14:47:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2C18EC433D2; Wed, 22 Feb 2023 14:47:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677077224; bh=5sxyMTW09Kl3ct1nkcDaj+gcruIUS7UwkGWllRn6QWo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Sc66U+JeBvcIan2oTweypNsCy2L8Q2/rPUz6v+bXrFziNS6zAxmMtjoUctKljuRW5 59q297z8DODDV+4UtOB4+eFCzY4DcJO7esf/AF4bPFIy9gxqqrFX64DtVFKz2Hi73L n+uuZcaGU40P4h0HRfuKyBmN4WO6ylbAmBn6d6vwVZ2dGLjgwYfpnbr3phSDVbTIf9 P9JJM92Z7sgM9Mk7/fOrst3iDJtN+i2ImK7gpXdihpHHdy1YIovXunR3SkSKR/odZn pYDPMZqntFhTXNPrx41WL2rAOI+OlnhnEeGnjI9vbbChhMDztY2XoZwqdkBbG8/cX2 XTv0y2jKriNaQ== From: Frederic Weisbecker To: Thomas Gleixner Cc: LKML , Frederic Weisbecker , Alexey Dobriyan , Wei Li , Peter Zijlstra , Mirsad Goran Todorovac , Yu Liao , Hillf Danton , Ingo Molnar Subject: [PATCH 3/8] timers/nohz: Protect idle/iowait sleep time under seqcount Date: Wed, 22 Feb 2023 15:46:44 +0100 Message-Id: <20230222144649.624380-4-frederic@kernel.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230222144649.624380-1-frederic@kernel.org> References: <20230222144649.624380-1-frederic@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758543243476744223?= X-GMAIL-MSGID: =?utf-8?q?1758543243476744223?= Reading idle/io sleep time (eg: from /proc/stat) can race with idle exit updates because the state machine handling the stats is not atomic and requires a coherent read batch. As a result reading the sleep time may report irrelevant or backward values. Fix this with protecting the simple state machine within a seqcount. This is expected to be cheap enough not to add measurable performance impact on the idle path. Note this only fixes reader VS writer condition partitially. A race remains that involves remote updates of the CPU iowait task counter. It can hardly be fixed. Reported-by: Yu Liao Acked-by: Peter Zijlstra (Intel) Cc: Hillf Danton Cc: Yu Liao Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Wei Li Cc: Alexey Dobriyan Cc: Mirsad Goran Todorovac Signed-off-by: Frederic Weisbecker --- kernel/time/tick-sched.c | 22 ++++++++++++++++------ kernel/time/tick-sched.h | 1 + 2 files changed, 17 insertions(+), 6 deletions(-) diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c index 9058b9eb8bc1..90d9b7b29875 100644 --- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -646,6 +646,7 @@ static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now) delta = ktime_sub(now, ts->idle_entrytime); + write_seqcount_begin(&ts->idle_sleeptime_seq); if (nr_iowait_cpu(smp_processor_id()) > 0) ts->iowait_sleeptime = ktime_add(ts->iowait_sleeptime, delta); else @@ -653,14 +654,18 @@ static void tick_nohz_stop_idle(struct tick_sched *ts, ktime_t now) ts->idle_entrytime = now; ts->idle_active = 0; + write_seqcount_end(&ts->idle_sleeptime_seq); sched_clock_idle_wakeup_event(); } static void tick_nohz_start_idle(struct tick_sched *ts) { + write_seqcount_begin(&ts->idle_sleeptime_seq); ts->idle_entrytime = ktime_get(); ts->idle_active = 1; + write_seqcount_end(&ts->idle_sleeptime_seq); + sched_clock_idle_sleep_event(); } @@ -668,6 +673,7 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime, bool compute_delta, u64 *last_update_time) { ktime_t now, idle; + unsigned int seq; if (!tick_nohz_active) return -1; @@ -676,13 +682,17 @@ static u64 get_cpu_sleep_time_us(struct tick_sched *ts, ktime_t *sleeptime, if (last_update_time) *last_update_time = ktime_to_us(now); - if (ts->idle_active && compute_delta) { - ktime_t delta = ktime_sub(now, ts->idle_entrytime); + do { + seq = read_seqcount_begin(&ts->idle_sleeptime_seq); - idle = ktime_add(*sleeptime, delta); - } else { - idle = *sleeptime; - } + if (ts->idle_active && compute_delta) { + ktime_t delta = ktime_sub(now, ts->idle_entrytime); + + idle = ktime_add(*sleeptime, delta); + } else { + idle = *sleeptime; + } + } while (read_seqcount_retry(&ts->idle_sleeptime_seq, seq)); return ktime_to_us(idle); diff --git a/kernel/time/tick-sched.h b/kernel/time/tick-sched.h index c6663254d17d..5ed5a9d41d5a 100644 --- a/kernel/time/tick-sched.h +++ b/kernel/time/tick-sched.h @@ -75,6 +75,7 @@ struct tick_sched { ktime_t idle_waketime; /* Idle entry */ + seqcount_t idle_sleeptime_seq; ktime_t idle_entrytime; /* Tick stop */