From patchwork Tue Oct 10 21:47:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 150983 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp154315vqb; Tue, 10 Oct 2023 14:48:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFvnrcSYlI/PWFPzUC+QltzzDgyggA+k6IrekDXz1z/u6J/DiwXv0W3EoC8ZJurb+UxT3me X-Received: by 2002:a05:6a20:a10c:b0:171:737:df97 with SMTP id q12-20020a056a20a10c00b001710737df97mr4604219pzk.2.1696974484877; Tue, 10 Oct 2023 14:48:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696974484; cv=none; d=google.com; s=arc-20160816; b=io9XY4a7FueZumFSMGR3D/plgCa4ucAsF+wve4IWcQSMhiddg5FXB8l2URFA2lCTD8 he0du98Edfl3Qd+AmXk7KpJYkBwX8jXZsSRGC0vz4s0QkjosSZTGhMs9fZL+IdfIf0/0 F/1oVqqdFEWONCuQpx3B1fPaD1VV6qLkti4OdBoVZDqUd4Rifhx9E3akzHK+uH8qg1GU RS0ZWiApN9E1nFNI+EazXl7CjHb3uCgucUaoQ6Dv17g7Zlv30GMgwkz/giEb5myMJ7Eh 6O3sUzRLmo/ievohRdG19l8Y5z9UhewVFESQ1J9WBXSrG3IASf0ePKzyFuW4Joc3dk1p 2aiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=1tRgx0NgEWbnKaIyfT7MQOZy7+jU+cp321G8t8i+uss=; fh=cvoIB/oA7G0YzrW2AMPMVGARpPNRWFdUriPN33etOOk=; b=XEs+IIdbI6odXloJ8OmGw3Rm5E2lNPBOAe77mZGz5MdVByj9GgRTdihnhjYjniKwdG W7nnhtpjAPHxptbS5hiyBG7eWJD6TsrQbSiTrTqvm+YmlCZ5lqe0JYPiFPacU9Mjg1sW lUkZ7MU56y0JGLzr7Ie8gyquyI3iOWCoSIBTOlhZm8QlH1YA19CTaaLO6nGhBKLgC2ks g5fz6SX4vDWZzzA3I/6pv758zFZBqa6SBhLqnOdle1yUghL0dADM+D7c+q7iritN09OC EaGr4ykw2imHPDUclcutp+1nI0jHIvhj6Gmu0ulIfJA+6Z1dPKyQn8Xpc5863WoYHis8 QM/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=ppcMjILe; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=OdWC7VLi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id b67-20020a633446000000b005775e13a6b9si8871796pga.363.2023.10.10.14.48.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 14:48:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=ppcMjILe; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=OdWC7VLi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id DAD8F801BA78; Tue, 10 Oct 2023 14:48:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343707AbjJJVsA (ORCPT + 19 others); Tue, 10 Oct 2023 17:48:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231716AbjJJVr6 (ORCPT ); Tue, 10 Oct 2023 17:47:58 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 302E99D; Tue, 10 Oct 2023 14:47:57 -0700 (PDT) Date: Tue, 10 Oct 2023 21:47:54 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1696974475; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1tRgx0NgEWbnKaIyfT7MQOZy7+jU+cp321G8t8i+uss=; b=ppcMjILeuqgxp4slR38RNO1u8SgT90uIAoRyS9qME+CnlfMBpeuCHDhmmk1PHgOvhiZkgc 62EuOHoq+rAoBMrNshq+W30UoJoONk/LggTlQfUvOZ80p1rwrf/Z6o48tmLQ9/6H0sg3Qy FplkLv9Dkt7AmHqj/cpWJRcSVjIA/B+MraDDBBCV08uShVXRbCifvkdE9x0V2zp/npBWKA z3JE7yBL4hMOlepZsKkZsVLmVgwJOw1aZx9t8ubbyeiqI5vgqqYuFjL8DTuVWCU8s9s6Q5 zKBou8YYtWX8nDdKqZbpaUOqdhY3Hzy/3hzdtXDxxxaeIaKNmWj3AmJTrdECeg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1696974475; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1tRgx0NgEWbnKaIyfT7MQOZy7+jU+cp321G8t8i+uss=; b=OdWC7VLiOfZjISR4Uw9bforEJ3SGXSyaIQ0tT/V+58x7Wb32vVeBFgdtS/GOWz7c1D0NSD 2vUAycOmZLedhNBw== From: "tip-bot2 for Mel Gorman" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: sched/core] sched/numa: Complete scanning of partial VMAs regardless of PID activity Cc: Mel Gorman , Ingo Molnar , Raghavendra K T , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20231010083143.19593-6-mgorman@techsingularity.net> References: <20231010083143.19593-6-mgorman@techsingularity.net> MIME-Version: 1.0 Message-ID: <169697447487.3135.12299717344024056092.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 10 Oct 2023 14:48:03 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779356692839343400 X-GMAIL-MSGID: 1779406717051992451 The following commit has been merged into the sched/core branch of tip: Commit-ID: b7a5b537c55c088d891ae554103d1b281abef781 Gitweb: https://git.kernel.org/tip/b7a5b537c55c088d891ae554103d1b281abef781 Author: Mel Gorman AuthorDate: Tue, 10 Oct 2023 09:31:42 +01:00 Committer: Ingo Molnar CommitterDate: Tue, 10 Oct 2023 23:41:47 +02:00 sched/numa: Complete scanning of partial VMAs regardless of PID activity NUMA Balancing skips VMAs when the current task has not trapped a NUMA fault within the VMA. If the VMA is skipped then mm->numa_scan_offset advances and a task that is trapping faults within the VMA may never fully update PTEs within the VMA. Force tasks to update PTEs for partially scanned PTEs. The VMA will be tagged for NUMA hints by some task but this removes some of the benefit of tracking PID activity within a VMA. A follow-on patch will mitigate this problem. The test cases and machines evaluated did not trigger the corner case so the performance results are neutral with only small changes within the noise from normal test-to-test variance. However, the next patch makes the corner case easier to trigger. Signed-off-by: Mel Gorman Signed-off-by: Ingo Molnar Tested-by: Raghavendra K T Link: https://lore.kernel.org/r/20231010083143.19593-6-mgorman@techsingularity.net --- include/linux/sched/numa_balancing.h | 1 + include/trace/events/sched.h | 3 ++- kernel/sched/fair.c | 18 +++++++++++++++--- 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/include/linux/sched/numa_balancing.h b/include/linux/sched/numa_balancing.h index c127a15..7dcc0bd 100644 --- a/include/linux/sched/numa_balancing.h +++ b/include/linux/sched/numa_balancing.h @@ -21,6 +21,7 @@ enum numa_vmaskip_reason { NUMAB_SKIP_INACCESSIBLE, NUMAB_SKIP_SCAN_DELAY, NUMAB_SKIP_PID_INACTIVE, + NUMAB_SKIP_IGNORE_PID, }; #ifdef CONFIG_NUMA_BALANCING diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h index d82a04d..bfc07c1 100644 --- a/include/trace/events/sched.h +++ b/include/trace/events/sched.h @@ -670,7 +670,8 @@ DEFINE_EVENT(sched_numa_pair_template, sched_swap_numa, EM( NUMAB_SKIP_SHARED_RO, "shared_ro" ) \ EM( NUMAB_SKIP_INACCESSIBLE, "inaccessible" ) \ EM( NUMAB_SKIP_SCAN_DELAY, "scan_delay" ) \ - EMe(NUMAB_SKIP_PID_INACTIVE, "pid_inactive" ) + EM( NUMAB_SKIP_PID_INACTIVE, "pid_inactive" ) \ + EMe(NUMAB_SKIP_IGNORE_PID, "ignore_pid_inactive" ) /* Redefine for export. */ #undef EM diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index ce36969..ab79013 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -3113,7 +3113,7 @@ static void reset_ptenuma_scan(struct task_struct *p) p->mm->numa_scan_offset = 0; } -static bool vma_is_accessed(struct vm_area_struct *vma) +static bool vma_is_accessed(struct mm_struct *mm, struct vm_area_struct *vma) { unsigned long pids; /* @@ -3126,7 +3126,19 @@ static bool vma_is_accessed(struct vm_area_struct *vma) return true; pids = vma->numab_state->pids_active[0] | vma->numab_state->pids_active[1]; - return test_bit(hash_32(current->pid, ilog2(BITS_PER_LONG)), &pids); + if (test_bit(hash_32(current->pid, ilog2(BITS_PER_LONG)), &pids)) + return true; + + /* + * Complete a scan that has already started regardless of PID access, or + * some VMAs may never be scanned in multi-threaded applications: + */ + if (mm->numa_scan_offset > vma->vm_start) { + trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_IGNORE_PID); + return true; + } + + return false; } #define VMA_PID_RESET_PERIOD (4 * sysctl_numa_balancing_scan_delay) @@ -3270,7 +3282,7 @@ static void task_numa_work(struct callback_head *work) } /* Do not scan the VMA if task has not accessed */ - if (!vma_is_accessed(vma)) { + if (!vma_is_accessed(mm, vma)) { trace_sched_skip_vma_numa(mm, vma, NUMAB_SKIP_PID_INACTIVE); continue; }