From patchwork Fri Feb 24 16:49:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 61402 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1025429wrd; Fri, 24 Feb 2023 09:02:52 -0800 (PST) X-Google-Smtp-Source: AK7set9utnMiDcGZrO+1RWFPaFGB4lQmzsu/S486tVnyPqRbXq7RuWYT0N/Jwzwg8WAgbUxA1Qkt X-Received: by 2002:aa7:d3c8:0:b0:4af:6e95:72b4 with SMTP id o8-20020aa7d3c8000000b004af6e9572b4mr5553531edr.15.1677258172567; Fri, 24 Feb 2023 09:02:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677258172; cv=none; d=google.com; s=arc-20160816; b=y54APXKMPgp6lFKu9a5D5GuEVJKzoIcS2OPCcdArJ44fo9NhU/h8gso4jM95teKVbv c+uQtNEFe0xrscK3bLYr2I+ntg07aUBkNZCEVRQ2plvVoeRMhRCM9rgZ8KNE7nR3hgQh oXmUkF7kPg8rvgjNIP620wnpbN9ouL4T3tfKdEY6915w/qWSFgK5NUZNMulIb6vKvvRA kwd0NBohjt0J1FrYNQHLAMVR4UMHGVAAKHru0WJzWJmBD0pmSabe8fCehPLMgg3yKq2l M6Zn5CdCeVGWF8cSVkFfG3LRH2sfIaSoEFX92AKNcJ9pfp9dyQb8KBmY7I7UjlKFor68 sOEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4jU+SGCwX400pg3X4OgbqE9w71eYcYGFHVV8AI/hOxc=; b=Gq0Q2SZDSzw/QcOKievL+Om4TqxSWm5IenJBe9mBLG2KJeGPI30SwcgeCVMBsNl3Gx ho0P2NeRGrcz0wYyd/wNAMZJFeb1nc/XAP8jI47hMd2xMCWfSle+kh4ZgwteUqmc8a+n q5MU+CqWjL+naoqYgYvI+ZnYcvIYH3fdycD2bzTiXeOTQfpyVz9W/BwgSCfHY7ioAwXf g2v4n+lMl2kecCSMa5ymbt9Xake2Ec9J3dmgiZ1tRCVOi2zprgjowx4L7LFasDuZ1DlT nc2HxyQRlmXCLk9pWSBXu4YYaimwZYZG7zxXRUYKzLLoqDvhG4DB+2hB8M7LOXBLMxeT e2lg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CjuG2KvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s10-20020a056402164a00b004ad7ceba692si8257272edx.20.2023.02.24.09.02.28; Fri, 24 Feb 2023 09:02:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=CjuG2KvI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230039AbjBXQuT (ORCPT + 99 others); Fri, 24 Feb 2023 11:50:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230023AbjBXQuQ (ORCPT ); Fri, 24 Feb 2023 11:50:16 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AA6B54617A; Fri, 24 Feb 2023 08:50:14 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 415F16194B; Fri, 24 Feb 2023 16:50:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 30298C433EF; Fri, 24 Feb 2023 16:50:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677257413; bh=nDYIubJTBPWH5ktJId6VhvPlAhhjbmackYfSKNUBnA8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CjuG2KvIvxcQ+B/eSRKm+PTo7G6vaKPGQgIU4hFyQs5wXnomIMM8b84GLT0O4Dnxe QWNkVDuQ1LSFLD75KKIO2nbosAgOXkj7ZRn7zUfbJs9Us+nqb3WkGIsp0fwlnwVmb4 dziFmuO5RmZlg2XCPkckA6GTPkq/+V6lRfEVFGwnWxWXY2EGWu/RqRsfd2LVcnEUwJ OvR8GULbhHEn7F9zn7ez5vg60FBy9A+cPDiw+ei2E+b1+FZcxvG/b/nQydeoRE3TWG chzdnusNMa6FPIev2fEZjRfPq+laArTHe4isK2ASKzh1Je7tudeigh2zidFin6awuM 2Y9+HwyejfjbQ== From: Josh Poimboeuf To: live-patching@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Seth Forshee , Peter Zijlstra , Song Liu , Mark Rutland , Petr Mladek , Joe Lawrence , Miroslav Benes , Jiri Kosina , Ingo Molnar , Rik van Riel Subject: [PATCH v3 1/3] livepatch: Skip task_call_func() for current task Date: Fri, 24 Feb 2023 08:49:59 -0800 Message-Id: <4b92e793462d532a05f03767151fa29db3e68e13.1677257135.git.jpoimboe@kernel.org> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758732665726688184?= X-GMAIL-MSGID: =?utf-8?q?1758732665726688184?= The current task doesn't need the scheduler's protection to unwind its own stack. Tested-by: Seth Forshee (DigitalOcean) Reviewed-by: Petr Mladek Signed-off-by: Josh Poimboeuf --- kernel/livepatch/transition.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c index f1b25ec581e0..4d1f443778f7 100644 --- a/kernel/livepatch/transition.c +++ b/kernel/livepatch/transition.c @@ -307,7 +307,11 @@ static bool klp_try_switch_task(struct task_struct *task) * functions. If all goes well, switch the task to the target patch * state. */ - ret = task_call_func(task, klp_check_and_switch_task, &old_name); + if (task == current) + ret = klp_check_and_switch_task(current, &old_name); + else + ret = task_call_func(task, klp_check_and_switch_task, &old_name); + switch (ret) { case 0: /* success */ break; From patchwork Fri Feb 24 16:50:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 61404 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1025812wrd; Fri, 24 Feb 2023 09:03:24 -0800 (PST) X-Google-Smtp-Source: AK7set9vRk9HohFjFW/YcdL/eK+FHaVbXjzeacf1YRsG3SO4KsTjcUDXEl/kyHYCE+lHlryiMt7d X-Received: by 2002:a17:90b:4f87:b0:230:81e9:ebb4 with SMTP id qe7-20020a17090b4f8700b0023081e9ebb4mr15797690pjb.10.1677258204109; Fri, 24 Feb 2023 09:03:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677258204; cv=none; d=google.com; s=arc-20160816; b=F3Z5PQBtZf9Z/z9YtejBWRk0sHYnJtCgI1iEwcMhJSlIF7QUA6P16NCCLTeAvWtupD CBWof6+ShdCFTIiADyU4thFS/+w5RJuwDFlvO1Q19Q6jO5WuQXlnK9GQ5u2BXibzcA6N gAWDhW9ZfTfx9F7cp5Rs8n99ZoZ6lvbBbIGHv4mNA04SI5foyI8/XFuLCf+c9m1x36la BSucgkFBElsk70/cgoQb0d7IJwk2NKBIiH8DNDC1zHTbs1HOfn1MTnuK09Le63Q/PkCu emDOjBvt/c5VB1pPgBdqwF8s9hhnZ51OisE3Z+7SKr8/LZsHCuwlYqiUYtQDg9QTEry+ JyKA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zXLwVoVQM0S0xHRBAWTPaObb0GNCQrV6ZHGpDF4I8WU=; b=ptrhetr+ef6EbnV4JORcHWgJPhJ00OExYb+BrXBcIKB+N2STQzS1Bu4cf5ihR9jC5N PzNekyD8Ty6WMIx0kqwDNE0SbDFE1dRGwSZkcOuma937Y+PQz29VtvOohz3PTQbfherd voIvkUJYE43B4W0vCvjs4/8nZR8FKibzJt9AXNpg18Qt/xczB3ctXMXw91hXc8iDIwqM yXNHEae5KDXFGo9dSA8ExU6dmzkFn3kXsTrgPmlOp5yPKfdMIMOKsXXxnjMHWMO11c5W wXekiJWqzbx3j1qqOya0x2DgXoZQLenBdoJwnh3qLBpySfpHXYnFOgxYCCtY5v/dPTfL r2PA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=r3k+Ug5c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lw1-20020a17090b180100b00233c4b78195si4121799pjb.4.2023.02.24.09.03.09; Fri, 24 Feb 2023 09:03:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=r3k+Ug5c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230060AbjBXQuY (ORCPT + 99 others); Fri, 24 Feb 2023 11:50:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50084 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230030AbjBXQuT (ORCPT ); Fri, 24 Feb 2023 11:50:19 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 702093CE03; Fri, 24 Feb 2023 08:50:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DE3B661949; Fri, 24 Feb 2023 16:50:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D1A05C433A7; Fri, 24 Feb 2023 16:50:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677257414; bh=WbBBpPO3kpc5kelBsC7eogJ72TFupiqLnGa0kx7+MGM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=r3k+Ug5cT0WBm0C6kePK9R3usi7m8upmxX1a7/tzyuQp4nI1LNcOjf3ckI1B5D5CT dpnbqzXqwX7GMGqHBeAdKoKuCsBGm76U7dTihCxv7o1vyorBgGT7uJFWb78Mo2Q7v3 jyt7WL6Op6WwCMqCp5HU/oa1nt1jNc84yswxVZffvSsaAWya035+6d19wbVafd7R9B 8Sb8ngUv2JYzlWpZx3k+XVQsMQZCjlJJTEY1Hdw+W98pmjYvxln8ahIRHv4eb/2u7M A2tM6q2uv0AVx5EecX2FlX/jN6TclsWTIIOtHmzx94zInU6Ln2k6UX01cG1qNnGW5a Q3q2bOuivHraA== From: Josh Poimboeuf To: live-patching@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Seth Forshee , Peter Zijlstra , Song Liu , Mark Rutland , Petr Mladek , Joe Lawrence , Miroslav Benes , Jiri Kosina , Ingo Molnar , Rik van Riel Subject: [PATCH v3 2/3] livepatch,sched: Add livepatch task switching to cond_resched() Date: Fri, 24 Feb 2023 08:50:00 -0800 Message-Id: <4ae981466b7814ec221014fc2554b2f86f3fb70b.1677257135.git.jpoimboe@kernel.org> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758732698365537997?= X-GMAIL-MSGID: =?utf-8?q?1758732698365537997?= There have been reports [1][2] of live patches failing to complete within a reasonable amount of time due to CPU-bound kthreads. Fix it by patching tasks in cond_resched(). There are four different flavors of cond_resched(), depending on the kernel configuration. Hook into all of them. A more elegant solution might be to use a preempt notifier. However, non-ORC unwinders can't unwind a preempted task reliably. [1] https://lore.kernel.org/lkml/20220507174628.2086373-1-song@kernel.org/ [2] https://lkml.kernel.org/lkml/20230120-vhost-klp-switching-v1-0-7c2b65519c43@kernel.org Tested-by: Seth Forshee (DigitalOcean) Reviewed-by: Petr Mladek Signed-off-by: Josh Poimboeuf --- include/linux/livepatch.h | 1 + include/linux/livepatch_sched.h | 29 +++++++++ include/linux/sched.h | 20 ++++-- kernel/livepatch/core.c | 1 + kernel/livepatch/transition.c | 107 +++++++++++++++++++++++++++----- kernel/sched/core.c | 64 ++++++++++++++++--- 6 files changed, 194 insertions(+), 28 deletions(-) create mode 100644 include/linux/livepatch_sched.h diff --git a/include/linux/livepatch.h b/include/linux/livepatch.h index 293e29960c6e..9b9b38e89563 100644 --- a/include/linux/livepatch.h +++ b/include/linux/livepatch.h @@ -13,6 +13,7 @@ #include #include #include +#include #if IS_ENABLED(CONFIG_LIVEPATCH) diff --git a/include/linux/livepatch_sched.h b/include/linux/livepatch_sched.h new file mode 100644 index 000000000000..013794fb5da0 --- /dev/null +++ b/include/linux/livepatch_sched.h @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _LINUX_LIVEPATCH_SCHED_H_ +#define _LINUX_LIVEPATCH_SCHED_H_ + +#include +#include + +#ifdef CONFIG_LIVEPATCH + +void __klp_sched_try_switch(void); + +#if !defined(CONFIG_PREEMPT_DYNAMIC) || !defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) + +DECLARE_STATIC_KEY_FALSE(klp_sched_try_switch_key); + +static __always_inline void klp_sched_try_switch(void) +{ + if (static_branch_unlikely(&klp_sched_try_switch_key)) + __klp_sched_try_switch(); +} + +#endif /* !CONFIG_PREEMPT_DYNAMIC || !CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */ + +#else /* !CONFIG_LIVEPATCH */ +static inline void klp_sched_try_switch(void) {} +static inline void __klp_sched_try_switch(void) {} +#endif /* CONFIG_LIVEPATCH */ + +#endif /* _LINUX_LIVEPATCH_SCHED_H_ */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 853d08f7562b..bd1e6f02facb 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -36,6 +36,7 @@ #include #include #include +#include #include /* task_struct member predeclarations (sorted alphabetically): */ @@ -2064,6 +2065,9 @@ extern int __cond_resched(void); #if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) +void sched_dynamic_klp_enable(void); +void sched_dynamic_klp_disable(void); + DECLARE_STATIC_CALL(cond_resched, __cond_resched); static __always_inline int _cond_resched(void) @@ -2072,6 +2076,7 @@ static __always_inline int _cond_resched(void) } #elif defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_KEY) + extern int dynamic_cond_resched(void); static __always_inline int _cond_resched(void) @@ -2079,20 +2084,25 @@ static __always_inline int _cond_resched(void) return dynamic_cond_resched(); } -#else +#else /* !CONFIG_PREEMPTION */ static inline int _cond_resched(void) { + klp_sched_try_switch(); return __cond_resched(); } -#endif /* CONFIG_PREEMPT_DYNAMIC */ +#endif /* PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */ -#else +#else /* CONFIG_PREEMPTION && !CONFIG_PREEMPT_DYNAMIC */ -static inline int _cond_resched(void) { return 0; } +static inline int _cond_resched(void) +{ + klp_sched_try_switch(); + return 0; +} -#endif /* !defined(CONFIG_PREEMPTION) || defined(CONFIG_PREEMPT_DYNAMIC) */ +#endif /* !CONFIG_PREEMPTION || CONFIG_PREEMPT_DYNAMIC */ #define cond_resched() ({ \ __might_resched(__FILE__, __LINE__, 0); \ diff --git a/kernel/livepatch/core.c b/kernel/livepatch/core.c index 201f0c0482fb..3f79265dd6e5 100644 --- a/kernel/livepatch/core.c +++ b/kernel/livepatch/core.c @@ -33,6 +33,7 @@ * * - klp_ftrace_handler() * - klp_update_patch_state() + * - __klp_sched_try_switch() */ DEFINE_MUTEX(klp_mutex); diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c index 4d1f443778f7..2662f2efb164 100644 --- a/kernel/livepatch/transition.c +++ b/kernel/livepatch/transition.c @@ -9,6 +9,7 @@ #include #include +#include #include "core.h" #include "patch.h" #include "transition.h" @@ -24,6 +25,25 @@ static int klp_target_state = KLP_UNDEFINED; static unsigned int klp_signals_cnt; +/* + * When a livepatch is in progress, enable klp stack checking in + * cond_resched(). This helps CPU-bound kthreads get patched. + */ +#if defined(CONFIG_PREEMPT_DYNAMIC) && defined(CONFIG_HAVE_PREEMPT_DYNAMIC_CALL) + +#define klp_cond_resched_enable() sched_dynamic_klp_enable() +#define klp_cond_resched_disable() sched_dynamic_klp_disable() + +#else /* !CONFIG_PREEMPT_DYNAMIC || !CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */ + +DEFINE_STATIC_KEY_FALSE(klp_sched_try_switch_key); +EXPORT_SYMBOL(klp_sched_try_switch_key); + +#define klp_cond_resched_enable() static_branch_enable(&klp_sched_try_switch_key) +#define klp_cond_resched_disable() static_branch_disable(&klp_sched_try_switch_key) + +#endif /* CONFIG_PREEMPT_DYNAMIC && CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */ + /* * This work can be performed periodically to finish patching or unpatching any * "straggler" tasks which failed to transition in the first attempt. @@ -172,8 +192,8 @@ void klp_update_patch_state(struct task_struct *task) * barrier (smp_rmb) for two cases: * * 1) Enforce the order of the TIF_PATCH_PENDING read and the - * klp_target_state read. The corresponding write barrier is in - * klp_init_transition(). + * klp_target_state read. The corresponding write barriers are in + * klp_init_transition() and klp_reverse_transition(). * * 2) Enforce the order of the TIF_PATCH_PENDING read and a future read * of func->transition, if klp_ftrace_handler() is called later on @@ -338,6 +358,44 @@ static bool klp_try_switch_task(struct task_struct *task) return !ret; } +void __klp_sched_try_switch(void) +{ + if (likely(!klp_patch_pending(current))) + return; + + /* + * This function is called from cond_resched() which is called in many + * places throughout the kernel. Using the klp_mutex here might + * deadlock. + * + * Instead, disable preemption to prevent racing with other callers of + * klp_try_switch_task(). Thanks to task_call_func() they won't be + * able to switch this task while it's running. + */ + preempt_disable(); + + /* + * Make sure current didn't get patched between the above check and + * preempt_disable(). + */ + if (unlikely(!klp_patch_pending(current))) + goto out; + + /* + * Enforce the order of the TIF_PATCH_PENDING read above and the + * klp_target_state read in klp_try_switch_task(). The corresponding + * write barriers are in klp_init_transition() and + * klp_reverse_transition(). + */ + smp_rmb(); + + klp_try_switch_task(current); + +out: + preempt_enable(); +} +EXPORT_SYMBOL(__klp_sched_try_switch); + /* * Sends a fake signal to all non-kthread tasks with TIF_PATCH_PENDING set. * Kthreads with TIF_PATCH_PENDING set are woken up. @@ -444,7 +502,8 @@ void klp_try_complete_transition(void) return; } - /* we're done, now cleanup the data structures */ + /* Done! Now cleanup the data structures. */ + klp_cond_resched_disable(); patch = klp_transition_patch; klp_complete_transition(); @@ -496,6 +555,8 @@ void klp_start_transition(void) set_tsk_thread_flag(task, TIF_PATCH_PENDING); } + klp_cond_resched_enable(); + klp_signals_cnt = 0; } @@ -551,8 +612,9 @@ void klp_init_transition(struct klp_patch *patch, int state) * see a func in transition with a task->patch_state of KLP_UNDEFINED. * * Also enforce the order of the klp_target_state write and future - * TIF_PATCH_PENDING writes to ensure klp_update_patch_state() doesn't - * set a task->patch_state to KLP_UNDEFINED. + * TIF_PATCH_PENDING writes to ensure klp_update_patch_state() and + * __klp_sched_try_switch() don't set a task->patch_state to + * KLP_UNDEFINED. */ smp_wmb(); @@ -588,14 +650,10 @@ void klp_reverse_transition(void) klp_target_state == KLP_PATCHED ? "patching to unpatching" : "unpatching to patching"); - klp_transition_patch->enabled = !klp_transition_patch->enabled; - - klp_target_state = !klp_target_state; - /* * Clear all TIF_PATCH_PENDING flags to prevent races caused by - * klp_update_patch_state() running in parallel with - * klp_start_transition(). + * klp_update_patch_state() or __klp_sched_try_switch() running in + * parallel with the reverse transition. */ read_lock(&tasklist_lock); for_each_process_thread(g, task) @@ -605,9 +663,28 @@ void klp_reverse_transition(void) for_each_possible_cpu(cpu) clear_tsk_thread_flag(idle_task(cpu), TIF_PATCH_PENDING); - /* Let any remaining calls to klp_update_patch_state() complete */ + /* + * Make sure all existing invocations of klp_update_patch_state() and + * __klp_sched_try_switch() see the cleared TIF_PATCH_PENDING before + * starting the reverse transition. + */ klp_synchronize_transition(); + /* + * All patching has stopped, now re-initialize the global variables to + * prepare for the reverse transition. + */ + klp_transition_patch->enabled = !klp_transition_patch->enabled; + klp_target_state = !klp_target_state; + + /* + * Enforce the order of the klp_target_state write and the + * TIF_PATCH_PENDING writes in klp_start_transition() to ensure + * klp_update_patch_state() and __klp_sched_try_switch() don't set + * task->patch_state to the wrong value. + */ + smp_wmb(); + klp_start_transition(); } @@ -621,9 +698,9 @@ void klp_copy_process(struct task_struct *child) * the task flag up to date with the parent here. * * The operation is serialized against all klp_*_transition() - * operations by the tasklist_lock. The only exception is - * klp_update_patch_state(current), but we cannot race with - * that because we are current. + * operations by the tasklist_lock. The only exceptions are + * klp_update_patch_state(current) and __klp_sched_try_switch(), but we + * cannot race with them because we are current. */ if (test_tsk_thread_flag(current, TIF_PATCH_PENDING)) set_tsk_thread_flag(child, TIF_PATCH_PENDING); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index e838feb6adc5..895d2a1fdcb3 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8487,6 +8487,7 @@ EXPORT_STATIC_CALL_TRAMP(might_resched); static DEFINE_STATIC_KEY_FALSE(sk_dynamic_cond_resched); int __sched dynamic_cond_resched(void) { + klp_sched_try_switch(); if (!static_branch_unlikely(&sk_dynamic_cond_resched)) return 0; return __cond_resched(); @@ -8635,13 +8636,17 @@ int sched_dynamic_mode(const char *str) #error "Unsupported PREEMPT_DYNAMIC mechanism" #endif -void sched_dynamic_update(int mode) +DEFINE_MUTEX(sched_dynamic_mutex); +static bool klp_override; + +static void __sched_dynamic_update(int mode) { /* * Avoid {NONE,VOLUNTARY} -> FULL transitions from ever ending up in * the ZERO state, which is invalid. */ - preempt_dynamic_enable(cond_resched); + if (!klp_override) + preempt_dynamic_enable(cond_resched); preempt_dynamic_enable(might_resched); preempt_dynamic_enable(preempt_schedule); preempt_dynamic_enable(preempt_schedule_notrace); @@ -8649,36 +8654,79 @@ void sched_dynamic_update(int mode) switch (mode) { case preempt_dynamic_none: - preempt_dynamic_enable(cond_resched); + if (!klp_override) + preempt_dynamic_enable(cond_resched); preempt_dynamic_disable(might_resched); preempt_dynamic_disable(preempt_schedule); preempt_dynamic_disable(preempt_schedule_notrace); preempt_dynamic_disable(irqentry_exit_cond_resched); - pr_info("Dynamic Preempt: none\n"); + if (mode != preempt_dynamic_mode) + pr_info("Dynamic Preempt: none\n"); break; case preempt_dynamic_voluntary: - preempt_dynamic_enable(cond_resched); + if (!klp_override) + preempt_dynamic_enable(cond_resched); preempt_dynamic_enable(might_resched); preempt_dynamic_disable(preempt_schedule); preempt_dynamic_disable(preempt_schedule_notrace); preempt_dynamic_disable(irqentry_exit_cond_resched); - pr_info("Dynamic Preempt: voluntary\n"); + if (mode != preempt_dynamic_mode) + pr_info("Dynamic Preempt: voluntary\n"); break; case preempt_dynamic_full: - preempt_dynamic_disable(cond_resched); + if (!klp_override) + preempt_dynamic_disable(cond_resched); preempt_dynamic_disable(might_resched); preempt_dynamic_enable(preempt_schedule); preempt_dynamic_enable(preempt_schedule_notrace); preempt_dynamic_enable(irqentry_exit_cond_resched); - pr_info("Dynamic Preempt: full\n"); + if (mode != preempt_dynamic_mode) + pr_info("Dynamic Preempt: full\n"); break; } preempt_dynamic_mode = mode; } +void sched_dynamic_update(int mode) +{ + mutex_lock(&sched_dynamic_mutex); + __sched_dynamic_update(mode); + mutex_unlock(&sched_dynamic_mutex); +} + +#ifdef CONFIG_HAVE_PREEMPT_DYNAMIC_CALL + +static int klp_cond_resched(void) +{ + __klp_sched_try_switch(); + return __cond_resched(); +} + +void sched_dynamic_klp_enable(void) +{ + mutex_lock(&sched_dynamic_mutex); + + klp_override = true; + static_call_update(cond_resched, klp_cond_resched); + + mutex_unlock(&sched_dynamic_mutex); +} + +void sched_dynamic_klp_disable(void) +{ + mutex_lock(&sched_dynamic_mutex); + + klp_override = false; + __sched_dynamic_update(preempt_dynamic_mode); + + mutex_unlock(&sched_dynamic_mutex); +} + +#endif /* CONFIG_HAVE_PREEMPT_DYNAMIC_CALL */ + static int __init setup_preempt_mode(char *str) { int mode = sched_dynamic_mode(str); From patchwork Fri Feb 24 16:50:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 61403 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1025697wrd; Fri, 24 Feb 2023 09:03:14 -0800 (PST) X-Google-Smtp-Source: AK7set/ix1pf8UoW35vTvfNXZ85IgEKieuFvM5KunETzROkrAf3N2W73Lkq6lLueqgPvhog085rS X-Received: by 2002:a50:fc18:0:b0:4af:8436:2f5d with SMTP id i24-20020a50fc18000000b004af84362f5dmr1561357edr.40.1677258194811; Fri, 24 Feb 2023 09:03:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677258194; cv=none; d=google.com; s=arc-20160816; b=M/dR6XObcHecl1jukG84PnkL5BCBqAv1qB1hnF3CpckEeuaLRPAmGtdzLA+ESfFd0q e97INhy6FO8hxYujjbGwZ02GZSjLm1hmirS46tipo+YbqoBSxNFkvQf9T9oFVoCKfdIW JT+xFz9k2a3Vpj0OqFucvosVGz0q5hzpbIpkWGtbUxFDnLAWye+f7AyI+oT0WAjNtpoO XrzJN527xp7GVismLocx5lKnveA7FFiSFO4cvU1gY5lcNG5Czhtm9eKiKBfrB6ivGcqY UrW/Zd4WaVkfJ13wWrR4AebpGUywgtkacSGhUjiXjhbTEgAGXUaroEtE6Z4adpVrvYkg QgDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+f41pgVe5xvkAd/Ki0t6NEZtUsqf2xiwTtTa3TntQds=; b=VtBNVBDvXj5ctagFhNcOKgHp8hbFxX0ZDb/K15Bq0BCkkh23JwUrW9hX1SGJGikR9R sizdDWKsMa1eACsNkpleiQyHe6ziR3ZBgbPm5LQ6AVZFvT8kMQEyh59ka6PGMPOl/ZzQ gF35Sm2jqREfrhbNeUCveQCRhzKI//XkrnDVtb+U2jyzPQj0mxCtsG575sxWTdVYu4Gd ckpQIi7LUTNL2SqQfa3YmQm1+VgJDvxmoD0SjmstHMZzvmngyDDuvypX81evSIfApa0u ODsIeOyEAPhAiMkf0P4mFfFcxSFjrR6bDJtjo4BlvboluqENJM4qG5KkE9HfJOvL7TMQ oPzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KNED18r3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k16-20020aa7d8d0000000b004b00f336555si418084eds.368.2023.02.24.09.02.39; Fri, 24 Feb 2023 09:03:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=KNED18r3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230048AbjBXQuV (ORCPT + 99 others); Fri, 24 Feb 2023 11:50:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230038AbjBXQuT (ORCPT ); Fri, 24 Feb 2023 11:50:19 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 135753BDA0; Fri, 24 Feb 2023 08:50:15 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8B10561948; Fri, 24 Feb 2023 16:50:15 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 78DFFC4339B; Fri, 24 Feb 2023 16:50:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1677257415; bh=/+ANatHz1yxjGMA7ryG0c/Km7p/0KZLeF6+BAXFBfTI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KNED18r35Unvo+Qk0w/kUt4niIy+qGdq7n1WVYiFX8mYQr2zorPEDVvNDAVDWoe+c 5mZUrQaTmP02Iw8M7rLdZcskZisWBbCmhGDqJYQKbrNztHpK997MnXF4an8vO4KPBS swLEUfOGaLQs5qEDFjkT2Bb+47nGCUFSrLipv8EWdVfUK2Ra7ExpuGMmNetJOHC/nV 5OrL4kgoeJ75Rpk+B7KlR6axpuuMupEUjsSOtu1F9PP16R0BEUDPqotDDrx/ZooKEO V3n6C77MJccDDfp1xu98Gyd/U9hTJJzziFAyXzYtdWsGyUVGKvrR7HqyGzz4SAgkLU j34oE4YW78mgg== From: Josh Poimboeuf To: live-patching@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Seth Forshee , Peter Zijlstra , Song Liu , Mark Rutland , Petr Mladek , Joe Lawrence , Miroslav Benes , Jiri Kosina , Ingo Molnar , Rik van Riel Subject: [PATCH v3 3/3] vhost: Fix livepatch timeouts in vhost_worker() Date: Fri, 24 Feb 2023 08:50:01 -0800 Message-Id: <509f6ea6fe6505f0a75a66026ba531c765ef922f.1677257135.git.jpoimboe@kernel.org> X-Mailer: git-send-email 2.39.1 In-Reply-To: References: MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758732688673003607?= X-GMAIL-MSGID: =?utf-8?q?1758732688673003607?= Livepatch timeouts were reported due to busy vhost_worker() kthreads. Now that cond_resched() can do livepatch task switching, use cond_resched() in vhost_worker(). That's the better way to conditionally call schedule() anyway. Reported-by: Seth Forshee (DigitalOcean) Link: https://lkml.kernel.org/lkml/20230120-vhost-klp-switching-v1-0-7c2b65519c43@kernel.org Tested-by: Seth Forshee (DigitalOcean) Reviewed-by: Petr Mladek Signed-off-by: Josh Poimboeuf --- drivers/vhost/vhost.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 43c9770b86e5..87e3cf12da1c 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -363,8 +363,7 @@ static int vhost_worker(void *data) kcov_remote_start_common(dev->kcov_handle); work->fn(work); kcov_remote_stop(); - if (need_resched()) - schedule(); + cond_resched(); } } kthread_unuse_mm(dev->mm); From patchwork Mon Mar 13 23:33:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 69201 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1475179wrd; Mon, 13 Mar 2023 17:08:40 -0700 (PDT) X-Google-Smtp-Source: AK7set8t+5Lya2d7UOB7E2mzuOkSEoZOBTf/jROE27llH6Y/eEcF4AoNy5o8ueTIXvPnyLRBKl3F X-Received: by 2002:a67:f254:0:b0:423:e081:8565 with SMTP id y20-20020a67f254000000b00423e0818565mr3123060vsm.35.1678752520365; Mon, 13 Mar 2023 17:08:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1678752520; cv=none; d=google.com; s=arc-20160816; b=1FtB3dfKpxW6YyTRu0vZjBR/8ZmIP3yJwNiEQOhWeNqDP4xFP1dPdVwYP75VJKSGth Ylzo4wNiTfywmU/es917vv4re8ZvdLp41AmT1+1F1JMRd8RfC1bc8WyJGAn3ga77eXsq YFFMOkzqmaVJ6ZwYHZHirOVyLvuLnSMQG8jPsqpxRZsdh7TeEU1ta56vPqpyJ/aTsyms UEEw3/bTi5BghQQzbkX/xPZVlDLIP+3wRLCbj3pbJE5+mtQYfXa3T2XYPspvYibXI1Tf RWyDaLGof556qeFL4eHfE9m/2PRuYQNk+SgUwGL7NkCrMcfE3fUb1dLsYRso1kVrFrYM /P5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=GRH64xuYetPfVpNKz/9gy46tJKsU2xjpMfSmTXMBH/8=; b=qqCDcl+eJwhj8+c/e7O0oNt3gt8Z5CimciWfd9nmQaVEOD6/xJM209l1UtLfHtMQB3 7yohWKX2jlVp1qlO8JZv1hxxFrrX7XYBTc6fJaZ2jFJwfqXJKhu440KFvdov3BWqD8TF maVXmKoGxSBwFAmZAbw/eFRikYgUmSP342xhRYuMNYapxaxUcLSioES9qU3+WDXtMoXE XyxVpI+SO7k+eKl8jh4fTdDvo1PEeQesS7bwcbfNa9rY+FhUIP4VdaKJC3tbcEvoSjiY 2qgJQwNdSUXWhS5M/xVq9XPf0giaxOvxQKZnvjFMRCu+9KefzgHbFiZPs9t87hlBdkXr KPFQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uf3wDI9o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y12-20020a67cf0c000000b004258d090eaesi393282vsl.108.2023.03.13.17.08.23; Mon, 13 Mar 2023 17:08:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=uf3wDI9o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229842AbjCMXdx (ORCPT + 99 others); Mon, 13 Mar 2023 19:33:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43600 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229665AbjCMXdv (ORCPT ); Mon, 13 Mar 2023 19:33:51 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF6E3F768; Mon, 13 Mar 2023 16:33:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 4BBF461556; Mon, 13 Mar 2023 23:33:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 44485C433EF; Mon, 13 Mar 2023 23:33:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1678750428; bh=W7MryJzHmQqDyin6xEvEG+fYO3n9hFFMrVefurQt1jY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=uf3wDI9oCFy///wWmgJYlUQjm5PAWdng+Ebht/QiWeo+lDt2FuyRKpocj7tUgJNHM 7RwJI+wOu+xwL0cl/N1s2ccVzTzTt10fc5S/5/UltRfiG/trRyRYkVNwPGBhDTsaFJ bWMInI7NcIHgMbXHZNTG5SFq3SQ9LanTw/AVLRaHqyCZ7D2pEiY3Ea/8IU1NmTR+tC 95Sv/j7MVU8cGHXciePUjJcl9nXvoQjUcuQHXNfs328NcCpSkC3s2j/r6tzzbdH978 tIw3WSickap6Od8l4cwYPMUJdfVzKIqIHSyKvy++OEyQib9cf06zcLpImCrpDnZKo3 rDfUgYMxGiQAw== Date: Mon, 13 Mar 2023 16:33:46 -0700 From: Josh Poimboeuf To: Petr Mladek Cc: live-patching@vger.kernel.org, linux-kernel@vger.kernel.org, Seth Forshee , Peter Zijlstra , Song Liu , Mark Rutland , Joe Lawrence , Miroslav Benes , Jiri Kosina , Ingo Molnar , Rik van Riel Subject: [PATCH 0.5/3] livepatch: Convert stack entries array to percpu Message-ID: <20230313233346.kayh4t2lpicjkpsv@treble> References: <4ae981466b7814ec221014fc2554b2f86f3fb70b.1677257135.git.jpoimboe@kernel.org> <20230228165608.kumgxziaietsjaz3@treble> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760299602950990796?= X-GMAIL-MSGID: =?utf-8?q?1760299602950990796?= On Fri, Mar 03, 2023 at 03:00:13PM +0100, Petr Mladek wrote: > > MAX_STACK_ENTRIES is 100, which seems excessive. If we halved that, the > > array would be "only" 400 bytes, which is *almost* reasonable to > > allocate on the stack? > > It is just for the stack in the process context. Right? > > I think that I have never seen a stack with over 50 entries. And in > the worst case, a bigger amount of entries would "just" result in > a non-reliable stack which might be acceptable. > > It looks acceptable to me. > > > Alternatively we could have a percpu entries array... :-/ > > That said, percpu entries would be fine as well. It sounds like > a good price for the livepatching feature. I think that livepatching > is used on big systems anyway. > > I slightly prefer the per-cpu solution. Booting a kernel with PREEMPT+LOCKDEP gave me a high-water mark of 60+ stack entries, seen when probing a device. I decided not to mess with MAX_STACK_ENTRIES, and instead just convert the entries to percpu. This patch could be inserted at the beginning of the set. ---8<--- Subject: [PATCH 0.5/3] livepatch: Convert stack entries array to percpu The entries array in klp_check_stack() is static local because it's too big to be reasonably allocated on the stack. Serialized access is enforced by the klp_mutex. In preparation for calling klp_check_stack() without the mutex (from cond_resched), convert it to a percpu variable. Signed-off-by: Josh Poimboeuf --- kernel/livepatch/transition.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/kernel/livepatch/transition.c b/kernel/livepatch/transition.c index f1b25ec581e0..135fc73e2e5d 100644 --- a/kernel/livepatch/transition.c +++ b/kernel/livepatch/transition.c @@ -14,6 +14,8 @@ #include "transition.h" #define MAX_STACK_ENTRIES 100 +DEFINE_PER_CPU(unsigned long[MAX_STACK_ENTRIES], klp_stack_entries); + #define STACK_ERR_BUF_SIZE 128 #define SIGNALS_TIMEOUT 15 @@ -240,12 +242,15 @@ static int klp_check_stack_func(struct klp_func *func, unsigned long *entries, */ static int klp_check_stack(struct task_struct *task, const char **oldname) { - static unsigned long entries[MAX_STACK_ENTRIES]; + unsigned long *entries = this_cpu_ptr(klp_stack_entries); struct klp_object *obj; struct klp_func *func; int ret, nr_entries; - ret = stack_trace_save_tsk_reliable(task, entries, ARRAY_SIZE(entries)); + /* Protect 'klp_stack_entries' */ + lockdep_assert_preemption_disabled(); + + ret = stack_trace_save_tsk_reliable(task, entries, MAX_STACK_ENTRIES); if (ret < 0) return -EINVAL; nr_entries = ret;