Message ID | 1694773288-15755-1-git-send-email-quic_mojha@quicinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1221100vqi; Fri, 15 Sep 2023 10:51:22 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFBdn6L8aMwIFRDTki7zMnWQTDVdh66BUl8AB/lMjXgd4WvR2SHi7wJP1uXA+atgEFcbJIn X-Received: by 2002:a17:903:2311:b0:1c3:a4f2:7ca3 with SMTP id d17-20020a170903231100b001c3a4f27ca3mr2932420plh.66.1694800281988; Fri, 15 Sep 2023 10:51:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694800281; cv=none; d=google.com; s=arc-20160816; b=0qEp3eLGUxWbNZBXvE47L0Px4w/69grvG/MpYzBUN+dMAqejGSRLSt53iCKN7zwBL7 RM4OWumoqTxQf3dYNDPhXZJg93SrWZMqN9iLOenzgzEGdn54SbLrfWOt60jsmlutljuX WWCpvHiG6ful8OZpPbQiG9eNnQDgFN4Juc8tjpZK8GeyqYigy1GZxDqQYGGnFgZU3wOb Fns/72FBR3OR79xuoaL5seARjRjIyb5xkBJ/DJ82dCmPUGbMi5E7iDNDDb8F5RQDVpb8 7hEAw3D3uGAaJjC5r2V8TeVlChJune3XsdyytnF29H7U7CNYBt+QcCuUBG++s5P02h82 /MXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from :dkim-signature; bh=tiWOyH+/M7saEqjNewqDhkz34LSgnEcc+FVODI6f1Ug=; fh=2+lvJt/dCAbsF6PMV3rfhQ/+WJSa+MReowFQL/ereY0=; b=ezJu7m8usQLwBBCLepWxolW4PRqtaFDNxNLMnfzEahA4sxsANhGumS9mscCO7EGera er2SHFkHJ87BqYW59ldv5Cq9bFHojmL5f2QnEG4lmJJu9WXwa0zrhUOO5x7VQOM/Ds3Z ZNispclYmEYl7vOrCrFK6nPO7DVK/brxjNrktNsm03N+5DKSjxhFbA3jGSIYx32f70od 0iA8x6byKvCQ4TBw+y2uk13tYn6sIOBeMXG26jjsZn9tQ4qHmaBbozknSXksmhiN3fgi NDI7kAc0tmIgJ+X6KwMLC2kUV+k7Kd7hpjVlIFUOvyQdGuxzu+8whmGFGSd1tOHzaTvV jBOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=bF3KPETU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id u11-20020a170902e80b00b001bbbbb61c71si3948378plg.399.2023.09.15.10.51.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 10:51:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=bF3KPETU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 9204F8310D24; Fri, 15 Sep 2023 03:22:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234222AbjIOKWZ (ORCPT <rfc822;ruipengqi7@gmail.com> + 32 others); Fri, 15 Sep 2023 06:22:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46156 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231341AbjIOKWY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 15 Sep 2023 06:22:24 -0400 Received: from mx0a-0031df01.pphosted.com (mx0a-0031df01.pphosted.com [205.220.168.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8476C1AE for <linux-kernel@vger.kernel.org>; Fri, 15 Sep 2023 03:22:19 -0700 (PDT) Received: from pps.filterd (m0279867.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38F3HUE2009289; Fri, 15 Sep 2023 10:22:06 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=qcppdkim1; bh=tiWOyH+/M7saEqjNewqDhkz34LSgnEcc+FVODI6f1Ug=; b=bF3KPETUFFwuw30VOWOys7LLs0Xy0b5mI7bjZ5Nn6s+NxtzpRA0tsUgCC/aoQw8UAX6Z TTjtI7n3RGB+qFKfV2I2L/94wgh40OBo9eIm2Xq9lpc8MFu8XGl156mSWuJLjtyzmUyJ azdvbxnhmwyT07b0WpRPZECSbUuUexs8/abNz8mw3edGenhP1htp9nvzgq7jweiPej6p yCjMSUlvaPPwN1RWUw08IABNxrLkJj6qJTc8zJAgVB7iwvkHalRyx5T9GCxJhV6eO/R1 3NgKKI3/lpOHUaNIro4h3+xEZDBICR5DJhpAkU5f6gy9zMCsIylPKyg6monxE7qKIsFT 6A== Received: from nasanppmta03.qualcomm.com (i-global254.qualcomm.com [199.106.103.254]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3t4f6v0wuq-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 15 Sep 2023 10:22:05 +0000 Received: from nasanex01c.na.qualcomm.com (nasanex01c.na.qualcomm.com [10.45.79.139]) by NASANPPMTA03.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 38FAM58i019642 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 15 Sep 2023 10:22:05 GMT Received: from hu-mojha-hyd.qualcomm.com (10.80.80.8) by nasanex01c.na.qualcomm.com (10.45.79.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.36; Fri, 15 Sep 2023 03:22:02 -0700 From: Mukesh Ojha <quic_mojha@quicinc.com> To: <mcgrof@kernel.org>, <russell.h.weight@intel.com>, <gregkh@linuxfoundation.org>, <rafael@kernel.org> CC: <linux-kernel@vger.kernel.org>, Mukesh Ojha <quic_mojha@quicinc.com> Subject: [PATCH] firmware_loader: Add reboot_in_progress for user helper path Date: Fri, 15 Sep 2023 15:51:28 +0530 Message-ID: <1694773288-15755-1-git-send-email-quic_mojha@quicinc.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01b.na.qualcomm.com (10.46.141.250) To nasanex01c.na.qualcomm.com (10.45.79.139) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-GUID: b1kqsY2SZzaAt5osDISF_O-luDjlHeSG X-Proofpoint-ORIG-GUID: b1kqsY2SZzaAt5osDISF_O-luDjlHeSG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.601,FMLib:17.11.176.26 definitions=2023-09-15_06,2023-09-14_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 lowpriorityscore=0 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 clxscore=1011 mlxlogscore=999 spamscore=0 mlxscore=0 priorityscore=1501 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2308100000 definitions=main-2309150091 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Fri, 15 Sep 2023 03:22:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777126900463170817 X-GMAIL-MSGID: 1777126900463170817 |
Series |
firmware_loader: Add reboot_in_progress for user helper path
|
|
Commit Message
Mukesh Ojha
Sept. 15, 2023, 10:21 a.m. UTC
There could be following scenario where there is a ongoing reboot
is going from processA which tries to call all the reboot notifier
callback and one of them is firmware reboot call which tries to
abort all the ongoing firmware userspace request under fw_lock
but there could be another processB which tries to do request
firmware, which came just after abort done from ProcessA and
ask for userspace to load the firmware and this can stop the
ongoing reboot ProcessA to stall for next 60s(default timeout)
which may be expected behaviour everyone like to see, instead
we should abort every request which came after once firmware
marks reboot notification.
ProcessA ProcessB
kernel_restart_prepare
blocking_notifier_call_chain
fw_shutdown_notify
kill_pending_fw_fallback_reqs
__fw_load_abort
fw_state_aborted request_firmware
__fw_state_set firmware_fallback_sysfs
... fw_load_from_user_helper
.. ...
. ..
usermodehelper_read_trylock
fw_load_sysfs_fallback
fw_sysfs_wait_timeout
usermodehelper_disable
__usermodehelper_disable
down_write()
Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com>
---
drivers/base/firmware_loader/fallback.c | 2 +-
drivers/base/firmware_loader/firmware.h | 1 +
drivers/base/firmware_loader/main.c | 2 ++
3 files changed, 4 insertions(+), 1 deletion(-)
Comments
On Fri, Sep 15, 2023 at 03:51:28PM +0530, Mukesh Ojha wrote: > There could be following scenario where there is a ongoing reboot > is going from processA which tries to call all the reboot notifier > callback and one of them is firmware reboot call which tries to > abort all the ongoing firmware userspace request under fw_lock > but there could be another processB which tries to do request > firmware, which came just after abort done from ProcessA and > ask for userspace to load the firmware and this can stop the > ongoing reboot ProcessA to stall for next 60s(default timeout) > which may be expected behaviour everyone like to see, instead > we should abort every request which came after once firmware > marks reboot notification. > > ProcessA ProcessB > > kernel_restart_prepare > blocking_notifier_call_chain > fw_shutdown_notify > kill_pending_fw_fallback_reqs > __fw_load_abort > fw_state_aborted request_firmware > __fw_state_set firmware_fallback_sysfs > ... fw_load_from_user_helper > .. ... > . .. > usermodehelper_read_trylock > fw_load_sysfs_fallback > fw_sysfs_wait_timeout > usermodehelper_disable > __usermodehelper_disable > down_write() > > Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> > --- > drivers/base/firmware_loader/fallback.c | 2 +- > drivers/base/firmware_loader/firmware.h | 1 + > drivers/base/firmware_loader/main.c | 2 ++ > 3 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c > index bf68e3947814..a5546aeea91f 100644 > --- a/drivers/base/firmware_loader/fallback.c > +++ b/drivers/base/firmware_loader/fallback.c > @@ -86,7 +86,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout) > } > > mutex_lock(&fw_lock); > - if (fw_state_is_aborted(fw_priv)) { > + if (reboot_in_progress || fw_state_is_aborted(fw_priv)) { > mutex_unlock(&fw_lock); > retval = -EINTR; > goto out; What prevents reboot_in_progress to change right after you check it here? And what kernel driver is trying to call the reboot notifier that gets mixed up in this? Why not fix that driver to not need the reboot notifier at all (hint, I really doubt it needs it...) > --- a/drivers/base/firmware_loader/main.c > +++ b/drivers/base/firmware_loader/main.c > @@ -93,6 +93,7 @@ static inline struct fw_priv *to_fw_priv(struct kref *ref) > DEFINE_MUTEX(fw_lock); > > struct firmware_cache fw_cache; > +bool reboot_in_progress; Bad global name for a variable in the firmware_loader core. thanks, greg k-h
On 9/17/2023 3:42 PM, Greg KH wrote: > On Fri, Sep 15, 2023 at 03:51:28PM +0530, Mukesh Ojha wrote: >> There could be following scenario where there is a ongoing reboot >> is going from processA which tries to call all the reboot notifier >> callback and one of them is firmware reboot call which tries to >> abort all the ongoing firmware userspace request under fw_lock >> but there could be another processB which tries to do request >> firmware, which came just after abort done from ProcessA and >> ask for userspace to load the firmware and this can stop the >> ongoing reboot ProcessA to stall for next 60s(default timeout) >> which may be expected behaviour everyone like to see, instead >> we should abort every request which came after once firmware >> marks reboot notification. >> >> ProcessA ProcessB >> >> kernel_restart_prepare >> blocking_notifier_call_chain >> fw_shutdown_notify >> kill_pending_fw_fallback_reqs >> __fw_load_abort >> fw_state_aborted request_firmware >> __fw_state_set firmware_fallback_sysfs >> ... fw_load_from_user_helper >> .. ... >> . .. >> usermodehelper_read_trylock >> fw_load_sysfs_fallback >> fw_sysfs_wait_timeout >> usermodehelper_disable >> __usermodehelper_disable >> down_write() >> >> Signed-off-by: Mukesh Ojha <quic_mojha@quicinc.com> >> --- >> drivers/base/firmware_loader/fallback.c | 2 +- >> drivers/base/firmware_loader/firmware.h | 1 + >> drivers/base/firmware_loader/main.c | 2 ++ >> 3 files changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c >> index bf68e3947814..a5546aeea91f 100644 >> --- a/drivers/base/firmware_loader/fallback.c >> +++ b/drivers/base/firmware_loader/fallback.c >> @@ -86,7 +86,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout) >> } >> >> mutex_lock(&fw_lock); >> - if (fw_state_is_aborted(fw_priv)) { >> + if (reboot_in_progress || fw_state_is_aborted(fw_priv)) { >> mutex_unlock(&fw_lock); >> retval = -EINTR; >> goto out; > > What prevents reboot_in_progress to change right after you check it > here? e.g, reboot_in_progress was false, it gets added to the pending list under fw_lock list_add(&fw_priv->pending_list, &pending_fw_head); reboot_in_progress = true, when all the outstanding fw request are aborted during from reboot thread from below path. However, I realize my mistake, reboot_in_progress should be modified under fw_lock, will fix in v2. >> fw_shutdown_notify >> kill_pending_fw_fallback_reqs So, idea is to revoke any fw request once firmware knows about ongoing reboot and not delay the reboot process further for next default 60s . > > And what kernel driver is trying to call the reboot notifier that gets > mixed up in this? Why not fix that driver to not need the reboot > notifier at all (hint, I really doubt it needs it...) 'drivers/base/firmware_loader/main.c' has reboot notifier which aborts the ongoing firmware requests and but can race with other parallel ongoing request which are not yet added to pending list. fw_load_sysfs_fallback-> stuck waiting for fw_lock which was held by kill_pending_fw_fallback_reqs=> mutex_lock(&fw_lock); __fw_load_abort > >> --- a/drivers/base/firmware_loader/main.c >> +++ b/drivers/base/firmware_loader/main.c >> @@ -93,6 +93,7 @@ static inline struct fw_priv *to_fw_priv(struct kref *ref) >> DEFINE_MUTEX(fw_lock); >> >> struct firmware_cache fw_cache; >> +bool reboot_in_progress; > > Bad global name for a variable in the firmware_loader core. bool abort_fw_load_req ?? -Mukesh > > thanks, > > greg k-h
On Tue, Sep 19, 2023 at 01:27:05PM +0530, Mukesh Ojha wrote: > > > +bool reboot_in_progress; > > > > Bad global name for a variable in the firmware_loader core. > > bool abort_fw_load_req ?? Use "noun_verb" please for global symbols so it's more obvious where the symbol is, and it's a bit easier to manage the namespace. thanks, greg k-h
diff --git a/drivers/base/firmware_loader/fallback.c b/drivers/base/firmware_loader/fallback.c index bf68e3947814..a5546aeea91f 100644 --- a/drivers/base/firmware_loader/fallback.c +++ b/drivers/base/firmware_loader/fallback.c @@ -86,7 +86,7 @@ static int fw_load_sysfs_fallback(struct fw_sysfs *fw_sysfs, long timeout) } mutex_lock(&fw_lock); - if (fw_state_is_aborted(fw_priv)) { + if (reboot_in_progress || fw_state_is_aborted(fw_priv)) { mutex_unlock(&fw_lock); retval = -EINTR; goto out; diff --git a/drivers/base/firmware_loader/firmware.h b/drivers/base/firmware_loader/firmware.h index bf549d6500d7..6f44248e2e44 100644 --- a/drivers/base/firmware_loader/firmware.h +++ b/drivers/base/firmware_loader/firmware.h @@ -86,6 +86,7 @@ struct fw_priv { extern struct mutex fw_lock; extern struct firmware_cache fw_cache; +extern bool reboot_in_progress; static inline bool __fw_state_check(struct fw_priv *fw_priv, enum fw_status status) diff --git a/drivers/base/firmware_loader/main.c b/drivers/base/firmware_loader/main.c index b58c42f1b1ce..d72b7950edf0 100644 --- a/drivers/base/firmware_loader/main.c +++ b/drivers/base/firmware_loader/main.c @@ -93,6 +93,7 @@ static inline struct fw_priv *to_fw_priv(struct kref *ref) DEFINE_MUTEX(fw_lock); struct firmware_cache fw_cache; +bool reboot_in_progress; void fw_state_init(struct fw_priv *fw_priv) { @@ -1613,6 +1614,7 @@ static int fw_shutdown_notify(struct notifier_block *unused1, * and avoid a deadlock with the usermode_lock. */ kill_pending_fw_fallback_reqs(false); + reboot_in_progress = true; return NOTIFY_DONE; }