Message ID | 20221127-snd-freeze-v3-0-a2eda731ca14@chromium.org |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp5529436wrr; Mon, 28 Nov 2022 01:12:38 -0800 (PST) X-Google-Smtp-Source: AA0mqf7lQeuKutyMCJxvgNABXTP66n9IqZW+DN8AH+iz1HyLFEzYnzS6TzFvkehwbuLzXxr5EKgg X-Received: by 2002:a05:6a00:338c:b0:56b:a319:7b52 with SMTP id cm12-20020a056a00338c00b0056ba3197b52mr31563966pfb.21.1669626758062; Mon, 28 Nov 2022 01:12:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669626758; cv=none; d=google.com; s=arc-20160816; b=N+jY8bxa9sEw7rM+xSe9r025nRVGAwYLsCikatSDE857w0sXUnRPy9fk179Bq/4at2 myODfgRDuioOAF2Oj+7vwhI824BTw5DXLc+Pp2bU/nEZ+ZJVQrgkkNe7SIGxBVU5bmFs K91ElZhSnE6+hqU5JVZmF7gnXxC8W5Sfb+96/5Pbljgsw0F+yU9F7oxCzoNB550T+cle 7hSXXH+XVZpRb2Co86AJulye0X1WkcpKOSGMR02q21CujoWkJcxYpaEXpmz4KNgClRnA 8NYXdfR37Q2734gNJyuBER+hUMU0qAsyRZOIKQjyjhaW4cDRLXas0EMT8hVOtHPP5Zmm ja3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:message-id:date:from :content-transfer-encoding:mime-version:subject:dkim-signature; bh=5cvqz+ywjjARQOy84/P5RN8vJ8kenww9UeNxB2Tv1zU=; b=Co+mlOBd38ZWslFQT1cXpfGWXtbzL2pXduYnuemNzpmwgPlRTKKcKwAbgjEyORr4Bl jPtEwaDwuqSHdJNytRZByiJV/o22HsH0UTxB5LKx736SMGvd08uKToAMgAjzXNXouivN hg3ucYJfe3Bo4/SobKVV93EerE4PF+UKylcFrw2sQUqUwBLV9eOmmNix8nMOcmoRz4BQ V+GkCRD7xEmTTkOEksBy3JHXJ/ozpXYWvaSF9oWHrJri4Z9TiS29vKRhx3HeO+DN+XtP wCfppGGdkir36f0fWUxUQNhHbf3ySNq+iB8G/btt3H3QqjpXIbX1cw07ILEgYulIqwv9 mRIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=KdYLxwpX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b22-20020a631b56000000b0043cb43f584fsi11025889pgm.482.2022.11.28.01.12.25; Mon, 28 Nov 2022 01:12:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=KdYLxwpX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229959AbiK1JKm (ORCPT <rfc822;gah0developer@gmail.com> + 99 others); Mon, 28 Nov 2022 04:10:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbiK1JKk (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 28 Nov 2022 04:10:40 -0500 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F132EE50 for <linux-kernel@vger.kernel.org>; Mon, 28 Nov 2022 01:10:37 -0800 (PST) Received: by mail-ej1-x62d.google.com with SMTP id n21so24080229ejb.9 for <linux-kernel@vger.kernel.org>; Mon, 28 Nov 2022 01:10:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=cc:to:message-id:date:from:content-transfer-encoding:mime-version :subject:from:to:cc:subject:date:message-id:reply-to; bh=5cvqz+ywjjARQOy84/P5RN8vJ8kenww9UeNxB2Tv1zU=; b=KdYLxwpXdB8uvJ4n2TshHf2Pu21znKgV7JScKdk87ufOcEbTMcNSjmEMiM31oG2lUt jd2mh7eczTri25hzT4q4SqsLkp4ka+0Bayy5j0iiQupXZAVZXMpU9Y1FjunSiuH/X99B MroTL0ZZ1orEDi/OJDbO34fcVNaP42LHfhTPA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:message-id:date:from:content-transfer-encoding:mime-version :subject:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5cvqz+ywjjARQOy84/P5RN8vJ8kenww9UeNxB2Tv1zU=; b=d96NEYOS1m81whlYwlnJEw3ozt7koBH9KK3aYT1zhDeGDwLHVEyv7U5po09RFeV3/P JeqdjRFJ4kIiimbSXBtFURIzKpX8UhjEqcDGQJeatvZPP8gfteiybhkkvKkkH530ZgKC z/8K5h0LV7j4CflIIFEtt657aCBzH7Dg5gRnv/gruHf/ByivOKxR4f0YXdJKlrX3Z8qr smeBMl0wDkBjB44C0Nd3N8FP2CHcE6oGCw9RPoKycI25QCn4OnceU8UWtS8/hEOxthkU /81yRQmJsenSni93KydMwGSZJFvGjxbHtvkmcqca5YpdBYMixTNKZEyHDcxyCUIuY3tc NCcA== X-Gm-Message-State: ANoB5pmxsZtP+WasxcudePhbUkjsaZkx7xx3WSE1o8bD7cx9LkmtHSrh Jp4SGyq9vDL1NmFhKSpTT0VlWQ== X-Received: by 2002:a17:906:583:b0:78d:9e18:b8f7 with SMTP id 3-20020a170906058300b0078d9e18b8f7mr40928913ejn.657.1669626636589; Mon, 28 Nov 2022 01:10:36 -0800 (PST) Received: from alco.roam.corp.google.com (80.71.134.83.ipv4.parknet.dk. [80.71.134.83]) by smtp.gmail.com with ESMTPSA id p35-20020a056402502300b00463b9d47e1fsm4932346eda.71.2022.11.28.01.10.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Nov 2022 01:10:36 -0800 (PST) Subject: [PATCH v3 0/2] ALSA: core: Fix deadlock when shutdown a frozen userspace MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-B4-Tracking: v=1; b=H4sIAPR6hGMC/3XNQQrCMBAF0KuUrI10Jm1aXXkPcZGmYxNoE0hsQE vv7uBSdDX8D+/PJjIlT1mcq00kKj77GDioQyWsM2Ei6UfOAmtEAOxkDqO8J6IXSSCCRiH2CrVgMJ hMckgmWMckrPPMpfP5EdPz86AAn+vPrQKylm3XaDBKa7J4sS7Fxa/LMaZJ3Hip4H+NrMfeNNiSOW mtvvS+729I71yB6gAAAA== From: Ricardo Ribalda <ribalda@chromium.org> Date: Mon, 28 Nov 2022 10:10:12 +0100 Message-Id: <20221127-snd-freeze-v3-0-a2eda731ca14@chromium.org> To: Takashi Iwai <tiwai@suse.com>, Len Brown <len.brown@intel.com>, Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>, Kai Vehmanen <kai.vehmanen@linux.intel.com>, Ranjani Sridharan <ranjani.sridharan@linux.intel.com>, Mark Brown <broonie@kernel.org>, Jaroslav Kysela <perex@perex.cz>, Pavel Machek <pavel@ucw.cz>, "Rafael J. Wysocki" <rafael@kernel.org> Cc: alsa-devel@alsa-project.org, "Joel Fernandes (Google)" <joel@joelfernandes.org>, Ricardo Ribalda <ribalda@chromium.org>, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org X-Mailer: b4 0.11.0-dev-696ae X-Developer-Signature: v=1; a=openpgp-sha256; l=1634; i=ribalda@chromium.org; h=from:subject:message-id; bh=UZTGU/fsXJEA3ltnq4F2WvDw+A4ygx9FH7eOzblo+pg=; b=owEBbQKS/ZANAwAKAdE30T7POsSIAcsmYgBjhHr/j1G48WlxdIEEOXaXpHgUksIaanU3PEzs5LxT VFYlDcSJAjMEAAEKAB0WIQREDzjr+/4oCDLSsx7RN9E+zzrEiAUCY4R6/wAKCRDRN9E+zzrEiDSND/ 9bYOi0gOQKQUB1Ke/4A1WAaxC+iY9PHPZmeRlE3w3S+DePSWJocIg9/hq+TqgAi2pdJaB4aKXJMsRF QHXNNgVQuagnDLL8x/aXINyN0DDcRt2XRJUEc7RcKym5rNFVNm+OnKqq4lINr8mzkoEFZQwHbgMllP R2/gr6hy0fUAp9ey/+CgymaNMPc3opkp3bExJH8l1uYcjpPEotNZveVmD13BBBNsseRdJqnBoPdbsD GatYAsLiPW1ahZVyqy9O+i1uSUYNlhwTqeR3m+ARMQVMnVR3mGeegtwU82B1IS3Hn77KLe5jShtayi aEikqMHEoVNjx0MwkvzIwl+DMPlFc0zHo32W2vmqgKojydVa0Jcz1ZXB52bLjmZTnm4d3H1ua+shv9 5n0BNFpvpcxSFrcIKSWQsdQOplU88nw0qLKQ009fS1wvKk44vv5TkO8faZd5xspqYh7zSI4xmHHzeo 76Qak6EHRlI8u0P7uLRxBAcOUSGdgwZTlzZxTsHrIHe7aO27Bc46iLHmBT6cYccRIH1w8322jXlOze TZVSTQNw7NsgV2PcQBi2xHd+waL6wUyEwDIzHjXzv7jOhL7iuXEpSSz9C87P/yaUdDaCtpPPNYgmpA uWZpfP286KbmC0MIgQSFj0eCZ0BnMwVWxScheBjIrE/XViLZ1nZ5d7RB7JmA== X-Developer-Key: i=ribalda@chromium.org; a=openpgp; fpr=9EC3BB66E2FC129A6F90B39556A0D81F9F782DA9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750730547255576410?= X-GMAIL-MSGID: =?utf-8?q?1750730547255576410?= |
Series |
ALSA: core: Fix deadlock when shutdown a frozen userspace
|
|
Message
Ricardo Ribalda
Nov. 28, 2022, 9:10 a.m. UTC
Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown")
we wait for userspace to close its fds.
But that will never occur with a frozen userspace (like during kexec()).
Lets detect the frozen userpace and act accordingly.
To: Jaroslav Kysela <perex@perex.cz>
To: Takashi Iwai <tiwai@suse.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
To: Pavel Machek <pavel@ucw.cz>
To: Len Brown <len.brown@intel.com>
To: Kai Vehmanen <kai.vehmanen@linux.intel.com>
To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com>
To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
To: Mark Brown <broonie@kernel.org>
Cc: alsa-devel@alsa-project.org
Cc: linux-kernel@vger.kernel.org
Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Cc: linux-pm@vger.kernel.org
Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
Changes in v3:
- Wrap pm_freezing in a function
- Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org
Changes in v2:
- Only use pm_freezing if CONFIG_FREEZER
- Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org
---
Ricardo Ribalda (2):
freezer: Add processes_frozen()
ALSA: core: Fix deadlock when shutdown a frozen userspace
include/linux/freezer.h | 2 ++
kernel/freezer.c | 11 +++++++++++
sound/core/init.c | 13 +++++++++++++
3 files changed, 26 insertions(+)
---
base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4
change-id: 20221127-snd-freeze-1ee143228326
Best regards,
Comments
On Mon, 28 Nov 2022 10:10:12 +0100, Ricardo Ribalda wrote: > > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") > we wait for userspace to close its fds. IMO, the fix above brought more problem. If you'd need to want to avoid later accesses during shutdown, the driver should rather just disconnect devices without waiting for the user-space completion. And, for that, a simple call of snd_card_disconnect() should suffice. > But that will never occur with a frozen userspace (like during kexec()). > > Lets detect the frozen userpace and act accordingly. ... and skipping the user-space sync at snd_card_disconnect_sync() as of this patch set is a dangerous move, I'm afraid. The user-space gets frozen also at the normal suspend/resume, and it implies that the sync will be lost even for the normal PM, too (although it must be a very corner case). thanks, Takashi > > To: Jaroslav Kysela <perex@perex.cz> > To: Takashi Iwai <tiwai@suse.com> > To: "Rafael J. Wysocki" <rafael@kernel.org> > To: Pavel Machek <pavel@ucw.cz> > To: Len Brown <len.brown@intel.com> > To: Kai Vehmanen <kai.vehmanen@linux.intel.com> > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > To: Mark Brown <broonie@kernel.org> > Cc: alsa-devel@alsa-project.org > Cc: linux-kernel@vger.kernel.org > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> > Cc: linux-pm@vger.kernel.org > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> > --- > Changes in v3: > - Wrap pm_freezing in a function > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org > > Changes in v2: > - Only use pm_freezing if CONFIG_FREEZER > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org > > --- > Ricardo Ribalda (2): > freezer: Add processes_frozen() > ALSA: core: Fix deadlock when shutdown a frozen userspace > > include/linux/freezer.h | 2 ++ > kernel/freezer.c | 11 +++++++++++ > sound/core/init.c | 13 +++++++++++++ > 3 files changed, 26 insertions(+) > --- > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 > change-id: 20221127-snd-freeze-1ee143228326 > > Best regards, > -- > Ricardo Ribalda <ribalda@chromium.org> >
Hi Takashi Thanks for your prompt reply On Mon, 28 Nov 2022 at 10:24, Takashi Iwai <tiwai@suse.de> wrote: > > On Mon, 28 Nov 2022 10:10:12 +0100, > Ricardo Ribalda wrote: > > > > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") > > we wait for userspace to close its fds. > > IMO, the fix above brought more problem. If you'd need to want to > avoid later accesses during shutdown, the driver should rather just > disconnect devices without waiting for the user-space completion. > And, for that, a simple call of snd_card_disconnect() should suffice. > > > But that will never occur with a frozen userspace (like during kexec()). > > > > Lets detect the frozen userpace and act accordingly. > > ... and skipping the user-space sync at snd_card_disconnect_sync() as > of this patch set is a dangerous move, I'm afraid. The user-space > gets frozen also at the normal suspend/resume, and it implies that the > sync will be lost even for the normal PM, too (although it must be a > very corner case). > And what about checking kexec_in_progress instead? Thanks! > > thanks, > > Takashi > > > > > To: Jaroslav Kysela <perex@perex.cz> > > To: Takashi Iwai <tiwai@suse.com> > > To: "Rafael J. Wysocki" <rafael@kernel.org> > > To: Pavel Machek <pavel@ucw.cz> > > To: Len Brown <len.brown@intel.com> > > To: Kai Vehmanen <kai.vehmanen@linux.intel.com> > > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > > To: Mark Brown <broonie@kernel.org> > > Cc: alsa-devel@alsa-project.org > > Cc: linux-kernel@vger.kernel.org > > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> > > Cc: linux-pm@vger.kernel.org > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> > > --- > > Changes in v3: > > - Wrap pm_freezing in a function > > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org > > > > Changes in v2: > > - Only use pm_freezing if CONFIG_FREEZER > > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org > > > > --- > > Ricardo Ribalda (2): > > freezer: Add processes_frozen() > > ALSA: core: Fix deadlock when shutdown a frozen userspace > > > > include/linux/freezer.h | 2 ++ > > kernel/freezer.c | 11 +++++++++++ > > sound/core/init.c | 13 +++++++++++++ > > 3 files changed, 26 insertions(+) > > --- > > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 > > change-id: 20221127-snd-freeze-1ee143228326 > > > > Best regards, > > -- > > Ricardo Ribalda <ribalda@chromium.org> > >
On 28.11.22 10:10, Ricardo Ribalda wrote: > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") > we wait for userspace to close its fds. > > But that will never occur with a frozen userspace (like during kexec()). Hi, that means that user space can delay an unplugging process for an indefinite time. And that is a problem. I am afraid there is a problem with the basic approach, even outside of times user space is frozen. Regards Oliver
On Mon, 28 Nov 2022 10:26:36 +0100, Ricardo Ribalda wrote: > > Hi Takashi > > Thanks for your prompt reply > > On Mon, 28 Nov 2022 at 10:24, Takashi Iwai <tiwai@suse.de> wrote: > > > > On Mon, 28 Nov 2022 10:10:12 +0100, > > Ricardo Ribalda wrote: > > > > > > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") > > > we wait for userspace to close its fds. > > > > IMO, the fix above brought more problem. If you'd need to want to > > avoid later accesses during shutdown, the driver should rather just > > disconnect devices without waiting for the user-space completion. > > And, for that, a simple call of snd_card_disconnect() should suffice. > > > > > But that will never occur with a frozen userspace (like during kexec()). > > > > > > Lets detect the frozen userpace and act accordingly. > > > > ... and skipping the user-space sync at snd_card_disconnect_sync() as > > of this patch set is a dangerous move, I'm afraid. The user-space > > gets frozen also at the normal suspend/resume, and it implies that the > > sync will be lost even for the normal PM, too (although it must be a > > very corner case). > > > > And what about checking kexec_in_progress instead? I still think that the call of snd_card_disconnect_sync() itself at shutdown is somehow wrong. If this only comes from the SOF code path above, we should address in that code path instead. OTOH, you showed two code paths: one is [ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done. [ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds. [ 246.819035] Call Trace: [ 246.821782] <TASK> [ 246.824186] __schedule+0x5f9/0x1263 [ 246.828231] schedule+0x87/0xc5 [ 246.831779] snd_card_disconnect_sync+0xb5/0x127 ... [ 246.889249] snd_sof_device_shutdown+0xb4/0x150 [ 246.899317] pci_device_shutdown+0x37/0x61 [ 246.903990] device_shutdown+0x14c/0x1d6 [ 246.908391] kernel_kexec+0x45/0xb9 and another is [ 246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds. [ 246.927709] Call Trace: [ 246.930461] <TASK> [ 246.932819] __schedule+0x5f9/0x1263 [ 246.936855] ? fsnotify_grab_connector+0x5c/0x70 [ 246.942045] schedule+0x87/0xc5 [ 246.945567] schedule_timeout+0x49/0xf3 [ 246.949877] wait_for_completion+0x86/0xe8 [ 246.954463] snd_card_free+0x68/0x89 ... [ 247.001080] platform_device_unregister+0x12/0x35 The former is likely the SOF code path by the commit you mentioned (but it's not 100% clear because you trimmed the stack trace), and this should be reconsidered. But, the latter seems to be independent from that. If that's the code path where the unbind is triggered before kexec, your fix might not work, either; it could be already at the wait_event*() when kexec starts. Maybe a simpler workaround would be to replace it with wait_event_killable*() stuff. But whether we can discontinue the sync there is still another thing to consider... Takashi > > Thanks! > > > > > thanks, > > > > Takashi > > > > > > > > To: Jaroslav Kysela <perex@perex.cz> > > > To: Takashi Iwai <tiwai@suse.com> > > > To: "Rafael J. Wysocki" <rafael@kernel.org> > > > To: Pavel Machek <pavel@ucw.cz> > > > To: Len Brown <len.brown@intel.com> > > > To: Kai Vehmanen <kai.vehmanen@linux.intel.com> > > > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > > > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > > > To: Mark Brown <broonie@kernel.org> > > > Cc: alsa-devel@alsa-project.org > > > Cc: linux-kernel@vger.kernel.org > > > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> > > > Cc: linux-pm@vger.kernel.org > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> > > > --- > > > Changes in v3: > > > - Wrap pm_freezing in a function > > > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org > > > > > > Changes in v2: > > > - Only use pm_freezing if CONFIG_FREEZER > > > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org > > > > > > --- > > > Ricardo Ribalda (2): > > > freezer: Add processes_frozen() > > > ALSA: core: Fix deadlock when shutdown a frozen userspace > > > > > > include/linux/freezer.h | 2 ++ > > > kernel/freezer.c | 11 +++++++++++ > > > sound/core/init.c | 13 +++++++++++++ > > > 3 files changed, 26 insertions(+) > > > --- > > > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 > > > change-id: 20221127-snd-freeze-1ee143228326 > > > > > > Best regards, > > > -- > > > Ricardo Ribalda <ribalda@chromium.org> > > > > > > > -- > Ricardo Ribalda >
Hi Takashi Thanks for your response On Mon, 28 Nov 2022 at 10:53, Takashi Iwai <tiwai@suse.de> wrote: > > On Mon, 28 Nov 2022 10:26:36 +0100, > Ricardo Ribalda wrote: > > > > Hi Takashi > > > > Thanks for your prompt reply > > > > On Mon, 28 Nov 2022 at 10:24, Takashi Iwai <tiwai@suse.de> wrote: > > > > > > On Mon, 28 Nov 2022 10:10:12 +0100, > > > Ricardo Ribalda wrote: > > > > > > > > Since 83bfc7e793b5 ("ASoC: SOF: core: unregister clients and machine drivers in .shutdown") > > > > we wait for userspace to close its fds. > > > > > > IMO, the fix above brought more problem. If you'd need to want to > > > avoid later accesses during shutdown, the driver should rather just > > > disconnect devices without waiting for the user-space completion. > > > And, for that, a simple call of snd_card_disconnect() should suffice. > > > > > > > But that will never occur with a frozen userspace (like during kexec()). > > > > > > > > Lets detect the frozen userpace and act accordingly. > > > > > > ... and skipping the user-space sync at snd_card_disconnect_sync() as > > > of this patch set is a dangerous move, I'm afraid. The user-space > > > gets frozen also at the normal suspend/resume, and it implies that the > > > sync will be lost even for the normal PM, too (although it must be a > > > very corner case). > > > > > > > And what about checking kexec_in_progress instead? > > I still think that the call of snd_card_disconnect_sync() itself at > shutdown is somehow wrong. If this only comes from the SOF code path > above, we should address in that code path instead. > > OTOH, you showed two code paths: one is > > [ 84.943749] Freezing user space processes ... (elapsed 0.111 seconds) done. > [ 246.784446] INFO: task kexec-lite:5123 blocked for more than 122 seconds. > [ 246.819035] Call Trace: > [ 246.821782] <TASK> > [ 246.824186] __schedule+0x5f9/0x1263 > [ 246.828231] schedule+0x87/0xc5 > [ 246.831779] snd_card_disconnect_sync+0xb5/0x127 > ... > [ 246.889249] snd_sof_device_shutdown+0xb4/0x150 > [ 246.899317] pci_device_shutdown+0x37/0x61 > [ 246.903990] device_shutdown+0x14c/0x1d6 > [ 246.908391] kernel_kexec+0x45/0xb9 > > and another is > > [ 246.893222] INFO: task kexec-lite:4891 blocked for more than 122 seconds. > [ 246.927709] Call Trace: > [ 246.930461] <TASK> > [ 246.932819] __schedule+0x5f9/0x1263 > [ 246.936855] ? fsnotify_grab_connector+0x5c/0x70 > [ 246.942045] schedule+0x87/0xc5 > [ 246.945567] schedule_timeout+0x49/0xf3 > [ 246.949877] wait_for_completion+0x86/0xe8 > [ 246.954463] snd_card_free+0x68/0x89 > ... > [ 247.001080] platform_device_unregister+0x12/0x35 > > The former is likely the SOF code path by the commit you mentioned > (but it's not 100% clear because you trimmed the stack trace), and > this should be reconsidered. Let me add both traces completely: [ 247.080494] INFO: task kexec-lite:5441 blocked for more than 122 seconds. [ 247.088100] Tainted: G U 5.15.79-14932-g6eecac25030c #5 [ 247.095984] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.104745] task:kexec-lite state:D stack: 0 pid: 5441 ppid: 5390 flags:0x00004002 [ 247.114083] Call Trace: [ 247.116833] <TASK> [ 247.119185] __schedule+0x5f9/0x1263 [ 247.123193] schedule+0x8e/0xdb [ 247.126698] snd_card_disconnect_sync+0xb5/0x127 [ 247.131877] ? init_wait_entry+0x31/0x31 [ 247.136270] soc_cleanup_card_resources+0x27/0x1d3 [ 247.141636] snd_soc_unbind_card+0xa6/0xfc [ 247.146218] snd_soc_unregister_card+0x26/0x34 [ 247.151192] release_nodes+0x43/0x62 [ 247.155200] devres_release_all+0x8b/0xc4 [ 247.159695] device_release_driver_internal+0x11a/0x1be [ 247.165546] bus_remove_device+0xd9/0x103 [ 247.170038] device_del+0x1f7/0x355 [ 247.173946] platform_device_del+0x28/0x8e [ 247.178539] platform_device_unregister+0x12/0x35 [ 247.183807] snd_sof_device_shutdown+0x57/0x94 [snd_sof 7d15170beb0e48032711b3230201744d5e4e590d] [ 247.193742] pci_device_shutdown+0x37/0x61 [ 247.198335] device_shutdown+0x14c/0x1d6 [ 247.202735] kernel_kexec+0x45/0xb9 [ 247.206646] __se_sys_reboot+0x173/0x1f6 [ 247.211041] ? syscall_enter_from_user_mode+0x1a6/0x1ab [ 247.216885] do_syscall_64+0x55/0x9d [ 247.220879] ? exit_to_user_mode_prepare+0x3c/0x8b [ 247.226245] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 247.231889] RIP: 0033:0x79b6a8563693 [ 247.235897] RSP: 002b:00007fff9ab771b8 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9 [ 247.244375] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 000079b6a8563693 [ 247.252363] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead [ 247.260337] RBP: 00007fff9ab77200 R08: 0000000000000004 R09: 00007fff9ab7930b [ 247.268323] R10: 00007fff9ab7936a R11: 0000000000000202 R12: 00000000ffffffff [ 247.276297] R13: 0000000000000004 R14: 00007fff9ab7936a R15: 0000000000000003 [ 247.284272] </TASK> [ 247.286729] task:init state:D stack: 0 pid: 1 ppid: 0 flags:0x00004006 [ 247.296079] Call Trace: [ 247.298810] <TASK> [ 247.301155] __schedule+0x5f9/0x1263 [ 247.305163] ? core_sys_select+0x278/0x343 [ 247.309754] schedule+0x8e/0xdb [ 247.313261] __refrigerator+0x5e/0x97 [ 247.317364] get_signal+0x5e4/0x5e9 [ 247.321280] arch_do_signal_or_restart+0x4a/0x27a [ 247.326550] exit_to_user_mode_loop+0x76/0xde [ 247.331433] exit_to_user_mode_prepare+0x61/0x8b [ 247.336606] syscall_exit_to_user_mode+0x26/0x168 [ 247.341877] do_syscall_64+0x63/0x9d [ 247.345871] ? exit_to_user_mode_prepare+0x3c/0x8b [ 247.351227] entry_SYSCALL_64_after_hwframe+0x61/0xcb Then, after adding the (pm_freezing hack) in card_disconnect_sync [ 247.081334] INFO: task kexec-lite:5568 blocked for more than 122 seconds. [ 247.088940] Tainted: G U 5.15.79-14932-g6eecac25030c-dirty #6 [ 247.097428] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.106198] task:kexec-lite state:D stack: 0 pid: 5568 ppid: 5389 flags:0x00004002 [ 247.115552] Call Trace: [ 247.118297] <TASK> [ 247.120651] __schedule+0x5f9/0x1263 [ 247.124655] ? fsnotify_grab_connector+0x5c/0x70 [ 247.129829] schedule+0x8e/0xdb [ 247.133354] schedule_timeout+0x49/0xf3 [ 247.137655] wait_for_completion+0x86/0xe8 [ 247.142249] snd_card_free+0x68/0x89 [ 247.146258] soc_cleanup_card_resources+0x1b5/0x1d3 [ 247.151725] snd_soc_unbind_card+0xa6/0xfc [ 247.156314] snd_soc_unregister_card+0x26/0x34 [ 247.161293] release_nodes+0x43/0x62 [ 247.165303] devres_release_all+0x8b/0xc4 [ 247.169797] device_release_driver_internal+0x11a/0x1be [ 247.175654] bus_remove_device+0xd9/0x103 [ 247.180147] device_del+0x1f7/0x355 [ 247.184058] platform_device_del+0x28/0x8e [ 247.188649] platform_device_unregister+0x12/0x35 [ 247.193923] snd_sof_device_shutdown+0x57/0x94 [snd_sof e88f872f6695f09a149f1a182cec1fe8e07d5eea] [ 247.203858] pci_device_shutdown+0x37/0x61 [ 247.208455] device_shutdown+0x14c/0x1d6 [ 247.212858] kernel_kexec+0x45/0xb9 [ 247.216759] __se_sys_reboot+0x173/0x1f6 [ 247.221157] ? syscall_enter_from_user_mode+0x1a6/0x1ab [ 247.227016] do_syscall_64+0x55/0x9d [ 247.231020] ? exit_to_user_mode_prepare+0x3c/0x8b [ 247.236386] entry_SYSCALL_64_after_hwframe+0x61/0xcb [ 247.242048] RIP: 0033:0x7d2d2634a693 [ 247.246056] RSP: 002b:00007ffdceeb6338 EFLAGS: 00000202 ORIG_RAX: 00000000000000a9 [ 247.254532] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007d2d2634a693 [ 247.262520] RDX: 0000000045584543 RSI: 0000000028121969 RDI: 00000000fee1dead [ 247.270500] RBP: 00007ffdceeb6380 R08: 0000000000000004 R09: 00007ffdceeb8305 [ 247.278489] R10: 00007ffdceeb836a R11: 0000000000000202 R12: 00000000ffffffff [ 247.286468] R13: 0000000000000004 R14: 00007ffdceeb836a R15: 0000000000000003 [ 247.294458] </TASK> [ 247.296909] task:init state:D stack: 0 pid: 1 ppid: 0 flags:0x00004006 [ 247.306253] Call Trace: [ 247.308994] <TASK> [ 247.311337] __schedule+0x5f9/0x1263 [ 247.315330] ? core_sys_select+0x278/0x343 [ 247.319924] schedule+0x8e/0xdb [ 247.323445] __refrigerator+0x5e/0x97 [ 247.327537] get_signal+0x5e4/0x5e9 [ 247.331449] arch_do_signal_or_restart+0x4a/0x27a [ 247.336721] exit_to_user_mode_loop+0x76/0xde [ 247.341602] exit_to_user_mode_prepare+0x61/0x8b [ 247.346763] syscall_exit_to_user_mode+0x26/0x168 [ 247.352035] do_syscall_64+0x63/0x9d [ 247.356043] ? exit_to_user_mode_prepare+0x3c/0x8b [ 247.361410] entry_SYSCALL_64_after_hwframe+0x61/0xcb Simply commenting out snd_sof_machine_unregister(sdev, pdata); does the trick though... will send a new version with just that. > > But, the latter seems to be independent from that. If that's the code > path where the unbind is triggered before kexec, your fix might not > work, either; it could be already at the wait_event*() when kexec > starts. > > Maybe a simpler workaround would be to replace it with > wait_event_killable*() stuff. But whether we can discontinue the sync > there is still another thing to consider... > > > Takashi > > > > > Thanks! > > > > > > > > thanks, > > > > > > Takashi > > > > > > > > > > > To: Jaroslav Kysela <perex@perex.cz> > > > > To: Takashi Iwai <tiwai@suse.com> > > > > To: "Rafael J. Wysocki" <rafael@kernel.org> > > > > To: Pavel Machek <pavel@ucw.cz> > > > > To: Len Brown <len.brown@intel.com> > > > > To: Kai Vehmanen <kai.vehmanen@linux.intel.com> > > > > To: Ranjani Sridharan <ranjani.sridharan@linux.intel.com> > > > > To: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com> > > > > To: Mark Brown <broonie@kernel.org> > > > > Cc: alsa-devel@alsa-project.org > > > > Cc: linux-kernel@vger.kernel.org > > > > Cc: "Joel Fernandes (Google)" <joel@joelfernandes.org> > > > > Cc: linux-pm@vger.kernel.org > > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> > > > > --- > > > > Changes in v3: > > > > - Wrap pm_freezing in a function > > > > - Link to v2: https://lore.kernel.org/r/20221127-snd-freeze-v2-0-d8a425ea9663@chromium.org > > > > > > > > Changes in v2: > > > > - Only use pm_freezing if CONFIG_FREEZER > > > > - Link to v1: https://lore.kernel.org/r/20221127-snd-freeze-v1-0-57461a366ec2@chromium.org > > > > > > > > --- > > > > Ricardo Ribalda (2): > > > > freezer: Add processes_frozen() > > > > ALSA: core: Fix deadlock when shutdown a frozen userspace > > > > > > > > include/linux/freezer.h | 2 ++ > > > > kernel/freezer.c | 11 +++++++++++ > > > > sound/core/init.c | 13 +++++++++++++ > > > > 3 files changed, 26 insertions(+) > > > > --- > > > > base-commit: 4312098baf37ee17a8350725e6e0d0e8590252d4 > > > > change-id: 20221127-snd-freeze-1ee143228326 > > > > > > > > Best regards, > > > > -- > > > > Ricardo Ribalda <ribalda@chromium.org> > > > > > > > > > > > > -- > > Ricardo Ribalda > >