Message ID | 20231102142643.75288-4-angelogioacchino.delregno@collabora.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:8f47:0:b0:403:3b70:6f57 with SMTP id j7csp393376vqu; Thu, 2 Nov 2023 07:27:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGKuMnuMs3rYrb+MXE+JSvyJjcNvH7bS0GdE5M+63H80YVHCNHNw0kEmehMVqxjJhlV9eN5 X-Received: by 2002:a05:6a21:71c7:b0:181:261f:f347 with SMTP id ay7-20020a056a2171c700b00181261ff347mr6588152pzc.32.1698935233783; Thu, 02 Nov 2023 07:27:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698935233; cv=none; d=google.com; s=arc-20160816; b=MoDPyaziP/QBfxQ5UKKo3KlFTTbvUSBs/xtnefBI4JaZy54soZAXeGsvPvCw1SQtta rjMQONK87GTAEmqkBhz8YaKKV/AdSVMkhHOEd3VCGzIhF3vJN6ML1ffEJ3BKNicNbGrQ dH3lk1YiBo4C/dk5oOoIFYW5MUG2vdUGYuViGLirr0ka4wjb06j0gQLhmqRLzapj6g+I WIViCItqwMH1PbPCpsR5vjDA1Y5XNkX92yyPaINY40X85P6nzvu9SyHDV9Fu2yFOSu9q QcFKisDBCFSeNQxF9KWkRSrAOyBC/3mGrDy2FhAtQdm70bOL+lEAlCHS48cTVjj0cVBW pDdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=IQlY+Ogct50SulhYBO0Ht8k2i9eNXTRvLhpGLGyd9SQ=; fh=tAWvjMAEmVD3R8FzmIRYzQIb+yW9J+gWZJ2R8deHkVw=; b=dKFMXXO5/HfAFjQkn76Ou2jDcVhp8DXUsFTSexej3XkS8NdhTQ/H2EHBKltO7+uyM9 rhvA33wxhqB5zWE0CtrKmXkIW+0RbcKskD9rcVAc9F4s2tbAXYobkCnmZ9kQC/Yu3JpX UqVw+Au4gFPlFE7ybw7YjZMIAPdX5ea0ltlzbTZTt0AUv2TXAP73THUazHTbkqMWCbF4 V1PZHVk6h+mrTeooJr3uwPjSg9ADoPfjYikmHzmyjlAQji35V5TOfajETJ9SyGOXqjw0 zlqgWKolLKklL+s7mYWHCrGkss7M880XEiW4BZAyuAi0asmQtc6y7KEJBUFoOewdpuBF aDiQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=lvCYDqjf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id i18-20020a633c52000000b005b92cefe42csi1980840pgn.97.2023.11.02.07.27.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Nov 2023 07:27:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=lvCYDqjf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id E5EE1822D11A; Thu, 2 Nov 2023 07:27:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376834AbjKBO1J (ORCPT <rfc822;lhua1029@gmail.com> + 36 others); Thu, 2 Nov 2023 10:27:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1376810AbjKBO1C (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Nov 2023 10:27:02 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [IPv6:2a00:1098:0:82:1000:25:2eeb:e5ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DC061A6 for <linux-kernel@vger.kernel.org>; Thu, 2 Nov 2023 07:26:55 -0700 (PDT) Received: from IcarusMOD.eternityproject.eu (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: kholk11) by madras.collabora.co.uk (Postfix) with ESMTPSA id 9461A66071A3; Thu, 2 Nov 2023 14:26:53 +0000 (GMT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1698935214; bh=/LtYZX88iauWNQZJUo4rCxgbBpDr9xze3AUhEtbVTX4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lvCYDqjfsnlTfsTUF/eGRTtWeN06O3QFIIgwOJE09mRbW8MjPHG1/i43A+8/nblLF p+E3VMrHJD5qpmnaxqxCumtS1z2BrlDmtmX8sk/GycpwBkjj+DglrTVqL4sFFaVcIA mDY8v7I6SxgubAArRmtIzOscjqMEmow1sr9XbJfDrmYhrw5bd/8a25/isvw4Vd+EB2 PnGNLBcJVqguD8Dtf+wt87G6W3LOzTvN+WdZgVQ0+DSpDYcbGjeqlj7xJZzGvIhT/D yi2c43PAITxkqG1psxvhQyXW1jUSq2TUp4v7ohJYSBW0qvrUY40AqbmX11RD9UlA1A sJI57ISmuQnLQ== From: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> To: boris.brezillon@collabora.com Cc: robh@kernel.org, steven.price@arm.com, maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, wenst@chromium.org, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>, kernel@collabora.com Subject: [PATCH v2 3/6] drm/panfrost: Implement ability to turn on/off GPU clocks in suspend Date: Thu, 2 Nov 2023 15:26:40 +0100 Message-ID: <20231102142643.75288-4-angelogioacchino.delregno@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231102142643.75288-1-angelogioacchino.delregno@collabora.com> References: <20231102142643.75288-1-angelogioacchino.delregno@collabora.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 02 Nov 2023 07:27:13 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781462711847174821 X-GMAIL-MSGID: 1781462711847174821 |
Series |
drm/panfrost: Turn off clocks and regulators in PM
|
|
Commit Message
AngeloGioacchino Del Regno
Nov. 2, 2023, 2:26 p.m. UTC
Currently, the GPU is being internally powered off for runtime suspend
and turned back on for runtime resume through commands sent to it, but
note that the GPU doesn't need to be clocked during the poweroff state,
hence it is possible to save some power on selected platforms.
Add suspend and resume handlers for full system sleep and then add
a new panfrost_gpu_pm enumeration and a pm_features variable in the
panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to
enable this power saving technique only on SoCs that are able to
safely use it.
Note that this was implemented only for the system sleep case and not
for runtime PM because testing on one of my MediaTek platforms showed
issues when turning on and off clocks aggressively (in PM runtime)
resulting in a full system lockup.
Doing this only for full system sleep never showed issues during my
testing by suspending and resuming the system continuously for more
than 100 cycles.
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
---
Note: Even after fixing the panfrost_power_off() function, I'm still
getting issues with turning off the clocks at .runtime_suspend() but
this time, instead of getting a GPU lockup, the entire SoC will deadlock
bringing down the entire system with it (so it's even worst!) :-)
drivers/gpu/drm/panfrost/panfrost_device.c | 61 ++++++++++++++++++++--
drivers/gpu/drm/panfrost/panfrost_device.h | 11 ++++
2 files changed, 68 insertions(+), 4 deletions(-)
Comments
On Thu, Nov 2, 2023 at 10:26 PM AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> wrote: > > Currently, the GPU is being internally powered off for runtime suspend > and turned back on for runtime resume through commands sent to it, but > note that the GPU doesn't need to be clocked during the poweroff state, > hence it is possible to save some power on selected platforms. > > Add suspend and resume handlers for full system sleep and then add > a new panfrost_gpu_pm enumeration and a pm_features variable in the > panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to > enable this power saving technique only on SoCs that are able to > safely use it. > > Note that this was implemented only for the system sleep case and not > for runtime PM because testing on one of my MediaTek platforms showed > issues when turning on and off clocks aggressively (in PM runtime) > resulting in a full system lockup. > > Doing this only for full system sleep never showed issues during my > testing by suspending and resuming the system continuously for more > than 100 cycles. > > Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> > --- > > Note: Even after fixing the panfrost_power_off() function, I'm still > getting issues with turning off the clocks at .runtime_suspend() but > this time, instead of getting a GPU lockup, the entire SoC will deadlock > bringing down the entire system with it (so it's even worst!) :-) IIRC the power domain controller also manages some bus isolation bits that prevent SoC lockup when the clock is disabled. Would reversing the runtime PM calls and the clock calls help? ChenYu > drivers/gpu/drm/panfrost/panfrost_device.c | 61 ++++++++++++++++++++-- > drivers/gpu/drm/panfrost/panfrost_device.h | 11 ++++ > 2 files changed, 68 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c > index 28f7046e1b1a..2022ed76a620 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.c > +++ b/drivers/gpu/drm/panfrost/panfrost_device.c > @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev) > panfrost_job_enable_interrupts(pfdev); > } > > -static int panfrost_device_resume(struct device *dev) > +static int panfrost_device_runtime_resume(struct device *dev) > { > struct panfrost_device *pfdev = dev_get_drvdata(dev); > > @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev) > return 0; > } > > -static int panfrost_device_suspend(struct device *dev) > +static int panfrost_device_runtime_suspend(struct device *dev) > { > struct panfrost_device *pfdev = dev_get_drvdata(dev); > > @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev) > return 0; > } > > -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, > - panfrost_device_resume, NULL); > +static int panfrost_device_resume(struct device *dev) > +{ > + struct panfrost_device *pfdev = dev_get_drvdata(dev); > + int ret; > + > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { > + ret = clk_enable(pfdev->clock); > + if (ret) > + return ret; > + > + if (pfdev->bus_clock) { > + ret = clk_enable(pfdev->bus_clock); > + if (ret) > + goto err_bus_clk; > + } > + } > + > + ret = pm_runtime_force_resume(dev); > + if (ret) > + goto err_resume; > + > + return 0; > + > +err_resume: > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) > + clk_disable(pfdev->bus_clock); > +err_bus_clk: > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) > + clk_disable(pfdev->clock); > + return ret; > +} > + > +static int panfrost_device_suspend(struct device *dev) > +{ > + struct panfrost_device *pfdev = dev_get_drvdata(dev); > + int ret; > + > + ret = pm_runtime_force_suspend(dev); > + if (ret) > + return ret; > + > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { > + clk_disable(pfdev->clock); > + > + if (pfdev->bus_clock) > + clk_disable(pfdev->bus_clock); > + } > + > + return 0; > +} > + > +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { > + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) > + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) > +}; > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h > index 1ef38f60d5dc..d7f179eb8ea3 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.h > +++ b/drivers/gpu/drm/panfrost/panfrost_device.h > @@ -25,6 +25,14 @@ struct panfrost_perfcnt; > #define NUM_JOB_SLOTS 3 > #define MAX_PM_DOMAINS 5 > > +/** > + * enum panfrost_gpu_pm - Supported kernel power management features > + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend > + */ > +enum panfrost_gpu_pm { > + GPU_PM_CLK_DIS, > +}; > + > struct panfrost_features { > u16 id; > u16 revision; > @@ -75,6 +83,9 @@ struct panfrost_compatible { > > /* Vendor implementation quirks callback */ > void (*vendor_quirk)(struct panfrost_device *pfdev); > + > + /* Allowed PM features */ > + u8 pm_features; > }; > > struct panfrost_device { > -- > 2.42.0 >
Il 03/11/23 06:12, Chen-Yu Tsai ha scritto: > On Thu, Nov 2, 2023 at 10:26 PM AngeloGioacchino Del Regno > <angelogioacchino.delregno@collabora.com> wrote: >> >> Currently, the GPU is being internally powered off for runtime suspend >> and turned back on for runtime resume through commands sent to it, but >> note that the GPU doesn't need to be clocked during the poweroff state, >> hence it is possible to save some power on selected platforms. >> >> Add suspend and resume handlers for full system sleep and then add >> a new panfrost_gpu_pm enumeration and a pm_features variable in the >> panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to >> enable this power saving technique only on SoCs that are able to >> safely use it. >> >> Note that this was implemented only for the system sleep case and not >> for runtime PM because testing on one of my MediaTek platforms showed >> issues when turning on and off clocks aggressively (in PM runtime) >> resulting in a full system lockup. >> >> Doing this only for full system sleep never showed issues during my >> testing by suspending and resuming the system continuously for more >> than 100 cycles. >> >> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> >> --- >> >> Note: Even after fixing the panfrost_power_off() function, I'm still >> getting issues with turning off the clocks at .runtime_suspend() but >> this time, instead of getting a GPU lockup, the entire SoC will deadlock >> bringing down the entire system with it (so it's even worst!) :-) > > IIRC the power domain controller also manages some bus isolation bits > that prevent SoC lockup when the clock is disabled. Would reversing > the runtime PM calls and the clock calls help? > Thanks for the reminder, but I tested that already... that doesn't work. There's one more thing I tried: on the MFG iospace, there are debug registers that you can poll to check if all bus transactions are finished (so, if the bus is idle). During local testing, I even hacked in that, and even with the actual bus being completely idle, it still freezes... and also checked some more in downstream code (for Dimensity 9200, kernel 5.10) if there was any other "trick" that I could make use of, but to no avail. I'd propose to get at least this power saving upstreamed, then perhaps in the future we can somehow revisit this to implement some more aggressive power management code? We're still getting a generous power saving with this one, I'd say... Anyway, I expect us to be effectively able to be more aggressive here, but I also expect that to take quite a bit of time (and probably some help from MediaTek as well)... Angelo > ChenYu > >> drivers/gpu/drm/panfrost/panfrost_device.c | 61 ++++++++++++++++++++-- >> drivers/gpu/drm/panfrost/panfrost_device.h | 11 ++++ >> 2 files changed, 68 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c >> index 28f7046e1b1a..2022ed76a620 100644 >> --- a/drivers/gpu/drm/panfrost/panfrost_device.c >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c >> @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev) >> panfrost_job_enable_interrupts(pfdev); >> } >> >> -static int panfrost_device_resume(struct device *dev) >> +static int panfrost_device_runtime_resume(struct device *dev) >> { >> struct panfrost_device *pfdev = dev_get_drvdata(dev); >> >> @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev) >> return 0; >> } >> >> -static int panfrost_device_suspend(struct device *dev) >> +static int panfrost_device_runtime_suspend(struct device *dev) >> { >> struct panfrost_device *pfdev = dev_get_drvdata(dev); >> >> @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev) >> return 0; >> } >> >> -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, >> - panfrost_device_resume, NULL); >> +static int panfrost_device_resume(struct device *dev) >> +{ >> + struct panfrost_device *pfdev = dev_get_drvdata(dev); >> + int ret; >> + >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { >> + ret = clk_enable(pfdev->clock); >> + if (ret) >> + return ret; >> + >> + if (pfdev->bus_clock) { >> + ret = clk_enable(pfdev->bus_clock); >> + if (ret) >> + goto err_bus_clk; >> + } >> + } >> + >> + ret = pm_runtime_force_resume(dev); >> + if (ret) >> + goto err_resume; >> + >> + return 0; >> + >> +err_resume: >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) >> + clk_disable(pfdev->bus_clock); >> +err_bus_clk: >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) >> + clk_disable(pfdev->clock); >> + return ret; >> +} >> + >> +static int panfrost_device_suspend(struct device *dev) >> +{ >> + struct panfrost_device *pfdev = dev_get_drvdata(dev); >> + int ret; >> + >> + ret = pm_runtime_force_suspend(dev); >> + if (ret) >> + return ret; >> + >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { >> + clk_disable(pfdev->clock); >> + >> + if (pfdev->bus_clock) >> + clk_disable(pfdev->bus_clock); >> + } >> + >> + return 0; >> +} >> + >> +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { >> + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) >> + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) >> +}; >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h >> index 1ef38f60d5dc..d7f179eb8ea3 100644 >> --- a/drivers/gpu/drm/panfrost/panfrost_device.h >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h >> @@ -25,6 +25,14 @@ struct panfrost_perfcnt; >> #define NUM_JOB_SLOTS 3 >> #define MAX_PM_DOMAINS 5 >> >> +/** >> + * enum panfrost_gpu_pm - Supported kernel power management features >> + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend >> + */ >> +enum panfrost_gpu_pm { >> + GPU_PM_CLK_DIS, >> +}; >> + >> struct panfrost_features { >> u16 id; >> u16 revision; >> @@ -75,6 +83,9 @@ struct panfrost_compatible { >> >> /* Vendor implementation quirks callback */ >> void (*vendor_quirk)(struct panfrost_device *pfdev); >> + >> + /* Allowed PM features */ >> + u8 pm_features; >> }; >> >> struct panfrost_device { >> -- >> 2.42.0 >>
On Fri, Nov 3, 2023 at 4:54 PM AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> wrote: > > Il 03/11/23 06:12, Chen-Yu Tsai ha scritto: > > On Thu, Nov 2, 2023 at 10:26 PM AngeloGioacchino Del Regno > > <angelogioacchino.delregno@collabora.com> wrote: > >> > >> Currently, the GPU is being internally powered off for runtime suspend > >> and turned back on for runtime resume through commands sent to it, but > >> note that the GPU doesn't need to be clocked during the poweroff state, > >> hence it is possible to save some power on selected platforms. > >> > >> Add suspend and resume handlers for full system sleep and then add > >> a new panfrost_gpu_pm enumeration and a pm_features variable in the > >> panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to > >> enable this power saving technique only on SoCs that are able to > >> safely use it. > >> > >> Note that this was implemented only for the system sleep case and not > >> for runtime PM because testing on one of my MediaTek platforms showed > >> issues when turning on and off clocks aggressively (in PM runtime) > >> resulting in a full system lockup. > >> > >> Doing this only for full system sleep never showed issues during my > >> testing by suspending and resuming the system continuously for more > >> than 100 cycles. > >> > >> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> > >> --- > >> > >> Note: Even after fixing the panfrost_power_off() function, I'm still > >> getting issues with turning off the clocks at .runtime_suspend() but > >> this time, instead of getting a GPU lockup, the entire SoC will deadlock > >> bringing down the entire system with it (so it's even worst!) :-) > > > > IIRC the power domain controller also manages some bus isolation bits > > that prevent SoC lockup when the clock is disabled. Would reversing > > the runtime PM calls and the clock calls help? > > > > Thanks for the reminder, but I tested that already... that doesn't work. > > There's one more thing I tried: on the MFG iospace, there are debug registers > that you can poll to check if all bus transactions are finished (so, if the bus > is idle). > During local testing, I even hacked in that, and even with the actual bus being > completely idle, it still freezes... and also checked some more in downstream > code (for Dimensity 9200, kernel 5.10) if there was any other "trick" that I > could make use of, but to no avail. > > I'd propose to get at least this power saving upstreamed, then perhaps in the > future we can somehow revisit this to implement some more aggressive power > management code? > We're still getting a generous power saving with this one, I'd say... > > Anyway, I expect us to be effectively able to be more aggressive here, but I > also expect that to take quite a bit of time (and probably some help from > MediaTek as well)... Sounds good to me. > Angelo > > > ChenYu > > > >> drivers/gpu/drm/panfrost/panfrost_device.c | 61 ++++++++++++++++++++-- > >> drivers/gpu/drm/panfrost/panfrost_device.h | 11 ++++ > >> 2 files changed, 68 insertions(+), 4 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c > >> index 28f7046e1b1a..2022ed76a620 100644 > >> --- a/drivers/gpu/drm/panfrost/panfrost_device.c > >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c > >> @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev) > >> panfrost_job_enable_interrupts(pfdev); > >> } > >> > >> -static int panfrost_device_resume(struct device *dev) > >> +static int panfrost_device_runtime_resume(struct device *dev) > >> { > >> struct panfrost_device *pfdev = dev_get_drvdata(dev); > >> > >> @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev) > >> return 0; > >> } > >> > >> -static int panfrost_device_suspend(struct device *dev) > >> +static int panfrost_device_runtime_suspend(struct device *dev) > >> { > >> struct panfrost_device *pfdev = dev_get_drvdata(dev); > >> > >> @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev) > >> return 0; > >> } > >> > >> -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, > >> - panfrost_device_resume, NULL); > >> +static int panfrost_device_resume(struct device *dev) > >> +{ > >> + struct panfrost_device *pfdev = dev_get_drvdata(dev); > >> + int ret; > >> + > >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { > >> + ret = clk_enable(pfdev->clock); > >> + if (ret) > >> + return ret; > >> + > >> + if (pfdev->bus_clock) { > >> + ret = clk_enable(pfdev->bus_clock); > >> + if (ret) > >> + goto err_bus_clk; > >> + } > >> + } > >> + > >> + ret = pm_runtime_force_resume(dev); > >> + if (ret) > >> + goto err_resume; > >> + > >> + return 0; > >> + > >> +err_resume: > >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) > >> + clk_disable(pfdev->bus_clock); > >> +err_bus_clk: > >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) > >> + clk_disable(pfdev->clock); > >> + return ret; > >> +} > >> + > >> +static int panfrost_device_suspend(struct device *dev) > >> +{ > >> + struct panfrost_device *pfdev = dev_get_drvdata(dev); > >> + int ret; > >> + > >> + ret = pm_runtime_force_suspend(dev); > >> + if (ret) > >> + return ret; > >> + > >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { > >> + clk_disable(pfdev->clock); > >> + > >> + if (pfdev->bus_clock) > >> + clk_disable(pfdev->bus_clock); > >> + } > >> + > >> + return 0; > >> +} > >> + > >> +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { > >> + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) > >> + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) > >> +}; > >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h > >> index 1ef38f60d5dc..d7f179eb8ea3 100644 > >> --- a/drivers/gpu/drm/panfrost/panfrost_device.h > >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h > >> @@ -25,6 +25,14 @@ struct panfrost_perfcnt; > >> #define NUM_JOB_SLOTS 3 > >> #define MAX_PM_DOMAINS 5 > >> > >> +/** > >> + * enum panfrost_gpu_pm - Supported kernel power management features > >> + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend > >> + */ > >> +enum panfrost_gpu_pm { > >> + GPU_PM_CLK_DIS, > >> +}; > >> + > >> struct panfrost_features { > >> u16 id; > >> u16 revision; > >> @@ -75,6 +83,9 @@ struct panfrost_compatible { > >> > >> /* Vendor implementation quirks callback */ > >> void (*vendor_quirk)(struct panfrost_device *pfdev); > >> + > >> + /* Allowed PM features */ > >> + u8 pm_features; > >> }; > >> > >> struct panfrost_device { > >> -- > >> 2.42.0 > >> >
On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote: > Currently, the GPU is being internally powered off for runtime suspend > and turned back on for runtime resume through commands sent to it, but > note that the GPU doesn't need to be clocked during the poweroff state, > hence it is possible to save some power on selected platforms. > > Add suspend and resume handlers for full system sleep and then add > a new panfrost_gpu_pm enumeration and a pm_features variable in the > panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to > enable this power saving technique only on SoCs that are able to > safely use it. > > Note that this was implemented only for the system sleep case and not > for runtime PM because testing on one of my MediaTek platforms showed > issues when turning on and off clocks aggressively (in PM runtime) > resulting in a full system lockup. > > Doing this only for full system sleep never showed issues during my > testing by suspending and resuming the system continuously for more > than 100 cycles. > > Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> > --- > > Note: Even after fixing the panfrost_power_off() function, I'm still > getting issues with turning off the clocks at .runtime_suspend() but > this time, instead of getting a GPU lockup, the entire SoC will deadlock > bringing down the entire system with it (so it's even worst!) :-) Ouch! Hopefully that's a SoC issue as I can't see anything that should cause problems. But note that if the GPU is powered down during a bus transaction that can lock up the entire bus. > > > drivers/gpu/drm/panfrost/panfrost_device.c | 61 ++++++++++++++++++++-- > drivers/gpu/drm/panfrost/panfrost_device.h | 11 ++++ > 2 files changed, 68 insertions(+), 4 deletions(-) > > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c > index 28f7046e1b1a..2022ed76a620 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.c > +++ b/drivers/gpu/drm/panfrost/panfrost_device.c > @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev) > panfrost_job_enable_interrupts(pfdev); > } > > -static int panfrost_device_resume(struct device *dev) > +static int panfrost_device_runtime_resume(struct device *dev) > { > struct panfrost_device *pfdev = dev_get_drvdata(dev); > > @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev) > return 0; > } > > -static int panfrost_device_suspend(struct device *dev) > +static int panfrost_device_runtime_suspend(struct device *dev) > { > struct panfrost_device *pfdev = dev_get_drvdata(dev); > > @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev) > return 0; > } > > -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, > - panfrost_device_resume, NULL); > +static int panfrost_device_resume(struct device *dev) > +{ > + struct panfrost_device *pfdev = dev_get_drvdata(dev); > + int ret; > + > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { > + ret = clk_enable(pfdev->clock); > + if (ret) > + return ret; > + > + if (pfdev->bus_clock) { > + ret = clk_enable(pfdev->bus_clock); > + if (ret) > + goto err_bus_clk; > + } > + } > + > + ret = pm_runtime_force_resume(dev); > + if (ret) > + goto err_resume; > + > + return 0; > + > +err_resume: > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) > + clk_disable(pfdev->bus_clock); > +err_bus_clk: > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) > + clk_disable(pfdev->clock); > + return ret; > +} > + > +static int panfrost_device_suspend(struct device *dev) > +{ > + struct panfrost_device *pfdev = dev_get_drvdata(dev); > + int ret; > + > + ret = pm_runtime_force_suspend(dev); > + if (ret) > + return ret; > + > + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { > + clk_disable(pfdev->clock); > + > + if (pfdev->bus_clock) > + clk_disable(pfdev->bus_clock); NIT: I would normally expect panfrost_device_resume() to have the opposite order. I'm not sure if there's an expected order here but I feel like the bus should be enabled before core - so _resume() would need to be swapped round. Other than that: Reviewed-by: Steven Price <steven.price@arm.com> Thanks, Steve > + } > + > + return 0; > +} > + > +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { > + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) > + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) > +}; > diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h > index 1ef38f60d5dc..d7f179eb8ea3 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_device.h > +++ b/drivers/gpu/drm/panfrost/panfrost_device.h > @@ -25,6 +25,14 @@ struct panfrost_perfcnt; > #define NUM_JOB_SLOTS 3 > #define MAX_PM_DOMAINS 5 > > +/** > + * enum panfrost_gpu_pm - Supported kernel power management features > + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend > + */ > +enum panfrost_gpu_pm { > + GPU_PM_CLK_DIS, > +}; > + > struct panfrost_features { > u16 id; > u16 revision; > @@ -75,6 +83,9 @@ struct panfrost_compatible { > > /* Vendor implementation quirks callback */ > void (*vendor_quirk)(struct panfrost_device *pfdev); > + > + /* Allowed PM features */ > + u8 pm_features; > }; > > struct panfrost_device {
Il 08/11/23 16:44, Steven Price ha scritto: > On 02/11/2023 14:26, AngeloGioacchino Del Regno wrote: >> Currently, the GPU is being internally powered off for runtime suspend >> and turned back on for runtime resume through commands sent to it, but >> note that the GPU doesn't need to be clocked during the poweroff state, >> hence it is possible to save some power on selected platforms. >> >> Add suspend and resume handlers for full system sleep and then add >> a new panfrost_gpu_pm enumeration and a pm_features variable in the >> panfrost_compatible structure: BIT(GPU_PM_CLK_DIS) will be used to >> enable this power saving technique only on SoCs that are able to >> safely use it. >> >> Note that this was implemented only for the system sleep case and not >> for runtime PM because testing on one of my MediaTek platforms showed >> issues when turning on and off clocks aggressively (in PM runtime) >> resulting in a full system lockup. >> >> Doing this only for full system sleep never showed issues during my >> testing by suspending and resuming the system continuously for more >> than 100 cycles. >> >> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com> >> --- >> >> Note: Even after fixing the panfrost_power_off() function, I'm still >> getting issues with turning off the clocks at .runtime_suspend() but >> this time, instead of getting a GPU lockup, the entire SoC will deadlock >> bringing down the entire system with it (so it's even worst!) :-) > > Ouch! Hopefully that's a SoC issue as I can't see anything that should > cause problems. But note that if the GPU is powered down during a bus > transaction that can lock up the entire bus. >> >> >> drivers/gpu/drm/panfrost/panfrost_device.c | 61 ++++++++++++++++++++-- >> drivers/gpu/drm/panfrost/panfrost_device.h | 11 ++++ >> 2 files changed, 68 insertions(+), 4 deletions(-) >> >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c >> index 28f7046e1b1a..2022ed76a620 100644 >> --- a/drivers/gpu/drm/panfrost/panfrost_device.c >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.c >> @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev) >> panfrost_job_enable_interrupts(pfdev); >> } >> >> -static int panfrost_device_resume(struct device *dev) >> +static int panfrost_device_runtime_resume(struct device *dev) >> { >> struct panfrost_device *pfdev = dev_get_drvdata(dev); >> >> @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev) >> return 0; >> } >> >> -static int panfrost_device_suspend(struct device *dev) >> +static int panfrost_device_runtime_suspend(struct device *dev) >> { >> struct panfrost_device *pfdev = dev_get_drvdata(dev); >> >> @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev) >> return 0; >> } >> >> -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, >> - panfrost_device_resume, NULL); >> +static int panfrost_device_resume(struct device *dev) >> +{ >> + struct panfrost_device *pfdev = dev_get_drvdata(dev); >> + int ret; >> + >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { >> + ret = clk_enable(pfdev->clock); >> + if (ret) >> + return ret; >> + >> + if (pfdev->bus_clock) { >> + ret = clk_enable(pfdev->bus_clock); >> + if (ret) >> + goto err_bus_clk; >> + } >> + } >> + >> + ret = pm_runtime_force_resume(dev); >> + if (ret) >> + goto err_resume; >> + >> + return 0; >> + >> +err_resume: >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) >> + clk_disable(pfdev->bus_clock); >> +err_bus_clk: >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) >> + clk_disable(pfdev->clock); >> + return ret; >> +} >> + >> +static int panfrost_device_suspend(struct device *dev) >> +{ >> + struct panfrost_device *pfdev = dev_get_drvdata(dev); >> + int ret; >> + >> + ret = pm_runtime_force_suspend(dev); >> + if (ret) >> + return ret; >> + >> + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { >> + clk_disable(pfdev->clock); >> + >> + if (pfdev->bus_clock) >> + clk_disable(pfdev->bus_clock); > > NIT: I would normally expect panfrost_device_resume() to have the > opposite order. I'm not sure if there's an expected order here but I > feel like the bus should be enabled before core - so _resume() would > need to be swapped round. > Actually, in panfrost_clk_init(), "bus" gets enabled after core... I'm not sure whether this was intentional or not either - but for consistency I will swap them in suspend (turning off `bus_clock` first, 'clock` after) as that's how it's done in panfrost_clk_fini() as well (except there the clocks are also unprepared). Though, I would agree on the logical fact that bus should get disabled after core... Cheers, Angelo > Other than that: > > Reviewed-by: Steven Price <steven.price@arm.com> > > Thanks, > > Steve > >> + } >> + >> + return 0; >> +} >> + >> +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { >> + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) >> + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) >> +}; >> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h >> index 1ef38f60d5dc..d7f179eb8ea3 100644 >> --- a/drivers/gpu/drm/panfrost/panfrost_device.h >> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h >> @@ -25,6 +25,14 @@ struct panfrost_perfcnt; >> #define NUM_JOB_SLOTS 3 >> #define MAX_PM_DOMAINS 5 >> >> +/** >> + * enum panfrost_gpu_pm - Supported kernel power management features >> + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend >> + */ >> +enum panfrost_gpu_pm { >> + GPU_PM_CLK_DIS, >> +}; >> + >> struct panfrost_features { >> u16 id; >> u16 revision; >> @@ -75,6 +83,9 @@ struct panfrost_compatible { >> >> /* Vendor implementation quirks callback */ >> void (*vendor_quirk)(struct panfrost_device *pfdev); >> + >> + /* Allowed PM features */ >> + u8 pm_features; >> }; >> >> struct panfrost_device { > > _______________________________________________ > Kernel mailing list -- kernel@mailman.collabora.com > To unsubscribe send an email to kernel-leave@mailman.collabora.com
diff --git a/drivers/gpu/drm/panfrost/panfrost_device.c b/drivers/gpu/drm/panfrost/panfrost_device.c index 28f7046e1b1a..2022ed76a620 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.c +++ b/drivers/gpu/drm/panfrost/panfrost_device.c @@ -403,7 +403,7 @@ void panfrost_device_reset(struct panfrost_device *pfdev) panfrost_job_enable_interrupts(pfdev); } -static int panfrost_device_resume(struct device *dev) +static int panfrost_device_runtime_resume(struct device *dev) { struct panfrost_device *pfdev = dev_get_drvdata(dev); @@ -413,7 +413,7 @@ static int panfrost_device_resume(struct device *dev) return 0; } -static int panfrost_device_suspend(struct device *dev) +static int panfrost_device_runtime_suspend(struct device *dev) { struct panfrost_device *pfdev = dev_get_drvdata(dev); @@ -426,5 +426,58 @@ static int panfrost_device_suspend(struct device *dev) return 0; } -EXPORT_GPL_RUNTIME_DEV_PM_OPS(panfrost_pm_ops, panfrost_device_suspend, - panfrost_device_resume, NULL); +static int panfrost_device_resume(struct device *dev) +{ + struct panfrost_device *pfdev = dev_get_drvdata(dev); + int ret; + + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { + ret = clk_enable(pfdev->clock); + if (ret) + return ret; + + if (pfdev->bus_clock) { + ret = clk_enable(pfdev->bus_clock); + if (ret) + goto err_bus_clk; + } + } + + ret = pm_runtime_force_resume(dev); + if (ret) + goto err_resume; + + return 0; + +err_resume: + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS) && pfdev->bus_clock) + clk_disable(pfdev->bus_clock); +err_bus_clk: + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) + clk_disable(pfdev->clock); + return ret; +} + +static int panfrost_device_suspend(struct device *dev) +{ + struct panfrost_device *pfdev = dev_get_drvdata(dev); + int ret; + + ret = pm_runtime_force_suspend(dev); + if (ret) + return ret; + + if (pfdev->comp->pm_features & BIT(GPU_PM_CLK_DIS)) { + clk_disable(pfdev->clock); + + if (pfdev->bus_clock) + clk_disable(pfdev->bus_clock); + } + + return 0; +} + +EXPORT_GPL_DEV_PM_OPS(panfrost_pm_ops) = { + RUNTIME_PM_OPS(panfrost_device_runtime_suspend, panfrost_device_runtime_resume, NULL) + SYSTEM_SLEEP_PM_OPS(panfrost_device_suspend, panfrost_device_resume) +}; diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h index 1ef38f60d5dc..d7f179eb8ea3 100644 --- a/drivers/gpu/drm/panfrost/panfrost_device.h +++ b/drivers/gpu/drm/panfrost/panfrost_device.h @@ -25,6 +25,14 @@ struct panfrost_perfcnt; #define NUM_JOB_SLOTS 3 #define MAX_PM_DOMAINS 5 +/** + * enum panfrost_gpu_pm - Supported kernel power management features + * @GPU_PM_CLK_DIS: Allow disabling clocks during system suspend + */ +enum panfrost_gpu_pm { + GPU_PM_CLK_DIS, +}; + struct panfrost_features { u16 id; u16 revision; @@ -75,6 +83,9 @@ struct panfrost_compatible { /* Vendor implementation quirks callback */ void (*vendor_quirk)(struct panfrost_device *pfdev); + + /* Allowed PM features */ + u8 pm_features; }; struct panfrost_device {