Message ID | 20230919233556.1458793-7-adrian.larumbe@collabora.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp3900339vqi; Tue, 19 Sep 2023 22:52:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFvd8l9g6QUs+C3eihP+sjp1zydae6ROKb5sIGz31GyeB7qMYeTvVkBu/PPazYVfD+yqvnd X-Received: by 2002:a17:902:d2cd:b0:1bd:fa80:103d with SMTP id n13-20020a170902d2cd00b001bdfa80103dmr1642398plc.25.1695189130705; Tue, 19 Sep 2023 22:52:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695189130; cv=none; d=google.com; s=arc-20160816; b=uqa8oxt8zMlMXLMfr8mvUPoHHYWbMhUEtMvPST2cBtcZ7Hvv3cmZq3Ov/mvesgBHVH WQmSVT1kvJlPNBua2DU+fXaKRlTg8d8+kthvKeyyMjJlv5MSi28jyuZJP0l7A+LhowJp 45NL9alWNYczrARXOohbvmpvcFA2MvBs2y2CT8DFlh1g17vcu+/HzrV9u8J2W1QF8oJF 1C9SKfqCb6TEOk5Nhik1l3uaxwYy3KcO4D4w53mOh7QWg/adfyaR/Y3UYaa0R1t/gMFm Yok5r066CcOg8Ti8h6U3D/1Zpp3PjHUNOfpbzv+6uiY31dsWJnFk+2eBB/lwl9FQaE+j qHhg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=s2NxgYL3t6hNPtv3aOP8g0ZHV7nH5Y9M65lcgtuEM5o=; fh=J6jIr4zQ7C+2dedjt84bZIf7pTpHxTiA88jNoeHaVQc=; b=XUAkRA6iP12LdcKhs73j2En4DYYzm/d7XP50D35eNq2FPWMXBXVHIWw65LewY7fLas xSnfdo0ApVFWhxv06yTKEvjnKfUKreXEKFUgjcavs7LxQVCTbrz6BOhTg6Ntw0P9Gp1p aeYugdaA4ujPUcsw9PGVwzjnyutShITqBGimfqWw5hQwtKG7ZNnCJ9AYT7De5oYFZdyV RZepTcF0ZzSqTLOcqqas+RNIB/5nI0w77mChO2KEYTJQj0syUoayxTmRIYiLph7F6Lmr CVxsENOGRc/s6bOjA3dxAXd3wXruOM6ZRSB1wU/9mJpMdgIpcS9jnqJ7+KJhZzDP90+o tp8g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=Dq5b59vh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id m3-20020a170902db0300b001c434927d8dsi8970051plx.131.2023.09.19.22.52.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Sep 2023 22:52:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=Dq5b59vh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 5B211810C2D0; Tue, 19 Sep 2023 16:36:29 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233654AbjISXgY (ORCPT <rfc822;toshivichauhan@gmail.com> + 26 others); Tue, 19 Sep 2023 19:36:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42000 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233514AbjISXgP (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 19 Sep 2023 19:36:15 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 138D283; Tue, 19 Sep 2023 16:36:10 -0700 (PDT) Received: from localhost.localdomain (unknown [IPv6:2a02:8010:65b5:0:1ac0:4dff:feee:236a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: alarumbe) by madras.collabora.co.uk (Postfix) with ESMTPSA id BA73966071A9; Wed, 20 Sep 2023 00:36:06 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1695166566; bh=fsXk4Gu75MgGG6vNMPuwnxFfx901KJO2Un2zim24NTM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Dq5b59vhlYoNpLOW5qFt0Bk73Ed7dsZ6/s8FLoFexKkdanEzvgulyJ/glb9OcZBqD kPG1j/DenYrFKPpxvA/O/P7UJyULnjmBQpo7GuqQ60pszHZwJ700dXUt9blsbyJG99 HCNcd6iaCDq0g3L21X+G3lz4RfBHcO8Dycaqv4wCma4opNSboztcaI0Do/Vy2SZc1e AvQfaKoxU35FWQr738ZezlmdDFR5g1x2e8xC9Few10Vl4qFivODIQkFPtgJg9+ge0A fWqOec2wtLPCNeO2gGzr5UpdITgCq4SsqXzKrU+J1t74X0krAMoMsLZtP8dTP8ovQY PTw/l6lECRW8w== From: =?utf-8?q?Adri=C3=A1n_Larumbe?= <adrian.larumbe@collabora.com> To: maarten.lankhorst@linux.intel.com, mripard@kernel.org, tzimmermann@suse.de, airlied@gmail.com, daniel@ffwll.ch, robdclark@gmail.com, quic_abhinavk@quicinc.com, dmitry.baryshkov@linaro.org, sean@poorly.run, marijn.suijten@somainline.org, robh@kernel.org, steven.price@arm.com Cc: adrian.larumbe@collabora.com, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-arm-msm@vger.kernel.org, freedreno@lists.freedesktop.org, healych@amazon.com, kernel@collabora.com, Boris Brezillon <boris.brezillon@collabora.com> Subject: [PATCH v6 6/6] drm/drm-file: Show finer-grained BO sizes in drm_show_memory_stats Date: Wed, 20 Sep 2023 00:34:54 +0100 Message-ID: <20230919233556.1458793-7-adrian.larumbe@collabora.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20230919233556.1458793-1-adrian.larumbe@collabora.com> References: <20230919233556.1458793-1-adrian.larumbe@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Tue, 19 Sep 2023 16:36:29 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777534637742570923 X-GMAIL-MSGID: 1777534637742570923 |
Series |
Add fdinfo support to Panfrost
|
|
Commit Message
Adrián Larumbe
Sept. 19, 2023, 11:34 p.m. UTC
The current implementation will try to pick the highest available size display unit as soon as the BO size exceeds that of the previous multiplier. That can lead to loss of precision in contexts of low memory usage. The new selection criteria try to preserve precision, whilst also increasing the display unit selection threshold to render more accurate values. Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Steven Price <steven.price@arm.com> --- drivers/gpu/drm/drm_file.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
Comments
On 20/09/2023 00:34, Adrián Larumbe wrote: > The current implementation will try to pick the highest available size > display unit as soon as the BO size exceeds that of the previous > multiplier. That can lead to loss of precision in contexts of low memory > usage. > > The new selection criteria try to preserve precision, whilst also > increasing the display unit selection threshold to render more accurate > values. > > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> > Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> > Reviewed-by: Steven Price <steven.price@arm.com> > --- > drivers/gpu/drm/drm_file.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c > index 762965e3d503..34cfa128ffe5 100644 > --- a/drivers/gpu/drm/drm_file.c > +++ b/drivers/gpu/drm/drm_file.c > @@ -872,6 +872,8 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e) > } > EXPORT_SYMBOL(drm_send_event); > > +#define UPPER_UNIT_THRESHOLD 100 > + > static void print_size(struct drm_printer *p, const char *stat, > const char *region, u64 sz) > { > @@ -879,7 +881,8 @@ static void print_size(struct drm_printer *p, const char *stat, > unsigned u; > > for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { > - if (sz < SZ_1K) > + if ((sz & (SZ_1K - 1)) && IS_ALIGNED worth it at all? > + sz < UPPER_UNIT_THRESHOLD * SZ_1K) > break; Excuse me for a late comment (I was away). I did not get what what is special about a ~10% threshold? Sounds to me just going with the lower unit, when size is not aligned to the higher one, would be better than sometimes precision-sometimes-not. Regards, Tvrtko > sz = div_u64(sz, SZ_1K); > }
On 20/09/2023 16:32, Tvrtko Ursulin wrote: > > On 20/09/2023 00:34, Adrián Larumbe wrote: >> The current implementation will try to pick the highest available size >> display unit as soon as the BO size exceeds that of the previous >> multiplier. That can lead to loss of precision in contexts of low memory >> usage. >> >> The new selection criteria try to preserve precision, whilst also >> increasing the display unit selection threshold to render more accurate >> values. >> >> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> >> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> >> Reviewed-by: Steven Price <steven.price@arm.com> >> --- >> drivers/gpu/drm/drm_file.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >> index 762965e3d503..34cfa128ffe5 100644 >> --- a/drivers/gpu/drm/drm_file.c >> +++ b/drivers/gpu/drm/drm_file.c >> @@ -872,6 +872,8 @@ void drm_send_event(struct drm_device *dev, struct >> drm_pending_event *e) >> } >> EXPORT_SYMBOL(drm_send_event); >> +#define UPPER_UNIT_THRESHOLD 100 >> + >> static void print_size(struct drm_printer *p, const char *stat, >> const char *region, u64 sz) >> { >> @@ -879,7 +881,8 @@ static void print_size(struct drm_printer *p, >> const char *stat, >> unsigned u; >> for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { >> - if (sz < SZ_1K) >> + if ((sz & (SZ_1K - 1)) && > > IS_ALIGNED worth it at all? > >> + sz < UPPER_UNIT_THRESHOLD * SZ_1K) >> break; > > Excuse me for a late comment (I was away). I did not get what what is > special about a ~10% threshold? Sounds to me just going with the lower > unit, when size is not aligned to the higher one, would be better than > sometimes precision-sometimes-not. FWIW both current and the threshold option make testing the feature very annoying. So I'd really propose we simply use smaller unit when unaligned. Regards, Tvrtko
On 20.09.2023 16:32, Tvrtko Ursulin wrote: > >On 20/09/2023 00:34, Adrián Larumbe wrote: >> The current implementation will try to pick the highest available size >> display unit as soon as the BO size exceeds that of the previous >> multiplier. That can lead to loss of precision in contexts of low memory >> usage. >> >> The new selection criteria try to preserve precision, whilst also >> increasing the display unit selection threshold to render more accurate >> values. >> >> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> >> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> >> Reviewed-by: Steven Price <steven.price@arm.com> >> --- >> drivers/gpu/drm/drm_file.c | 5 ++++- >> 1 file changed, 4 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >> index 762965e3d503..34cfa128ffe5 100644 >> --- a/drivers/gpu/drm/drm_file.c >> +++ b/drivers/gpu/drm/drm_file.c >> @@ -872,6 +872,8 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e) >> } >> EXPORT_SYMBOL(drm_send_event); >> +#define UPPER_UNIT_THRESHOLD 100 >> + >> static void print_size(struct drm_printer *p, const char *stat, >> const char *region, u64 sz) >> { >> @@ -879,7 +881,8 @@ static void print_size(struct drm_printer *p, const char *stat, >> unsigned u; >> for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { >> - if (sz < SZ_1K) >> + if ((sz & (SZ_1K - 1)) && > >IS_ALIGNED worth it at all? This could look better, yeah. >> + sz < UPPER_UNIT_THRESHOLD * SZ_1K) >> break; > >Excuse me for a late comment (I was away). I did not get what what is special >about a ~10% threshold? Sounds to me just going with the lower unit, when size >is not aligned to the higher one, would be better than sometimes >precision-sometimes-not. We had a bit of a debate over this in previous revisions of the patch. It all began when a Panfrost user complained that for relatively small BOs, they were losing precision in the fdinfo file because the sum of the sizes of all BOs for a drm file was in the order of MiBs, but not big enough to warrant losing accuracy when plotting them on nvtop or gputop. At first I thought of letting drivers pick their own preferred unit, but this would lead to inconsistency in the units presented in the fdinfo file across different DRM devices. Rob then suggested imposing a unit multiple threshold, while Boris made the suggestion of checking for unit size alignment to lessen precision loss. In the end Rob thought that minding both constraints was a good solution of compromise. The unit threshold was picked sort of arbitrarily, and suggested by Rob himself. The point of having it is avoiding huge number representations for BO size tallies that aren't aligned to the next unit, and also because BO size sums are scaled when plotting them on a Y axis, so complete accuracy isn't a requirement. >Regards, > >Tvrtko > >> sz = div_u64(sz, SZ_1K); >> } Adrian Larumbe
On 21.09.2023 11:14, Tvrtko Ursulin wrote: > >On 20/09/2023 16:32, Tvrtko Ursulin wrote: >> >> On 20/09/2023 00:34, Adrián Larumbe wrote: >> > The current implementation will try to pick the highest available size >> > display unit as soon as the BO size exceeds that of the previous >> > multiplier. That can lead to loss of precision in contexts of low memory >> > usage. >> > >> > The new selection criteria try to preserve precision, whilst also >> > increasing the display unit selection threshold to render more accurate >> > values. >> > >> > Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> >> > Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> >> > Reviewed-by: Steven Price <steven.price@arm.com> >> > --- >> > drivers/gpu/drm/drm_file.c | 5 ++++- >> > 1 file changed, 4 insertions(+), 1 deletion(-) >> > >> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >> > index 762965e3d503..34cfa128ffe5 100644 >> > --- a/drivers/gpu/drm/drm_file.c >> > +++ b/drivers/gpu/drm/drm_file.c >> > @@ -872,6 +872,8 @@ void drm_send_event(struct drm_device *dev, struct >> > drm_pending_event *e) >> > } >> > EXPORT_SYMBOL(drm_send_event); >> > +#define UPPER_UNIT_THRESHOLD 100 >> > + >> > static void print_size(struct drm_printer *p, const char *stat, >> > const char *region, u64 sz) >> > { >> > @@ -879,7 +881,8 @@ static void print_size(struct drm_printer *p, >> > const char *stat, >> > unsigned u; >> > for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { >> > - if (sz < SZ_1K) >> > + if ((sz & (SZ_1K - 1)) && >> >> IS_ALIGNED worth it at all? >> >> > + sz < UPPER_UNIT_THRESHOLD * SZ_1K) >> > break; >> >> Excuse me for a late comment (I was away). I did not get what what is >> special about a ~10% threshold? Sounds to me just going with the lower >> unit, when size is not aligned to the higher one, would be better than >> sometimes precision-sometimes-not. > >FWIW both current and the threshold option make testing the feature very >annoying. How so? >So I'd really propose we simply use smaller unit when unaligned. Like I said in the previous reply, for drm files whose overall BO size sum is enormous but not a multiple of a MiB, this would render huge number representations in KiB. I don't find this particularly comfortable to read, and then this extra precision would mean nothing to nvtop or gputop, which would have to scale the size to their available screen dimensions when plotting them. >Regards, > >Tvrtko
On 22/09/2023 12:03, Adrián Larumbe wrote: > On 21.09.2023 11:14, Tvrtko Ursulin wrote: >> >> On 20/09/2023 16:32, Tvrtko Ursulin wrote: >>> >>> On 20/09/2023 00:34, Adrián Larumbe wrote: >>>> The current implementation will try to pick the highest available size >>>> display unit as soon as the BO size exceeds that of the previous >>>> multiplier. That can lead to loss of precision in contexts of low memory >>>> usage. >>>> >>>> The new selection criteria try to preserve precision, whilst also >>>> increasing the display unit selection threshold to render more accurate >>>> values. >>>> >>>> Signed-off-by: Adrián Larumbe <adrian.larumbe@collabora.com> >>>> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> >>>> Reviewed-by: Steven Price <steven.price@arm.com> >>>> --- >>>> drivers/gpu/drm/drm_file.c | 5 ++++- >>>> 1 file changed, 4 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c >>>> index 762965e3d503..34cfa128ffe5 100644 >>>> --- a/drivers/gpu/drm/drm_file.c >>>> +++ b/drivers/gpu/drm/drm_file.c >>>> @@ -872,6 +872,8 @@ void drm_send_event(struct drm_device *dev, struct >>>> drm_pending_event *e) >>>> } >>>> EXPORT_SYMBOL(drm_send_event); >>>> +#define UPPER_UNIT_THRESHOLD 100 >>>> + >>>> static void print_size(struct drm_printer *p, const char *stat, >>>> const char *region, u64 sz) >>>> { >>>> @@ -879,7 +881,8 @@ static void print_size(struct drm_printer *p, >>>> const char *stat, >>>> unsigned u; >>>> for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { >>>> - if (sz < SZ_1K) >>>> + if ((sz & (SZ_1K - 1)) && >>> >>> IS_ALIGNED worth it at all? >>> >>>> + sz < UPPER_UNIT_THRESHOLD * SZ_1K) >>>> break; >>> >>> Excuse me for a late comment (I was away). I did not get what what is >>> special about a ~10% threshold? Sounds to me just going with the lower >>> unit, when size is not aligned to the higher one, would be better than >>> sometimes precision-sometimes-not. >> >> FWIW both current and the threshold option make testing the feature very >> annoying. > > How so? I have to build in the knowledge of implementation details of print_size() into my IGT in order to use the right size BOs, so test is able to verify stats move as expected. It just feels wrong. >> So I'd really propose we simply use smaller unit when unaligned. > > Like I said in the previous reply, for drm files whose overall BO size sum is enormous > but not a multiple of a MiB, this would render huge number representations in KiB. > I don't find this particularly comfortable to read, and then this extra precision > would mean nothing to nvtop or gputop, which would have to scale the size to their > available screen dimensions when plotting them. I don't think numbers in KiB are so huge. And I don't think people will end up reading them manually a lot anyway, since you have to hunt the pid, and fd, etc.. It is much more realistic that some tool like gputop will be used. And I don't think consistency of units across drivers or whatever matters. Even better to keep userspace parser on their toes and make then follow drm-usage-stats.rst and not any implementations, at some point in time. Regards, Tvrtko
diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c index 762965e3d503..34cfa128ffe5 100644 --- a/drivers/gpu/drm/drm_file.c +++ b/drivers/gpu/drm/drm_file.c @@ -872,6 +872,8 @@ void drm_send_event(struct drm_device *dev, struct drm_pending_event *e) } EXPORT_SYMBOL(drm_send_event); +#define UPPER_UNIT_THRESHOLD 100 + static void print_size(struct drm_printer *p, const char *stat, const char *region, u64 sz) { @@ -879,7 +881,8 @@ static void print_size(struct drm_printer *p, const char *stat, unsigned u; for (u = 0; u < ARRAY_SIZE(units) - 1; u++) { - if (sz < SZ_1K) + if ((sz & (SZ_1K - 1)) && + sz < UPPER_UNIT_THRESHOLD * SZ_1K) break; sz = div_u64(sz, SZ_1K); }