Message ID | 20230922062558.1739642-1-max.kellermann@ionos.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:910f:0:b0:403:3b70:6f57 with SMTP id r15csp4858vqg; Fri, 22 Sep 2023 17:30:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG9EZN28zb0Uskp/6rA7AqdWQ/eoXf51NzmfnEVNfA15vM6Rma8EgpZRPlTm2ju2CgnNVx/ X-Received: by 2002:a05:6a20:1614:b0:13e:1d49:7249 with SMTP id l20-20020a056a20161400b0013e1d497249mr1211489pzj.2.1695429015430; Fri, 22 Sep 2023 17:30:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695429015; cv=none; d=google.com; s=arc-20160816; b=wxCfYlZW7aqrzMirMbAXPrgH10JNAs30+IpsoyyJ9MsN3T1uadmGl6JrdbERI7lTPt r13xRhzM64XzUja1PqzKOhDqrC5YYZdfm9I3DJLdFiMC4NBGeaRJY0vKiZmgUGnN0wQd YzlJLL1sETNki12oA/ChX7G3IRsPXgg7o0JR6fj4wn6hdqSzFllBvVK/1g5FQKLAwn2i RAkXJUX8TlUxbYPWtbV5EaE5OEcK+sayrpFuElL16l8hTqjK4NjiF0fiQoSb/50UfjNi 4eDTlcn2DhhbMqHTtwaxSCzYFZ4MQ+dit0bNqBzOoNzIGmpHFX+RNjICumJie8JclLGT mgQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=Hc1EfM4Ltnso0WZLWYNawQf+V2mMDzvf7BDBSh0CjLs=; fh=bF1/iJYiZhg89aqYR2VVvDoAFDEhRbIzAd23g2A3FFA=; b=oLbxm0KcjCYcYZzs8x/0rlAYBI255j/07ANn+9InXpobp87AlS0uM/7rn7rfmN0LOW d2LlCDny9yELjrXPRoaqEw3ABK21qFAVRbgzteTtaHOwo+rk0ZQgZexpDZxZALlwAcsy Rvddl3Dn/Nq304oCEbMl4qxcdrkgGNsbRKxyJEVztIFfhwP0RstrST5q46TI16dzIYYh taHXQ2HUVtC6rRPpoOqjDmdTl1uQOXJOQ7qxrFUWE6tq/Bd7sMt1rU9T8zqC/3uG4rq/ oiR8IDKZ++oq0y0B+TCvNE65WBYTa5sNBAVmcrIWgpzpNP5T5OJ1HyODE7VXMxxOyI05 X8cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ionos.com header.s=google header.b=Xi3hU7u3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=ionos.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id p3-20020aa78603000000b0069022a03d42si4700858pfn.283.2023.09.22.17.30.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Sep 2023 17:30:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@ionos.com header.s=google header.b=Xi3hU7u3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=ionos.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 3CB0F822CE1E; Thu, 21 Sep 2023 23:26:32 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231426AbjIVG00 (ORCPT <rfc822;chrisfriedt@gmail.com> + 30 others); Fri, 22 Sep 2023 02:26:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231535AbjIVG0U (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 22 Sep 2023 02:26:20 -0400 Received: from mail-ej1-x62d.google.com (mail-ej1-x62d.google.com [IPv6:2a00:1450:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2773C114 for <linux-kernel@vger.kernel.org>; Thu, 21 Sep 2023 23:26:09 -0700 (PDT) Received: by mail-ej1-x62d.google.com with SMTP id a640c23a62f3a-9ad8bf9bfabso214550566b.3 for <linux-kernel@vger.kernel.org>; Thu, 21 Sep 2023 23:26:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1695363967; x=1695968767; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=Hc1EfM4Ltnso0WZLWYNawQf+V2mMDzvf7BDBSh0CjLs=; b=Xi3hU7u3ESXKPnQypKv7hQWrvchrgwjc24rjyV9hebdbC3qHo2ymQU/g2fiYKVttl5 dcXnxAEenT0ykxocx7998TSKHSfvqPlaLoLyJUsSm9vMVQXzUYYos1foB26VSQvsE10p f5vZKgpakp9najw2bSztkQ8jU6gWMjL0uhTMdZAY3dlqPRW94ZLa2SBuKsfTx1gK++ZH i0o5r1YHr8jmSdg04xVSvWhPCjzDGXD4y8hODAnWb/WqV5WWeFK1YcKIAC0YlhN2esWz 3Iv/TSRbuMHer79kkzCu9IEydQOMJgK3LOY6C71SOFrfe5zWuVm30DGAnqloK8/ULkQQ iIRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695363967; x=1695968767; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Hc1EfM4Ltnso0WZLWYNawQf+V2mMDzvf7BDBSh0CjLs=; b=omYR7+BzG37+8ttU7IWfQOEPQlY4axW6XZiOw+ydMnXQfDph7puIDryJw0XwmUtV7e F9Ka3mcIWYOvSJ+NOfsxvXUrYW+lS7y1ZurBGrg28ntjmrWbql/gOSSNm5RHOZl98apa ghz+45VyGJV/G4X2JSAVFlIEmAluR9C/3HVC3Vg/7Qrbm58c/HhBkKn2/m9ifDZC0pdI zW72jXQx7aHroPXSPQqKzqbNZuq1Dkz9tXMYZo+bB7y3sYyizmxjtuiUCr9m+68cJyUO lgqcD1mwLRakaUKwyp/DYbYZjbR5Yu1D5L/aet9xhfwlHqVx+MA1+6fNrlwgeUTmVuEB HC9g== X-Gm-Message-State: AOJu0YyTEq5esgGQMA8IQmL+7IAmtrKBSlyQdqMqhIwZG8gVIIB4VFYA e8Hp4QWBF6D5gzuiIDPc14YgXA== X-Received: by 2002:a17:906:2096:b0:9ae:1de:f4fb with SMTP id 22-20020a170906209600b009ae01def4fbmr6180280ejq.46.1695363967557; Thu, 21 Sep 2023 23:26:07 -0700 (PDT) Received: from heron.intern.cm-ag (p200300dc6f209c00529a4cfffe3dd983.dip0.t-ipconnect.de. [2003:dc:6f20:9c00:529a:4cff:fe3d:d983]) by smtp.gmail.com with ESMTPSA id gy6-20020a170906f24600b00992afee724bsm2195519ejb.76.2023.09.21.23.26.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 23:26:07 -0700 (PDT) From: Max Kellermann <max.kellermann@ionos.com> To: Xiubo Li <xiubli@redhat.com>, Ilya Dryomov <idryomov@gmail.com>, Jeff Layton <jlayton@kernel.org> Cc: Max Kellermann <max.kellermann@ionos.com>, ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/2] fs/ceph/debugfs: make all files world-readable Date: Fri, 22 Sep 2023 08:25:57 +0200 Message-Id: <20230922062558.1739642-1-max.kellermann@ionos.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 21 Sep 2023 23:26:32 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777786175016000939 X-GMAIL-MSGID: 1777786175016000939 |
Series |
[1/2] fs/ceph/debugfs: make all files world-readable
|
|
Commit Message
Max Kellermann
Sept. 22, 2023, 6:25 a.m. UTC
I'd like to be able to run metrics collector processes without special
privileges
In the kernel, there is a mix of debugfs files being world-readable
and not world-readable is; with a naive "git grep", I found 723
world-readable debugfs_create_file() calls and 582 calls which were
only accessible to privileged processe.
From the code, I cannot derive a consistent policy for that, but the
ceph statistics seem harmless (and useful) enough.
Signed-off-by: Max Kellermann <max.kellermann@ionos.com>
---
fs/ceph/debugfs.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
Comments
On 9/22/23 14:25, Max Kellermann wrote: > I'd like to be able to run metrics collector processes without special > privileges > > In the kernel, there is a mix of debugfs files being world-readable > and not world-readable is; with a naive "git grep", I found 723 > world-readable debugfs_create_file() calls and 582 calls which were > only accessible to privileged processe. > > From the code, I cannot derive a consistent policy for that, but the > ceph statistics seem harmless (and useful) enough. I am not sure whether will this make sense. Because the 'debug' under '/sys/kernel/' is also only accessible by privileged process. Ilya, Jeff Any idea ? Thanks - Xiubo > Signed-off-by: Max Kellermann <max.kellermann@ionos.com> > --- > fs/ceph/debugfs.c | 18 +++++++++--------- > 1 file changed, 9 insertions(+), 9 deletions(-) > > diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c > index 3904333fa6c3..2abee7e18144 100644 > --- a/fs/ceph/debugfs.c > +++ b/fs/ceph/debugfs.c > @@ -429,31 +429,31 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) > name); > > fsc->debugfs_mdsmap = debugfs_create_file("mdsmap", > - 0400, > + 0444, > fsc->client->debugfs_dir, > fsc, > &mdsmap_fops); > > fsc->debugfs_mds_sessions = debugfs_create_file("mds_sessions", > - 0400, > + 0444, > fsc->client->debugfs_dir, > fsc, > &mds_sessions_fops); > > fsc->debugfs_mdsc = debugfs_create_file("mdsc", > - 0400, > + 0444, > fsc->client->debugfs_dir, > fsc, > &mdsc_fops); > > fsc->debugfs_caps = debugfs_create_file("caps", > - 0400, > + 0444, > fsc->client->debugfs_dir, > fsc, > &caps_fops); > > fsc->debugfs_status = debugfs_create_file("status", > - 0400, > + 0444, > fsc->client->debugfs_dir, > fsc, > &status_fops); > @@ -461,13 +461,13 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) > fsc->debugfs_metrics_dir = debugfs_create_dir("metrics", > fsc->client->debugfs_dir); > > - debugfs_create_file("file", 0400, fsc->debugfs_metrics_dir, fsc, > + debugfs_create_file("file", 0444, fsc->debugfs_metrics_dir, fsc, > &metrics_file_fops); > - debugfs_create_file("latency", 0400, fsc->debugfs_metrics_dir, fsc, > + debugfs_create_file("latency", 0444, fsc->debugfs_metrics_dir, fsc, > &metrics_latency_fops); > - debugfs_create_file("size", 0400, fsc->debugfs_metrics_dir, fsc, > + debugfs_create_file("size", 0444, fsc->debugfs_metrics_dir, fsc, > &metrics_size_fops); > - debugfs_create_file("caps", 0400, fsc->debugfs_metrics_dir, fsc, > + debugfs_create_file("caps", 0444, fsc->debugfs_metrics_dir, fsc, > &metrics_caps_fops); > } >
On Mon, 2023-09-25 at 13:18 +0800, Xiubo Li wrote: > On 9/22/23 14:25, Max Kellermann wrote: > > I'd like to be able to run metrics collector processes without special > > privileges > > > > In the kernel, there is a mix of debugfs files being world-readable > > and not world-readable is; with a naive "git grep", I found 723 > > world-readable debugfs_create_file() calls and 582 calls which were > > only accessible to privileged processe. > > > > From the code, I cannot derive a consistent policy for that, but the > > ceph statistics seem harmless (and useful) enough. > > I am not sure whether will this make sense. Because the 'debug' under > '/sys/kernel/' is also only accessible by privileged process. > > Ilya, Jeff > > Any idea ? > Yeah, I don't think this makes much sense. At least on my machine: # stat -c '%A' /sys/kernel/debug drwx------ Without at least x permissions, an unprivileged user can't pathwalk through there. Max, how are you testing this? > > > Signed-off-by: Max Kellermann <max.kellermann@ionos.com> > > --- > > fs/ceph/debugfs.c | 18 +++++++++--------- > > 1 file changed, 9 insertions(+), 9 deletions(-) > > > > diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c > > index 3904333fa6c3..2abee7e18144 100644 > > --- a/fs/ceph/debugfs.c > > +++ b/fs/ceph/debugfs.c > > @@ -429,31 +429,31 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) > > name); > > > > fsc->debugfs_mdsmap = debugfs_create_file("mdsmap", > > - 0400, > > + 0444, > > fsc->client->debugfs_dir, > > fsc, > > &mdsmap_fops); > > > > fsc->debugfs_mds_sessions = debugfs_create_file("mds_sessions", > > - 0400, > > + 0444, > > fsc->client->debugfs_dir, > > fsc, > > &mds_sessions_fops); > > > > fsc->debugfs_mdsc = debugfs_create_file("mdsc", > > - 0400, > > + 0444, > > fsc->client->debugfs_dir, > > fsc, > > &mdsc_fops); > > > > fsc->debugfs_caps = debugfs_create_file("caps", > > - 0400, > > + 0444, > > fsc->client->debugfs_dir, > > fsc, > > &caps_fops); > > > > fsc->debugfs_status = debugfs_create_file("status", > > - 0400, > > + 0444, > > fsc->client->debugfs_dir, > > fsc, > > &status_fops); > > @@ -461,13 +461,13 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) > > fsc->debugfs_metrics_dir = debugfs_create_dir("metrics", > > fsc->client->debugfs_dir); > > > > - debugfs_create_file("file", 0400, fsc->debugfs_metrics_dir, fsc, > > + debugfs_create_file("file", 0444, fsc->debugfs_metrics_dir, fsc, > > &metrics_file_fops); > > - debugfs_create_file("latency", 0400, fsc->debugfs_metrics_dir, fsc, > > + debugfs_create_file("latency", 0444, fsc->debugfs_metrics_dir, fsc, > > &metrics_latency_fops); > > - debugfs_create_file("size", 0400, fsc->debugfs_metrics_dir, fsc, > > + debugfs_create_file("size", 0444, fsc->debugfs_metrics_dir, fsc, > > &metrics_size_fops); > > - debugfs_create_file("caps", 0400, fsc->debugfs_metrics_dir, fsc, > > + debugfs_create_file("caps", 0444, fsc->debugfs_metrics_dir, fsc, > > &metrics_caps_fops); > > } > > >
On Fri, Sep 22, 2023 at 8:26 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > I'd like to be able to run metrics collector processes without special > privileges Hi Max, A word of caution about building metrics collectors based on debugfs output: there are no stability guarantees. While the format won't be changed just for the sake of change of course, expect zero effort to preserve backwards compatibility. The latency metrics in particular are sent to the MDS in binary form and are intended to be consumed through commands like "ceph fs top". debugfs stuff is there just for an occasional sneak peek (apart from actual debugging). Thanks, Ilya
On Mon, 2023-09-25 at 12:41 +0200, Ilya Dryomov wrote: > On Fri, Sep 22, 2023 at 8:26 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > > > I'd like to be able to run metrics collector processes without special > > privileges > > Hi Max, > > A word of caution about building metrics collectors based on debugfs > output: there are no stability guarantees. While the format won't be > changed just for the sake of change of course, expect zero effort to > preserve backwards compatibility. > > The latency metrics in particular are sent to the MDS in binary form > and are intended to be consumed through commands like "ceph fs top". > debugfs stuff is there just for an occasional sneak peek (apart from > actual debugging). > FWIW, I wish we had gone with netlink for this functionality instead of a seqfile. Lorenzo has been working with netlink for some similar functionality with nfsd[1], and it's much nicer for this sort of thing. [1]: https://lore.kernel.org/linux-nfs/ZQTM6l7NrsVHFoR5@lore-desk/T/#t
On Mon, Sep 25, 2023 at 7:18 AM Xiubo Li <xiubli@redhat.com> wrote: > I am not sure whether will this make sense. Because the 'debug' under > '/sys/kernel/' is also only accessible by privileged process. Not exactly correct. It is by default accessible to processes who have CAP_DAC_OVERRIDE and additionally it is accessible to (unprivileged) processes running as uid=0 (those two traits usually overlap). But we don't want to run kernel-exporter as uid=0 and neither do we want to give it CAP_DAC_OVERRIDE; both would be too much, it would affect much more than just (read) access to debugfs. Instead, we mount debugfs with "gid=X,mode=0710". That way, we can give (unprivileged) processes which are member of a certain group access to debugfs, and we put our kernel-exporter process in that group. We can use these mount options to change debugfs defaults, but if a debugfs implementor (such as cephfs) decides to override these global debugfs settings by passing stricter file permissions, we can't easily override that. And that is what my patch is about: restore the ability to override debugfs permissions with a mount option, as debugfs was designed. Max
On Mon, Sep 25, 2023 at 12:42 PM Ilya Dryomov <idryomov@gmail.com> wrote: > A word of caution about building metrics collectors based on debugfs > output: there are no stability guarantees. While the format won't be > changed just for the sake of change of course, expect zero effort to > preserve backwards compatibility. Agree, but there's nothing else. We have been using my patch for quite some time, and it has been very useful. Maybe we can discuss promoting these statistics to sysfs/proc? (the raw numbers, not the existing aggregates which are useless for any practical purpose) > The latency metrics in particular are sent to the MDS in binary form > and are intended to be consumed through commands like "ceph fs top". > debugfs stuff is there just for an occasional sneak peek (apart from > actual debugging). I don't know the whole Ceph ecosystem so well, but "ceph" is a command that is supposed to run on a Ceph server, and not on a machine that mounts a cephfs, right? If that's right, then this command is useless for me. Max
On Tue, Sep 26, 2023 at 8:16 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Mon, Sep 25, 2023 at 12:42 PM Ilya Dryomov <idryomov@gmail.com> wrote: > > A word of caution about building metrics collectors based on debugfs > > output: there are no stability guarantees. While the format won't be > > changed just for the sake of change of course, expect zero effort to > > preserve backwards compatibility. > > Agree, but there's nothing else. We have been using my patch for quite > some time, and it has been very useful. > > Maybe we can discuss promoting these statistics to sysfs/proc? (the > raw numbers, not the existing aggregates which are useless for any > practical purpose) > > > The latency metrics in particular are sent to the MDS in binary form > > and are intended to be consumed through commands like "ceph fs top". > > debugfs stuff is there just for an occasional sneak peek (apart from > > actual debugging). > > I don't know the whole Ceph ecosystem so well, but "ceph" is a command > that is supposed to run on a Ceph server, and not on a machine that > mounts a cephfs, right? If that's right, then this command is useless > for me. No, "ceph" command (as well as "rbd", "rados", etc) can be run from anywhere -- it's just a matter of installing a package which is likely already installed unless you are mounting CephFS manually without using /sbin/mount.ceph mount helper. Thanks, Ilya
On Tue, Sep 26, 2023 at 10:46 AM Ilya Dryomov <idryomov@gmail.com> wrote: > No, "ceph" command (as well as "rbd", "rados", etc) can be run from > anywhere -- it's just a matter of installing a package which is likely > already installed unless you are mounting CephFS manually without using > /sbin/mount.ceph mount helper. I have never heard of that helper, so no, we're not using it - should we? This "ceph" tool requires installing 90 MB of additional Debian packages, which I just tried on a test cluster, and "ceph fs top" fails with "Error initializing cluster client: ObjectNotFound('RADOS object not found (error calling conf_read_file)')". Okay, so I have to configure something.... but .... I don't get why I would want to do that, when I can get the same information from the kernel without installing or configuring anything. This sounds like overcomplexifying the thing for no reason. Max
On Tue, Sep 26, 2023 at 11:09 AM Max Kellermann <max.kellermann@ionos.com> wrote: > > On Tue, Sep 26, 2023 at 10:46 AM Ilya Dryomov <idryomov@gmail.com> wrote: > > No, "ceph" command (as well as "rbd", "rados", etc) can be run from > > anywhere -- it's just a matter of installing a package which is likely > > already installed unless you are mounting CephFS manually without using > > /sbin/mount.ceph mount helper. > > I have never heard of that helper, so no, we're not using it - should we? If you have figured out the right mount options, you might as well not. The helper does things like determine whether v1 or v2 addresses should be used, fetch the key and pass it via the kernel keyring (whereas you are probably passing it verbatim on the command line), etc. It's the same syscall in the end, so the helper is certainly not required. > > This "ceph" tool requires installing 90 MB of additional Debian > packages, which I just tried on a test cluster, and "ceph fs top" > fails with "Error initializing cluster client: ObjectNotFound('RADOS > object not found (error calling conf_read_file)')". Okay, so I have to > configure something.... but .... I don't get why I would want to do > that, when I can get the same information from the kernel without > installing or configuring anything. This sounds like overcomplexifying > the thing for no reason. I have relayed my understanding of this feature (or rather how it was presented to me). I see where you are coming from, so adding more CephFS folks to chime in. Thanks, Ilya
On Wed, Sep 27, 2023 at 12:53 PM Ilya Dryomov <idryomov@gmail.com> wrote: > > This "ceph" tool requires installing 90 MB of additional Debian > > packages, which I just tried on a test cluster, and "ceph fs top" > > fails with "Error initializing cluster client: ObjectNotFound('RADOS > > object not found (error calling conf_read_file)')". Okay, so I have to > > configure something.... but .... I don't get why I would want to do > > that, when I can get the same information from the kernel without > > installing or configuring anything. This sounds like overcomplexifying > > the thing for no reason. > > I have relayed my understanding of this feature (or rather how it was > presented to me). I see where you are coming from, so adding more > CephFS folks to chime in. Let me show these folks how badly "ceph fs stats" performs: # time ceph fs perf stats {"version": 2, "global_counters": ["cap_hit", "read_latency", "write_latency"[...] real 0m0.502s user 0m0.393s sys 0m0.053s Now my debugfs-based solution: # time cat /sys/kernel/debug/ceph/*/metrics/latency item total avg_lat(us) min_lat(us) max_lat(us) stdev(us) [...] real 0m0.002s user 0m0.002s sys 0m0.001s debugfs is more than 200 times faster. It is so fast, it can hardly be measured by "time" - and most of these 2ms is the overhead for executing /bin/cat, not for actually reading the debugfs file. Our kernel-exporter is a daemon process, it only needs a single pread() system call in each iteration, it has even less overhead. Integrating the "ceph" tool instead would require forking the process each time, starting a new Python VM, and so on... For obtaining real-time latency statistics, the "ceph" script is the wrong tool for the job. Max
diff --git a/fs/ceph/debugfs.c b/fs/ceph/debugfs.c index 3904333fa6c3..2abee7e18144 100644 --- a/fs/ceph/debugfs.c +++ b/fs/ceph/debugfs.c @@ -429,31 +429,31 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) name); fsc->debugfs_mdsmap = debugfs_create_file("mdsmap", - 0400, + 0444, fsc->client->debugfs_dir, fsc, &mdsmap_fops); fsc->debugfs_mds_sessions = debugfs_create_file("mds_sessions", - 0400, + 0444, fsc->client->debugfs_dir, fsc, &mds_sessions_fops); fsc->debugfs_mdsc = debugfs_create_file("mdsc", - 0400, + 0444, fsc->client->debugfs_dir, fsc, &mdsc_fops); fsc->debugfs_caps = debugfs_create_file("caps", - 0400, + 0444, fsc->client->debugfs_dir, fsc, &caps_fops); fsc->debugfs_status = debugfs_create_file("status", - 0400, + 0444, fsc->client->debugfs_dir, fsc, &status_fops); @@ -461,13 +461,13 @@ void ceph_fs_debugfs_init(struct ceph_fs_client *fsc) fsc->debugfs_metrics_dir = debugfs_create_dir("metrics", fsc->client->debugfs_dir); - debugfs_create_file("file", 0400, fsc->debugfs_metrics_dir, fsc, + debugfs_create_file("file", 0444, fsc->debugfs_metrics_dir, fsc, &metrics_file_fops); - debugfs_create_file("latency", 0400, fsc->debugfs_metrics_dir, fsc, + debugfs_create_file("latency", 0444, fsc->debugfs_metrics_dir, fsc, &metrics_latency_fops); - debugfs_create_file("size", 0400, fsc->debugfs_metrics_dir, fsc, + debugfs_create_file("size", 0444, fsc->debugfs_metrics_dir, fsc, &metrics_size_fops); - debugfs_create_file("caps", 0400, fsc->debugfs_metrics_dir, fsc, + debugfs_create_file("caps", 0444, fsc->debugfs_metrics_dir, fsc, &metrics_caps_fops); }