Message ID | 20230705070427.92579-2-jefflexu@linux.alibaba.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp1701020vqx; Wed, 5 Jul 2023 00:57:14 -0700 (PDT) X-Google-Smtp-Source: APBJJlFQjU8FWvzrnkcF/Ll1wjXWAy5W15rOzSIs3aEgevVXfK1nKer7fBR8c8AfzBeoYBhUTOEq X-Received: by 2002:a25:f44c:0:b0:bcb:c3d6:2b1d with SMTP id p12-20020a25f44c000000b00bcbc3d62b1dmr13114280ybe.34.1688543833711; Wed, 05 Jul 2023 00:57:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688543833; cv=none; d=google.com; s=arc-20160816; b=U+jJJQH+RpQ62yevWLypkx80fLQQuWh3XYj2zw9NBePQcBO66lfVNB/qsSUW/dLlv+ dGZsYRcrYnfOPlx9EyrfyOE1Lx5OBzXGaYU1q8/R8Z+G2tOwxH23nAj8Gn6Pbv39L+Yg Q5vlHc+0hykS6j3ldIbQ6H2YMw+ZpSglxcsu8nxeTLocATyZk0KypTm6WyFvQ1xekmW2 j+bC7PUM1uxnbrytmgFM03sq8y0kS2htGd4YEtJ/Fg26Xak0dIGOQG84JEqhm7KMaNcM 5s1IMkCt0/YDf91hVqz2ha7GP4FcwZ3PzVf4EERO4aZga4Fs6oDaIfYKXfOQ8SGNtR6w TEJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=XRlplwY1vuXUkkvwndOeZ8CFXbtBMZhxCxPJClW+8bQ=; fh=tdJqfpFiDTPZS9Qop1cSXrmMucRMwC/xPvzmbNN2BQM=; b=UJFm6xYSQq6PrTYSUg9250TevCOpOlh0EPJkhMGZKzK12GVlYI3wwbzIHbYjb+pCPB V1nIYOGeRKrfssHXnQ/BFDrmKlGq3Fmcf4+QnEc+WTjvg2E4t65zXVmDTr0RGZCTaxMK Kl9DB4uwYMTxr1ZCvZvY+e01f8HLt+om4MinjQHYusr071rmqi+ZvJdR6g9jz+WqhJqc vD6+7s73Bah0+c+fWfpfXjrmlNwZQ6roxocf0pWVLcjbwLZxF2gpyM9EKbD8c0wSDIff yjLmt8ZT7rmZweR+ofvOiQ5Ug4MIPTbqOB9C8AYAk+j8lZEaIWDt5CY0A3kOaPXSroev Sq3g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p38-20020a634f66000000b00530b2754b77si21788906pgl.71.2023.07.05.00.56.59; Wed, 05 Jul 2023 00:57:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232068AbjGEHEi (ORCPT <rfc822;tebrre53rla2o@gmail.com> + 99 others); Wed, 5 Jul 2023 03:04:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35754 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232040AbjGEHEg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 5 Jul 2023 03:04:36 -0400 Received: from out30-132.freemail.mail.aliyun.com (out30-132.freemail.mail.aliyun.com [115.124.30.132]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CB2F1AD for <linux-kernel@vger.kernel.org>; Wed, 5 Jul 2023 00:04:33 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R581e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045192;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VmflQWf_1688540668; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VmflQWf_1688540668) by smtp.aliyun-inc.com; Wed, 05 Jul 2023 15:04:29 +0800 From: Jingbo Xu <jefflexu@linux.alibaba.com> To: hsiangkao@linux.alibaba.com, chao@kernel.org, huyue2@coolpad.com, linux-erofs@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org, alexl@redhat.com Subject: [PATCH v2 1/2] erofs: update on-disk format for xattr name filter Date: Wed, 5 Jul 2023 15:04:26 +0800 Message-Id: <20230705070427.92579-2-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230705070427.92579-1-jefflexu@linux.alibaba.com> References: <20230705070427.92579-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,LONGWORDS,RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY, USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770566538918479866?= X-GMAIL-MSGID: =?utf-8?q?1770566538918479866?= |
Series |
erofs: introduce xattr name bloom filter
|
|
Commit Message
Jingbo Xu
July 5, 2023, 7:04 a.m. UTC
The xattr name bloom filter feature is going to be introduced to speed
up the negative xattr lookup, e.g. system.posix_acl_[access|default]
lookup when running "ls -lR" workload.
The number of common used extended attributes (n) is approximately 30.
trusted.overlay.opaque
trusted.overlay.redirect
trusted.overlay.origin
trusted.overlay.impure
trusted.overlay.nlink
trusted.overlay.upper
trusted.overlay.metacopy
trusted.overlay.protattr
user.overlay.opaque
user.overlay.redirect
user.overlay.origin
user.overlay.impure
user.overlay.nlink
user.overlay.upper
user.overlay.metacopy
user.overlay.protattr
security.evm
security.ima
security.selinux
security.SMACK64
security.SMACK64IPIN
security.SMACK64IPOUT
security.SMACK64EXEC
security.SMACK64TRANSMUTE
security.SMACK64MMAP
security.apparmor
security.capability
system.posix_acl_access
system.posix_acl_default
user.mime_type
Given the number of bits of the bloom filter (m) is 32, the optimal
value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74).
The single hash function is implemented as:
xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index)
where index represents the index of corresponding predefined short name
prefix, while name represents the name string after stripping the above
predefined name prefix.
The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is
used to give a better spread when mapping these 30 extended attributes
into 32-bit bloom filter as:
bit 0: security.ima
bit 1:
bit 2: trusted.overlay.nlink
bit 3:
bit 4: user.overlay.nlink
bit 5: trusted.overlay.upper
bit 6: user.overlay.origin
bit 7: trusted.overlay.protattr
bit 8: security.apparmor
bit 9: user.overlay.protattr
bit 10: user.overlay.opaque
bit 11: security.selinux
bit 12: security.SMACK64TRANSMUTE
bit 13: security.SMACK64
bit 14: security.SMACK64MMAP
bit 15: user.overlay.impure
bit 16: security.SMACK64IPIN
bit 17: trusted.overlay.redirect
bit 18: trusted.overlay.origin
bit 19: security.SMACK64IPOUT
bit 20: trusted.overlay.opaque
bit 21: system.posix_acl_default
bit 22:
bit 23: user.mime_type
bit 24: trusted.overlay.impure
bit 25: security.SMACK64EXEC
bit 26: user.overlay.redirect
bit 27: user.overlay.upper
bit 28: security.evm
bit 29: security.capability
bit 30: system.posix_acl_access
bit 31: trusted.overlay.metacopy, user.overlay.metacopy
The h_name_filter field is introduced to the on-disk per-inode xattr
header to place the corresponding xattr name filter, where bit value 1
indicates non-existence for compatibility.
This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER
compatible feature bit.
Suggested-by: Alexander Larsson <alexl@redhat.com>
Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com>
---
fs/erofs/erofs_fs.h | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
Comments
On 2023/7/5 15:04, Jingbo Xu wrote: > The xattr name bloom filter feature is going to be introduced to speed > up the negative xattr lookup, e.g. system.posix_acl_[access|default] > lookup when running "ls -lR" workload. > > The number of common used extended attributes (n) is approximately 30. There are some commonly used extended attributes (n) and the total number of these is 31: > > trusted.overlay.opaque > trusted.overlay.redirect > trusted.overlay.origin > trusted.overlay.impure > trusted.overlay.nlink > trusted.overlay.upper > trusted.overlay.metacopy > trusted.overlay.protattr > user.overlay.opaque > user.overlay.redirect > user.overlay.origin > user.overlay.impure > user.overlay.nlink > user.overlay.upper > user.overlay.metacopy > user.overlay.protattr > security.evm > security.ima > security.selinux > security.SMACK64 > security.SMACK64IPIN > security.SMACK64IPOUT > security.SMACK64EXEC > security.SMACK64TRANSMUTE > security.SMACK64MMAP > security.apparmor > security.capability > system.posix_acl_access > system.posix_acl_default > user.mime_type > > Given the number of bits of the bloom filter (m) is 32, the optimal > value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). > > The single hash function is implemented as: > > xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) > > where index represents the index of corresponding predefined short name where `index`... > prefix, while name represents the name string after stripping the above > predefined name prefix. > > The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is > used to give a better spread when mapping these 30 extended attributes > into 32-bit bloom filter as: > > bit 0: security.ima > bit 1: > bit 2: trusted.overlay.nlink > bit 3: > bit 4: user.overlay.nlink > bit 5: trusted.overlay.upper > bit 6: user.overlay.origin > bit 7: trusted.overlay.protattr > bit 8: security.apparmor > bit 9: user.overlay.protattr > bit 10: user.overlay.opaque > bit 11: security.selinux > bit 12: security.SMACK64TRANSMUTE > bit 13: security.SMACK64 > bit 14: security.SMACK64MMAP > bit 15: user.overlay.impure > bit 16: security.SMACK64IPIN > bit 17: trusted.overlay.redirect > bit 18: trusted.overlay.origin > bit 19: security.SMACK64IPOUT > bit 20: trusted.overlay.opaque > bit 21: system.posix_acl_default > bit 22: > bit 23: user.mime_type > bit 24: trusted.overlay.impure > bit 25: security.SMACK64EXEC > bit 26: user.overlay.redirect > bit 27: user.overlay.upper > bit 28: security.evm > bit 29: security.capability > bit 30: system.posix_acl_access > bit 31: trusted.overlay.metacopy, user.overlay.metacopy > > The h_name_filter field is introduced to the on-disk per-inode xattr > header to place the corresponding xattr name filter, where bit value 1 > indicates non-existence for compatibility. > > This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER > compatible feature bit. > > Suggested-by: Alexander Larsson <alexl@redhat.com> > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > --- > fs/erofs/erofs_fs.h | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h > index 2c7b16e340fe..b4b6235fd720 100644 > --- a/fs/erofs/erofs_fs.h > +++ b/fs/erofs/erofs_fs.h > @@ -13,6 +13,7 @@ > > #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 > #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 > +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 I'd suggest that if we could leave one reserved byte in the superblock for now (and checking if it's 0) since 1) xattr filter feature is a compatible feature; 2) I'm not sure if the implementation could be changed. so that later implementation changes won't bother compat bits again. > > /* > * Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should > @@ -200,7 +201,7 @@ struct erofs_inode_extended { > * for read-only fs, no need to introduce h_refcount > */ > struct erofs_xattr_ibody_header { > - __le32 h_reserved; > + __le32 h_name_filter; /* bit value 1 indicates not-present */ > __u8 h_shared_count; > __u8 h_reserved2[7]; > __le32 h_shared_xattrs[]; /* shared xattr id array */ > @@ -221,6 +222,11 @@ struct erofs_xattr_ibody_header { > #define EROFS_XATTR_LONG_PREFIX 0x80 > #define EROFS_XATTR_LONG_PREFIX_MASK 0x7f > > +#define EROFS_XATTR_FILTER_BITS 32 > +#define EROFS_XATTR_FILTER_MASK (EROFS_XATTR_FILTER_BITS - 1) Is this useful, we could just replace it to (EROFS_XATTR_FILTER_BITS - 1) directly. Otherwise it looks good to me, Thanks, Gao Xiang
On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote: > > > > On 2023/7/5 15:04, Jingbo Xu wrote: > > The xattr name bloom filter feature is going to be introduced to speed > > up the negative xattr lookup, e.g. system.posix_acl_[access|default] > > lookup when running "ls -lR" workload. > > > > The number of common used extended attributes (n) is approximately 30. > > There are some commonly used extended attributes (n) and the total number > of these is 31: > > > > > trusted.overlay.opaque > > trusted.overlay.redirect > > trusted.overlay.origin > > trusted.overlay.impure > > trusted.overlay.nlink > > trusted.overlay.upper > > trusted.overlay.metacopy > > trusted.overlay.protattr > > user.overlay.opaque > > user.overlay.redirect > > user.overlay.origin > > user.overlay.impure > > user.overlay.nlink > > user.overlay.upper > > user.overlay.metacopy > > user.overlay.protattr > > security.evm > > security.ima > > security.selinux > > security.SMACK64 > > security.SMACK64IPIN > > security.SMACK64IPOUT > > security.SMACK64EXEC > > security.SMACK64TRANSMUTE > > security.SMACK64MMAP > > security.apparmor > > security.capability > > system.posix_acl_access > > system.posix_acl_default > > user.mime_type > > > > Given the number of bits of the bloom filter (m) is 32, the optimal > > value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). > > > > The single hash function is implemented as: > > > > xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) > > > > where index represents the index of corresponding predefined short name > > where `index`... > > > > > prefix, while name represents the name string after stripping the above > > predefined name prefix. > > > > The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is > > used to give a better spread when mapping these 30 extended attributes > > into 32-bit bloom filter as: > > > > bit 0: security.ima > > bit 1: > > bit 2: trusted.overlay.nlink > > bit 3: > > bit 4: user.overlay.nlink > > bit 5: trusted.overlay.upper > > bit 6: user.overlay.origin > > bit 7: trusted.overlay.protattr > > bit 8: security.apparmor > > bit 9: user.overlay.protattr > > bit 10: user.overlay.opaque > > bit 11: security.selinux > > bit 12: security.SMACK64TRANSMUTE > > bit 13: security.SMACK64 > > bit 14: security.SMACK64MMAP > > bit 15: user.overlay.impure > > bit 16: security.SMACK64IPIN > > bit 17: trusted.overlay.redirect > > bit 18: trusted.overlay.origin > > bit 19: security.SMACK64IPOUT > > bit 20: trusted.overlay.opaque > > bit 21: system.posix_acl_default > > bit 22: > > bit 23: user.mime_type > > bit 24: trusted.overlay.impure > > bit 25: security.SMACK64EXEC > > bit 26: user.overlay.redirect > > bit 27: user.overlay.upper > > bit 28: security.evm > > bit 29: security.capability > > bit 30: system.posix_acl_access > > bit 31: trusted.overlay.metacopy, user.overlay.metacopy > > > > The h_name_filter field is introduced to the on-disk per-inode xattr > > header to place the corresponding xattr name filter, where bit value 1 > > indicates non-existence for compatibility. > > > > This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER > > compatible feature bit. > > > > Suggested-by: Alexander Larsson <alexl@redhat.com> > > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > > --- > > fs/erofs/erofs_fs.h | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h > > index 2c7b16e340fe..b4b6235fd720 100644 > > --- a/fs/erofs/erofs_fs.h > > +++ b/fs/erofs/erofs_fs.h > > @@ -13,6 +13,7 @@ > > > > #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 > > #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 > > +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 > > I'd suggest that if we could leave one reserved byte in the > superblock for now (and checking if it's 0) since > 1) xattr filter feature is a compatible feature; > 2) I'm not sure if the implementation could be changed. > > so that later implementation changes won't bother compat bits > again. I would very much like to generate these bloom filters in composefs right now, before the composefs v1 format is completely locked down, and this should be fully possible given that this is a backwards compat change. But this is only possible if it doesn't require a feature flag like this that makes old erofs versions not mount the image.
On 2023/7/5 15:43, Alexander Larsson wrote: > On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote: >> >> >> >> On 2023/7/5 15:04, Jingbo Xu wrote: >>> The xattr name bloom filter feature is going to be introduced to speed >>> up the negative xattr lookup, e.g. system.posix_acl_[access|default] >>> lookup when running "ls -lR" workload. >>> >>> The number of common used extended attributes (n) is approximately 30. >> >> There are some commonly used extended attributes (n) and the total number >> of these is 31: >> >>> >>> trusted.overlay.opaque >>> trusted.overlay.redirect >>> trusted.overlay.origin >>> trusted.overlay.impure >>> trusted.overlay.nlink >>> trusted.overlay.upper >>> trusted.overlay.metacopy >>> trusted.overlay.protattr >>> user.overlay.opaque >>> user.overlay.redirect >>> user.overlay.origin >>> user.overlay.impure >>> user.overlay.nlink >>> user.overlay.upper >>> user.overlay.metacopy >>> user.overlay.protattr >>> security.evm >>> security.ima >>> security.selinux >>> security.SMACK64 >>> security.SMACK64IPIN >>> security.SMACK64IPOUT >>> security.SMACK64EXEC >>> security.SMACK64TRANSMUTE >>> security.SMACK64MMAP >>> security.apparmor >>> security.capability >>> system.posix_acl_access >>> system.posix_acl_default >>> user.mime_type >>> >>> Given the number of bits of the bloom filter (m) is 32, the optimal >>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). >>> >>> The single hash function is implemented as: >>> >>> xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) >>> >>> where index represents the index of corresponding predefined short name >> >> where `index`... >> >> >> >>> prefix, while name represents the name string after stripping the above >>> predefined name prefix. >>> >>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is >>> used to give a better spread when mapping these 30 extended attributes >>> into 32-bit bloom filter as: >>> >>> bit 0: security.ima >>> bit 1: >>> bit 2: trusted.overlay.nlink >>> bit 3: >>> bit 4: user.overlay.nlink >>> bit 5: trusted.overlay.upper >>> bit 6: user.overlay.origin >>> bit 7: trusted.overlay.protattr >>> bit 8: security.apparmor >>> bit 9: user.overlay.protattr >>> bit 10: user.overlay.opaque >>> bit 11: security.selinux >>> bit 12: security.SMACK64TRANSMUTE >>> bit 13: security.SMACK64 >>> bit 14: security.SMACK64MMAP >>> bit 15: user.overlay.impure >>> bit 16: security.SMACK64IPIN >>> bit 17: trusted.overlay.redirect >>> bit 18: trusted.overlay.origin >>> bit 19: security.SMACK64IPOUT >>> bit 20: trusted.overlay.opaque >>> bit 21: system.posix_acl_default >>> bit 22: >>> bit 23: user.mime_type >>> bit 24: trusted.overlay.impure >>> bit 25: security.SMACK64EXEC >>> bit 26: user.overlay.redirect >>> bit 27: user.overlay.upper >>> bit 28: security.evm >>> bit 29: security.capability >>> bit 30: system.posix_acl_access >>> bit 31: trusted.overlay.metacopy, user.overlay.metacopy >>> >>> The h_name_filter field is introduced to the on-disk per-inode xattr >>> header to place the corresponding xattr name filter, where bit value 1 >>> indicates non-existence for compatibility. >>> >>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER >>> compatible feature bit. >>> >>> Suggested-by: Alexander Larsson <alexl@redhat.com> >>> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >>> --- >>> fs/erofs/erofs_fs.h | 8 +++++++- >>> 1 file changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h >>> index 2c7b16e340fe..b4b6235fd720 100644 >>> --- a/fs/erofs/erofs_fs.h >>> +++ b/fs/erofs/erofs_fs.h >>> @@ -13,6 +13,7 @@ >>> >>> #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 >>> #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 >>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 >> >> I'd suggest that if we could leave one reserved byte in the >> superblock for now (and checking if it's 0) since >> 1) xattr filter feature is a compatible feature; >> 2) I'm not sure if the implementation could be changed. >> >> so that later implementation changes won't bother compat bits >> again. > > I would very much like to generate these bloom filters in composefs > right now, before the composefs v1 format is completely locked down, > and this should be fully possible given that this is a backwards > compat change. But this is only possible if it doesn't require a > feature flag like this that makes old erofs versions not mount the > image. EROFS has two types of feature bits: 1) compat flags, which doesn't block mounting on old kernels; 2) incompat flags, which will block mounting on old kernels. here bloom filter use a new compat flag, so old kernels will just ignore this and mount. compat flags just indicates that "an image with a feature, and you could use it or not". Here I just meant the bloom filter internals are fixed for now, so that we might reserve a byte in the on-disk super block for later potential changes (if any). And don't need to bother another new compat flag. Thanks, Gao Xiang
On Wed, Jul 5, 2023 at 9:51 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote: > > > > On 2023/7/5 15:43, Alexander Larsson wrote: > > On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote: > >> > >> > >> > >> On 2023/7/5 15:04, Jingbo Xu wrote: > >>> The xattr name bloom filter feature is going to be introduced to speed > >>> up the negative xattr lookup, e.g. system.posix_acl_[access|default] > >>> lookup when running "ls -lR" workload. > >>> > >>> The number of common used extended attributes (n) is approximately 30. > >> > >> There are some commonly used extended attributes (n) and the total number > >> of these is 31: > >> > >>> > >>> trusted.overlay.opaque > >>> trusted.overlay.redirect > >>> trusted.overlay.origin > >>> trusted.overlay.impure > >>> trusted.overlay.nlink > >>> trusted.overlay.upper > >>> trusted.overlay.metacopy > >>> trusted.overlay.protattr > >>> user.overlay.opaque > >>> user.overlay.redirect > >>> user.overlay.origin > >>> user.overlay.impure > >>> user.overlay.nlink > >>> user.overlay.upper > >>> user.overlay.metacopy > >>> user.overlay.protattr > >>> security.evm > >>> security.ima > >>> security.selinux > >>> security.SMACK64 > >>> security.SMACK64IPIN > >>> security.SMACK64IPOUT > >>> security.SMACK64EXEC > >>> security.SMACK64TRANSMUTE > >>> security.SMACK64MMAP > >>> security.apparmor > >>> security.capability > >>> system.posix_acl_access > >>> system.posix_acl_default > >>> user.mime_type > >>> > >>> Given the number of bits of the bloom filter (m) is 32, the optimal > >>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). > >>> > >>> The single hash function is implemented as: > >>> > >>> xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) > >>> > >>> where index represents the index of corresponding predefined short name > >> > >> where `index`... > >> > >> > >> > >>> prefix, while name represents the name string after stripping the above > >>> predefined name prefix. > >>> > >>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is > >>> used to give a better spread when mapping these 30 extended attributes > >>> into 32-bit bloom filter as: > >>> > >>> bit 0: security.ima > >>> bit 1: > >>> bit 2: trusted.overlay.nlink > >>> bit 3: > >>> bit 4: user.overlay.nlink > >>> bit 5: trusted.overlay.upper > >>> bit 6: user.overlay.origin > >>> bit 7: trusted.overlay.protattr > >>> bit 8: security.apparmor > >>> bit 9: user.overlay.protattr > >>> bit 10: user.overlay.opaque > >>> bit 11: security.selinux > >>> bit 12: security.SMACK64TRANSMUTE > >>> bit 13: security.SMACK64 > >>> bit 14: security.SMACK64MMAP > >>> bit 15: user.overlay.impure > >>> bit 16: security.SMACK64IPIN > >>> bit 17: trusted.overlay.redirect > >>> bit 18: trusted.overlay.origin > >>> bit 19: security.SMACK64IPOUT > >>> bit 20: trusted.overlay.opaque > >>> bit 21: system.posix_acl_default > >>> bit 22: > >>> bit 23: user.mime_type > >>> bit 24: trusted.overlay.impure > >>> bit 25: security.SMACK64EXEC > >>> bit 26: user.overlay.redirect > >>> bit 27: user.overlay.upper > >>> bit 28: security.evm > >>> bit 29: security.capability > >>> bit 30: system.posix_acl_access > >>> bit 31: trusted.overlay.metacopy, user.overlay.metacopy > >>> > >>> The h_name_filter field is introduced to the on-disk per-inode xattr > >>> header to place the corresponding xattr name filter, where bit value 1 > >>> indicates non-existence for compatibility. > >>> > >>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER > >>> compatible feature bit. > >>> > >>> Suggested-by: Alexander Larsson <alexl@redhat.com> > >>> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > >>> --- > >>> fs/erofs/erofs_fs.h | 8 +++++++- > >>> 1 file changed, 7 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h > >>> index 2c7b16e340fe..b4b6235fd720 100644 > >>> --- a/fs/erofs/erofs_fs.h > >>> +++ b/fs/erofs/erofs_fs.h > >>> @@ -13,6 +13,7 @@ > >>> > >>> #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 > >>> #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 > >>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 > >> > >> I'd suggest that if we could leave one reserved byte in the > >> superblock for now (and checking if it's 0) since > >> 1) xattr filter feature is a compatible feature; > >> 2) I'm not sure if the implementation could be changed. > >> > >> so that later implementation changes won't bother compat bits > >> again. > > > > I would very much like to generate these bloom filters in composefs > > right now, before the composefs v1 format is completely locked down, > > and this should be fully possible given that this is a backwards > > compat change. But this is only possible if it doesn't require a > > feature flag like this that makes old erofs versions not mount the > > image. > > EROFS has two types of feature bits: > > 1) compat flags, which doesn't block mounting on old kernels; > 2) incompat flags, which will block mounting on old kernels. > > here bloom filter use a new compat flag, so old kernels will just > ignore this and mount. compat flags just indicates that "an image > with a feature, and you could use it or not". > > Here I just meant the bloom filter internals are fixed for now, > so that we might reserve a byte in the on-disk super block for > later potential changes (if any). And don't need to bother another > new compat flag. Cool. Then we're all good!
On 2023/7/5 16:12, Alexander Larsson wrote: > On Wed, Jul 5, 2023 at 9:51 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote: >> >> >> >> On 2023/7/5 15:43, Alexander Larsson wrote: >>> On Wed, Jul 5, 2023 at 9:25 AM Gao Xiang <hsiangkao@linux.alibaba.com> wrote: >>>> >>>> >>>> >>>> On 2023/7/5 15:04, Jingbo Xu wrote: >>>>> The xattr name bloom filter feature is going to be introduced to speed >>>>> up the negative xattr lookup, e.g. system.posix_acl_[access|default] >>>>> lookup when running "ls -lR" workload. >>>>> >>>>> The number of common used extended attributes (n) is approximately 30. >>>> >>>> There are some commonly used extended attributes (n) and the total number >>>> of these is 31: >>>> >>>>> >>>>> trusted.overlay.opaque >>>>> trusted.overlay.redirect >>>>> trusted.overlay.origin >>>>> trusted.overlay.impure >>>>> trusted.overlay.nlink >>>>> trusted.overlay.upper >>>>> trusted.overlay.metacopy >>>>> trusted.overlay.protattr >>>>> user.overlay.opaque >>>>> user.overlay.redirect >>>>> user.overlay.origin >>>>> user.overlay.impure >>>>> user.overlay.nlink >>>>> user.overlay.upper >>>>> user.overlay.metacopy >>>>> user.overlay.protattr >>>>> security.evm >>>>> security.ima >>>>> security.selinux >>>>> security.SMACK64 >>>>> security.SMACK64IPIN >>>>> security.SMACK64IPOUT >>>>> security.SMACK64EXEC >>>>> security.SMACK64TRANSMUTE >>>>> security.SMACK64MMAP >>>>> security.apparmor >>>>> security.capability >>>>> system.posix_acl_access >>>>> system.posix_acl_default >>>>> user.mime_type >>>>> >>>>> Given the number of bits of the bloom filter (m) is 32, the optimal >>>>> value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). >>>>> >>>>> The single hash function is implemented as: >>>>> >>>>> xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) >>>>> >>>>> where index represents the index of corresponding predefined short name >>>> >>>> where `index`... >>>> >>>> >>>> >>>>> prefix, while name represents the name string after stripping the above >>>>> predefined name prefix. >>>>> >>>>> The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is >>>>> used to give a better spread when mapping these 30 extended attributes >>>>> into 32-bit bloom filter as: >>>>> >>>>> bit 0: security.ima >>>>> bit 1: >>>>> bit 2: trusted.overlay.nlink >>>>> bit 3: >>>>> bit 4: user.overlay.nlink >>>>> bit 5: trusted.overlay.upper >>>>> bit 6: user.overlay.origin >>>>> bit 7: trusted.overlay.protattr >>>>> bit 8: security.apparmor >>>>> bit 9: user.overlay.protattr >>>>> bit 10: user.overlay.opaque >>>>> bit 11: security.selinux >>>>> bit 12: security.SMACK64TRANSMUTE >>>>> bit 13: security.SMACK64 >>>>> bit 14: security.SMACK64MMAP >>>>> bit 15: user.overlay.impure >>>>> bit 16: security.SMACK64IPIN >>>>> bit 17: trusted.overlay.redirect >>>>> bit 18: trusted.overlay.origin >>>>> bit 19: security.SMACK64IPOUT >>>>> bit 20: trusted.overlay.opaque >>>>> bit 21: system.posix_acl_default >>>>> bit 22: >>>>> bit 23: user.mime_type >>>>> bit 24: trusted.overlay.impure >>>>> bit 25: security.SMACK64EXEC >>>>> bit 26: user.overlay.redirect >>>>> bit 27: user.overlay.upper >>>>> bit 28: security.evm >>>>> bit 29: security.capability >>>>> bit 30: system.posix_acl_access >>>>> bit 31: trusted.overlay.metacopy, user.overlay.metacopy >>>>> >>>>> The h_name_filter field is introduced to the on-disk per-inode xattr >>>>> header to place the corresponding xattr name filter, where bit value 1 >>>>> indicates non-existence for compatibility. >>>>> >>>>> This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER >>>>> compatible feature bit. >>>>> >>>>> Suggested-by: Alexander Larsson <alexl@redhat.com> >>>>> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >>>>> --- >>>>> fs/erofs/erofs_fs.h | 8 +++++++- >>>>> 1 file changed, 7 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h >>>>> index 2c7b16e340fe..b4b6235fd720 100644 >>>>> --- a/fs/erofs/erofs_fs.h >>>>> +++ b/fs/erofs/erofs_fs.h >>>>> @@ -13,6 +13,7 @@ >>>>> >>>>> #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 >>>>> #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 >>>>> +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 >>>> >>>> I'd suggest that if we could leave one reserved byte in the >>>> superblock for now (and checking if it's 0) since >>>> 1) xattr filter feature is a compatible feature; >>>> 2) I'm not sure if the implementation could be changed. >>>> >>>> so that later implementation changes won't bother compat bits >>>> again. >>> >>> I would very much like to generate these bloom filters in composefs >>> right now, before the composefs v1 format is completely locked down, >>> and this should be fully possible given that this is a backwards >>> compat change. But this is only possible if it doesn't require a >>> feature flag like this that makes old erofs versions not mount the >>> image. >> >> EROFS has two types of feature bits: >> >> 1) compat flags, which doesn't block mounting on old kernels; >> 2) incompat flags, which will block mounting on old kernels. >> >> here bloom filter use a new compat flag, so old kernels will just >> ignore this and mount. compat flags just indicates that "an image >> with a feature, and you could use it or not". >> >> Here I just meant the bloom filter internals are fixed for now, >> so that we might reserve a byte in the on-disk super block for >> later potential changes (if any). And don't need to bother another >> new compat flag. > > Cool. Then we're all good! :) >
diff --git a/fs/erofs/erofs_fs.h b/fs/erofs/erofs_fs.h index 2c7b16e340fe..b4b6235fd720 100644 --- a/fs/erofs/erofs_fs.h +++ b/fs/erofs/erofs_fs.h @@ -13,6 +13,7 @@ #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 /* * Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should @@ -200,7 +201,7 @@ struct erofs_inode_extended { * for read-only fs, no need to introduce h_refcount */ struct erofs_xattr_ibody_header { - __le32 h_reserved; + __le32 h_name_filter; /* bit value 1 indicates not-present */ __u8 h_shared_count; __u8 h_reserved2[7]; __le32 h_shared_xattrs[]; /* shared xattr id array */ @@ -221,6 +222,11 @@ struct erofs_xattr_ibody_header { #define EROFS_XATTR_LONG_PREFIX 0x80 #define EROFS_XATTR_LONG_PREFIX_MASK 0x7f +#define EROFS_XATTR_FILTER_BITS 32 +#define EROFS_XATTR_FILTER_MASK (EROFS_XATTR_FILTER_BITS - 1) +#define EROFS_XATTR_FILTER_DEFAULT UINT32_MAX +#define EROFS_XATTR_FILTER_SEED 0x25BBE08F + /* xattr entry (for both inline & shared xattrs) */ struct erofs_xattr_entry { __u8 e_name_len; /* length of name */