From patchwork Fri Jul 14 02:53:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 120202 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp2235449vqm; Thu, 13 Jul 2023 20:02:20 -0700 (PDT) X-Google-Smtp-Source: APBJJlFbMDu3WD4rixohJFFynigdrIB8V3RmgHHsxDb5PGFscb+h85OVeNcjou6L3B18wJG6y9CN X-Received: by 2002:a5d:5544:0:b0:313:f907:ceed with SMTP id g4-20020a5d5544000000b00313f907ceedmr2862025wrw.39.1689303740582; Thu, 13 Jul 2023 20:02:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689303740; cv=none; d=google.com; s=arc-20160816; b=N72RFvzDlV6HmALj4PlE7YeKm+4KUhaF648bMIHtpMoQFaaj9MtuSAKtknAlzX8DwD vC3t5erEfwzg76s4ivh+wDFmD4Fsto5RJG+vMx0PVPsi/81P/o9VdgIQF5wn9IfrMk5j WmZCleBA1wpVGjoV2IObQY3OJt2huDWVqXp2dcYX65soMqghLqeHMT/fokNIFDHIYwar Jq8lDUuX7fHPEOPom39v8bKw+xy5zx7hzG9V1fPdDHKW6ijeh3z0sjGxXWYKmzbPUV0D 5xkmNox/fM95lr3yeprakuM/SM+7Tmp9Zr9H2OVrb3ZLWT7Y4QRw2VypF2q+hQJi8YNi pnAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=v7r8eC71YYGCa5tQgLXJwcAudA28AK7vNBMhRXMrVfI=; fh=c8pvLHGYFOEymPms4uYOO2xnsIHGQ54Pg45olDkm+5s=; b=oLNyJ/mKa8PrcDFFBbjbI5gn6MlfIr/7KxVCprT7W71Vru/BeEVlTYwPm1mYOKaXXH 6w+WKkRVc2/sRjt3q7Ebppcr9T5A2jTM4n07LqpNdpuRrkxdr+B0AUZ0fQKzdhtB41th jw6zrqWpZon67EfVS3a9sFE+brANQ5AsIcUXDaWkndjm0T0CvrTKpTsmdfjNJszJbnbR XwxqrN0CwBlDhULM7EwD2LRBc3wBi8mMNtaSIQh2qhL5S39LBLW71ngZbGRCHjjF6UHF dtnlcjUkIWCEACx1+Ey8tQMbbTCoiEhb8cCy8YBRjI79hZhb+rFSStgpOTtM/8fImU0U P7MQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id md13-20020a170906ae8d00b0098dd9f4ed60si7680851ejb.848.2023.07.13.20.01.56; Thu, 13 Jul 2023 20:02:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234448AbjGNCxk (ORCPT + 99 others); Thu, 13 Jul 2023 22:53:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233024AbjGNCxh (ORCPT ); Thu, 13 Jul 2023 22:53:37 -0400 Received: from out30-100.freemail.mail.aliyun.com (out30-100.freemail.mail.aliyun.com [115.124.30.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66BEC2D53 for ; Thu, 13 Jul 2023 19:53:36 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045176;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VnJvRs8_1689303211; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VnJvRs8_1689303211) by smtp.aliyun-inc.com; Fri, 14 Jul 2023 10:53:32 +0800 From: Jingbo Xu To: hsiangkao@linux.alibaba.com, chao@kernel.org, huyue2@coolpad.com, linux-erofs@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org, alexl@redhat.com Subject: [PATCH v4 1/3] erofs-utils: add xxh32 library Date: Fri, 14 Jul 2023 10:53:28 +0800 Message-Id: <20230714025330.42950-2-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230714025330.42950-1-jefflexu@linux.alibaba.com> References: <20230714025330.42950-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,URIBL_BLOCKED, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771363359165906058 X-GMAIL-MSGID: 1771363359165906058 Add xxh32 library which could be used by following xattr bloom filter feature. Signed-off-by: Jingbo Xu --- include/erofs/xxhash.h | 35 +++++++++++++++++ lib/Makefile.am | 3 +- lib/xxhash.c | 85 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 122 insertions(+), 1 deletion(-) create mode 100644 include/erofs/xxhash.h create mode 100644 lib/xxhash.c diff --git a/include/erofs/xxhash.h b/include/erofs/xxhash.h new file mode 100644 index 0000000..fd9384e --- /dev/null +++ b/include/erofs/xxhash.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef __EROFS_XXHASH_H +#define __EROFS_XXHASH_H + +#ifdef __cplusplus +extern "C" +{ +#endif + +#include "defs.h" + +/* + * Copied from + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git + * xxHash - Extremely Fast Hash algorithm + */ + +/** + * xxh32() - calculate the 32-bit hash of the input with a given seed. + * + * @input: The data to hash. + * @length: The length of the data to hash. + * @seed: The seed can be used to alter the result predictably. + * + * Speed on Core 2 Duo @ 3 GHz (single thread, SMHasher benchmark) : 5.4 GB/s + * + * Return: The 32-bit hash of the data. + */ +uint32_t xxh32(const void *input, size_t length, uint32_t seed); + +#ifdef __cplusplus +} +#endif + +#endif diff --git a/lib/Makefile.am b/lib/Makefile.am index faa7311..a049af6 100644 --- a/lib/Makefile.am +++ b/lib/Makefile.am @@ -23,13 +23,14 @@ noinst_HEADERS = $(top_srcdir)/include/erofs_fs.h \ $(top_srcdir)/include/erofs/xattr.h \ $(top_srcdir)/include/erofs/compress_hints.h \ $(top_srcdir)/include/erofs/fragments.h \ + $(top_srcdir)/include/erofs/xxhash.h \ $(top_srcdir)/lib/liberofs_private.h noinst_HEADERS += compressor.h liberofs_la_SOURCES = config.c io.c cache.c super.c inode.c xattr.c exclude.c \ namei.c data.c compress.c compressor.c zmap.c decompress.c \ compress_hints.c hashmap.c sha256.c blobchunk.c dir.c \ - fragments.c rb_tree.c dedupe.c + fragments.c rb_tree.c dedupe.c xxhash.c liberofs_la_CFLAGS = -Wall -I$(top_srcdir)/include if ENABLE_LZ4 diff --git a/lib/xxhash.c b/lib/xxhash.c new file mode 100644 index 0000000..e5f511c --- /dev/null +++ b/lib/xxhash.c @@ -0,0 +1,85 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copied from + * https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git + * xxHash - Extremely Fast Hash algorithm + */ +#include "erofs/xxhash.h" + +/*-************************************* + * Macros + **************************************/ +#define xxh_rotl32(x, r) ((x << r) | (x >> (32 - r))) + +/*-************************************* + * Constants + **************************************/ +static const uint32_t PRIME32_1 = 2654435761U; +static const uint32_t PRIME32_2 = 2246822519U; +static const uint32_t PRIME32_3 = 3266489917U; +static const uint32_t PRIME32_4 = 668265263U; +static const uint32_t PRIME32_5 = 374761393U; + +/*-*************************** + * Simple Hash Functions + ****************************/ +static uint32_t xxh32_round(uint32_t seed, const uint32_t input) +{ + seed += input * PRIME32_2; + seed = xxh_rotl32(seed, 13); + seed *= PRIME32_1; + return seed; +} + +uint32_t xxh32(const void *input, const size_t len, const uint32_t seed) +{ + const uint8_t *p = (const uint8_t *)input; + const uint8_t *b_end = p + len; + uint32_t h32; + + if (len >= 16) { + const uint8_t *const limit = b_end - 16; + uint32_t v1 = seed + PRIME32_1 + PRIME32_2; + uint32_t v2 = seed + PRIME32_2; + uint32_t v3 = seed + 0; + uint32_t v4 = seed - PRIME32_1; + + do { + v1 = xxh32_round(v1, get_unaligned_le32(p)); + p += 4; + v2 = xxh32_round(v2, get_unaligned_le32(p)); + p += 4; + v3 = xxh32_round(v3, get_unaligned_le32(p)); + p += 4; + v4 = xxh32_round(v4, get_unaligned_le32(p)); + p += 4; + } while (p <= limit); + + h32 = xxh_rotl32(v1, 1) + xxh_rotl32(v2, 7) + + xxh_rotl32(v3, 12) + xxh_rotl32(v4, 18); + } else { + h32 = seed + PRIME32_5; + } + + h32 += (uint32_t)len; + + while (p + 4 <= b_end) { + h32 += get_unaligned_le32(p) * PRIME32_3; + h32 = xxh_rotl32(h32, 17) * PRIME32_4; + p += 4; + } + + while (p < b_end) { + h32 += (*p) * PRIME32_5; + h32 = xxh_rotl32(h32, 11) * PRIME32_1; + p++; + } + + h32 ^= h32 >> 15; + h32 *= PRIME32_2; + h32 ^= h32 >> 13; + h32 *= PRIME32_3; + h32 ^= h32 >> 16; + + return h32; +} From patchwork Fri Jul 14 02:53:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 120209 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp2248615vqm; Thu, 13 Jul 2023 20:40:25 -0700 (PDT) X-Google-Smtp-Source: APBJJlGTb9FEu48cGTqJj8rBIrwOaZa5MsSktO7X4d4eTotjxrCI21vQDmmlB/37wah7niHf7Low X-Received: by 2002:a05:6a20:12c8:b0:133:bd2e:3046 with SMTP id v8-20020a056a2012c800b00133bd2e3046mr692912pzg.2.1689306024857; Thu, 13 Jul 2023 20:40:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689306024; cv=none; d=google.com; s=arc-20160816; b=a87RwhlYxpnVwspR63F5Kbwp5hF6BgzTBTdUrirB00Png1OAqWkzF94qyIjHZ6nyNw VIjLRB6LWZbfUG9pk6ME5kzk9BEUG6xBhyldqrrOvQCURp+VDiQlItj6FzLgNYPEAPKo 0VxNpJGwyt/nOJnH+JxkOJI4yWuNrCXJyqcVmWcNq93eYPmVGux2xD5KUMQIrKCOzh4c WLNiIpqcibWWAtGPuZwyWpVTiBNPo1kHd8ArRCfQncoCKlGfsgZKOP3m3YpD0MYSVJEm RJDaq6Hrc6kOd3pE9c7JFP6dLMbE4SGpMfMBf+lgcDWvWNXHPiiOELipcCscjkJPcyeZ jFEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=rfTlYDssjClXJ95/SjXxODu7NKWcwqo9XWglg5TMLI4=; fh=c8pvLHGYFOEymPms4uYOO2xnsIHGQ54Pg45olDkm+5s=; b=ArTvFgnWP/9iq3t09xVEqTAAjugAba2e9n8cYoC5DAwZ9fl/ryFZAlCQW5FzG4KG7C Ei07ICGb813Cu5n+MthVweuzhwMJtThy2yBWVYMHM7WgkRmKEINYT6AOxmwgI6R+mhVQ WSTyIDmDllRswaEM4Be3DO+ENe5ahSLHl5FAc4t0VhNoVQf5RRhI6Oi7DGOCwQ93REWK KmZIJ1EEfc8yXTBtvfItD/dGLps2VASFwXAccFwa7YPVHLl4/QbBgD+9qqQ+8g/u4WYW MnOXSrgmDKNOsXJ7H6L8xZCHuDEwNZkav6r9tjf6NVDo76LYHkfiWBFys8yIQ/c0AONY rlqA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y13-20020a655a0d000000b0050fa9bc63cbsi6024104pgs.432.2023.07.13.20.40.11; Thu, 13 Jul 2023 20:40:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234552AbjGNCxn (ORCPT + 99 others); Thu, 13 Jul 2023 22:53:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231873AbjGNCxj (ORCPT ); Thu, 13 Jul 2023 22:53:39 -0400 Received: from out30-110.freemail.mail.aliyun.com (out30-110.freemail.mail.aliyun.com [115.124.30.110]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 75E9B2D5D for ; Thu, 13 Jul 2023 19:53:37 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R241e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VnK8Oac_1689303213; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VnK8Oac_1689303213) by smtp.aliyun-inc.com; Fri, 14 Jul 2023 10:53:33 +0800 From: Jingbo Xu To: hsiangkao@linux.alibaba.com, chao@kernel.org, huyue2@coolpad.com, linux-erofs@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org, alexl@redhat.com Subject: [PATCH v4 2/3] erofs-utils: update on-disk format for xattr name filter Date: Fri, 14 Jul 2023 10:53:29 +0800 Message-Id: <20230714025330.42950-3-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230714025330.42950-1-jefflexu@linux.alibaba.com> References: <20230714025330.42950-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,LONGWORDS,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771365754191900456 X-GMAIL-MSGID: 1771365754191900456 The xattr name bloom filter feature is going to be introduced to speed up the negative xattr lookup, e.g. system.posix_acl_[access|default] lookup when running "ls -lR" workload. There are some commonly used extended attributes (n) and the total number of these is approximately 30. trusted.overlay.opaque trusted.overlay.redirect trusted.overlay.origin trusted.overlay.impure trusted.overlay.nlink trusted.overlay.upper trusted.overlay.metacopy trusted.overlay.protattr user.overlay.opaque user.overlay.redirect user.overlay.origin user.overlay.impure user.overlay.nlink user.overlay.upper user.overlay.metacopy user.overlay.protattr security.evm security.ima security.selinux security.SMACK64 security.SMACK64IPIN security.SMACK64IPOUT security.SMACK64EXEC security.SMACK64TRANSMUTE security.SMACK64MMAP security.apparmor security.capability system.posix_acl_access system.posix_acl_default user.mime_type Given the number of bits of the bloom filter (m) is 32, the optimal value for the number of the hash functions (k) is 1 (ln2 * m/n = 0.74). The single hash function is implemented as: xxh32(name, strlen(name), EROFS_XATTR_FILTER_SEED + index) where `index` represents the index of corresponding predefined short name prefix, while `name` represents the name string after stripping the above predefined name prefix. The constant magic number EROFS_XATTR_FILTER_SEED, i.e. 0x25BBE08F, is used to give a better spread when mapping these 30 extended attributes into 32-bit bloom filter as: bit 0: security.ima bit 1: bit 2: trusted.overlay.nlink bit 3: bit 4: user.overlay.nlink bit 5: trusted.overlay.upper bit 6: user.overlay.origin bit 7: trusted.overlay.protattr bit 8: security.apparmor bit 9: user.overlay.protattr bit 10: user.overlay.opaque bit 11: security.selinux bit 12: security.SMACK64TRANSMUTE bit 13: security.SMACK64 bit 14: security.SMACK64MMAP bit 15: user.overlay.impure bit 16: security.SMACK64IPIN bit 17: trusted.overlay.redirect bit 18: trusted.overlay.origin bit 19: security.SMACK64IPOUT bit 20: trusted.overlay.opaque bit 21: system.posix_acl_default bit 22: bit 23: user.mime_type bit 24: trusted.overlay.impure bit 25: security.SMACK64EXEC bit 26: user.overlay.redirect bit 27: user.overlay.upper bit 28: security.evm bit 29: security.capability bit 30: system.posix_acl_access bit 31: trusted.overlay.metacopy, user.overlay.metacopy h_name_filter is introduced to the on-disk per-inode xattr header to place the corresponding xattr name filter, where bit value 1 indicates non-existence for compatibility. This feature is indicated by EROFS_FEATURE_COMPAT_XATTR_FILTER compatible feature bit. Reserve one byte in on-disk superblock as the on-disk format for xattr name filter may change in the future. With this flag we don't need bothering these compatible bits again at that time. Suggested-by: Alexander Larsson Signed-off-by: Jingbo Xu --- include/erofs_fs.h | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/include/erofs_fs.h b/include/erofs_fs.h index 3697882..1789a37 100644 --- a/include/erofs_fs.h +++ b/include/erofs_fs.h @@ -14,6 +14,7 @@ #define EROFS_FEATURE_COMPAT_SB_CHKSUM 0x00000001 #define EROFS_FEATURE_COMPAT_MTIME 0x00000002 +#define EROFS_FEATURE_COMPAT_XATTR_FILTER 0x00000004 /* * Any bits that aren't in EROFS_ALL_FEATURE_INCOMPAT should @@ -82,7 +83,8 @@ struct erofs_super_block { __u8 xattr_prefix_count; /* # of long xattr name prefixes */ __le32 xattr_prefix_start; /* start of long xattr prefixes */ __le64 packed_nid; /* nid of the special packed inode */ - __u8 reserved2[24]; + __u8 xattr_filter_reserved; /* reserved for xattr name filter */ + __u8 reserved2[23]; }; /* @@ -201,7 +203,7 @@ struct erofs_inode_extended { * for read-only fs, no need to introduce h_refcount */ struct erofs_xattr_ibody_header { - __le32 h_reserved; + __le32 h_name_filter; /* bit value 1 indicates not-present */ __u8 h_shared_count; __u8 h_reserved2[7]; __le32 h_shared_xattrs[0]; /* shared xattr id array */ @@ -222,6 +224,12 @@ struct erofs_xattr_ibody_header { #define EROFS_XATTR_LONG_PREFIX 0x80 #define EROFS_XATTR_LONG_PREFIX_MASK 0x7f +#define EROFS_XATTR_NAME_LEN_MAX UCHAR_MAX + +#define EROFS_XATTR_FILTER_BITS 32 +#define EROFS_XATTR_FILTER_DEFAULT UINT32_MAX +#define EROFS_XATTR_FILTER_SEED 0x25BBE08F + /* xattr entry (for both inline & shared xattrs) */ struct erofs_xattr_entry { __u8 e_name_len; /* length of name */ From patchwork Fri Jul 14 02:53:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 120203 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp2238793vqm; Thu, 13 Jul 2023 20:11:24 -0700 (PDT) X-Google-Smtp-Source: APBJJlEQ32+lDlvDCq8jOUiIL4P8ledsOvJIeAptL5dxhfQ5NriSWgSuCQIL4zm28X7zGvl5Nzyf X-Received: by 2002:a05:6a00:997:b0:66c:a45:f00b with SMTP id u23-20020a056a00099700b0066c0a45f00bmr5016478pfg.23.1689304283749; Thu, 13 Jul 2023 20:11:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689304283; cv=none; d=google.com; s=arc-20160816; b=GxJbUJQO+WTdO8CvYoLfrDVbBhYTRv+I9weQtRsEbhdLlfH0dl9FHxOXV7bbauvWVa 3uhuy51EA0GQnxYImY2lYi4Yux6MxcOBHnjOYdmljeVkf3P+La8Slhg1swTE1qy5gfSJ /x+nfif03UPCKRZmF3vmuHJFdA85u6hsNG09PgGAeDQr4n+VDOa3cyjfjlAVG6oWq2+u Z+jnXDb4FT6Zufld0WumcMGGypCmySRkMRLp5n6eluwNWP4M21kOWsV7ilAXEx2YSA4q A+vFj1tJdXCZMnseXOBWBm8fRleHhyHeo8EcSIOnKDyIglL8kMJoJHicxmqGhcGg8tFI WldQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=OZppvTS3+qJR0wn2sGrHtq1a3YeMBj+SNzsC5rHeXUQ=; fh=c8pvLHGYFOEymPms4uYOO2xnsIHGQ54Pg45olDkm+5s=; b=RIFHJfCJwlcQgJdYrkkHHIDnlOh+cA+qHl4KiA2r7F2tb9Et8M0idmNnCJ329FhAhq TjyjAqSJBEgiikGmQ2QkhZhFI6U73UyKHaBDajiJchAg88ToCpMJ9VgTxJYHX0/cV75T vqZtkjD71wgjOJuTmrj4BaYQIBOftWhxHVd1Wm56tNRCRVD/i/HTofwZh3acIj7FPpzz yL33UcRNyCeFqAL/rgjmoAvBox7+1/yFnE3Kv2+YUoGxuyv8LkeTWHr8khQIxKBTmdhM 2IZLnfb6FhEUFPe8z3lS2iKnAYxCU7bEyNws5rBBZ69FIuVgc+a3IkmezRf6lq2NVP7m Yiqw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z4-20020a056a001d8400b006824bcb19b5si6074292pfw.4.2023.07.13.20.11.09; Thu, 13 Jul 2023 20:11:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234545AbjGNCxq (ORCPT + 99 others); Thu, 13 Jul 2023 22:53:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234425AbjGNCxk (ORCPT ); Thu, 13 Jul 2023 22:53:40 -0400 Received: from out30-111.freemail.mail.aliyun.com (out30-111.freemail.mail.aliyun.com [115.124.30.111]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 689D42D48 for ; Thu, 13 Jul 2023 19:53:38 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R161e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046051;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VnK1FF._1689303214; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0VnK1FF._1689303214) by smtp.aliyun-inc.com; Fri, 14 Jul 2023 10:53:35 +0800 From: Jingbo Xu To: hsiangkao@linux.alibaba.com, chao@kernel.org, huyue2@coolpad.com, linux-erofs@lists.ozlabs.org Cc: linux-kernel@vger.kernel.org, alexl@redhat.com Subject: [PATCH v4 3/3] erofs-utils: mkfs: enable xattr name filter Date: Fri, 14 Jul 2023 10:53:30 +0800 Message-Id: <20230714025330.42950-4-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.19.1.6.gb485710b In-Reply-To: <20230714025330.42950-1-jefflexu@linux.alibaba.com> References: <20230714025330.42950-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-9.9 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,URIBL_BLOCKED, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771363928998139599 X-GMAIL-MSGID: 1771363928998139599 Introduce "-Exattr-name-filter" option to enable the xattr name bloom filter feature. Signed-off-by: Jingbo Xu --- include/erofs/config.h | 1 + include/erofs/internal.h | 1 + lib/xattr.c | 74 +++++++++++++++++++++++++++++++--------- mkfs/main.c | 7 ++++ 4 files changed, 67 insertions(+), 16 deletions(-) diff --git a/include/erofs/config.h b/include/erofs/config.h index 8f52d2c..c51f0cd 100644 --- a/include/erofs/config.h +++ b/include/erofs/config.h @@ -53,6 +53,7 @@ struct erofs_configure { bool c_ignore_mtime; bool c_showprogress; bool c_extra_ea_name_prefixes; + bool c_xattr_name_filter; #ifdef HAVE_LIBSELINUX struct selabel_handle *sehnd; diff --git a/include/erofs/internal.h b/include/erofs/internal.h index ab964d4..1d7ef73 100644 --- a/include/erofs/internal.h +++ b/include/erofs/internal.h @@ -133,6 +133,7 @@ EROFS_FEATURE_FUNCS(fragments, incompat, INCOMPAT_FRAGMENTS) EROFS_FEATURE_FUNCS(dedupe, incompat, INCOMPAT_DEDUPE) EROFS_FEATURE_FUNCS(xattr_prefixes, incompat, INCOMPAT_XATTR_PREFIXES) EROFS_FEATURE_FUNCS(sb_chksum, compat, COMPAT_SB_CHKSUM) +EROFS_FEATURE_FUNCS(xattr_filter, compat, COMPAT_XATTR_FILTER) #define EROFS_I_EA_INITED (1 << 0) #define EROFS_I_Z_INITED (1 << 1) diff --git a/lib/xattr.c b/lib/xattr.c index 7d7dc54..a5d9fc5 100644 --- a/lib/xattr.c +++ b/lib/xattr.c @@ -18,6 +18,7 @@ #include "erofs/cache.h" #include "erofs/io.h" #include "erofs/fragments.h" +#include "erofs/xxhash.h" #include "liberofs_private.h" #define EA_HASHTABLE_BITS 16 @@ -26,6 +27,7 @@ struct xattr_item { struct xattr_item *next_shared_xattr; const char *kvbuf; unsigned int hash[2], len[2], count; + unsigned int name_filter_bit; int shared_xattr_id; u8 prefix; struct hlist_node node; @@ -101,7 +103,8 @@ static unsigned int put_xattritem(struct xattr_item *item) } static struct xattr_item *get_xattritem(u8 prefix, char *kvbuf, - unsigned int len[2]) + unsigned int len[2], + unsigned int name_filter_bit) { struct xattr_item *item; unsigned int hash[2], hkey; @@ -133,40 +136,59 @@ static struct xattr_item *get_xattritem(u8 prefix, char *kvbuf, item->hash[1] = hash[1]; item->shared_xattr_id = -1; item->prefix = prefix; + item->name_filter_bit = name_filter_bit; hash_add(ea_hashtable, &item->node, hkey); return item; } -static bool match_prefix(const char *key, u8 *index, u16 *len) +static unsigned int erofs_xattr_calc_name_filter_bit(u8 prefix, const char *key, + unsigned int len) +{ + if (!cfg.c_xattr_name_filter) + return 0; + return xxh32(key, len, EROFS_XATTR_FILTER_SEED + prefix) & + (EROFS_XATTR_FILTER_BITS - 1); +} + +static bool match_short_prefix(const char *key, u8 *index, u16 *len) { struct xattr_prefix *p; - struct ea_type_node *tnode; - list_for_each_entry(tnode, &ea_name_prefixes, list) { - p = &tnode->type; + for (p = xattr_types; p < xattr_types + ARRAY_SIZE(xattr_types); ++p) { if (p->prefix && !strncmp(p->prefix, key, p->prefix_len)) { *len = p->prefix_len; - *index = tnode->index; + *index = p - xattr_types; return true; } } - for (p = xattr_types; p < xattr_types + ARRAY_SIZE(xattr_types); ++p) { + return false; +} + +static bool match_prefix(const char *key, u8 *index, u16 *len) +{ + struct xattr_prefix *p; + struct ea_type_node *tnode; + + list_for_each_entry(tnode, &ea_name_prefixes, list) { + p = &tnode->type; if (p->prefix && !strncmp(p->prefix, key, p->prefix_len)) { *len = p->prefix_len; - *index = p - xattr_types; + *index = tnode->index; return true; } } - return false; + + return match_short_prefix(key, index, len); } static struct xattr_item *parse_one_xattr(const char *path, const char *key, unsigned int keylen) { ssize_t ret; - u8 prefix; - u16 prefixlen; + u8 prefix, o_prefix; + u16 prefixlen, o_prefixlen; unsigned int len[2]; + unsigned int bit = 0; char *kvbuf; erofs_dbg("parse xattr [%s] of %s", path, key); @@ -176,6 +198,13 @@ static struct xattr_item *parse_one_xattr(const char *path, const char *key, DBG_BUGON(keylen < prefixlen); + if (cfg.c_xattr_name_filter) { + if (!match_short_prefix(key, &o_prefix, &o_prefixlen)) + return ERR_PTR(-ENODATA); + bit = erofs_xattr_calc_name_filter_bit(o_prefix, + key + o_prefixlen, keylen - o_prefixlen); + } + /* determine length of the value */ #ifdef HAVE_LGETXATTR ret = lgetxattr(path, key, NULL, 0); @@ -216,7 +245,7 @@ static struct xattr_item *parse_one_xattr(const char *path, const char *key, len[1] = ret; } } - return get_xattritem(prefix, kvbuf, len); + return get_xattritem(prefix, kvbuf, len, bit); } static struct xattr_item *erofs_get_selabel_xattr(const char *srcpath, @@ -226,7 +255,7 @@ static struct xattr_item *erofs_get_selabel_xattr(const char *srcpath, if (cfg.sehnd) { char *secontext; int ret; - unsigned int len[2]; + unsigned int bit, len[2]; char *kvbuf, *fspath; if (cfg.mount_point) @@ -260,7 +289,8 @@ static struct xattr_item *erofs_get_selabel_xattr(const char *srcpath, } sprintf(kvbuf, "selinux%s", secontext); freecon(secontext); - return get_xattritem(EROFS_XATTR_INDEX_SECURITY, kvbuf, len); + bit = erofs_xattr_calc_name_filter_bit(EROFS_XATTR_INDEX_SECURITY, "selinux", len[0]); + return get_xattritem(EROFS_XATTR_INDEX_SECURITY, kvbuf, len, bit); } #endif return NULL; @@ -408,7 +438,7 @@ static int erofs_droid_xattr_set_caps(struct erofs_inode *inode) { const u64 capabilities = inode->capabilities; char *kvbuf; - unsigned int len[2]; + unsigned int bit, len[2]; struct vfs_cap_data caps; struct xattr_item *item; @@ -430,7 +460,8 @@ static int erofs_droid_xattr_set_caps(struct erofs_inode *inode) caps.data[1].inheritable = 0; memcpy(kvbuf + len[0], &caps, len[1]); - item = get_xattritem(EROFS_XATTR_INDEX_SECURITY, kvbuf, len); + bit = erofs_xattr_calc_name_filter_bit(EROFS_XATTR_INDEX_SECURITY, "capability", len[0]); + item = get_xattritem(EROFS_XATTR_INDEX_SECURITY, kvbuf, len, bit); if (IS_ERR(item)) return PTR_ERR(item); if (!item) @@ -754,6 +785,17 @@ char *erofs_export_xattr_ibody(struct list_head *ixattrs, unsigned int size) header = (struct erofs_xattr_ibody_header *)buf; header->h_shared_count = 0; + if (cfg.c_xattr_name_filter) { + u32 name_filter = 0; + + list_for_each_entry_safe(node, n, ixattrs, list) { + struct xattr_item *const item = node->item; + name_filter |= 1UL << item->name_filter_bit; + } + name_filter = EROFS_XATTR_FILTER_DEFAULT & ~name_filter; + header->h_name_filter = cpu_to_le32(name_filter); + } + p = sizeof(struct erofs_xattr_ibody_header); list_for_each_entry_safe(node, n, ixattrs, list) { struct xattr_item *const item = node->item; diff --git a/mkfs/main.c b/mkfs/main.c index ac208e5..7db7847 100644 --- a/mkfs/main.c +++ b/mkfs/main.c @@ -241,6 +241,13 @@ handle_fragment: return -EINVAL; cfg.c_dedupe = true; } + + if (MATCH_EXTENTED_OPT("xattr-name-filter", token, keylen)) { + if (vallen) + return -EINVAL; + cfg.c_xattr_name_filter = true; + erofs_sb_set_xattr_filter(); + } } return 0; }