Message ID | 20231204-slub-cleanup-hooks-v1-0-88b65f7cd9d5@suse.cz |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp2991865vqy; Mon, 4 Dec 2023 11:35:06 -0800 (PST) X-Google-Smtp-Source: AGHT+IEka2AzvN2Eh6nZuY7VvmMIU0mmYDYapsZSlg72iUpT3G01AdVl/SZXgZ5XUv7G9mvxVDDd X-Received: by 2002:a05:6e02:1241:b0:35d:59a2:6913 with SMTP id j1-20020a056e02124100b0035d59a26913mr4117617ilq.64.1701718505684; Mon, 04 Dec 2023 11:35:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701718505; cv=none; d=google.com; s=arc-20160816; b=UqQDteTHb8yQnIMsc/w9k3E2Yr25dFFS36rcE4eV9HiTx7YcLh2lyi3hv6kOdxlte6 +IBgifUtcgVpE5lvLELUThGYiaUQHevks4BpBWacmWGo8koYCIv8hEv0KF9q6j6jmx9f cdP7Jpu9pERm2p3pkqEcpBGZV88oMwlJRES8Ri7d0GcRJ3Wu1jet3IcdUy+wcSpc3/J/ SOAQYeXFpPnKlbn7Gm8mJE8WlcSz+bUL3NDHY7VrZwvJgTj08v8295kMOdps9hRuLw6W Cg/GB/8uvAmXtD9jbDtoQ4qumBpFWIg5/I9WVLUnFuD0u/Hexz4XVVKNcrd3Ldllc+7d WM0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:content-transfer-encoding:mime-version :message-id:date:subject:from; bh=aZT4XXToznx+yNKftkZu4IjQ3/65or5OU6hVajERaWU=; fh=QGaHzSWEdTlyx4gN2Cp05xyAuU2J2OcL//unbyREezU=; b=bA42UV1jQpG2nnNlKdU3surhBZG0+Oa+waMIE45Ly7L8DfNCTJABJINU2pJX0D98Fn 0UfImsQHMfd4ueO6lKuVmO4NDJKzeT9m/hVuf0Oiqejh8mlY68MZ67XSDYCXA6z1dAyz MeYFGJvB9ksyFiqTGd016Zygl6IhmKHNqihrjowjEpuFC19pey7LKWVa1Zq2rgiNaAIt pyYWsh7wQbbYxlAiZnDk2X/HWBmOBqMzQ32WcpMjpuRdJdDEWF+81Uw1ZI5n62JovTGT qbp5QD10JxZkOGHXO+AAY6Up4b0qer8mFi7Xc8IZFOMliF0rXD28J6+vFhfRhpGiia+x j/EQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id c17-20020a656751000000b005b9602a7badsi8649319pgu.688.2023.12.04.11.35.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 11:35:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 0C0E3804C57A; Mon, 4 Dec 2023 11:35:00 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232908AbjLDTeu (ORCPT <rfc822;chrisfriedt@gmail.com> + 99 others); Mon, 4 Dec 2023 14:34:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45788 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230299AbjLDTes (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 4 Dec 2023 14:34:48 -0500 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2a07:de40:b251:101:10:150:64:2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E9FBCE for <linux-kernel@vger.kernel.org>; Mon, 4 Dec 2023 11:34:54 -0800 (PST) Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7748F1FE6D; Mon, 4 Dec 2023 19:34:52 +0000 (UTC) Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 53667139AA; Mon, 4 Dec 2023 19:34:52 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id x9KfE9wpbmUPMwAAD6G6ig (envelope-from <vbabka@suse.cz>); Mon, 04 Dec 2023 19:34:52 +0000 From: Vlastimil Babka <vbabka@suse.cz> Subject: [PATCH 0/4] SLUB: cleanup hook processing Date: Mon, 04 Dec 2023 20:34:39 +0100 Message-Id: <20231204-slub-cleanup-hooks-v1-0-88b65f7cd9d5@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-B4-Tracking: v=1; b=H4sIAM8pbmUC/x3MQQqAIBBA0avErBtIyxZdJVqYjTkUGg5FEN09a fkW/z8glJkEhuqBTBcLp1ig6gpcsHEl5KUYdKNbpZsOZT9ndDvZeB4YUtoEvfGmU622vXJQwiO T5/ufjtP7fnAR5D9kAAAA To: Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com> Cc: Andrew Morton <akpm@linux-foundation.org>, Roman Gushchin <roman.gushchin@linux.dev>, Hyeonggon Yoo <42.hyeyoo@gmail.com>, Alexander Potapenko <glider@google.com>, Marco Elver <elver@google.com>, Dmitry Vyukov <dvyukov@google.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kasan-dev@googlegroups.com, Vlastimil Babka <vbabka@suse.cz> X-Mailer: b4 0.12.4 X-Spamd-Bar: ++++++++ Authentication-Results: smtp-out2.suse.de; dkim=none; dmarc=none; spf=softfail (smtp-out2.suse.de: 2a07:de40:b281:104:10:150:64:97 is neither permitted nor denied by domain of vbabka@suse.cz) smtp.mailfrom=vbabka@suse.cz X-Rspamd-Server: rspamd2 X-Spamd-Result: default: False [8.39 / 50.00]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; SPAMHAUS_XBL(0.00)[2a07:de40:b281:104:10:150:64:97:from]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; FREEMAIL_ENVRCPT(0.00)[gmail.com]; TAGGED_RCPT(0.00)[]; MIME_GOOD(-0.10)[text/plain]; DMARC_NA(1.20)[suse.cz]; R_SPF_SOFTFAIL(4.60)[~all]; BAYES_HAM(-0.00)[38.82%]; NEURAL_HAM_LONG(-1.00)[-1.000]; RCVD_COUNT_THREE(0.00)[3]; MX_GOOD(-0.01)[]; RCPT_COUNT_TWELVE(0.00)[14]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:email]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_EQ_ENVFROM(0.00)[]; R_DKIM_NA(2.20)[]; MIME_TRACE(0.00)[0:+]; FREEMAIL_CC(0.00)[linux-foundation.org,linux.dev,gmail.com,google.com,kvack.org,vger.kernel.org,googlegroups.com,suse.cz]; RCVD_TLS_ALL(0.00)[]; MID_RHS_MATCH_FROM(0.00)[]; SUSPICIOUS_RECIPS(1.50)[] X-Spam-Score: 8.39 X-Rspamd-Queue-Id: 7748F1FE6D X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Mon, 04 Dec 2023 11:35:01 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784381183999176029 X-GMAIL-MSGID: 1784381183999176029 |
Series | SLUB: cleanup hook processing | |
Message
Vlastimil Babka
Dec. 4, 2023, 7:34 p.m. UTC
This is a spin-off of preparatory patches from the percpu array series [1] as they are IMHO useful on their own and simple enough to target 6.8, while the percpu array is still a RFC. To avoid non-trivial conflict, the series is also rebased on top of the SLAB removal branch. [2] [1] https://lore.kernel.org/all/20231129-slub-percpu-caches-v3-0-6bcf536772bc@suse.cz/ [2] https://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab.git/log/?h=slab/for-6.8/slab-removal --- Vlastimil Babka (4): mm/slub: fix bulk alloc and free stats mm/slub: introduce __kmem_cache_free_bulk() without free hooks mm/slub: handle bulk and single object freeing separately mm/slub: free KFENCE objects in slab_free_hook() mm/slub.c | 109 +++++++++++++++++++++++++++++++++++++++++--------------------- 1 file changed, 73 insertions(+), 36 deletions(-) --- base-commit: 4a38e93b3a7e6669c44929fed918b1494e902dd7 change-id: 20231204-slub-cleanup-hooks-f5f54132a61c Best regards,
Comments
On 12/5/23 14:27, Chengming Zhou wrote: > On 2023/12/5 03:34, Vlastimil Babka wrote: >> When freeing an object that was allocated from KFENCE, we do that in the >> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be >> the cpu slab, so the fastpath has to fallback to the slowpath. >> >> This optimization doesn't help much though, because is_kfence_address() >> is checked earlier anyway during the free hook processing or detached >> freelist building. Thus we can simplify the code by making the >> slab_free_hook() free the KFENCE object immediately, similarly to KASAN >> quarantine. >> >> In slab_free_hook() we can place kfence_free() above init processing, as >> callers have been making sure to set init to false for KFENCE objects. >> This simplifies slab_free(). This places it also above kasan_slab_free() >> which is ok as that skips KFENCE objects anyway. >> >> While at it also determine the init value in slab_free_freelist_hook() >> outside of the loop. >> >> This change will also make introducing per cpu array caches easier. >> >> Tested-by: Marco Elver <elver@google.com> >> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> >> --- >> mm/slub.c | 22 ++++++++++------------ >> 1 file changed, 10 insertions(+), 12 deletions(-) >> >> diff --git a/mm/slub.c b/mm/slub.c >> index ed2fa92e914c..e38c2b712f6c 100644 >> --- a/mm/slub.c >> +++ b/mm/slub.c >> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, >> * production configuration these hooks all should produce no code at all. >> * >> * Returns true if freeing of the object can proceed, false if its reuse >> - * was delayed by KASAN quarantine. >> + * was delayed by KASAN quarantine, or it was returned to KFENCE. >> */ >> static __always_inline >> bool slab_free_hook(struct kmem_cache *s, void *x, bool init) >> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init) >> __kcsan_check_access(x, s->object_size, >> KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT); >> >> + if (kfence_free(kasan_reset_tag(x))) > > I'm wondering if "kasan_reset_tag()" is needed here? I think so, because AFAICS the is_kfence_address() check in kfence_free() could be a false negative otherwise. In fact now I even question some of the other is_kfence_address() checks in mm/slub.c, mainly build_detached_freelist() which starts from pointers coming directly from slab users. Insight from KASAN/KFENCE folks appreciated :) > The patch looks good to me! > > Reviewed-by: Chengming Zhou <zhouchengming@bytedance.com> Thanks! > Thanks. > >> + return false; >> + >> /* >> * As memory initialization might be integrated into KASAN, >> * kasan_slab_free and initialization memset's must be >> @@ -2086,23 +2089,25 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, >> void *object; >> void *next = *head; >> void *old_tail = *tail; >> + bool init; >> >> if (is_kfence_address(next)) { >> slab_free_hook(s, next, false); >> - return true; >> + return false; >> } >> >> /* Head and tail of the reconstructed freelist */ >> *head = NULL; >> *tail = NULL; >> >> + init = slab_want_init_on_free(s); >> + >> do { >> object = next; >> next = get_freepointer(s, object); >> >> /* If object's reuse doesn't have to be delayed */ >> - if (likely(slab_free_hook(s, object, >> - slab_want_init_on_free(s)))) { >> + if (likely(slab_free_hook(s, object, init))) { >> /* Move object to the new freelist */ >> set_freepointer(s, object, *head); >> *head = object; >> @@ -4103,9 +4108,6 @@ static void __slab_free(struct kmem_cache *s, struct slab *slab, >> >> stat(s, FREE_SLOWPATH); >> >> - if (kfence_free(head)) >> - return; >> - >> if (IS_ENABLED(CONFIG_SLUB_TINY) || kmem_cache_debug(s)) { >> free_to_partial_list(s, slab, head, tail, cnt, addr); >> return; >> @@ -4290,13 +4292,9 @@ static __fastpath_inline >> void slab_free(struct kmem_cache *s, struct slab *slab, void *object, >> unsigned long addr) >> { >> - bool init; >> - >> memcg_slab_free_hook(s, slab, &object, 1); >> >> - init = !is_kfence_address(object) && slab_want_init_on_free(s); >> - >> - if (likely(slab_free_hook(s, object, init))) >> + if (likely(slab_free_hook(s, object, slab_want_init_on_free(s)))) >> do_slab_free(s, slab, object, object, 1, addr); >> } >> >>
On Wed, 6 Dec 2023 at 14:02, Chengming Zhou <chengming.zhou@linux.dev> wrote: > > On 2023/12/6 17:58, Vlastimil Babka wrote: > > On 12/5/23 14:27, Chengming Zhou wrote: > >> On 2023/12/5 03:34, Vlastimil Babka wrote: > >>> When freeing an object that was allocated from KFENCE, we do that in the > >>> slowpath __slab_free(), relying on the fact that KFENCE "slab" cannot be > >>> the cpu slab, so the fastpath has to fallback to the slowpath. > >>> > >>> This optimization doesn't help much though, because is_kfence_address() > >>> is checked earlier anyway during the free hook processing or detached > >>> freelist building. Thus we can simplify the code by making the > >>> slab_free_hook() free the KFENCE object immediately, similarly to KASAN > >>> quarantine. > >>> > >>> In slab_free_hook() we can place kfence_free() above init processing, as > >>> callers have been making sure to set init to false for KFENCE objects. > >>> This simplifies slab_free(). This places it also above kasan_slab_free() > >>> which is ok as that skips KFENCE objects anyway. > >>> > >>> While at it also determine the init value in slab_free_freelist_hook() > >>> outside of the loop. > >>> > >>> This change will also make introducing per cpu array caches easier. > >>> > >>> Tested-by: Marco Elver <elver@google.com> > >>> Signed-off-by: Vlastimil Babka <vbabka@suse.cz> > >>> --- > >>> mm/slub.c | 22 ++++++++++------------ > >>> 1 file changed, 10 insertions(+), 12 deletions(-) > >>> > >>> diff --git a/mm/slub.c b/mm/slub.c > >>> index ed2fa92e914c..e38c2b712f6c 100644 > >>> --- a/mm/slub.c > >>> +++ b/mm/slub.c > >>> @@ -2039,7 +2039,7 @@ static inline void memcg_slab_free_hook(struct kmem_cache *s, struct slab *slab, > >>> * production configuration these hooks all should produce no code at all. > >>> * > >>> * Returns true if freeing of the object can proceed, false if its reuse > >>> - * was delayed by KASAN quarantine. > >>> + * was delayed by KASAN quarantine, or it was returned to KFENCE. > >>> */ > >>> static __always_inline > >>> bool slab_free_hook(struct kmem_cache *s, void *x, bool init) > >>> @@ -2057,6 +2057,9 @@ bool slab_free_hook(struct kmem_cache *s, void *x, bool init) > >>> __kcsan_check_access(x, s->object_size, > >>> KCSAN_ACCESS_WRITE | KCSAN_ACCESS_ASSERT); > >>> > >>> + if (kfence_free(kasan_reset_tag(x))) > >> > >> I'm wondering if "kasan_reset_tag()" is needed here? > > > > I think so, because AFAICS the is_kfence_address() check in kfence_free() > > could be a false negative otherwise. In fact now I even question some of the > > Ok. > > > other is_kfence_address() checks in mm/slub.c, mainly > > build_detached_freelist() which starts from pointers coming directly from > > slab users. Insight from KASAN/KFENCE folks appreciated :) > > > I know very little about KASAN/KFENCE, looking forward to their insight. :) > > Just saw a check in __kasan_slab_alloc(): > > if (is_kfence_address(object)) > return (void *)object; > > So thought it seems that a kfence object would be skipped by KASAN. The is_kfence_address() implementation tolerates tagged addresses, i.e. if it receives a tagged non-kfence address, it will never return true. The KASAN_HW_TAGS patches and KFENCE patches were in development concurrently, and at the time there was some conflict resolution that happened when both were merged. The is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was squashed into 2b8305260fb. [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/ Andrey, do you recall what issue you encountered that needed kasan_reset_tag()?
On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <elver@google.com> wrote: > > The is_kfence_address() implementation tolerates tagged addresses, > i.e. if it receives a tagged non-kfence address, it will never return > true. > > The KASAN_HW_TAGS patches and KFENCE patches were in development > concurrently, and at the time there was some conflict resolution that > happened when both were merged. The > is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was > squashed into 2b8305260fb. > > [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/ > > Andrey, do you recall what issue you encountered that needed kasan_reset_tag()? I don't remember at this point, but this could have been just a safety measure. If is_kfence_address tolerates tagged addresses, we should be able to drop these kasan_reset_tag calls.
On 12/11/23 23:11, Andrey Konovalov wrote: > On Wed, Dec 6, 2023 at 3:45 PM Marco Elver <elver@google.com> wrote: >> >> The is_kfence_address() implementation tolerates tagged addresses, >> i.e. if it receives a tagged non-kfence address, it will never return >> true. So just to be sure, it can't happen that a genuine kfence address would then become KASAN tagged and handed out, and thus when tested by is_kfence_address() it would be a false negative? >> The KASAN_HW_TAGS patches and KFENCE patches were in development >> concurrently, and at the time there was some conflict resolution that >> happened when both were merged. The >> is_kfence_address(kasan_reset_tag(..)) initially came from [1] but was >> squashed into 2b8305260fb. >> >> [1] https://lore.kernel.org/all/9dc196006921b191d25d10f6e611316db7da2efc.1611946152.git.andreyknvl@google.com/ >> >> Andrey, do you recall what issue you encountered that needed kasan_reset_tag()? > > I don't remember at this point, but this could have been just a safety measure. > > If is_kfence_address tolerates tagged addresses, we should be able to > drop these kasan_reset_tag calls. Will drop it once the above is confirmed. Thanks!