Message ID | 20230208184203.2260394-1-elver@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp3625542wrn; Wed, 8 Feb 2023 10:51:09 -0800 (PST) X-Google-Smtp-Source: AK7set+HaxHfBTtscOMoNC/FnYZ0ZfGLH3l+FsFNgf2FPvmXS3vdR+EGsBjFoVrXsVZkOoIdce3w X-Received: by 2002:a17:906:283:b0:878:4bda:2011 with SMTP id 3-20020a170906028300b008784bda2011mr7668659ejf.4.1675882269073; Wed, 08 Feb 2023 10:51:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675882269; cv=none; d=google.com; s=arc-20160816; b=JafjSan2JkISJ/L32CzGngFBfh7SMUbnGRHRj4D9lh6KgnsEcSOGUOC6m/9uEriKg0 aAGUInAttpt7RZDcq3GQq5wb7VWxZ27JoEDqaYjwWt5lPz76zBUgdhYioYNsuPKci7YP /v2cb3eyVEKcyLdrgowziTlXeYx26Rpir7S+x5k1DSyg3dqpTrMBWbJSIu4hD8BnkdoK W39ndq8AZjZqmr8ighjs0SDBmPY9IaAa7EmSlVscIwuE4WlERVOgYi9jYzR1SrQJhwdD 6nWVJIQuAvKV8eFen/l9F1oIx9sIR3D5Be896k8z1UYGKHbbVUK6Kbng20wejbPiIbdw kUAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=muGKsBQpDMjO2rXBhVjlNe2+4h4S1u+JGQBqWajmnMY=; b=Y1sn7oMyRXLCuu0CLVhUVX/B04g+hT57V4u8afch9yahfeLwyuxqouJB1I8oMjbVNj Sf3UvV9LAZ9+AQ8GDLdptrTphjTPDhJDm1GuY9DysRhvyx/OVf1fHf3yjsF099ke/8V8 0PlmVEaj4DehhTJDUM/E8uabke/b/AOBJB1EuZcLnNW7kcChtBQReXuyxsGAss5n+TAY YZL3mlbizd2FXRn6D923GK9r9YhQKZYwkp1UMo2nTDovBBeEEJ3kYJppNoFwQlvj08Jn OW0zt0JlYnt21NMW+ARvVtwKYJtQWioLzkdTlHaCP7OgD+S5Wr2VSIiqwiBLO+0QlRus 4Xzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HxBkYsCy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 15-20020a17090602cf00b0088b3b09e143si3643747ejk.593.2023.02.08.10.50.46; Wed, 08 Feb 2023 10:51:09 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=HxBkYsCy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230335AbjBHSmo (ORCPT <rfc822;ivan.orlov0322@gmail.com> + 99 others); Wed, 8 Feb 2023 13:42:44 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229740AbjBHSmn (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 8 Feb 2023 13:42:43 -0500 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE12944B6 for <linux-kernel@vger.kernel.org>; Wed, 8 Feb 2023 10:42:41 -0800 (PST) Received: by mail-yb1-xb4a.google.com with SMTP id k15-20020a5b0a0f000000b007eba3f8e3baso17974076ybq.4 for <linux-kernel@vger.kernel.org>; Wed, 08 Feb 2023 10:42:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=muGKsBQpDMjO2rXBhVjlNe2+4h4S1u+JGQBqWajmnMY=; b=HxBkYsCyEckUoHuvqVZ5uRlBgYs41lGpzh9bJv5RycbHZDeZvkaOSzuAVFVt1ocvW2 GUtHG6k3mrzIxEGxYLtddoREAd1jMYdlSfDEJ+H9rP1xQ3pUllwwQPEUZH1vYPa5ulra 2/+qrCz0H5FEJKh2E1cpdMySNo/1cFVZG7fO8PhEVzQPZ3sEBVYBhs0rgDlHIWhU0+kO Yf9vVPbuo9u+AY1zPiE/WEBlftJL/MvXqqHw8ylsVXWkRAYZDDqRFx62tiV63uOOqLZe kMWoCoa+EkILLVIvTVknwJXc+UnFkKSqYWNUe0meeh0pNPjtvUMwge+IQDK+CpWpGHW4 OFDw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=muGKsBQpDMjO2rXBhVjlNe2+4h4S1u+JGQBqWajmnMY=; b=SBCWr+hKE61NcKuSYgHh9eUA0ZtRuULwLuoxA/YEWvs5YAHybqvML/pEjdrolByOY7 fHw13F42yDH09rJrWQYaihRWHWUwBeDOC/Au8HLrjix4XYSNHQcztJ0/KdNEh7+iYtYa iKTMrAI1lbjzxNbO2rxK7zjKCszHCBMAorVQVktdoTzSq91WFksXFLtuMcD1XhhVGoNv YT39ucFv77KzYlZqCpxlOxyNLb3L2GfZo2Hs6s1lyJwP/HCg3+MOyEPXUV4rKuF+YIA8 VSzy+Fcyt1sWzClXicJzGRTdReGb4MGrgQ2Kxz4Cwjl43G/rjSZlBQxu7Xc2RTNW2ZwQ DkDQ== X-Gm-Message-State: AO0yUKUVpRFiwqRLOpsHj8kAOqjhudGAVZRU2Ol2IylLU7bnH8EAUy8L S4yYOCXmpDwNIl48tDMe9v/BUk5oJw== X-Received: from elver.muc.corp.google.com ([2a00:79e0:9c:201:a0ba:1bc4:8b19:5b22]) (user=elver job=sendgmr) by 2002:a81:1312:0:b0:528:37bd:c122 with SMTP id 18-20020a811312000000b0052837bdc122mr10ywt.5.1675881760758; Wed, 08 Feb 2023 10:42:40 -0800 (PST) Date: Wed, 8 Feb 2023 19:42:03 +0100 Mime-Version: 1.0 X-Mailer: git-send-email 2.39.1.519.gcb327c4b5f-goog Message-ID: <20230208184203.2260394-1-elver@google.com> Subject: [PATCH -tip] kasan: Emit different calls for instrumentable memintrinsics From: Marco Elver <elver@google.com> To: elver@google.com, Peter Zijlstra <peterz@infradead.org> Cc: Masahiro Yamada <masahiroy@kernel.org>, Nathan Chancellor <nathan@kernel.org>, Nick Desaulniers <ndesaulniers@google.com>, Nicolas Schier <nicolas@fjasle.eu>, Andrey Ryabinin <ryabinin.a.a@gmail.com>, Alexander Potapenko <glider@google.com>, Andrey Konovalov <andreyknvl@gmail.com>, Dmitry Vyukov <dvyukov@google.com>, Vincenzo Frascino <vincenzo.frascino@arm.com>, linux-kbuild@vger.kernel.org, kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org, Ingo Molnar <mingo@kernel.org>, Tony Lindgren <tony@atomide.com>, Ulf Hansson <ulf.hansson@linaro.org>, linux-toolchains@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757289925766776554?= X-GMAIL-MSGID: =?utf-8?q?1757289925766776554?= |
Series |
[-tip] kasan: Emit different calls for instrumentable memintrinsics
|
|
Commit Message
Marco Elver
Feb. 8, 2023, 6:42 p.m. UTC
Clang 15 will provide an option to prefix calls to memcpy/memset/memmove
with __asan_ in instrumented functions: https://reviews.llvm.org/D122724
GCC does not yet have similar support.
Use it to regain KASAN instrumentation of memcpy/memset/memmove on
architectures that require noinstr to be really free from instrumented
mem*() functions (all GENERIC_ENTRY architectures).
Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions")
Signed-off-by: Marco Elver <elver@google.com>
---
The Fixes tag is just there to show the dependency, and that people
shouldn't apply this patch without 69d4c0d32186.
---
scripts/Makefile.kasan | 7 +++++++
1 file changed, 7 insertions(+)
Comments
On Wed, Feb 8, 2023 at 7:42 PM Marco Elver <elver@google.com> wrote: > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724 Hi Marco, Does this option affect all functions or only the ones that are marked with no_sanitize? Based on the LLVM patch description, should we also change the normal memcpy/memset/memmove to be noninstrumented? These __asan_mem* functions are not defined in the kernel AFAICS. Should we add them? Or maybe we should just use "__" as the prefix, as right now __mem* functions are the ones that are not instrumented? Thanks! > GCC does not yet have similar support. > > Use it to regain KASAN instrumentation of memcpy/memset/memmove on > architectures that require noinstr to be really free from instrumented > mem*() functions (all GENERIC_ENTRY architectures). > > Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions") > Signed-off-by: Marco Elver <elver@google.com> > --- > > The Fixes tag is just there to show the dependency, and that people > shouldn't apply this patch without 69d4c0d32186. > > --- > scripts/Makefile.kasan | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan > index b9e94c5e7097..78336b04c077 100644 > --- a/scripts/Makefile.kasan > +++ b/scripts/Makefile.kasan > @@ -38,6 +38,13 @@ endif > > CFLAGS_KASAN += $(call cc-param,asan-stack=$(stack_enable)) > > +ifdef CONFIG_GENERIC_ENTRY > +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*() > +# instead. With compilers that don't support this option, compiler-inserted > +# memintrinsics won't be checked by KASAN. > +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix) > +endif > + > endif # CONFIG_KASAN_GENERIC > > ifdef CONFIG_KASAN_SW_TAGS > -- > 2.39.1.519.gcb327c4b5f-goog >
On Thu, 9 Feb 2023 at 23:43, Andrey Konovalov <andreyknvl@gmail.com> wrote: > > On Wed, Feb 8, 2023 at 7:42 PM Marco Elver <elver@google.com> wrote: > > > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724 > > Hi Marco, > > Does this option affect all functions or only the ones that are marked > with no_sanitize? Only functions that are instrumented, i.e. wherever fsanitize=kernel-address inserts instrumentation. > Based on the LLVM patch description, should we also change the normal > memcpy/memset/memmove to be noninstrumented? They are no longer instrumented as of 69d4c0d32186 (for CONFIG_GENERIC_ENTRY arches). > These __asan_mem* functions are not defined in the kernel AFAICS. > Should we add them? Peter introduced them in 69d4c0d32186, and we effectively have no mem*() instrumentation on x86 w/o the compiler-enablement patch here. > Or maybe we should just use "__" as the prefix, as right now __mem* > functions are the ones that are not instrumented? __asan_mem* is for instrumented code, just like ASan userspace does (actually ASan userspace has been doing it like this forever, just the kernel was somehow special). [...] > > Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions") > > Signed-off-by: Marco Elver <elver@google.com> > > --- > > > > The Fixes tag is just there to show the dependency, and that people > > shouldn't apply this patch without 69d4c0d32186. ^^^ Depends on this commit, which is only in -tip. > > +ifdef CONFIG_GENERIC_ENTRY It also only affects GENERIC_ENTRY arches. > > +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*() > > +# instead. With compilers that don't support this option, compiler-inserted > > +# memintrinsics won't be checked by KASAN. > > +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix) > > +endif Probably the same should be done for SW_TAGS, because arm64 will be GENERIC_ENTRY at one point or another as well. KASAN + GCC on x86 will have no mem*() instrumentation after 69d4c0d32186, which is sad, so somebody ought to teach it the same param as above. Thanks, -- Marco
On Fri, Feb 10, 2023 at 12:35 AM Marco Elver <elver@google.com> wrote: > > On Thu, 9 Feb 2023 at 23:43, Andrey Konovalov <andreyknvl@gmail.com> wrote: > > > > On Wed, Feb 8, 2023 at 7:42 PM Marco Elver <elver@google.com> wrote: > > > > > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove > > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724 > > > > Hi Marco, > > > > Does this option affect all functions or only the ones that are marked > > with no_sanitize? > > Only functions that are instrumented, i.e. wherever > fsanitize=kernel-address inserts instrumentation. Ack. > > Based on the LLVM patch description, should we also change the normal > > memcpy/memset/memmove to be noninstrumented? > > They are no longer instrumented as of 69d4c0d32186 (for > CONFIG_GENERIC_ENTRY arches). Ah, sorry, overlooked that. > > These __asan_mem* functions are not defined in the kernel AFAICS. > > Should we add them? > > Peter introduced them in 69d4c0d32186, and we effectively have no > mem*() instrumentation on x86 w/o the compiler-enablement patch here. > > > Or maybe we should just use "__" as the prefix, as right now __mem* > > functions are the ones that are not instrumented? > > __asan_mem* is for instrumented code, just like ASan userspace does > (actually ASan userspace has been doing it like this forever, just the > kernel was somehow special). > > [...] > > > Fixes: 69d4c0d32186 ("entry, kasan, x86: Disallow overriding mem*() functions") > > > Signed-off-by: Marco Elver <elver@google.com> > > > --- > > > > > > The Fixes tag is just there to show the dependency, and that people > > > shouldn't apply this patch without 69d4c0d32186. > > ^^^ Depends on this commit, which is only in -tip. Got it. Missed that patch. > > > +ifdef CONFIG_GENERIC_ENTRY > > It also only affects GENERIC_ENTRY arches. > > > > +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*() > > > +# instead. With compilers that don't support this option, compiler-inserted > > > +# memintrinsics won't be checked by KASAN. > > > +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix) > > > +endif > > Probably the same should be done for SW_TAGS, because arm64 will be > GENERIC_ENTRY at one point or another as well. Yes, makes sense. I'll file a bug for this once I fully understand the consequences of these changes. > KASAN + GCC on x86 will have no mem*() instrumentation after > 69d4c0d32186, which is sad, so somebody ought to teach it the same > param as above. Hm, with that patch we would have no KASAN checking within normal mem* functions (not the ones embedded by the compiler) on GENERIC_ENTRY arches even with Clang, right?
On Fri, 10 Feb 2023 at 17:13, Andrey Konovalov <andreyknvl@gmail.com> wrote: [...] > > Probably the same should be done for SW_TAGS, because arm64 will be > > GENERIC_ENTRY at one point or another as well. > > Yes, makes sense. I'll file a bug for this once I fully understand the > consequences of these changes. > > > KASAN + GCC on x86 will have no mem*() instrumentation after > > 69d4c0d32186, which is sad, so somebody ought to teach it the same > > param as above. > > Hm, with that patch we would have no KASAN checking within normal mem* > functions (not the ones embedded by the compiler) on GENERIC_ENTRY > arches even with Clang, right? Yes, that's the point - normal mem*() functions cannot be instrumented with GENERIC_ENTRY within noinstr functions, because the compiler sometimes decides to transform normal assignments into memcpy()/memset(). And if mem*() were instrumented (as it was before 69d4c0d32186), that'd break things for these architectures. But since most code is normally instrumented, with the right compiler support (which the patch here enables), we just turn mem*() in instrumented functions into __asan_mem*(), and get the instrumentation as before. 69d4c0d32186 already added those __asan functions. The fact that KASAN used to override mem*() is just the wrong choice in a world where compilers decide to inline or outline these. From an instrumentation point of view at the compiler level, we need to treat them like any other instrumentable instruction (loads, stores, atomics, etc.): transform each instrumentable instruction into something that does the right checks. Only then can we be sure that we don't accidentally instrument something that shouldn't be (noinstr functions), because instead of relying on the compiler, we forced instrumentation on every mem*().
On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote: > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724 > > GCC does not yet have similar support. GCC has support to rename memcpy/memset etc. for years, say on following compiled with -fsanitize=kernel-address -O2 -mstringop-strategy=libcall (the last option just to make sure the compiler doesn't prefer to emit rep mov*/stos* or loop or something similar, of course kernel can keep whatever it uses) you'll get just __asan_memcpy/__asan_memset calls, no memcpy/memset, while without -fsanitize=kernel-address you get normally memcpy/memset. Or do you need the __asan_* functions only in asan instrumented functions and normal ones in non-instrumented functions in the same TU? #ifdef __SANITIZE_ADDRESS__ extern __typeof (__builtin_memcpy) memcpy __asm ("__asan_memcpy"); extern __typeof (__builtin_memset) memset __asm ("__asan_memset"); #endif struct S { char a[2048]; } a, b; void foo (void) { a = b; b = (struct S) {}; } void bar (void *p, void *q, int s) { memcpy (p, q, s); } void baz (void *p, int c, int s) { memset (p, c, s); } void qux (void *p, void *q, int s) { __builtin_memcpy (p, q, s); } void quux (void *p, int c, int s) { __builtin_memset (p, c, s); } Jakub
On Fri, 10 Feb 2023 at 20:25, Jakub Jelinek <jakub@redhat.com> wrote: > > On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote: > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724 > > > > GCC does not yet have similar support. > > GCC has support to rename memcpy/memset etc. for years, say on > following compiled with > -fsanitize=kernel-address -O2 -mstringop-strategy=libcall > (the last option just to make sure the compiler doesn't prefer to emit > rep mov*/stos* or loop or something similar, of course kernel can keep > whatever it uses) you'll get just __asan_memcpy/__asan_memset calls, > no memcpy/memset, while without -fsanitize=kernel-address you get > normally memcpy/memset. > Or do you need the __asan_* functions only in asan instrumented functions > and normal ones in non-instrumented functions in the same TU? Yes, exactly that: __asan_ in instrumented, and normal ones in no_sanitize functions; they can be mixed in the same TU. We can't rename normal mem*() functions everywhere. In no_sanitize functions (in particular noinstr), normal mem*() should be used. But in instrumented code, it should be __asan_mem*(). Another longer explanation I also just replied here: https://lore.kernel.org/all/CANpmjNNH-O+38U6zRWJUCU-eJTfMhUosy==GWEOn1vcu=J2dcw@mail.gmail.com/ At least clang has had this behaviour for user space ASan forever: https://godbolt.org/z/h5sWExzef - so it was easy to just add the flag to make it behave like in user space for mem*() in the kernel. It might also be worthwhile for GCC to emit __asan_ for user space, given that the runtimes are shared and the user space runtime definitely has __asan_. The kernel needs the param (asan-kernel-mem-intrinsic-prefix) though, to not break older kernels. Thanks, -- Marco
On Fri, Feb 10, 2023 at 7:41 PM Marco Elver <elver@google.com> wrote: > > On Fri, 10 Feb 2023 at 17:13, Andrey Konovalov <andreyknvl@gmail.com> wrote: > [...] > > > Probably the same should be done for SW_TAGS, because arm64 will be > > > GENERIC_ENTRY at one point or another as well. > > > > Yes, makes sense. I'll file a bug for this once I fully understand the > > consequences of these changes. > > > > > KASAN + GCC on x86 will have no mem*() instrumentation after > > > 69d4c0d32186, which is sad, so somebody ought to teach it the same > > > param as above. > > > > Hm, with that patch we would have no KASAN checking within normal mem* > > functions (not the ones embedded by the compiler) on GENERIC_ENTRY > > arches even with Clang, right? > > Yes, that's the point - normal mem*() functions cannot be instrumented > with GENERIC_ENTRY within noinstr functions, because the compiler > sometimes decides to transform normal assignments into > memcpy()/memset(). And if mem*() were instrumented (as it was before > 69d4c0d32186), that'd break things for these architectures. > > But since most code is normally instrumented, with the right compiler > support (which the patch here enables), we just turn mem*() in > instrumented functions into __asan_mem*(), and get the instrumentation > as before. 69d4c0d32186 already added those __asan functions. The fact > that KASAN used to override mem*() is just the wrong choice in a world > where compilers decide to inline or outline these. From an > instrumentation point of view at the compiler level, we need to treat > them like any other instrumentable instruction (loads, stores, > atomics, etc.): transform each instrumentable instruction into > something that does the right checks. Only then can we be sure that we > don't accidentally instrument something that shouldn't be (noinstr > functions), because instead of relying on the compiler, we forced > instrumentation on every mem*(). I meant to ask whether the normal mem* calls from instrumented functions will also be transformed to __asan_mem*() by the compiler. But following the godbolt link you shared, I see that this is true. Thank you for the explanation! So the overall negative impact of these changes is that we don't get KASAN checking in both normal mem* calls and the ones formed by transforming assignments for GENERIC_ENTRY architectures with GCC and with older Clang. This is not great. I wonder if we then need to print some kind of warning when the kernel is built with these compilers. If these changes move forward, AFAIU, we can also drop these custom mem* definitions for non-instrumented files for x86: https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/string_64.h#L88 Thanks!
On Fri, 10 Feb 2023 at 22:37, Andrey Konovalov <andreyknvl@gmail.com> wrote: > > On Fri, Feb 10, 2023 at 7:41 PM Marco Elver <elver@google.com> wrote: > > > > On Fri, 10 Feb 2023 at 17:13, Andrey Konovalov <andreyknvl@gmail.com> wrote: > > [...] > > > > Probably the same should be done for SW_TAGS, because arm64 will be > > > > GENERIC_ENTRY at one point or another as well. > > > > > > Yes, makes sense. I'll file a bug for this once I fully understand the > > > consequences of these changes. > > > > > > > KASAN + GCC on x86 will have no mem*() instrumentation after > > > > 69d4c0d32186, which is sad, so somebody ought to teach it the same > > > > param as above. > > > > > > Hm, with that patch we would have no KASAN checking within normal mem* > > > functions (not the ones embedded by the compiler) on GENERIC_ENTRY > > > arches even with Clang, right? > > > > Yes, that's the point - normal mem*() functions cannot be instrumented > > with GENERIC_ENTRY within noinstr functions, because the compiler > > sometimes decides to transform normal assignments into > > memcpy()/memset(). And if mem*() were instrumented (as it was before > > 69d4c0d32186), that'd break things for these architectures. > > > > But since most code is normally instrumented, with the right compiler > > support (which the patch here enables), we just turn mem*() in > > instrumented functions into __asan_mem*(), and get the instrumentation > > as before. 69d4c0d32186 already added those __asan functions. The fact > > that KASAN used to override mem*() is just the wrong choice in a world > > where compilers decide to inline or outline these. From an > > instrumentation point of view at the compiler level, we need to treat > > them like any other instrumentable instruction (loads, stores, > > atomics, etc.): transform each instrumentable instruction into > > something that does the right checks. Only then can we be sure that we > > don't accidentally instrument something that shouldn't be (noinstr > > functions), because instead of relying on the compiler, we forced > > instrumentation on every mem*(). > > I meant to ask whether the normal mem* calls from instrumented > functions will also be transformed to __asan_mem*() by the compiler. > But following the godbolt link you shared, I see that this is true. > > Thank you for the explanation! > > So the overall negative impact of these changes is that we don't get > KASAN checking in both normal mem* calls and the ones formed by > transforming assignments for GENERIC_ENTRY architectures with GCC and > with older Clang. This is not great. I wonder if we then need to print > some kind of warning when the kernel is built with these compilers. Since these changes are already in -tip, and by judging from [1], there really is no other way. As-is, KASAN on x86 is already broken per [1] (though we got lucky thus far). Printing a warning wouldn't hurt, but I think nobody would notice the warning, and if somebody notices, they wouldn't care. Sooner or later, we just need to make sure that test robots (syzbot, etc.) have new compilers. > If these changes move forward, AFAIU, we can also drop these custom > mem* definitions for non-instrumented files for x86: > > https://elixir.bootlin.com/linux/latest/source/arch/x86/include/asm/string_64.h#L88 Yes, I think so. [1] https://lore.kernel.org/all/20230112194314.845371875@infradead.org/ Last but not least are you ok with this patch? This patch ought to be applied to the same tree as 69d4c0d32186 anyway, so this patch lives or dies by that change.
On Fri, Feb 10, 2023 at 09:07:14PM +0100, Marco Elver wrote: > On Fri, 10 Feb 2023 at 20:25, Jakub Jelinek <jakub@redhat.com> wrote: > > > > On Wed, Feb 08, 2023 at 07:42:03PM +0100, Marco Elver wrote: > > > Clang 15 will provide an option to prefix calls to memcpy/memset/memmove > > > with __asan_ in instrumented functions: https://reviews.llvm.org/D122724 > > > > > > GCC does not yet have similar support. > > > > GCC has support to rename memcpy/memset etc. for years, say on > > following compiled with > > -fsanitize=kernel-address -O2 -mstringop-strategy=libcall > > (the last option just to make sure the compiler doesn't prefer to emit > > rep mov*/stos* or loop or something similar, of course kernel can keep > > whatever it uses) you'll get just __asan_memcpy/__asan_memset calls, > > no memcpy/memset, while without -fsanitize=kernel-address you get > > normally memcpy/memset. > > > Or do you need the __asan_* functions only in asan instrumented functions > > and normal ones in non-instrumented functions in the same TU? > > Yes, exactly that: __asan_ in instrumented, and normal ones in > no_sanitize functions; they can be mixed in the same TU. We can't > rename normal mem*() functions everywhere. In no_sanitize functions > (in particular noinstr), normal mem*() should be used. But in > instrumented code, it should be __asan_mem*(). Another longer > explanation I also just replied here: > https://lore.kernel.org/all/CANpmjNNH-O+38U6zRWJUCU-eJTfMhUosy==GWEOn1vcu=J2dcw@mail.gmail.com/ > > At least clang has had this behaviour for user space ASan forever: > https://godbolt.org/z/h5sWExzef - so it was easy to just add the flag > to make it behave like in user space for mem*() in the kernel. It > might also be worthwhile for GCC to emit __asan_ for user space, given > that the runtimes are shared and the user space runtime definitely has > __asan_. The kernel needs the param (asan-kernel-mem-intrinsic-prefix) > though, to not break older kernels. So, what exactly you want for gcc to do with --param asan-kernel-mem-intrinsic-prefix=1 (note, in GCC case it can't be without the =1 at the end)? The current gcc behavior is that operations like aggregate copies, or clearing which might or might not need memcpy/memset/memmove under the hood later are asan instrumented before the operation (in order not to limit the choices on how it will be expanded), uses of builtins (__builtin_ prefixed or not) are also instrumented before the calls unless they are one of the calls that is recognized as always instrumented. None for hwasan, for asan: index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr, strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn, strpbrk, strspn, strstr, strncpy and for those builtins gcc disables inline expansion and enforces a library call (but until the expansion they are treated in optimizations like normal builtins and so could be say DCEd, or their aliasing behavior is considered etc.). kasan behaves the same I think. Now, I think libasan only has __asan_ prefixed __asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of the calls from the above list even can't be prefixed. So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_ prefix just memcpy/memmove/memset and nothing else? Is it ok to emit memcpy/memset/memmove from aggregate operations which are instrumented already at the caller (and similarly is it ok to handle those operations inline)? Jakub
On Mon, Feb 13, 2023 at 12:01:40PM +0100, Jakub Jelinek wrote: > The current gcc behavior is that operations like aggregate copies, or > clearing which might or might not need memcpy/memset/memmove under the hood > later are asan instrumented before the operation (in order not to limit the > choices on how it will be expanded), uses of builtins (__builtin_ prefixed > or not) are also instrumented before the calls unless they are one of the > calls that is recognized as always instrumented. None for hwasan, > for asan: > index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr, > strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn, > strpbrk, strspn, strstr, strncpy > and for those builtins gcc disables inline expansion and enforces a library > call (but until the expansion they are treated in optimizations like normal > builtins and so could be say DCEd, or their aliasing behavior is considered > etc.). kasan behaves the same I think. > > Now, I think libasan only has __asan_ prefixed > __asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of > the calls from the above list even can't be prefixed. > > So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_ > prefix just memcpy/memmove/memset and nothing else? Is it ok to emit > memcpy/memset/memmove from aggregate operations which are instrumented > already at the caller (and similarly is it ok to handle those operations > inline)? I'm thinking it is trivial to add more __asan prefixed functions as needed, while trying to untangle the trainwreck created by assuming the normal functions are instrumented is much more work.
On Mon, 13 Feb 2023 at 13:36, Peter Zijlstra <peterz@infradead.org> wrote: > > On Mon, Feb 13, 2023 at 12:01:40PM +0100, Jakub Jelinek wrote: > > > The current gcc behavior is that operations like aggregate copies, or > > clearing which might or might not need memcpy/memset/memmove under the hood > > later are asan instrumented before the operation (in order not to limit the > > choices on how it will be expanded), uses of builtins (__builtin_ prefixed > > or not) are also instrumented before the calls unless they are one of the > > calls that is recognized as always instrumented. None for hwasan, > > for asan: > > index, memchr, memcmp, memcpy, memmove, memset, strcasecmp, strcat, strchr, > > strcmp, strcpy, strdup, strlen, strncasecmp, strncat, strncmp, strcspn, > > strpbrk, strspn, strstr, strncpy > > and for those builtins gcc disables inline expansion and enforces a library > > call (but until the expansion they are treated in optimizations like normal > > builtins and so could be say DCEd, or their aliasing behavior is considered > > etc.). kasan behaves the same I think. > > > > Now, I think libasan only has __asan_ prefixed > > __asan_memmove, __asan_memset and __asan_memcpy, nothing else, so most of > > the calls from the above list even can't be prefixed. Correct, right now libasan only does memmove, memset, and memcpy. I don't think it'll ever do more, at least not in the near future. > > So, do you want for --param asan-kernel-mem-intrinsic-prefix=1 to __asan_ > > prefix just memcpy/memmove/memset and nothing else? Yes. > > Is it ok to emit > > memcpy/memset/memmove from aggregate operations which are instrumented > > already at the caller (and similarly is it ok to handle those operations > > inline)? Yes, I think that's fair. > I'm thinking it is trivial to add more __asan prefixed functions as > needed, while trying to untangle the trainwreck created by assuming the > normal functions are instrumented is much more work. For the kernel param, I'd only do memcpy/memmove/memset, as those are the most brittle ones. The string functions are instrumented on most architectures through lib/string.c being instrumented. Thanks, -- Marco
diff --git a/scripts/Makefile.kasan b/scripts/Makefile.kasan index b9e94c5e7097..78336b04c077 100644 --- a/scripts/Makefile.kasan +++ b/scripts/Makefile.kasan @@ -38,6 +38,13 @@ endif CFLAGS_KASAN += $(call cc-param,asan-stack=$(stack_enable)) +ifdef CONFIG_GENERIC_ENTRY +# Instrument memcpy/memset/memmove calls by using instrumented __asan_mem*() +# instead. With compilers that don't support this option, compiler-inserted +# memintrinsics won't be checked by KASAN. +CFLAGS_KASAN += $(call cc-param,asan-kernel-mem-intrinsic-prefix) +endif + endif # CONFIG_KASAN_GENERIC ifdef CONFIG_KASAN_SW_TAGS