Message ID | 20221109024155.2810410-1-connoro@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp100477wru; Tue, 8 Nov 2022 18:44:40 -0800 (PST) X-Google-Smtp-Source: AMsMyM5xdWjNuurCg81N8nF9HM1g+a0Tp0XL06+dxAcgTgmPyLnWZDQZtQwt4+E0GTqMPJ+cXnrB X-Received: by 2002:a17:906:b897:b0:755:6595:cd34 with SMTP id hb23-20020a170906b89700b007556595cd34mr1071929ejb.70.1667961880547; Tue, 08 Nov 2022 18:44:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667961880; cv=none; d=google.com; s=arc-20160816; b=JYQCwPZCQq+PgntUgTuyg5ZAG/vl/Qp5gdRNHSytN/9XYDFyaclbYlQkvv3pg9ayEd 6Yav2yNsthbTCDhsGpRJnY0AqJqHtLSvsBSpA+A87iLhTRwpJvA2GRtTLrp14BS70wEE 6Ri8GIQq4y9scNjwOSJvO9Yi28tyFPSGXXjpLH79lyDxSV/Rwx3X3dhw++GL4EWTxSyP gK9HkHGTNdsxlavyu8e6k/in0wwxcPmh5h2uPyI+Ve70I06RA+HtB9YsgFDpBeGACitc LGtolC0B9DUTK8fb0g3dLv9CZIKqkLaRobcotEm+a5wK6XL/CQi+ped0d8coxbZFzy8E rXTQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=4Mz4zEcIKBrlDyAmKkP8D55UhnpWD1j4J0ZUJ0geBMw=; b=i7EiQFVcBgkgSXKVTIZIL5sglWgAvAqWXTe5C4xzhVH8+qDIlYb88EYyHaUR1Uc3NS MwsBs/fHUuH7ocFhoRnMwvU4voaHYxygIulPBiV3c50l4tel00UgqZPx8Jb2GEINiuLL UmLmWWRyX3T4WH8M9YgmEBZ1C5N5ZEousbezhNZA3c9AKDiH7clo/P+Qi0ZkgcmGfkub fbNNe2B4YdRoYe0/dthVKcrM7D81+woAnIgJwGx/x7MO7bmNo0vR9SACRGHHlJHu8CO+ eIEl5feyE1Z2unP01A231jWTxi2cU3CKRMDd4OJ7KDYQ55TG8ve5FUcj/3V3BUxD6TO9 tqrg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=JhcYXGD2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k5-20020a50cb85000000b004571a669230si12624599edi.469.2022.11.08.18.44.17; Tue, 08 Nov 2022 18:44:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=JhcYXGD2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229900AbiKICmW (ORCPT <rfc822;dexuan.linux@gmail.com> + 99 others); Tue, 8 Nov 2022 21:42:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33918 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229492AbiKICmU (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 8 Nov 2022 21:42:20 -0500 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 404DE1A221 for <linux-kernel@vger.kernel.org>; Tue, 8 Nov 2022 18:42:20 -0800 (PST) Received: by mail-pj1-x1049.google.com with SMTP id om10-20020a17090b3a8a00b002108b078ab1so430873pjb.9 for <linux-kernel@vger.kernel.org>; Tue, 08 Nov 2022 18:42:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=4Mz4zEcIKBrlDyAmKkP8D55UhnpWD1j4J0ZUJ0geBMw=; b=JhcYXGD2COekAUcZzHL8U9BIh7stDBFD7YenyGw4kjTwzCbPWYUbVkDexYLlSQBXN5 bJT/6/x+DHJY7IRVw4FCcb4ZmLORBnMo6V6FJpc2BnaU8kYwKO5IZmJuRKVwG+9n7ULS pk1WClMef+rhKaEf+wtDtwIz/Pa2lH643l7E+gUnJn4iC1waoHqWkFzIJ5X3w+TMeDoX dpaGqib14Den7EzSKBSSlJ7ikjcvCd8ymDqxpfgDuDhoEH11OnBbbhLJd3kWdb+sJfI1 nPA1qi02HFS5VHxaSMdVApxfsemedy/mSyn4Vh7L1PEfrP4eja5/0z6oZQ9xSLgxUwXs zrsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=4Mz4zEcIKBrlDyAmKkP8D55UhnpWD1j4J0ZUJ0geBMw=; b=spUaCZwSYgeoaHUE/I82swC0bWEtekUY2DyVb4D9KDPBa1XaRpGRvwKCU2Sm6BIqST Jmrs3H3kqjinbMw/+qbTenL2r9TwltTGEv82hhNHMf6/jCU2+WRtJ9LAB2o5j8G4QjDc tXlPD0/whaP0FfGaWXoRqovh8FPdHX2jZQTbd+XA6XUhe5mm3DxoKO+Nl0ez2WZf4cwt XXw5Dc1HAlwWuSpTqs35HY1hOWQFwmEkKwbBMXT3tGIBPi5+vxXi0SCZd/bJblGI1TiJ +8g2tj8DC+79uJfK/Djc1dBLoCiDkIBWyYIrZUQpEB4s+nDHUAbTZe7AbgD9iqcG58h+ Tg0Q== X-Gm-Message-State: ACrzQf2f6pY14QMq+Yo7sEdro1mbQsZQVcU/MEEZ7FC4oMMyMFAjvhM9 jN0iAmtHdfbfKmsicazzaz1Aag3vSEtL X-Received: from connoro.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:a99]) (user=connoro job=sendgmr) by 2002:aa7:820f:0:b0:562:83c1:59e0 with SMTP id k15-20020aa7820f000000b0056283c159e0mr1116355pfi.34.1667961739606; Tue, 08 Nov 2022 18:42:19 -0800 (PST) Date: Wed, 9 Nov 2022 02:41:55 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.38.1.431.g37b22c650d-goog Message-ID: <20221109024155.2810410-1-connoro@google.com> Subject: [PATCH bpf-next] bpf: btf: don't log ignored BTF mismatches From: "Connor O'Brien" <connoro@google.com> To: bpf@vger.kernel.org Cc: Martin KaFai Lau <martin.lau@linux.dev>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Andrii Nakryiko <andrii@kernel.org>, Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@kernel.org>, Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, linux-kernel@vger.kernel.org, "Connor O'Brien" <connoro@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748984796388194466?= X-GMAIL-MSGID: =?utf-8?q?1748984796388194466?= |
Series |
[bpf-next] bpf: btf: don't log ignored BTF mismatches
|
|
Commit Message
Connor O'Brien
Nov. 9, 2022, 2:41 a.m. UTC
Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF
mismatches are expected and module loading should proceed
anyway. Logging with pr_warn() on every one of these "benign"
mismatches creates unnecessary noise when many such modules are
loaded. Instead, limit logging to the case where a BTF mismatch
actually prevents a module form loading.
Signed-off-by: Connor O'Brien <connoro@google.com>
---
kernel/bpf/btf.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
Comments
On 11/8/22 6:41 PM, Connor O'Brien wrote: > Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF > mismatches are expected and module loading should proceed > anyway. Logging with pr_warn() on every one of these "benign" > mismatches creates unnecessary noise when many such modules are > loaded. Instead, limit logging to the case where a BTF mismatch > actually prevents a module form loading. > > Signed-off-by: Connor O'Brien <connoro@google.com> > --- > kernel/bpf/btf.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > index 5579ff3a5b54..406370487413 100644 > --- a/kernel/bpf/btf.c > +++ b/kernel/bpf/btf.c > @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, > } > btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); > if (IS_ERR(btf)) { > - pr_warn("failed to validate module [%s] BTF: %ld\n", > - mod->name, PTR_ERR(btf)); I think such warning still useful even with CONFIG_MODULE_ALLOW_BTF_MISMATCH. Can we use pr_warn_ratelimited instead of pr_warn in the above to avoid excessive warnings? > kfree(btf_mod); > - if (!IS_ENABLED(CONFIG_MODULE_ALLOW_BTF_MISMATCH)) > + if (!IS_ENABLED(CONFIG_MODULE_ALLOW_BTF_MISMATCH)) { > + pr_warn("failed to validate module [%s] BTF: %ld\n", > + mod->name, PTR_ERR(btf)); > err = PTR_ERR(btf); > + } > goto out; > } > err = btf_alloc_id(btf);
On Wed, Nov 9, 2022 at 8:45 AM Yonghong Song <yhs@meta.com> wrote: > > > > On 11/8/22 6:41 PM, Connor O'Brien wrote: > > Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF > > mismatches are expected and module loading should proceed > > anyway. Logging with pr_warn() on every one of these "benign" > > mismatches creates unnecessary noise when many such modules are > > loaded. Instead, limit logging to the case where a BTF mismatch > > actually prevents a module form loading. > > > > Signed-off-by: Connor O'Brien <connoro@google.com> > > --- > > kernel/bpf/btf.c | 7 ++++--- > > 1 file changed, 4 insertions(+), 3 deletions(-) > > > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > index 5579ff3a5b54..406370487413 100644 > > --- a/kernel/bpf/btf.c > > +++ b/kernel/bpf/btf.c > > @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, > > } > > btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); > > if (IS_ERR(btf)) { > > - pr_warn("failed to validate module [%s] BTF: %ld\n", > > - mod->name, PTR_ERR(btf)); > > I think such warning still useful even with > CONFIG_MODULE_ALLOW_BTF_MISMATCH. > Can we use pr_warn_ratelimited instead of pr_warn in the above to > avoid excessive warnings? I gave this a try on a Pixel 6 but I'm not sure it quite addresses the issue. The amount of logging doesn't seem to decrease much, I think because the interval between loading one mismatched module and the next can be greater than the default rate limit. To my mind, the issue is the total volume of these messages more so than their rate. For context, Android devices using the GKI may load hundreds of separately-built modules, and BTF mismatches are possible for any/all of these. It was pointed out to me that btf_verifier_log_type can also print several more lines per mismatched module. ~5 lines of logging for each mismatched module can start to add up, in terms of both overhead and the noise added to the kernel logs. This is more subjective but I think the warnings also read as though this is a more serious failure that might prevent affected modules from working correctly; anecdotally, I've gotten multiple questions about them asking if something is broken. This can be a red herring for anyone unfamiliar with BTF who is reading the logs to debug unrelated issues. In the CONFIG_MODULE_ALLOW_BTF_MISMATCH=y case the flood of warnings seems out of proportion to the actual result (modules still load successfully, just without debug info) especially since the user has explicitly enabled a config saying they expect mismatches. If there needs to be some logging in the "mismatch allowed" case, could an acceptable middle ground be to use pr_warn_once to send a single message reporting that mismatches were detected & module BTF debug info might be unavailable? Alternatively, if we could opt out of module BTF loading then that would also avoid this issue, but that's already been proposed before ([1], [2]) so I thought working with the existing config option might be preferred. [1] https://lore.kernel.org/bpf/20220209052141.140063-1-connoro@google.com/ [2] https://lore.kernel.org/bpf/20221004222725.2813510-1-sdf@google.com/
On 12/7/22 6:19 PM, Connor O'Brien wrote: > On Wed, Nov 9, 2022 at 8:45 AM Yonghong Song <yhs@meta.com> wrote: >> >> >> >> On 11/8/22 6:41 PM, Connor O'Brien wrote: >>> Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF >>> mismatches are expected and module loading should proceed >>> anyway. Logging with pr_warn() on every one of these "benign" >>> mismatches creates unnecessary noise when many such modules are >>> loaded. Instead, limit logging to the case where a BTF mismatch >>> actually prevents a module form loading. >>> >>> Signed-off-by: Connor O'Brien <connoro@google.com> >>> --- >>> kernel/bpf/btf.c | 7 ++++--- >>> 1 file changed, 4 insertions(+), 3 deletions(-) >>> >>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c >>> index 5579ff3a5b54..406370487413 100644 >>> --- a/kernel/bpf/btf.c >>> +++ b/kernel/bpf/btf.c >>> @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, >>> } >>> btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); >>> if (IS_ERR(btf)) { >>> - pr_warn("failed to validate module [%s] BTF: %ld\n", >>> - mod->name, PTR_ERR(btf)); >> >> I think such warning still useful even with >> CONFIG_MODULE_ALLOW_BTF_MISMATCH. >> Can we use pr_warn_ratelimited instead of pr_warn in the above to >> avoid excessive warnings? > > I gave this a try on a Pixel 6 but I'm not sure it quite addresses the > issue. The amount of logging doesn't seem to decrease much, I think > because the interval between loading one mismatched module and the > next can be greater than the default rate limit. To my mind, the issue > is the total volume of these messages more so than their rate. > > For context, Android devices using the GKI may load hundreds of > separately-built modules, and BTF mismatches are possible for any/all > of these. It was pointed out to me that btf_verifier_log_type can also > print several more lines per mismatched module. ~5 lines of logging > for each mismatched module can start to add up, in terms of both > overhead and the noise added to the kernel logs. > > This is more subjective but I think the warnings also read as though > this is a more serious failure that might prevent affected modules > from working correctly; anecdotally, I've gotten multiple questions > about them asking if something is broken. This can be a red herring > for anyone unfamiliar with BTF who is reading the logs to debug > unrelated issues. In the CONFIG_MODULE_ALLOW_BTF_MISMATCH=y case the > flood of warnings seems out of proportion to the actual result > (modules still load successfully, just without debug info) especially > since the user has explicitly enabled a config saying they expect > mismatches. > > If there needs to be some logging in the "mismatch allowed" case, > could an acceptable middle ground be to use pr_warn_once to send a So it looks like pr_warn_ratelimited still produces a lot of warning. In this case, I guess pr_warn_once should be okay. > single message reporting that mismatches were detected & module BTF > debug info might be unavailable? Alternatively, if we could opt out of > module BTF loading then that would also avoid this issue, but that's > already been proposed before ([1], [2]) so I thought working with the > existing config option might be preferred. > > [1] https://lore.kernel.org/bpf/20220209052141.140063-1-connoro@google.com/ > [2] https://lore.kernel.org/bpf/20221004222725.2813510-1-sdf@google.com/
On Thu, Dec 8, 2022 at 10:01 AM Yonghong Song <yhs@meta.com> wrote: > > > > On 12/7/22 6:19 PM, Connor O'Brien wrote: > > On Wed, Nov 9, 2022 at 8:45 AM Yonghong Song <yhs@meta.com> wrote: > >> > >> > >> > >> On 11/8/22 6:41 PM, Connor O'Brien wrote: > >>> Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF > >>> mismatches are expected and module loading should proceed > >>> anyway. Logging with pr_warn() on every one of these "benign" > >>> mismatches creates unnecessary noise when many such modules are > >>> loaded. Instead, limit logging to the case where a BTF mismatch > >>> actually prevents a module form loading. > >>> > >>> Signed-off-by: Connor O'Brien <connoro@google.com> > >>> --- > >>> kernel/bpf/btf.c | 7 ++++--- > >>> 1 file changed, 4 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > >>> index 5579ff3a5b54..406370487413 100644 > >>> --- a/kernel/bpf/btf.c > >>> +++ b/kernel/bpf/btf.c > >>> @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, > >>> } > >>> btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); > >>> if (IS_ERR(btf)) { > >>> - pr_warn("failed to validate module [%s] BTF: %ld\n", > >>> - mod->name, PTR_ERR(btf)); > >> > >> I think such warning still useful even with > >> CONFIG_MODULE_ALLOW_BTF_MISMATCH. > >> Can we use pr_warn_ratelimited instead of pr_warn in the above to > >> avoid excessive warnings? > > > > I gave this a try on a Pixel 6 but I'm not sure it quite addresses the > > issue. The amount of logging doesn't seem to decrease much, I think > > because the interval between loading one mismatched module and the > > next can be greater than the default rate limit. To my mind, the issue > > is the total volume of these messages more so than their rate. > > > > For context, Android devices using the GKI may load hundreds of > > separately-built modules, and BTF mismatches are possible for any/all > > of these. It was pointed out to me that btf_verifier_log_type can also > > print several more lines per mismatched module. ~5 lines of logging > > for each mismatched module can start to add up, in terms of both > > overhead and the noise added to the kernel logs. > > > > This is more subjective but I think the warnings also read as though > > this is a more serious failure that might prevent affected modules > > from working correctly; anecdotally, I've gotten multiple questions > > about them asking if something is broken. This can be a red herring > > for anyone unfamiliar with BTF who is reading the logs to debug > > unrelated issues. In the CONFIG_MODULE_ALLOW_BTF_MISMATCH=y case the > > flood of warnings seems out of proportion to the actual result > > (modules still load successfully, just without debug info) especially > > since the user has explicitly enabled a config saying they expect > > mismatches. > > > > If there needs to be some logging in the "mismatch allowed" case, > > could an acceptable middle ground be to use pr_warn_once to send a > > So it looks like pr_warn_ratelimited still produces a lot of warning. > In this case, I guess pr_warn_once should be okay. Maybe pr_warn_once generic "some kernel module BTF mismatched". And also warn per-module message with details if CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set? > > > single message reporting that mismatches were detected & module BTF > > debug info might be unavailable? Alternatively, if we could opt out of > > module BTF loading then that would also avoid this issue, but that's > > already been proposed before ([1], [2]) so I thought working with the > > existing config option might be preferred. > > > > [1] https://lore.kernel.org/bpf/20220209052141.140063-1-connoro@google.com/ > > [2] https://lore.kernel.org/bpf/20221004222725.2813510-1-sdf@google.com/
On Thu, Dec 8, 2022 at 4:03 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Thu, Dec 8, 2022 at 10:01 AM Yonghong Song <yhs@meta.com> wrote: > > > > > > > > On 12/7/22 6:19 PM, Connor O'Brien wrote: > > > On Wed, Nov 9, 2022 at 8:45 AM Yonghong Song <yhs@meta.com> wrote: > > >> > > >> > > >> > > >> On 11/8/22 6:41 PM, Connor O'Brien wrote: > > >>> Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF > > >>> mismatches are expected and module loading should proceed > > >>> anyway. Logging with pr_warn() on every one of these "benign" > > >>> mismatches creates unnecessary noise when many such modules are > > >>> loaded. Instead, limit logging to the case where a BTF mismatch > > >>> actually prevents a module form loading. > > >>> > > >>> Signed-off-by: Connor O'Brien <connoro@google.com> > > >>> --- > > >>> kernel/bpf/btf.c | 7 ++++--- > > >>> 1 file changed, 4 insertions(+), 3 deletions(-) > > >>> > > >>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > >>> index 5579ff3a5b54..406370487413 100644 > > >>> --- a/kernel/bpf/btf.c > > >>> +++ b/kernel/bpf/btf.c > > >>> @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, > > >>> } > > >>> btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); > > >>> if (IS_ERR(btf)) { > > >>> - pr_warn("failed to validate module [%s] BTF: %ld\n", > > >>> - mod->name, PTR_ERR(btf)); > > >> > > >> I think such warning still useful even with > > >> CONFIG_MODULE_ALLOW_BTF_MISMATCH. > > >> Can we use pr_warn_ratelimited instead of pr_warn in the above to > > >> avoid excessive warnings? > > > > > > I gave this a try on a Pixel 6 but I'm not sure it quite addresses the > > > issue. The amount of logging doesn't seem to decrease much, I think > > > because the interval between loading one mismatched module and the > > > next can be greater than the default rate limit. To my mind, the issue > > > is the total volume of these messages more so than their rate. > > > > > > For context, Android devices using the GKI may load hundreds of > > > separately-built modules, and BTF mismatches are possible for any/all > > > of these. It was pointed out to me that btf_verifier_log_type can also > > > print several more lines per mismatched module. ~5 lines of logging > > > for each mismatched module can start to add up, in terms of both > > > overhead and the noise added to the kernel logs. > > > > > > This is more subjective but I think the warnings also read as though > > > this is a more serious failure that might prevent affected modules > > > from working correctly; anecdotally, I've gotten multiple questions > > > about them asking if something is broken. This can be a red herring > > > for anyone unfamiliar with BTF who is reading the logs to debug > > > unrelated issues. In the CONFIG_MODULE_ALLOW_BTF_MISMATCH=y case the > > > flood of warnings seems out of proportion to the actual result > > > (modules still load successfully, just without debug info) especially > > > since the user has explicitly enabled a config saying they expect > > > mismatches. > > > > > > If there needs to be some logging in the "mismatch allowed" case, > > > could an acceptable middle ground be to use pr_warn_once to send a > > > > So it looks like pr_warn_ratelimited still produces a lot of warning. > > In this case, I guess pr_warn_once should be okay. > > Maybe pr_warn_once generic "some kernel module BTF mismatched". And > also warn per-module message with details if > CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set? > Yeah, I figured the logging behavior should remain unchanged when CONFIG_MODULE_ALLOW_BTF_MISMATCH is unset, the existing verbose logging makes sense when every mismatch stops a module from loading. Then for the pr_warn_once case something like "Kernel module BTF mismatch detected, BTF debug info may be unavailable for some modules" > > > > > single message reporting that mismatches were detected & module BTF > > > debug info might be unavailable? Alternatively, if we could opt out of > > > module BTF loading then that would also avoid this issue, but that's > > > already been proposed before ([1], [2]) so I thought working with the > > > existing config option might be preferred. > > > > > > [1] https://lore.kernel.org/bpf/20220209052141.140063-1-connoro@google.com/ > > > [2] https://lore.kernel.org/bpf/20221004222725.2813510-1-sdf@google.com/
On Thu, Dec 8, 2022 at 4:59 PM Connor O'Brien <connoro@google.com> wrote: > > On Thu, Dec 8, 2022 at 4:03 PM Andrii Nakryiko > <andrii.nakryiko@gmail.com> wrote: > > > > On Thu, Dec 8, 2022 at 10:01 AM Yonghong Song <yhs@meta.com> wrote: > > > > > > > > > > > > On 12/7/22 6:19 PM, Connor O'Brien wrote: > > > > On Wed, Nov 9, 2022 at 8:45 AM Yonghong Song <yhs@meta.com> wrote: > > > >> > > > >> > > > >> > > > >> On 11/8/22 6:41 PM, Connor O'Brien wrote: > > > >>> Enabling CONFIG_MODULE_ALLOW_BTF_MISMATCH is an indication that BTF > > > >>> mismatches are expected and module loading should proceed > > > >>> anyway. Logging with pr_warn() on every one of these "benign" > > > >>> mismatches creates unnecessary noise when many such modules are > > > >>> loaded. Instead, limit logging to the case where a BTF mismatch > > > >>> actually prevents a module form loading. > > > >>> > > > >>> Signed-off-by: Connor O'Brien <connoro@google.com> > > > >>> --- > > > >>> kernel/bpf/btf.c | 7 ++++--- > > > >>> 1 file changed, 4 insertions(+), 3 deletions(-) > > > >>> > > > >>> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > > > >>> index 5579ff3a5b54..406370487413 100644 > > > >>> --- a/kernel/bpf/btf.c > > > >>> +++ b/kernel/bpf/btf.c > > > >>> @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, > > > >>> } > > > >>> btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); > > > >>> if (IS_ERR(btf)) { > > > >>> - pr_warn("failed to validate module [%s] BTF: %ld\n", > > > >>> - mod->name, PTR_ERR(btf)); > > > >> > > > >> I think such warning still useful even with > > > >> CONFIG_MODULE_ALLOW_BTF_MISMATCH. > > > >> Can we use pr_warn_ratelimited instead of pr_warn in the above to > > > >> avoid excessive warnings? > > > > > > > > I gave this a try on a Pixel 6 but I'm not sure it quite addresses the > > > > issue. The amount of logging doesn't seem to decrease much, I think > > > > because the interval between loading one mismatched module and the > > > > next can be greater than the default rate limit. To my mind, the issue > > > > is the total volume of these messages more so than their rate. > > > > > > > > For context, Android devices using the GKI may load hundreds of > > > > separately-built modules, and BTF mismatches are possible for any/all > > > > of these. It was pointed out to me that btf_verifier_log_type can also > > > > print several more lines per mismatched module. ~5 lines of logging > > > > for each mismatched module can start to add up, in terms of both > > > > overhead and the noise added to the kernel logs. > > > > > > > > This is more subjective but I think the warnings also read as though > > > > this is a more serious failure that might prevent affected modules > > > > from working correctly; anecdotally, I've gotten multiple questions > > > > about them asking if something is broken. This can be a red herring > > > > for anyone unfamiliar with BTF who is reading the logs to debug > > > > unrelated issues. In the CONFIG_MODULE_ALLOW_BTF_MISMATCH=y case the > > > > flood of warnings seems out of proportion to the actual result > > > > (modules still load successfully, just without debug info) especially > > > > since the user has explicitly enabled a config saying they expect > > > > mismatches. > > > > > > > > If there needs to be some logging in the "mismatch allowed" case, > > > > could an acceptable middle ground be to use pr_warn_once to send a > > > > > > So it looks like pr_warn_ratelimited still produces a lot of warning. > > > In this case, I guess pr_warn_once should be okay. > > > > Maybe pr_warn_once generic "some kernel module BTF mismatched". And > > also warn per-module message with details if > > CONFIG_MODULE_ALLOW_BTF_MISMATCH is not set? > > > > Yeah, I figured the logging behavior should remain unchanged when > CONFIG_MODULE_ALLOW_BTF_MISMATCH is unset, the existing verbose > logging makes sense when every mismatch stops a module from loading. > > Then for the pr_warn_once case something like "Kernel module BTF > mismatch detected, BTF debug info may be unavailable for some modules" yep, makes sense to me > > > > > > > > > > single message reporting that mismatches were detected & module BTF > > > > debug info might be unavailable? Alternatively, if we could opt out of > > > > module BTF loading then that would also avoid this issue, but that's > > > > already been proposed before ([1], [2]) so I thought working with the > > > > existing config option might be preferred. > > > > > > > > [1] https://lore.kernel.org/bpf/20220209052141.140063-1-connoro@google.com/ > > > > [2] https://lore.kernel.org/bpf/20221004222725.2813510-1-sdf@google.com/
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index 5579ff3a5b54..406370487413 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -7190,11 +7190,12 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op, } btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size); if (IS_ERR(btf)) { - pr_warn("failed to validate module [%s] BTF: %ld\n", - mod->name, PTR_ERR(btf)); kfree(btf_mod); - if (!IS_ENABLED(CONFIG_MODULE_ALLOW_BTF_MISMATCH)) + if (!IS_ENABLED(CONFIG_MODULE_ALLOW_BTF_MISMATCH)) { + pr_warn("failed to validate module [%s] BTF: %ld\n", + mod->name, PTR_ERR(btf)); err = PTR_ERR(btf); + } goto out; } err = btf_alloc_id(btf);