Message ID | 20230209031159.2337445-1-ouyangweizhao@zeku.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp104560wrn; Wed, 8 Feb 2023 19:41:19 -0800 (PST) X-Google-Smtp-Source: AK7set/xrv3wHn3snKPHWunmzYDB7SLfEtXqCpPjrOzioxmDRsyLBZj46UCzA8ATtJcUQwV9ntUF X-Received: by 2002:a17:902:d2cd:b0:192:5a90:b047 with SMTP id n13-20020a170902d2cd00b001925a90b047mr10794123plc.3.1675914079613; Wed, 08 Feb 2023 19:41:19 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1675914079; cv=pass; d=google.com; s=arc-20160816; b=g6WxB41Pepev1ONMN6BebCluqVTZOmqFeHIj4RyfgMOBhQ81StcFpoGq01zHZgR6oF fOMLkptFJuqXTQEqgP82Bki0202ebNCroFd4S52rP9bIZ+3pOY39iRf2/4xrD4PhlXGj gvmr1ccex5IVhNNChD1aUTEYatbQOUTM/TqPnRS7y3ZEhTs22f7Wf3kMrA4RXsuzv1KN jm/ZMUehRRX54C8xdCaFKT1QpyVUrSmKmm3nWIL4s6rPGE3yaZbdEN5qCNhiy6QmYAl6 4YT5Ji6nIfbGPU3nG8XOXzUCr1o+IL/+mVHCDZ+VDrD4guoMc2QwdvrlCmtwRQ3dc91e bPmg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=CMsTfoXAVz7HPkrgNVxHHrE4a/fTKMFzP3+DIvN2VEI=; b=d1FzdkfcmrhcYe0Vh1f56Wqz5RhAbR7qzVlQ2c5SjfTQ1e6MRVEvC5SD4lxJv0G2zn RAXKdml1/qwU2Pe9kSSsl1mBjE5ubtubkVCWzeIJxugY/lXTlvkAIEw3X95jts8bZppp Te1Pf0Z7H5+kPPsp7o0cTOBhhCuKu+tzGihuhJJ3n+VWtwwjxKXs+CS4NW35LbFeyhLb 39FHJf8Q28NBNA14IPoPpumFQKn+ZybbQR3Jf15VHgtXxqQL2U4u7LfBkUVUnXb7EMb8 5WD+nAxh9pkncl7swQSTa9uXtrfcnY2qsgO0uvgUZcdeVLDIpHFj2H/X2spxIF3I/rqA Ds3g== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@zeku.com header.s=selector1 header.b=IJkFn9sK; arc=pass (i=1 spf=pass spfdomain=zeku.com dmarc=pass fromdomain=zeku.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeku.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u15-20020a17090341cf00b00193f8c6a020si612502ple.111.2023.02.08.19.41.06; Wed, 08 Feb 2023 19:41:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@zeku.com header.s=selector1 header.b=IJkFn9sK; arc=pass (i=1 spf=pass spfdomain=zeku.com dmarc=pass fromdomain=zeku.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=zeku.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232746AbjBID3K (ORCPT <rfc822;ivan.orlov0322@gmail.com> + 99 others); Wed, 8 Feb 2023 22:29:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60594 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232762AbjBID2m (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 8 Feb 2023 22:28:42 -0500 Received: from APC01-PSA-obe.outbound.protection.outlook.com (mail-psaapc01on2137.outbound.protection.outlook.com [40.107.255.137]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B38D515CB1 for <linux-kernel@vger.kernel.org>; Wed, 8 Feb 2023 19:27:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=j1sv53BuXfi62XjcaFBJDDvBM0U/wdOiZNHF9LOjdc5w9wdL9D378SNbTljqLzgFstQFJrX4Jk1ZrkyWkodqMQxd1JLksjfWq/KFedt5VMI/D3Q+XFKq7Fry/2+/D+aHoZ9sj34XoCvsBeOeE13E7bkAroMinYEvjbSXck241yT4NhEHP8sBQrPP4YSvtqhUV6VxW+ma/xT9C01gsU/D+SZ6/Vl/GEtBGeOOvxaUdi2sbHQnEHS6lhp6TecAQsoFBTU8UubLzPUzE2fzkavUruUNXNSI8m/rdHq2d4eyDK7hH+siGUaRwSCidOglFNs71jpne1p4x7fb17+vV1KQSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=CMsTfoXAVz7HPkrgNVxHHrE4a/fTKMFzP3+DIvN2VEI=; b=XtnV/vTnrzZDVHRbMCNSvQ0qJ8DZcyEgl58FvxVQ3DPZlLQ8Cn6bRGl8ZZsdiAOj42tPNtc6dlUtoSGk4QZQxEn/XCQ+OnL8PJ0dFqRCjs5+1Ou1pMStiAn6Z9T6/ZZ5c2uClSWPmMC+U4vnoV0erlLN6U6t8/9fC+A+Zqf941YuNxxYtKGAaVp/OR00tbcOwB8jyoGsYb0Z6ymCpkCH00uUOScTf1GuBIl8tFD6P4Y2aoMm73N+kWN8z6s/M3atLvcvjRts294pgW9Hv2q7+Qf7UhfZP4F+yZnRtqxilIUtbThdtChjI+vJ2fKc15ZHEqCFQa8oVmCt4J8busF/TQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 103.192.253.182) smtp.rcpttodomain=gmail.com smtp.mailfrom=zeku.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=zeku.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=zeku.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=CMsTfoXAVz7HPkrgNVxHHrE4a/fTKMFzP3+DIvN2VEI=; b=IJkFn9sKeGGhLSBpchnYsIwRyUN6haT2wv4SuYaIX1KWpN+spGWVeiEJz0d42sIJZzaHulCx2y1YfAITYPy8WdbQPB63JvA4iQgkg6uy9Yig/IUUi1VdYix6lynVkNvtRLOEFZr3Rzy/+dfWgJkj93ZKiCUPQGOrZn0/6EqhHDroKBmAflt9ht9wZTPFMngGWam42e3Dbgb9JTSmKO961j31F1IpWFPVfYlafQPzVSlACgtOpQ5VWCkJeP+0IBD/mYxsH3NO7+2CuXNEuTS22xnNBb1s/OpQDfPjzeodwEHrE6xORvPlXvKwuXrPjOwVwF/cC32Aomy92rmU+fGRbA== Received: from SG2PR02CA0051.apcprd02.prod.outlook.com (2603:1096:4:54::15) by SG2PR02MB4347.apcprd02.prod.outlook.com (2603:1096:0:5::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.17; Thu, 9 Feb 2023 03:27:35 +0000 Received: from SG2APC01FT0026.eop-APC01.prod.protection.outlook.com (2603:1096:4:54:cafe::d0) by SG2PR02CA0051.outlook.office365.com (2603:1096:4:54::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6086.18 via Frontend Transport; Thu, 9 Feb 2023 03:27:35 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 103.192.253.182) smtp.mailfrom=zeku.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=zeku.com; Received-SPF: Pass (protection.outlook.com: domain of zeku.com designates 103.192.253.182 as permitted sender) receiver=protection.outlook.com; client-ip=103.192.253.182; helo=sh-exhtc2.internal.zeku.com; pr=C Received: from sh-exhtc2.internal.zeku.com (103.192.253.182) by SG2APC01FT0026.mail.protection.outlook.com (10.13.37.85) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6086.17 via Frontend Transport; Thu, 9 Feb 2023 03:27:34 +0000 Received: from sh-exhtc3.internal.zeku.com (10.123.154.250) by sh-exhtc2.internal.zeku.com (10.123.21.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.12; Thu, 9 Feb 2023 11:27:34 +0800 Received: from sh-exhtc1.internal.zeku.com (10.123.21.105) by sh-exhtc3.internal.zeku.com (10.123.154.250) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.2.986.5; Thu, 9 Feb 2023 11:27:33 +0800 Received: from localhost.localdomain (10.123.154.19) by sh-exhtc1.internal.zeku.com (10.123.21.105) with Microsoft SMTP Server id 15.1.2375.12 via Frontend Transport; Thu, 9 Feb 2023 11:27:33 +0800 From: Weizhao Ouyang <ouyangweizhao@zeku.com> To: Andrey Ryabinin <ryabinin.a.a@gmail.com>, Alexander Potapenko <glider@google.com>, Andrey Konovalov <andreyknvl@gmail.com>, Dmitry Vyukov <dvyukov@google.com>, Vincenzo Frascino <vincenzo.frascino@arm.com>, "Andrew Morton" <akpm@linux-foundation.org> CC: <kasan-dev@googlegroups.com>, <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>, Weizhao Ouyang <o451686892@gmail.com>, "Shuai Yuan" <yuanshuai@zeku.com>, Weizhao Ouyang <ouyangweizhao@zeku.com>, Peng Ren <renlipeng@zeku.com> Subject: [PATCH v2] kasan: fix deadlock in start_report() Date: Thu, 9 Feb 2023 11:11:59 +0800 Message-ID: <20230209031159.2337445-1-ouyangweizhao@zeku.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: SG2APC01FT0026:EE_|SG2PR02MB4347:EE_ X-MS-Office365-Filtering-Correlation-Id: 5e1cfa4d-addf-44e7-25ac-08db0a4d961e X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: YPH2u0bp0ju5wCWJXcCtE/QYoZuaJu1/882wCqkNyaVht3NOrkfzFtc6hg5shXdZfVptF8XNfGh2h/I/VW/4LvIlKbKGCxeTVv6ucePatHtEpibCwdBvVohRDUd8LjHbhaMfPGZwRhUnhdzvljpqoYfjcZhJzxuKZLKNAt3xPZDp/8b25MRSjgOolpqWLTDDfi2b+E0Aue8OI06zykg+DSa6QiTJUaOp07I/Zktwv3rTBzQyD+gY2KJ7pmWe4/Odtj727BKp1FCC8MkWlOHNXNk16HOKi8lQaxu3HZzgwmpX79Claz0FD19ZSUjmqxi1fFRYWqXoafMiVfIqFJI2KjmIXcuU7UdEniQzHs7rxf35mufs8gig6AAPbUdm9pEcNn2aHq1cGkCpML6cUl0fHaxrciDo88ql7jyr34vfo67D/Tc/40Mx9Vsh3cRzWvHkhN1moBiH8m6RWG/DTldJiNt5k7tjmgZEgosMyfc4XMrBdlqUtA8FMFJC0jv1qi8zXUv7ptrHpMz0x6/rS63MWijiDihuL8P5KDjy9Nq/HqLa9KhRlHCLpvfmfemLJzxccsnS27j8t447DgZQTt7RnyHFVB6J+KlrsExxJojKvqdO5jESPXPfBzm9rXvPwej35IWxI+zXsb3Rv7OaU+EGW+1PezKq3egxmNeFTNEZN7mtIeXQBmkKLdj9ya0TGeat X-Forefront-Antispam-Report: CIP:103.192.253.182;CTRY:CN;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:sh-exhtc2.internal.zeku.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230025)(4636009)(346002)(396003)(376002)(39850400004)(136003)(451199018)(36840700001)(46966006)(70206006)(4326008)(70586007)(8676002)(336012)(36860700001)(316002)(83380400001)(110136005)(54906003)(86362001)(426003)(5660300002)(7416002)(81166007)(356005)(478600001)(1076003)(186003)(26005)(36756003)(6666004)(107886003)(2906002)(47076005)(82310400005)(41300700001)(8936002)(82740400003)(2616005)(40480700001)(36900700001);DIR:OUT;SFP:1102; X-OriginatorOrg: zeku.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Feb 2023 03:27:34.8968 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 5e1cfa4d-addf-44e7-25ac-08db0a4d961e X-MS-Exchange-CrossTenant-Id: 171aedba-f024-43df-bc82-290d40e185ac X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=171aedba-f024-43df-bc82-290d40e185ac;Ip=[103.192.253.182];Helo=[sh-exhtc2.internal.zeku.com] X-MS-Exchange-CrossTenant-AuthSource: SG2APC01FT0026.eop-APC01.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: SG2PR02MB4347 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757321608893948504?= X-GMAIL-MSGID: =?utf-8?q?1757323281999533319?= |
Series |
[v2] kasan: fix deadlock in start_report()
|
|
Commit Message
Weizhao Ouyang
Feb. 9, 2023, 3:11 a.m. UTC
From: Weizhao Ouyang <o451686892@gmail.com> From: Shuai Yuan <yuanshuai@zeku.com> Calling start_report() again between start_report() and end_report() will result in a race issue for the report_lock. In extreme cases this problem arose in Kunit tests in the hardware tag-based Kasan mode. For example, when an invalid memory release problem is found, kasan_report_invalid_free() will print error log, but if an MTE exception is raised during the output log, the kasan_report() is called, resulting in a deadlock problem. The kasan_depth not protect it in hardware tag-based Kasan mode. Signed-off-by: Shuai Yuan <yuanshuai@zeku.com> Reviewed-by: Weizhao Ouyang <ouyangweizhao@zeku.com> Reviewed-by: Peng Ren <renlipeng@zeku.com> --- Changes in v2: -- remove redundant log mm/kasan/report.c | 25 ++++++++++++++++++++----- 1 file changed, 20 insertions(+), 5 deletions(-)
Comments
On Thu, 9 Feb 2023 at 04:27, Weizhao Ouyang <ouyangweizhao@zeku.com> wrote: > > From: Weizhao Ouyang <o451686892@gmail.com> > > From: Shuai Yuan <yuanshuai@zeku.com> > > Calling start_report() again between start_report() and end_report() > will result in a race issue for the report_lock. In extreme cases this > problem arose in Kunit tests in the hardware tag-based Kasan mode. > > For example, when an invalid memory release problem is found, > kasan_report_invalid_free() will print error log, but if an MTE exception > is raised during the output log, the kasan_report() is called, resulting > in a deadlock problem. The kasan_depth not protect it in hardware > tag-based Kasan mode. I think checking report_suppressed() would be cleaner and simpler than ignoring all trylock failures. If trylock fails, it does not mean that the current thread is holding it. We of course could do a custom lock which stores current->tid in the lock word, but it looks effectively equivalent to checking report_suppressed(). > Signed-off-by: Shuai Yuan <yuanshuai@zeku.com> > Reviewed-by: Weizhao Ouyang <ouyangweizhao@zeku.com> > Reviewed-by: Peng Ren <renlipeng@zeku.com> > --- > Changes in v2: > -- remove redundant log > > mm/kasan/report.c | 25 ++++++++++++++++++++----- > 1 file changed, 20 insertions(+), 5 deletions(-) > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c > index 22598b20c7b7..aa39aa8b1855 100644 > --- a/mm/kasan/report.c > +++ b/mm/kasan/report.c > @@ -166,7 +166,7 @@ static inline void fail_non_kasan_kunit_test(void) { } > > static DEFINE_SPINLOCK(report_lock); > > -static void start_report(unsigned long *flags, bool sync) > +static bool start_report(unsigned long *flags, bool sync) > { > fail_non_kasan_kunit_test(); > /* Respect the /proc/sys/kernel/traceoff_on_warning interface. */ > @@ -175,8 +175,13 @@ static void start_report(unsigned long *flags, bool sync) > lockdep_off(); > /* Make sure we don't end up in loop. */ > kasan_disable_current(); > - spin_lock_irqsave(&report_lock, *flags); > + if (!spin_trylock_irqsave(&report_lock, *flags)) { > + lockdep_on(); > + kasan_enable_current(); > + return false; > + } > pr_err("==================================================================\n"); > + return true; > } > > static void end_report(unsigned long *flags, void *addr) > @@ -468,7 +473,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty > if (unlikely(!report_enabled())) > return; > > - start_report(&flags, true); > + if (!start_report(&flags, true)) { > + pr_err("%s: report ignore\n", __func__); > + return; > + } > > memset(&info, 0, sizeof(info)); > info.type = type; > @@ -503,7 +511,11 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write, > goto out; > } > > - start_report(&irq_flags, true); > + if (!start_report(&irq_flags, true)) { > + ret = false; > + pr_err("%s: report ignore\n", __func__); > + goto out; > + } > > memset(&info, 0, sizeof(info)); > info.type = KASAN_REPORT_ACCESS; > @@ -536,7 +548,10 @@ void kasan_report_async(void) > if (unlikely(!report_enabled())) > return; > > - start_report(&flags, false); > + if (!start_report(&flags, false)) { > + pr_err("%s: report ignore\n", __func__); > + return; > + } > pr_err("BUG: KASAN: invalid-access\n"); > pr_err("Asynchronous fault: no details available\n"); > pr_err("\n"); > -- > 2.25.1 > > -- > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20230209031159.2337445-1-ouyangweizhao%40zeku.com.
Hi Dmitry Vyukov Thanks, I see that your means. Currently, report_suppressed() seem not work in Kasan-HW mode, it always return false. Do you think should change the report_suppressed function? I don't know why CONFIG_KASAN_HW_TAGS was blocked separately before. -----邮件原件----- 发件人: Dmitry Vyukov <dvyukov@google.com> 发送时间: 2023年2月9日 16:56 收件人: 欧阳炜钊(Weizhao Ouyang) <ouyangweizhao@zeku.com> 抄送: Andrey Ryabinin <ryabinin.a.a@gmail.com>; Alexander Potapenko <glider@google.com>; Andrey Konovalov <andreyknvl@gmail.com>; Vincenzo Frascino <vincenzo.frascino@arm.com>; Andrew Morton <akpm@linux-foundation.org>; kasan-dev@googlegroups.com; linux-mm@kvack.org; linux-kernel@vger.kernel.org; Weizhao Ouyang <o451686892@gmail.com>; 袁帅(Shuai Yuan) <yuanshuai@zeku.com>; 任立鹏(Peng Ren) <renlipeng@zeku.com> 主题: Re: [PATCH v2] kasan: fix deadlock in start_report() On Thu, 9 Feb 2023 at 04:27, Weizhao Ouyang <ouyangweizhao@zeku.com> wrote: > > From: Weizhao Ouyang <o451686892@gmail.com> > > From: Shuai Yuan <yuanshuai@zeku.com> > > Calling start_report() again between start_report() and end_report() > will result in a race issue for the report_lock. In extreme cases this > problem arose in Kunit tests in the hardware tag-based Kasan mode. > > For example, when an invalid memory release problem is found, > kasan_report_invalid_free() will print error log, but if an MTE > exception is raised during the output log, the kasan_report() is > called, resulting in a deadlock problem. The kasan_depth not protect > it in hardware tag-based Kasan mode. I think checking report_suppressed() would be cleaner and simpler than ignoring all trylock failures. If trylock fails, it does not mean that the current thread is holding it. We of course could do a custom lock which stores current->tid in the lock word, but it looks effectively equivalent to checking report_suppressed(). > Signed-off-by: Shuai Yuan <yuanshuai@zeku.com> > Reviewed-by: Weizhao Ouyang <ouyangweizhao@zeku.com> > Reviewed-by: Peng Ren <renlipeng@zeku.com> > --- > Changes in v2: > -- remove redundant log > > mm/kasan/report.c | 25 ++++++++++++++++++++----- > 1 file changed, 20 insertions(+), 5 deletions(-) > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c index > 22598b20c7b7..aa39aa8b1855 100644 > --- a/mm/kasan/report.c > +++ b/mm/kasan/report.c > @@ -166,7 +166,7 @@ static inline void fail_non_kasan_kunit_test(void) > { } > > static DEFINE_SPINLOCK(report_lock); > > -static void start_report(unsigned long *flags, bool sync) > +static bool start_report(unsigned long *flags, bool sync) > { > fail_non_kasan_kunit_test(); > /* Respect the /proc/sys/kernel/traceoff_on_warning interface. > */ @@ -175,8 +175,13 @@ static void start_report(unsigned long *flags, bool sync) > lockdep_off(); > /* Make sure we don't end up in loop. */ > kasan_disable_current(); > - spin_lock_irqsave(&report_lock, *flags); > + if (!spin_trylock_irqsave(&report_lock, *flags)) { > + lockdep_on(); > + kasan_enable_current(); > + return false; > + } > > pr_err("============================================================== > ====\n"); > + return true; > } > > static void end_report(unsigned long *flags, void *addr) @@ -468,7 > +473,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty > if (unlikely(!report_enabled())) > return; > > - start_report(&flags, true); > + if (!start_report(&flags, true)) { > + pr_err("%s: report ignore\n", __func__); > + return; > + } > > memset(&info, 0, sizeof(info)); > info.type = type; > @@ -503,7 +511,11 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write, > goto out; > } > > - start_report(&irq_flags, true); > + if (!start_report(&irq_flags, true)) { > + ret = false; > + pr_err("%s: report ignore\n", __func__); > + goto out; > + } > > memset(&info, 0, sizeof(info)); > info.type = KASAN_REPORT_ACCESS; @@ -536,7 +548,10 @@ void > kasan_report_async(void) > if (unlikely(!report_enabled())) > return; > > - start_report(&flags, false); > + if (!start_report(&flags, false)) { > + pr_err("%s: report ignore\n", __func__); > + return; > + } > pr_err("BUG: KASAN: invalid-access\n"); > pr_err("Asynchronous fault: no details available\n"); > pr_err("\n"); > -- > 2.25.1 > > -- > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20230209031159.2337445-1-ouyangweizhao%40zeku.com. ZEKU 信息安全声明:本邮件包含信息归发件人所在组织ZEKU所有。 禁止任何人在未经授权的情况下以任何形式(包括但不限于全部或部分披露、复制或传播)使用包含的信息。若您错收了本邮件,请立即电话或邮件通知发件人,并删除本邮件及附件。 Information Security Notice: The information contained in this mail is solely property of the sender's organization ZEKU. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it.
aOn Thu, 9 Feb 2023 at 10:19, 袁帅(Shuai Yuan) <yuanshuai@zeku.com> wrote: > > Hi Dmitry Vyukov > > Thanks, I see that your means. > > Currently, report_suppressed() seem not work in Kasan-HW mode, it always return false. > Do you think should change the report_suppressed function? > I don't know why CONFIG_KASAN_HW_TAGS was blocked separately before. That logic was added by Andrey in: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c068664c97c7cf Andrey, can we make report_enabled() check current->kasan_depth and remove report_suppressed()? Then we can also remove the comment in kasan_report_invalid_free(). It looks like kasan_disable_current() in kmemleak needs to affect HW_TAGS mode as well: https://elixir.bootlin.com/linux/v6.2-rc7/source/mm/kmemleak.c#L301 So overall it looks like simplifications and it will fix what Shuai reported. > -----邮件原件----- > 发件人: Dmitry Vyukov <dvyukov@google.com> > 发送时间: 2023年2月9日 16:56 > 收件人: 欧阳炜钊(Weizhao Ouyang) <ouyangweizhao@zeku.com> > 抄送: Andrey Ryabinin <ryabinin.a.a@gmail.com>; Alexander Potapenko <glider@google.com>; Andrey Konovalov <andreyknvl@gmail.com>; Vincenzo Frascino <vincenzo.frascino@arm.com>; Andrew Morton <akpm@linux-foundation.org>; kasan-dev@googlegroups.com; linux-mm@kvack.org; linux-kernel@vger.kernel.org; Weizhao Ouyang <o451686892@gmail.com>; 袁帅(Shuai Yuan) <yuanshuai@zeku.com>; 任立鹏(Peng Ren) <renlipeng@zeku.com> > 主题: Re: [PATCH v2] kasan: fix deadlock in start_report() > > On Thu, 9 Feb 2023 at 04:27, Weizhao Ouyang <ouyangweizhao@zeku.com> wrote: > > > > From: Weizhao Ouyang <o451686892@gmail.com> > > > > From: Shuai Yuan <yuanshuai@zeku.com> > > > > Calling start_report() again between start_report() and end_report() > > will result in a race issue for the report_lock. In extreme cases this > > problem arose in Kunit tests in the hardware tag-based Kasan mode. > > > > For example, when an invalid memory release problem is found, > > kasan_report_invalid_free() will print error log, but if an MTE > > exception is raised during the output log, the kasan_report() is > > called, resulting in a deadlock problem. The kasan_depth not protect > > it in hardware tag-based Kasan mode. > > I think checking report_suppressed() would be cleaner and simpler than ignoring all trylock failures. If trylock fails, it does not mean that the current thread is holding it. We of course could do a custom lock which stores current->tid in the lock word, but it looks effectively equivalent to checking report_suppressed(). > > > > > Signed-off-by: Shuai Yuan <yuanshuai@zeku.com> > > Reviewed-by: Weizhao Ouyang <ouyangweizhao@zeku.com> > > Reviewed-by: Peng Ren <renlipeng@zeku.com> > > --- > > Changes in v2: > > -- remove redundant log > > > > mm/kasan/report.c | 25 ++++++++++++++++++++----- > > 1 file changed, 20 insertions(+), 5 deletions(-) > > > > diff --git a/mm/kasan/report.c b/mm/kasan/report.c index > > 22598b20c7b7..aa39aa8b1855 100644 > > --- a/mm/kasan/report.c > > +++ b/mm/kasan/report.c > > @@ -166,7 +166,7 @@ static inline void fail_non_kasan_kunit_test(void) > > { } > > > > static DEFINE_SPINLOCK(report_lock); > > > > -static void start_report(unsigned long *flags, bool sync) > > +static bool start_report(unsigned long *flags, bool sync) > > { > > fail_non_kasan_kunit_test(); > > /* Respect the /proc/sys/kernel/traceoff_on_warning interface. > > */ @@ -175,8 +175,13 @@ static void start_report(unsigned long *flags, bool sync) > > lockdep_off(); > > /* Make sure we don't end up in loop. */ > > kasan_disable_current(); > > - spin_lock_irqsave(&report_lock, *flags); > > + if (!spin_trylock_irqsave(&report_lock, *flags)) { > > + lockdep_on(); > > + kasan_enable_current(); > > + return false; > > + } > > > > pr_err("============================================================== > > ====\n"); > > + return true; > > } > > > > static void end_report(unsigned long *flags, void *addr) @@ -468,7 > > +473,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty > > if (unlikely(!report_enabled())) > > return; > > > > - start_report(&flags, true); > > + if (!start_report(&flags, true)) { > > + pr_err("%s: report ignore\n", __func__); > > + return; > > + } > > > > memset(&info, 0, sizeof(info)); > > info.type = type; > > @@ -503,7 +511,11 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write, > > goto out; > > } > > > > - start_report(&irq_flags, true); > > + if (!start_report(&irq_flags, true)) { > > + ret = false; > > + pr_err("%s: report ignore\n", __func__); > > + goto out; > > + } > > > > memset(&info, 0, sizeof(info)); > > info.type = KASAN_REPORT_ACCESS; @@ -536,7 +548,10 @@ void > > kasan_report_async(void) > > if (unlikely(!report_enabled())) > > return; > > > > - start_report(&flags, false); > > + if (!start_report(&flags, false)) { > > + pr_err("%s: report ignore\n", __func__); > > + return; > > + } > > pr_err("BUG: KASAN: invalid-access\n"); > > pr_err("Asynchronous fault: no details available\n"); > > pr_err("\n"); > > -- > > 2.25.1 > > > > -- > > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20230209031159.2337445-1-ouyangweizhao%40zeku.com. > ZEKU > 信息安全声明:本邮件包含信息归发件人所在组织ZEKU所有。 禁止任何人在未经授权的情况下以任何形式(包括但不限于全部或部分披露、复制或传播)使用包含的信息。若您错收了本邮件,请立即电话或邮件通知发件人,并删除本邮件及附件。 > Information Security Notice: The information contained in this mail is solely property of the sender's organization ZEKU. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it.
On Thu, Feb 9, 2023 at 11:44 AM Dmitry Vyukov <dvyukov@google.com> wrote: > > On Thu, 9 Feb 2023 at 10:19, 袁帅(Shuai Yuan) <yuanshuai@zeku.com> wrote: > > > > Hi Dmitry Vyukov > > > > Thanks, I see that your means. > > > > Currently, report_suppressed() seem not work in Kasan-HW mode, it always return false. > > Do you think should change the report_suppressed function? > > I don't know why CONFIG_KASAN_HW_TAGS was blocked separately before. > > That logic was added by Andrey in: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c068664c97c7cf > > Andrey, can we make report_enabled() check current->kasan_depth and > remove report_suppressed()? I decided to not use kasan_depth for HW_TAGS, as we can always use a match-all tag to make "invalid" memory accesses. I think we can fix the reporting code to do exactly that so that it doesn't cause MTE faults. Shuai, could you clarify, at which point due kasan_report_invalid_free an MTE exception is raised in your tests? > Then we can also remove the comment in kasan_report_invalid_free(). > > It looks like kasan_disable_current() in kmemleak needs to affect > HW_TAGS mode as well: > https://elixir.bootlin.com/linux/v6.2-rc7/source/mm/kmemleak.c#L301 It uses kasan_reset_tag, so it should work properly with HW_TAGS.
On Friday, February 10, 2023 at 6:54 AM Andrey Konovalov <andreyknvl@gmail.com> wrote: > On Thu, Feb 9, 2023 at 11:44 AM Dmitry Vyukov <dvyukov@google.com> > wrote: > > > > On Thu, 9 Feb 2023 at 10:19, 袁帅(Shuai Yuan) <yuanshuai@zeku.com> > wrote: > > > > > > Hi Dmitry Vyukov > > > > > > Thanks, I see that your means. > > > > > > Currently, report_suppressed() seem not work in Kasan-HW mode, it > always return false. > > > Do you think should change the report_suppressed function? > > > I don't know why CONFIG_KASAN_HW_TAGS was blocked separately > before. > > > > That logic was added by Andrey in: > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/com > > mit/?id=c068664c97c7cf > > > > Andrey, can we make report_enabled() check current->kasan_depth and > > remove report_suppressed()? > > I decided to not use kasan_depth for HW_TAGS, as we can always use a > match-all tag to make "invalid" memory accesses. > > I think we can fix the reporting code to do exactly that so that it doesn't > cause MTE faults. > > Shuai, could you clarify, at which point due kasan_report_invalid_free an > MTE exception is raised in your tests? Yes, I need some time to clarify this problem with a clear log by test. > > Then we can also remove the comment in kasan_report_invalid_free(). > > > > It looks like kasan_disable_current() in kmemleak needs to affect > > HW_TAGS mode as well: > > https://elixir.bootlin.com/linux/v6.2-rc7/source/mm/kmemleak.c#L301 > > It uses kasan_reset_tag, so it should work properly with HW_TAGS. ZEKU 信息安全声明:本邮件包含信息归发件人所在组织ZEKU所有。 禁止任何人在未经授权的情况下以任何形式(包括但不限于全部或部分披露、复制或传播)使用包含的信息。若您错收了本邮件,请立即电话或邮件通知发件人,并删除本邮件及附件。 Information Security Notice: The information contained in this mail is solely property of the sender's organization ZEKU. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it.
> On Friday, February 10, 2023 at 6:54 AM Andrey Konovalov > <andreyknvl@gmail.com> > wrote: > > On Thu, Feb 9, 2023 at 11:44 AM Dmitry Vyukov <dvyukov@google.com> > > wrote: > > > > > > On Thu, 9 Feb 2023 at 10:19, 袁帅(Shuai Yuan) <yuanshuai@zeku.com> > > wrote: > > > > > > > > Hi Dmitry Vyukov > > > > > > > > Thanks, I see that your means. > > > > > > > > Currently, report_suppressed() seem not work in Kasan-HW mode, it > > always return false. > > > > Do you think should change the report_suppressed function? > > > > I don't know why CONFIG_KASAN_HW_TAGS was blocked separately > > before. > > > > > > That logic was added by Andrey in: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/c > > > om > > > mit/?id=c068664c97c7cf > > > > > > Andrey, can we make report_enabled() check current->kasan_depth and > > > remove report_suppressed()? > > > > I decided to not use kasan_depth for HW_TAGS, as we can always use a > > match-all tag to make "invalid" memory accesses. > > > > I think we can fix the reporting code to do exactly that so that it > > doesn't cause MTE faults. > > > > Shuai, could you clarify, at which point due kasan_report_invalid_free > > an MTE exception is raised in your tests? > > Yes, I need some time to clarify this problem with a clear log by test. > Hi Andrey and Dmitry I have got valid information to clarify the problem and solutions. I made a few changes to the code to do this. a) I was testing on a device that had hardware issues with MTE, and the memory tag sometimes changed randomly. b) I did this test on kernel version 5.15, but this problem should exist on the latest kernel version from a code perspective. c) Run the kernel with a single core by "maxcpus=1". d) Code modify, (1) Call dump_stack_lvl(KERN_ERR) when start_report() returns false, this is done based on the current patch v2. (2) Add some log in print_address_description() to show kmem_cache address and memory tag. https://elixir.bootlin.com/linux/v5.15.94/source/mm/kasan/report.c#L252 @@ -255,24 +260,25 @@ static void print_address_description(void *addr, u8 tag) dump_stack_lvl(KERN_ERR); pr_err("\n"); - +pr_err("ys:1\n"); if (page && PageSlab(page)) { struct kmem_cache *cache = page->slab_cache; -void *object = nearest_obj(cache, page,addr); +void *object = NULL; +pr_err("ys:cache start %llx, mtag:%x, page_address:%llx\n", +cache, hw_get_mem_tag(cache), page_address(page)); +object = nearest_obj(cache, page, addr); + pr_err("ys:cache end %llx, object %llx, page_address:%llx\n", + cache, object, page_address(page)); describe_object(cache, object, addr, tag); } (3) Add kasan_enable_tagging() to KUNIT_EXPECT_KASAN_FAIL in https://elixir.bootlin.com/linux/v5.15.94/source/lib/test_kasan.c#L94 This ensures that kunit is tested on this unstable device. e) With the above modification we can get the backtrace: ys:1 ys:cache start f4ffff8140005380, mtag:fe, page_address:ffffff8140328000 ys:cache change f4ffff8140005380, mtag:fe, page_address:ffffff8140328000 ys: error address:f4ffff8140005398 Pointer tag: [f4], memory tag: [fe] CPU: 0 PID: 100 Comm: kunit_try_catch Tainted: Call trace: dump_backtrace.cfi_jt+0x0/0x8 show_stack+0x28/0x38 dump_stack_lvl+0x68/0x98 __kasan_report+0x110/0x29c kasan_report+0x40/0x8c __do_kernel_fault+0xd4/0x2c4 do_bad_area+0x40/0x100 do_tag_check_fault+0x2c/0x40 do_mem_abort+0x74/0x138 el1_abort+0x40/0x64 el1h_64_sync_handler+0x60/0xa0 el1h_64_sync+0x7c/0x80 print_address_description+0x154/0x2e8 __kasan_report+0x200/0x29c kasan_report+0x40/0x8c __do_kernel_fault+0xd4/0x2c4 do_bad_area+0x40/0x100 do_tag_check_fault+0x2c/0x40 do_mem_abort+0x74/0x138 el1_abort+0x40/0x64 el1h_64_sync_handler+0x60/0xa0 el1h_64_sync+0x7c/0x80 enqueue_entity+0x23c/0x4b8 enqueue_task_fair+0x13c/0x48c enqueue_task.llvm.1684042887774774428+0xd0/0x250 __do_set_cpus_allowed+0x1ac/0x304 __set_cpus_allowed_ptr_locked+0x168/0x28c migrate_enable+0xf0/0x17c kasan_strings+0x59c/0x72c kunit_try_run_case+0x84/0x128 kunit_generic_run_threadfn_adapter+0x48/0x80 kthread+0x17c/0x1e8 ret_from_fork+0x10/0x20 ys:cache end f4ffff8140005380, object ffffff814032ca00, page_address:ffffff8140328000 f) From the above log, you can see that the system tried to call kasan_report() twice, because we visit tag address by kmem_cache and this tag have change.. Normally this doesn't happen easily. So I think we can add kasan_reset_tag() to handle the kmem_cache address. For example, the following changes are used for the latest kernel version. diff --git a/mm/kasan/report.c b/mm/kasan/report.c --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -412,7 +412,7 @@ static void complete_report_info(struct kasan_report_info *info) slab = kasan_addr_to_slab(addr); if (slab) { - info->cache = slab->slab_cache; + info->cache = kasan_reset_tag(slab->slab_cache); info->object = nearest_obj(info->cache, slab, addr); I have tested Kernel5.15 using a similar approach and it seems to work. On the other hand, I think there should be other solutions and hope to get your feedback. Thanks a lot. > > > Then we can also remove the comment in kasan_report_invalid_free(). > > > > > > It looks like kasan_disable_current() in kmemleak needs to affect > > > HW_TAGS mode as well: > > > https://elixir.bootlin.com/linux/v6.2-rc7/source/mm/kmemleak.c#L301 > > > > It uses kasan_reset_tag, so it should work properly with HW_TAGS. ZEKU 信息安全声明:本邮件包含信息归发件人所在组织ZEKU所有。 禁止任何人在未经授权的情况下以任何形式(包括但不限于全部或部分披露、复制或传播)使用包含的信息。若您错收了本邮件,请立即电话或邮件通知发件人,并删除本邮件及附件。 Information Security Notice: The information contained in this mail is solely property of the sender's organization ZEKU. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it.
On Wed, Feb 15, 2023 at 2:22 PM 袁帅(Shuai Yuan) <yuanshuai@zeku.com> wrote: > > I have got valid information to clarify the problem and solutions. I made > a few changes to the code to do this. > > a) I was testing on a device that had hardware issues with MTE, > and the memory tag sometimes changed randomly. Ah, I see. Faulty hardware explains the problem. Thank you! > f) From the above log, you can see that the system tried to call kasan_report() twice, > because we visit tag address by kmem_cache and this tag have change.. > Normally this doesn't happen easily. So I think we can add kasan_reset_tag() to handle > the kmem_cache address. > > For example, the following changes are used for the latest kernel version. > diff --git a/mm/kasan/report.c b/mm/kasan/report.c > --- a/mm/kasan/report.c > +++ b/mm/kasan/report.c > @@ -412,7 +412,7 @@ static void complete_report_info(struct kasan_report_info *info) > slab = kasan_addr_to_slab(addr); > if (slab) { > - info->cache = slab->slab_cache; > + info->cache = kasan_reset_tag(slab->slab_cache); This fixes the problem for accesses to slab_cache, but KASAN reporting code also accesses stack depot memory and calls other routines that might access (faulty) tagged memory. And the accessed addresses aren't exposed to KASAN code, so we can't use kasan_reset_tag for those. I wonder what would be a good solution here. I really don't want to use kasan_depth or some other global/per-cpu flag here, as it would be too good of a target for attackers wishing to bypass MTE. Perhaps, disabling MTE once reporting started would be a better option: calling the disabling routine would arguably be a harder task for an attacker than overwriting a flag. +Catalin, would it be acceptable to implement a routine that disables in-kernel MTE tag checking (until the next mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE fault does this, but without the fault itself. I.e., expose the part of do_tag_recovery functionality without report_tag_fault? TL;DR on the problem: Besides relying on CPU tag checks, KASAN also does explicit tag checks to detect double-frees and similar problems, see the calls to kasan_report_invalid_free. Thus, when e.g. a double-free report is printed, MTE checking is still on. This results in a deadlock in case invalid memory is accessed during KASAN reporting. Thanks!
On Mon, Feb 27, 2023 at 03:13:45AM +0100, Andrey Konovalov wrote: > +Catalin, would it be acceptable to implement a routine that disables > in-kernel MTE tag checking (until the next > mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE > fault does this, but without the fault itself. I.e., expose the part > of do_tag_recovery functionality without report_tag_fault? I don't think we ever re-enable MTE after do_tag_recovery(). The mte_enable_kernel_*() are called at boot. We do call kasan_enable_tagging() explicitly in the kunit tests but that's a controlled fault environment. IIUC, the problem is that the kernel already got an MTE fault, so at that point the error is not really recoverable. If we want to avoid a fault in the first place, we could do something like __uaccess_enable_tco() (Vincenzo has some patches to generalise these routines) but if an MTE fault already triggered and MTE is to stay disabled after the reporting anyway, I don't think it's worth it. So I wonder whether it's easier to just disable MTE before calling report_tag_fault() so that it won't trigger additional faults: diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index f4cb0f85ccf4..1449d2bc6f10 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -329,8 +329,6 @@ static void do_tag_recovery(unsigned long addr, unsigned long esr, struct pt_regs *regs) { - report_tag_fault(addr, esr, regs); - /* * Disable MTE Tag Checking on the local CPU for the current EL. * It will be done lazily on the other CPUs when they will hit a @@ -339,6 +337,8 @@ static void do_tag_recovery(unsigned long addr, unsigned long esr, sysreg_clear_set(sctlr_el1, SCTLR_EL1_TCF_MASK, SYS_FIELD_PREP_ENUM(SCTLR_EL1, TCF, NONE)); isb(); + + report_tag_fault(addr, esr, regs); } static bool is_el1_mte_sync_tag_check_fault(unsigned long esr)
On Tue, Feb 28, 2023 at 5:09 PM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Mon, Feb 27, 2023 at 03:13:45AM +0100, Andrey Konovalov wrote: > > +Catalin, would it be acceptable to implement a routine that disables > > in-kernel MTE tag checking (until the next > > mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE > > fault does this, but without the fault itself. I.e., expose the part > > of do_tag_recovery functionality without report_tag_fault? > > I don't think we ever re-enable MTE after do_tag_recovery(). The > mte_enable_kernel_*() are called at boot. We do call > kasan_enable_tagging() explicitly in the kunit tests but that's a > controlled fault environment. Right, but here we don't want to re-enable MTE after a fault, we want to suppress faults when printing an error report. > IIUC, the problem is that the kernel already got an MTE fault, so at > that point the error is not really recoverable. No, the problem is with the following sequence of events: 1. KASAN detects a memory corruption and starts printing a report _without getting an MTE fault_. This happens when e.g. KASAN sees a free of an invalid address. 2. During error reporting, an MTE fault is triggered by the error reporting code. E.g. while collecting information about the accessed slab object. 3. KASAN tries to print another report while printing a report and goes into a deadlock. If we could avoid MTE faults being triggered during error reporting, this would solve the problem. > If we want to avoid a > fault in the first place, we could do something like > __uaccess_enable_tco() (Vincenzo has some patches to generalise these > routines) Ah, this looks exactly like what we need. Adding __uaccess_en/disable_tco to kasan_report_invalid_free solves the problem. Do you think it would be possible to expose these routines to KASAN? > but if an MTE fault already triggered and MTE is to stay > disabled after the reporting anyway, I don't think it's worth it. No MTE fault is triggered yet in the described sequence of events. > So I wonder whether it's easier to just disable MTE before calling > report_tag_fault() so that it won't trigger additional faults: This will only help in case the first error report is caused by an MTE fault. However, this won't help with the discussed problem: KASAN can detect a memory corruption and print a report without getting an MTE fault. Nevertheless, this change makes sense to avoid a similar scenario involving 2 MTE faults.
On Tue, Feb 28, 2023 at 10:50:46PM +0100, Andrey Konovalov wrote: > On Tue, Feb 28, 2023 at 5:09 PM Catalin Marinas <catalin.marinas@arm.com> wrote: > > On Mon, Feb 27, 2023 at 03:13:45AM +0100, Andrey Konovalov wrote: > > > +Catalin, would it be acceptable to implement a routine that disables > > > in-kernel MTE tag checking (until the next > > > mte_enable_kernel_sync/async/asymm call)? In a similar way an MTE > > > fault does this, but without the fault itself. I.e., expose the part > > > of do_tag_recovery functionality without report_tag_fault? > > > > I don't think we ever re-enable MTE after do_tag_recovery(). The > > mte_enable_kernel_*() are called at boot. We do call > > kasan_enable_tagging() explicitly in the kunit tests but that's a > > controlled fault environment. > > Right, but here we don't want to re-enable MTE after a fault, we want > to suppress faults when printing an error report. > > > IIUC, the problem is that the kernel already got an MTE fault, so at > > that point the error is not really recoverable. > > No, the problem is with the following sequence of events: > > 1. KASAN detects a memory corruption and starts printing a report > _without getting an MTE fault_. This happens when e.g. KASAN sees a > free of an invalid address. > > 2. During error reporting, an MTE fault is triggered by the error > reporting code. E.g. while collecting information about the accessed > slab object. > > 3. KASAN tries to print another report while printing a report and > goes into a deadlock. > > If we could avoid MTE faults being triggered during error reporting, > this would solve the problem. Ah, I get it now. So we just want to avoid triggering a benign MTE fault. > > If we want to avoid a > > fault in the first place, we could do something like > > __uaccess_enable_tco() (Vincenzo has some patches to generalise these > > routines) > > Ah, this looks exactly like what we need. Adding > __uaccess_en/disable_tco to kasan_report_invalid_free solves the > problem. > > Do you think it would be possible to expose these routines to KASAN? Yes. I'm including Vincenzo's patch below (part of fixing some potential strscpy() faults with its unaligned accesses eager reading; we'll get to posting that eventually). You can add some arch_kasan_enable/disable() macros on top and feel free to include the patch below. Now, I wonder whether we should link those into kasan_disable_current(). These functions only deal with the depth for KASAN_SW_TAGS but it would make sense for KASAN_HW_TAGS to enable tag-check-override so that we don't need to bother with a match-all tags on pointer dereferencing. ----8<---------------------------- From 0dcfc84d8b984001219cc3c9eaf698c26286624c Mon Sep 17 00:00:00 2001 From: Vincenzo Frascino <vincenzo.frascino@arm.com> Date: Thu, 13 Oct 2022 07:46:23 +0100 Subject: [PATCH] arm64: mte: Rename TCO routines The TCO related routines are used in uaccess methods and load_unaligned_zeropad() but are unrelated to both even if the naming suggest otherwise. Improve the readability of the code moving the away from uaccess.h and pre-pending them with "mte". Cc: Will Deacon <will@kernel.org> Signed-off-by: Vincenzo Frascino <vincenzo.frascino@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> --- arch/arm64/include/asm/mte-kasan.h | 81 +++++++++++++++++++++++++ arch/arm64/include/asm/mte.h | 12 ---- arch/arm64/include/asm/uaccess.h | 66 +++----------------- arch/arm64/include/asm/word-at-a-time.h | 4 +- 4 files changed, 93 insertions(+), 70 deletions(-) diff --git a/arch/arm64/include/asm/mte-kasan.h b/arch/arm64/include/asm/mte-kasan.h index 9f79425fc65a..598be32ed811 100644 --- a/arch/arm64/include/asm/mte-kasan.h +++ b/arch/arm64/include/asm/mte-kasan.h @@ -13,8 +13,73 @@ #include <linux/types.h> +#ifdef CONFIG_KASAN_HW_TAGS + +/* Whether the MTE asynchronous mode is enabled. */ +DECLARE_STATIC_KEY_FALSE(mte_async_or_asymm_mode); + +static inline bool system_uses_mte_async_or_asymm_mode(void) +{ + return static_branch_unlikely(&mte_async_or_asymm_mode); +} + +#else /* CONFIG_KASAN_HW_TAGS */ + +static inline bool system_uses_mte_async_or_asymm_mode(void) +{ + return false; +} + +#endif /* CONFIG_KASAN_HW_TAGS */ + #ifdef CONFIG_ARM64_MTE +/* + * The Tag Check Flag (TCF) mode for MTE is per EL, hence TCF0 + * affects EL0 and TCF affects EL1 irrespective of which TTBR is + * used. + * The kernel accesses TTBR0 usually with LDTR/STTR instructions + * when UAO is available, so these would act as EL0 accesses using + * TCF0. + * However futex.h code uses exclusives which would be executed as + * EL1, this can potentially cause a tag check fault even if the + * user disables TCF0. + * + * To address the problem we set the PSTATE.TCO bit in uaccess_enable() + * and reset it in uaccess_disable(). + * + * The Tag check override (TCO) bit disables temporarily the tag checking + * preventing the issue. + */ +static inline void __mte_disable_tco(void) +{ + asm volatile(ALTERNATIVE("nop", SET_PSTATE_TCO(0), + ARM64_MTE, CONFIG_KASAN_HW_TAGS)); +} + +static inline void __mte_enable_tco(void) +{ + asm volatile(ALTERNATIVE("nop", SET_PSTATE_TCO(1), + ARM64_MTE, CONFIG_KASAN_HW_TAGS)); +} + +/* + * These functions disable tag checking only if in MTE async mode + * since the sync mode generates exceptions synchronously and the + * nofault or load_unaligned_zeropad can handle them. + */ +static inline void __mte_disable_tco_async(void) +{ + if (system_uses_mte_async_or_asymm_mode()) + __mte_disable_tco(); +} + +static inline void __mte_enable_tco_async(void) +{ + if (system_uses_mte_async_or_asymm_mode()) + __mte_enable_tco(); +} + /* * These functions are meant to be only used from KASAN runtime through * the arch_*() interface defined in asm/memory.h. @@ -138,6 +203,22 @@ void mte_enable_kernel_asymm(void); #else /* CONFIG_ARM64_MTE */ +static inline void __mte_disable_tco(void) +{ +} + +static inline void __mte_enable_tco(void) +{ +} + +static inline void __mte_disable_tco_async(void) +{ +} + +static inline void __mte_enable_tco_async(void) +{ +} + static inline u8 mte_get_ptr_tag(void *ptr) { return 0xFF; diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h index 20dd06d70af5..c028afb1cd0b 100644 --- a/arch/arm64/include/asm/mte.h +++ b/arch/arm64/include/asm/mte.h @@ -178,14 +178,6 @@ static inline void mte_disable_tco_entry(struct task_struct *task) } #ifdef CONFIG_KASAN_HW_TAGS -/* Whether the MTE asynchronous mode is enabled. */ -DECLARE_STATIC_KEY_FALSE(mte_async_or_asymm_mode); - -static inline bool system_uses_mte_async_or_asymm_mode(void) -{ - return static_branch_unlikely(&mte_async_or_asymm_mode); -} - void mte_check_tfsr_el1(void); static inline void mte_check_tfsr_entry(void) @@ -212,10 +204,6 @@ static inline void mte_check_tfsr_exit(void) mte_check_tfsr_el1(); } #else -static inline bool system_uses_mte_async_or_asymm_mode(void) -{ - return false; -} static inline void mte_check_tfsr_el1(void) { } diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index 5c7b2f9d5913..057ec1882326 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -136,55 +136,9 @@ static inline void __uaccess_enable_hw_pan(void) CONFIG_ARM64_PAN)); } -/* - * The Tag Check Flag (TCF) mode for MTE is per EL, hence TCF0 - * affects EL0 and TCF affects EL1 irrespective of which TTBR is - * used. - * The kernel accesses TTBR0 usually with LDTR/STTR instructions - * when UAO is available, so these would act as EL0 accesses using - * TCF0. - * However futex.h code uses exclusives which would be executed as - * EL1, this can potentially cause a tag check fault even if the - * user disables TCF0. - * - * To address the problem we set the PSTATE.TCO bit in uaccess_enable() - * and reset it in uaccess_disable(). - * - * The Tag check override (TCO) bit disables temporarily the tag checking - * preventing the issue. - */ -static inline void __uaccess_disable_tco(void) -{ - asm volatile(ALTERNATIVE("nop", SET_PSTATE_TCO(0), - ARM64_MTE, CONFIG_KASAN_HW_TAGS)); -} - -static inline void __uaccess_enable_tco(void) -{ - asm volatile(ALTERNATIVE("nop", SET_PSTATE_TCO(1), - ARM64_MTE, CONFIG_KASAN_HW_TAGS)); -} - -/* - * These functions disable tag checking only if in MTE async mode - * since the sync mode generates exceptions synchronously and the - * nofault or load_unaligned_zeropad can handle them. - */ -static inline void __uaccess_disable_tco_async(void) -{ - if (system_uses_mte_async_or_asymm_mode()) - __uaccess_disable_tco(); -} - -static inline void __uaccess_enable_tco_async(void) -{ - if (system_uses_mte_async_or_asymm_mode()) - __uaccess_enable_tco(); -} - static inline void uaccess_disable_privileged(void) { - __uaccess_disable_tco(); + __mte_disable_tco(); if (uaccess_ttbr0_disable()) return; @@ -194,7 +148,7 @@ static inline void uaccess_disable_privileged(void) static inline void uaccess_enable_privileged(void) { - __uaccess_enable_tco(); + __mte_enable_tco(); if (uaccess_ttbr0_enable()) return; @@ -302,8 +256,8 @@ do { \ #define get_user __get_user /* - * We must not call into the scheduler between __uaccess_enable_tco_async() and - * __uaccess_disable_tco_async(). As `dst` and `src` may contain blocking + * We must not call into the scheduler between __mte_enable_tco_async() and + * __mte_disable_tco_async(). As `dst` and `src` may contain blocking * functions, we must evaluate these outside of the critical section. */ #define __get_kernel_nofault(dst, src, type, err_label) \ @@ -312,10 +266,10 @@ do { \ __typeof__(src) __gkn_src = (src); \ int __gkn_err = 0; \ \ - __uaccess_enable_tco_async(); \ + __mte_enable_tco_async(); \ __raw_get_mem("ldr", *((type *)(__gkn_dst)), \ (__force type *)(__gkn_src), __gkn_err, K); \ - __uaccess_disable_tco_async(); \ + __mte_disable_tco_async(); \ \ if (unlikely(__gkn_err)) \ goto err_label; \ @@ -388,8 +342,8 @@ do { \ #define put_user __put_user /* - * We must not call into the scheduler between __uaccess_enable_tco_async() and - * __uaccess_disable_tco_async(). As `dst` and `src` may contain blocking + * We must not call into the scheduler between __mte_enable_tco_async() and + * __mte_disable_tco_async(). As `dst` and `src` may contain blocking * functions, we must evaluate these outside of the critical section. */ #define __put_kernel_nofault(dst, src, type, err_label) \ @@ -398,10 +352,10 @@ do { \ __typeof__(src) __pkn_src = (src); \ int __pkn_err = 0; \ \ - __uaccess_enable_tco_async(); \ + __mte_enable_tco_async(); \ __raw_put_mem("str", *((type *)(__pkn_src)), \ (__force type *)(__pkn_dst), __pkn_err, K); \ - __uaccess_disable_tco_async(); \ + __mte_disable_tco_async(); \ \ if (unlikely(__pkn_err)) \ goto err_label; \ diff --git a/arch/arm64/include/asm/word-at-a-time.h b/arch/arm64/include/asm/word-at-a-time.h index 1c8e4f2490bf..f3b151ed0d7a 100644 --- a/arch/arm64/include/asm/word-at-a-time.h +++ b/arch/arm64/include/asm/word-at-a-time.h @@ -55,7 +55,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr) { unsigned long ret; - __uaccess_enable_tco_async(); + __mte_enable_tco_async(); /* Load word from unaligned pointer addr */ asm( @@ -65,7 +65,7 @@ static inline unsigned long load_unaligned_zeropad(const void *addr) : "=&r" (ret) : "r" (addr), "Q" (*(unsigned long *)addr)); - __uaccess_disable_tco_async(); + __mte_disable_tco_async(); return ret; }
> On Tue, Feb 28, 2023 at 10:50:46PM +0100, Andrey Konovalov wrote: > > On Tue, Feb 28, 2023 at 5:09 PM Catalin Marinas <catalin.marinas@arm.com> > > > > Right, but here we don't want to re-enable MTE after a fault, we want > > to suppress faults when printing an error report. > > > > > IIUC, the problem is that the kernel already got an MTE fault, so at > > > that point the error is not really recoverable. > > > > No, the problem is with the following sequence of events: > > > > 1. KASAN detects a memory corruption and starts printing a report > > _without getting an MTE fault_. This happens when e.g. KASAN sees a > > free of an invalid address. > > > > 2. During error reporting, an MTE fault is triggered by the error > > reporting code. E.g. while collecting information about the accessed > > slab object. > > > > 3. KASAN tries to print another report while printing a report and > > goes into a deadlock. > > > > If we could avoid MTE faults being triggered during error reporting, > > this would solve the problem. > > Ah, I get it now. So we just want to avoid triggering a benign MTE fault. > > > > If we want to avoid a > > > fault in the first place, we could do something like > > > __uaccess_enable_tco() (Vincenzo has some patches to generalise > > > these > > > routines) > > > > Ah, this looks exactly like what we need. Adding > > __uaccess_en/disable_tco to kasan_report_invalid_free solves the > > problem. > > > > Do you think it would be possible to expose these routines to KASAN? > > Yes. I'm including Vincenzo's patch below (part of fixing some potential > strscpy() faults with its unaligned accesses eager reading; we'll get to posting > that eventually). You can add some arch_kasan_enable/disable() macros on > top and feel free to include the patch below. I have initially verified the following code on kernel version 5.15 and it is valid. Although not using the latest interface, there is no fundamental difference. I think this change should also apply to the latest kernel code. diff --git a/mm/kasan/report.c b/mm/kasan/report.c index 1f96a72c7edd..73b7fc532d81 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -28,7 +28,9 @@ #include <trace/events/error_report.h> #include <asm/sections.h> - +#ifdef CONFIG_KASAN_HW_TAGS +#include <asm/uaccess.h> +#endif #include <kunit/test.h> #include "kasan.h" @@ -107,6 +109,10 @@ static void start_report(unsigned long *flags) */ kasan_disable_current(); spin_lock_irqsave(&report_lock, *flags); +#ifdef CONFIG_KASAN_HW_TAGS +if (kasan_hw_tags_enabled()) +__uaccess_enable_tco(); +#endif pr_err("==================================================================\n"); } @@ -116,6 +122,10 @@ static void end_report(unsigned long *flags, unsigned long addr) trace_error_report_end(ERROR_DETECTOR_KASAN, addr); pr_err("==================================================================\n"); add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); +#ifdef CONFIG_KASAN_HW_TAGS +if (kasan_hw_tags_enabled()) +__uaccess_disable_tco(); +#endif spin_unlock_irqrestore(&report_lock, *flags); if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags)) { /* > Now, I wonder whether we should link those into kasan_disable_current(). > These functions only deal with the depth for KASAN_SW_TAGS but it would > make sense for KASAN_HW_TAGS to enable tag-check-override so that we > don't need to bother with a match-all tags on pointer dereferencing. ZEKU 信息安全声明:本邮件包含信息归发件人所在组织ZEKU所有。 禁止任何人在未经授权的情况下以任何形式(包括但不限于全部或部分披露、复制或传播)使用包含的信息。若您错收了本邮件,请立即电话或邮件通知发件人,并删除本邮件及附件。 Information Security Notice: The information contained in this mail is solely property of the sender's organization ZEKU. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this email in error, please notify the sender by phone or email immediately and delete it.
On Wed, Mar 1, 2023 at 6:00 PM Catalin Marinas <catalin.marinas@arm.com> wrote: > > Yes. I'm including Vincenzo's patch below (part of fixing some potential > strscpy() faults with its unaligned accesses eager reading; we'll get to > posting that eventually). You can add some arch_kasan_enable/disable() > macros on top and feel free to include the patch below. Ah, perfect! I'll send a patchset soon. Thanks! > Now, I wonder whether we should link those into kasan_disable_current(). > These functions only deal with the depth for KASAN_SW_TAGS but it would > make sense for KASAN_HW_TAGS to enable tag-check-override so that we > don't need to bother with a match-all tags on pointer dereferencing. Using these TCO routines requires having (at least) migration disabled, right? It's not a problem for KASAN reporting code, as it already disables preemption anyway. The question is with the other kasan_disable/enable_current() call sites. But as within all of them, the code does either a single access or a memcpy or something similar, I think we can disable preemption for that duration. On a related note, I recalled that we also have a bug about using supporting no_sanitize_address for HW_TAGS KASAN. And Peter suggested using TCO entry/exit instrumentation to resolve it [2]. However, we will also need to disable preemption for the duration of no_sanitize_address-annotated functions, and I'm not sure if it's a good idea to do that via compiler instrumentation. Any thoughts? In the mean time, I'll send a simpler patchset without converting all kasan_disable/enable_current(). [1] https://bugzilla.kernel.org/show_bug.cgi?id=212513 [2] https://bugzilla.kernel.org/show_bug.cgi?id=212513#c2
On Sat, Mar 11, 2023 at 12:42:20AM +0100, Andrey Konovalov wrote: > On Wed, Mar 1, 2023 at 6:00 PM Catalin Marinas <catalin.marinas@arm.com> wrote: > > Yes. I'm including Vincenzo's patch below (part of fixing some potential > > strscpy() faults with its unaligned accesses eager reading; we'll get to > > posting that eventually). You can add some arch_kasan_enable/disable() > > macros on top and feel free to include the patch below. > > Ah, perfect! I'll send a patchset soon. Thanks! > > > Now, I wonder whether we should link those into kasan_disable_current(). > > These functions only deal with the depth for KASAN_SW_TAGS but it would > > make sense for KASAN_HW_TAGS to enable tag-check-override so that we > > don't need to bother with a match-all tags on pointer dereferencing. > > Using these TCO routines requires having (at least) migration disabled, right? Not necessarily. The TCO is set per CPU and disabling preemption (I don't think migration is sufficient) would work but these routines are also called on a uaccess fault path, so it needs to be preemptible. We used to clear TCO on exception entry prior to commit 38ddf7dafaea ("arm64: mte: avoid clearing PSTATE.TCO on entry unless necessary") but we restore it anyway on exception return. I think the only problem is if between these routines, we invoke cond_resched() directly. Not sure what the kasan code does but disabling preemption should avoid a reschedule. Another option is for mte_thread_switch() to context switch the TCO state.
diff --git a/mm/kasan/report.c b/mm/kasan/report.c index 22598b20c7b7..aa39aa8b1855 100644 --- a/mm/kasan/report.c +++ b/mm/kasan/report.c @@ -166,7 +166,7 @@ static inline void fail_non_kasan_kunit_test(void) { } static DEFINE_SPINLOCK(report_lock); -static void start_report(unsigned long *flags, bool sync) +static bool start_report(unsigned long *flags, bool sync) { fail_non_kasan_kunit_test(); /* Respect the /proc/sys/kernel/traceoff_on_warning interface. */ @@ -175,8 +175,13 @@ static void start_report(unsigned long *flags, bool sync) lockdep_off(); /* Make sure we don't end up in loop. */ kasan_disable_current(); - spin_lock_irqsave(&report_lock, *flags); + if (!spin_trylock_irqsave(&report_lock, *flags)) { + lockdep_on(); + kasan_enable_current(); + return false; + } pr_err("==================================================================\n"); + return true; } static void end_report(unsigned long *flags, void *addr) @@ -468,7 +473,10 @@ void kasan_report_invalid_free(void *ptr, unsigned long ip, enum kasan_report_ty if (unlikely(!report_enabled())) return; - start_report(&flags, true); + if (!start_report(&flags, true)) { + pr_err("%s: report ignore\n", __func__); + return; + } memset(&info, 0, sizeof(info)); info.type = type; @@ -503,7 +511,11 @@ bool kasan_report(unsigned long addr, size_t size, bool is_write, goto out; } - start_report(&irq_flags, true); + if (!start_report(&irq_flags, true)) { + ret = false; + pr_err("%s: report ignore\n", __func__); + goto out; + } memset(&info, 0, sizeof(info)); info.type = KASAN_REPORT_ACCESS; @@ -536,7 +548,10 @@ void kasan_report_async(void) if (unlikely(!report_enabled())) return; - start_report(&flags, false); + if (!start_report(&flags, false)) { + pr_err("%s: report ignore\n", __func__); + return; + } pr_err("BUG: KASAN: invalid-access\n"); pr_err("Asynchronous fault: no details available\n"); pr_err("\n");