Message ID | 20240131121401.3898735-1-zhaoyang.huang@unisoc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-46407-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp1843183dyb; Wed, 31 Jan 2024 04:15:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IHdjycGfmQhSf4LYVvfZ4Nl5mM5WVxENtYxFUKbJKuTWGTYPkooALrh0KvKYf1oHOrg7mBT X-Received: by 2002:a17:906:a8f:b0:a35:b30a:6098 with SMTP id y15-20020a1709060a8f00b00a35b30a6098mr1015455ejf.71.1706703300489; Wed, 31 Jan 2024 04:15:00 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706703300; cv=pass; d=google.com; s=arc-20160816; b=AudXUq4ate9ksVVXrSutUBoZEV174PyhFmMYqvkXHoutw4B3aE1uPjLAt63xKv+aTf 6nmj7dhyhkbw75f2fWw9p89KABSmi75DLutyPKbxJh0wS8EERdsw//yk78R6fQgDCOg5 TZvpTMtmfa2HaoPtuieBAcYMjaesvqq89m6nBG8zym0QeNwSvaynakm/01VXe+0N5fnm 0C4Rpdwm9lI82IK0hYuMaVG72lgV9FmQ1OO2i2EjhLz1aTmZ5Z/I5LPCEZt/Wm7Dgx2c 91Jb+jhpSzBWtxrSfLJ9Xj+Iy4VZdbfv1a01TbxGTDjvfT0A8NAn/5Y0oPk9efgSf0KA RMJw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:to:from; bh=ca1LZ9NVnWU9DOwnduM6bPX+CJZzbjgKLF5HdzROB78=; fh=QkF7zwdUFPaMJ3KvyCuqHP9Ak25br7f8fdXfkp3+hSw=; b=ti4ts4zHei91gESa0ZnOmy6FKoPSr/C+OnCkta8HCnWtlD/V91hKi8iKhA+ba+PG/x SkjY6qgJRA4dkuGUaTfY687m9zci54Wk2ssKnn080wkCR9Ly431Zq70MHZ1YcbXiuAwE Xz3DI63055w35Dteve+7rndC2S45Lt0ViGLaHqgK3tDEW1q8UUJhkX6qXJ97wvktIzOl Iue99mywiBko6cYWwNE1LILPgDe7T/2QRr+T/OvW4Hal3JaM7jcY59CO7uNhnwqtxNkB NYQeepeI3576J9oxMlLO0h2uP9r4BhbI+s8nce7uYUGaMV/bIvJ1jQ28ALPy6lYfqZIy jZ6Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=unisoc.com); spf=pass (google.com: domain of linux-kernel+bounces-46407-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-46407-ouuuleilei=gmail.com@vger.kernel.org" X-Forwarded-Encrypted: i=1; AJvYcCVXTs5bGLkcZQIy7vYdkl9N+xIvgptuPuF7+uCWpiCZcyWxOfyTKLbvC/CoEBxqoQpmY7hHBsqe5bbBOBU+rgfVhVRkdA== Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id gt16-20020a170906f21000b00a35ddad87d3si2512339ejb.617.2024.01.31.04.15.00 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 04:15:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-46407-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=unisoc.com); spf=pass (google.com: domain of linux-kernel+bounces-46407-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-46407-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 1E55F1F216BB for <ouuuleilei@gmail.com>; Wed, 31 Jan 2024 12:15:00 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4B67278688; Wed, 31 Jan 2024 12:14:49 +0000 (UTC) Received: from SHSQR01.spreadtrum.com (mx1.unisoc.com [222.66.158.135]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE4F978676 for <linux-kernel@vger.kernel.org>; Wed, 31 Jan 2024 12:14:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=222.66.158.135 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706703282; cv=none; b=aCUb3oPvy4Yku5po3Znpb93ilDhq6s9pj3bwGMrg2G5/V4XlboJCq5UlGUkYBYonaOtFRGY+MRaeTq/zmgpg3sBL0lnmFdLtSFSMZwD8nOHh5kTon3BY//3La0HyPg07/0ot8uizTBvDAskmiF6OjxcTg386Tw3Kl9zugzduTqE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706703282; c=relaxed/simple; bh=Q0wV+TnItOkEHfIiIhxUFssbDbFUQK4VKdx1sKcggrI=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=cq1KQty6iK0cmY9e6arFPTlpGafWMR2uUkgVjz+XUkZgB05EEM4WQa+Nt/yzKaJZ55gRMvJNCnU9LKY7rEFXZ7kb0sJp9gFJ/QGwnrHz66hWlUdUGeQP4BrSrbTXcK9WprLPHrh2VivrPoMema46blEWDUOwUaamxRlB8wK6ias= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=unisoc.com; spf=pass smtp.mailfrom=unisoc.com; arc=none smtp.client-ip=222.66.158.135 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=unisoc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=unisoc.com Received: from dlp.unisoc.com ([10.29.3.86]) by SHSQR01.spreadtrum.com with ESMTP id 40VCE6du013504; Wed, 31 Jan 2024 20:14:06 +0800 (+08) (envelope-from zhaoyang.huang@unisoc.com) Received: from SHDLP.spreadtrum.com (bjmbx01.spreadtrum.com [10.0.64.7]) by dlp.unisoc.com (SkyGuard) with ESMTPS id 4TQ13T1Q8Jz2SjW7v; Wed, 31 Jan 2024 20:06:29 +0800 (CST) Received: from bj03382pcu01.spreadtrum.com (10.0.73.40) by BJMBX01.spreadtrum.com (10.0.64.7) with Microsoft SMTP Server (TLS) id 15.0.1497.23; Wed, 31 Jan 2024 20:14:04 +0800 From: "zhaoyang.huang" <zhaoyang.huang@unisoc.com> To: Damien Le Moal <dlemoal@kernel.org>, Niklas Cassel <niklas.cassel@wdc.com>, "Martin K . Petersen" <martin.petersen@oracle.com>, Hannes Reinecke <hare@suse.de>, <linux-kernel@vger.kernel.org>, Zhaoyang Huang <huangzhaoyang@gmail.com>, <steve.kang@unisoc.com> Subject: [PATCH 1/1] block: print warning when invalid domain set to ioprio Date: Wed, 31 Jan 2024 20:14:01 +0800 Message-ID: <20240131121401.3898735-1-zhaoyang.huang@unisoc.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: SHCAS01.spreadtrum.com (10.0.1.201) To BJMBX01.spreadtrum.com (10.0.64.7) X-MAIL: SHSQR01.spreadtrum.com 40VCE6du013504 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789608119988426047 X-GMAIL-MSGID: 1789608119988426047 |
Series |
[1/1] block: print warning when invalid domain set to ioprio
|
|
Commit Message
zhaoyang.huang
Jan. 31, 2024, 12:14 p.m. UTC
From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio is a open macro to set bio->ioprio directly.It is confused for the developer who run across kernel panic[1] but can find nothing in previous kernel log. Add a pr_err here to dump the information. [1] Here is the kernel panic I run across which caused by a out of bounds check introduced by CONFIG_FOTIFY_SOURCE. [exception_serialno]: [exception_kernel_version]: [exception_reboot_reason]: kernel_crash [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception [exception_time]: 1970-01-01_08-00-23 [exception_file_info]: not-bugon [exception_task_id]: 409 [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2] [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110 [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc [<ffffffc0810f0eb4>] panic+0x194/0x37cgc [<ffffffc0800a638c>] die+0x300/0x310gc [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound here caused by the value of class transferred from ioprio [<ffffffc080707f28>] elv_merge+0x248/0x270gc [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38 [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc [<ffffffc080167c20>] kthread+0x110/0x1b8gc Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> --- include/uapi/linux/ioprio.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
Comments
On Wed, Jan 31, 2024 at 08:14:01PM +0800, zhaoyang.huang wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio > is a open macro to set bio->ioprio directly.It is confused for the developer > who run across kernel panic[1] but can find nothing in previous kernel log. > Add a pr_err here to dump the information. > > [1] > Here is the kernel panic I run across which caused by a out of bounds check > introduced by CONFIG_FOTIFY_SOURCE. > > [exception_serialno]: > [exception_kernel_version]: > [exception_reboot_reason]: kernel_crash > [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception > [exception_time]: 1970-01-01_08-00-23 > [exception_file_info]: not-bugon > [exception_task_id]: 409 > [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2] > [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110 > [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc > [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc > [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc > [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc > [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc > [<ffffffc0810f0eb4>] panic+0x194/0x37cgc > [<ffffffc0800a638c>] die+0x300/0x310gc > [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc > [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc > [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc > [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc > [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc > [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc > [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound > here caused by the value of class transferred from ioprio > [<ffffffc080707f28>] elv_merge+0x248/0x270gc > [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc > [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc > [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc > [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc > [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc > [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc > [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc > [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc > [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc > [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc > [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc > [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38 > [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc > [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc > [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc > [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc > [<ffffffc080167c20>] kthread+0x110/0x1b8gc > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > --- > include/uapi/linux/ioprio.h | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h > index bee2bdb0eedb..73c420a0df72 100644 > --- a/include/uapi/linux/ioprio.h > +++ b/include/uapi/linux/ioprio.h > @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel, > { > if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) || > IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) || > - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) > + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) { > + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n", > + __func__, prioclass, priolevel, priohint); > return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT; > + } > > return (prioclass << IOPRIO_CLASS_SHIFT) | > (priohint << IOPRIO_HINT_SHIFT) | priolevel; > -- > 2.25.1 > Adding linux-block to CC. pr_err() is a kernel function for printing. ioprio_value() is a function in a uapi header, so this function will be used by user space programs. There is a reason: $ git grep pr_err include/uapi/ Gives no results. I think you should fix mq-deadline instead. It looks like the problem comes from: ioprio_value() will set class to IOPRIO_CLASS_INVALID (value 7), if the user specified an class/level/hint that was invalid. ioprio_class_to_prio[] array in mq-deadline.c does currently not have an entry in to translate IOPRIO_CLASS_INVALID (7) to a valid DD_*_PRIO value. Although, why does this I/O even reach the scheduler, shouldn't this I/O get rejected even earlier? Both io_uring and libaio will call ioprio_check_cap(), which should fail the I/O before it reaches the I/O scheduler, but in your case, you are submitting the I/O from the filesystem. Should we perhaps add a call to ioprio_check_cap() or similar in some path used to submit I/O by filesystems? Kind regards, Niklas
On Wed, Jan 31, 2024 at 01:52:51PM +0100, Niklas Cassel wrote: > On Wed, Jan 31, 2024 at 08:14:01PM +0800, zhaoyang.huang wrote: > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > > Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio > > is a open macro to set bio->ioprio directly.It is confused for the developer > > who run across kernel panic[1] but can find nothing in previous kernel log. > > Add a pr_err here to dump the information. > > > > [1] > > Here is the kernel panic I run across which caused by a out of bounds check > > introduced by CONFIG_FOTIFY_SOURCE. > > > > [exception_serialno]: > > [exception_kernel_version]: > > [exception_reboot_reason]: kernel_crash > > [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception > > [exception_time]: 1970-01-01_08-00-23 > > [exception_file_info]: not-bugon > > [exception_task_id]: 409 > > [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2] > > [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110 > > [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc > > [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc > > [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc > > [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc > > [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc > > [<ffffffc0810f0eb4>] panic+0x194/0x37cgc > > [<ffffffc0800a638c>] die+0x300/0x310gc > > [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc > > [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc > > [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc > > [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc > > [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc > > [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc > > [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound > > here caused by the value of class transferred from ioprio > > [<ffffffc080707f28>] elv_merge+0x248/0x270gc > > [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc > > [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc > > [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc > > [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc > > [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc > > [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc > > [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc > > [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc > > [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc > > [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc > > [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc > > [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38 > > [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc > > [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc > > [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc > > [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc > > [<ffffffc080167c20>] kthread+0x110/0x1b8gc > > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > --- > > include/uapi/linux/ioprio.h | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h > > index bee2bdb0eedb..73c420a0df72 100644 > > --- a/include/uapi/linux/ioprio.h > > +++ b/include/uapi/linux/ioprio.h > > @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel, > > { > > if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) || > > IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) || > > - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) > > + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) { > > + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n", > > + __func__, prioclass, priolevel, priohint); > > return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT; > > + } > > > > return (prioclass << IOPRIO_CLASS_SHIFT) | > > (priohint << IOPRIO_HINT_SHIFT) | priolevel; > > -- > > 2.25.1 > > > > Adding linux-block to CC. > > pr_err() is a kernel function for printing. > ioprio_value() is a function in a uapi header, so this function will be > used by user space programs. > > There is a reason: > $ git grep pr_err include/uapi/ > > Gives no results. > > > > I think you should fix mq-deadline instead. > It looks like the problem comes from: > ioprio_value() will set class to IOPRIO_CLASS_INVALID (value 7), > if the user specified an class/level/hint that was invalid. > > ioprio_class_to_prio[] array in mq-deadline.c does currently not have an > entry in to translate IOPRIO_CLASS_INVALID (7) to a valid DD_*_PRIO value. > > Although, why does this I/O even reach the scheduler, shouldn't this I/O > get rejected even earlier? > > Both io_uring and libaio will call ioprio_check_cap(), which should fail > the I/O before it reaches the I/O scheduler, but in your case, you are > submitting the I/O from the filesystem. > > Should we perhaps add a call to ioprio_check_cap() or similar in some > path used to submit I/O by filesystems? On second thought, how can the FS have a ioprio class stored that would have been rejected at I/O submission time by the user? This sound like either a bug in the FS or by some of your local changes that you did for your other patch (ioprio based on activity). Kind regards, Niklas
On Wed, Jan 31, 2024 at 9:07 PM Niklas Cassel <Niklas.Cassel@wdc.com> wrote: > > On Wed, Jan 31, 2024 at 01:52:51PM +0100, Niklas Cassel wrote: > > On Wed, Jan 31, 2024 at 08:14:01PM +0800, zhaoyang.huang wrote: > > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > > > > Since few caller check IOPRIO_PRIO_VALUE's return value and bio_set_prio > > > is a open macro to set bio->ioprio directly.It is confused for the developer > > > who run across kernel panic[1] but can find nothing in previous kernel log. > > > Add a pr_err here to dump the information. > > > > > > [1] > > > Here is the kernel panic I run across which caused by a out of bounds check > > > introduced by CONFIG_FOTIFY_SOURCE. > > > > > > [exception_serialno]: > > > [exception_kernel_version]: > > > [exception_reboot_reason]: kernel_crash > > > [exception_panic_reason]: UBSAN: array index out of bounds: Fatal exception > > > [exception_time]: 1970-01-01_08-00-23 > > > [exception_file_info]: not-bugon > > > [exception_task_id]: 409 > > > [exception_task_family]: [f2fs_ckpt-254:4, 409][kthreadd, 2] > > > [exception_pc_symbol]: [<ffffffc080736974>] dd_request_merge+0x100/0x110 > > > [exception_stack_info]: [<ffffffc07a27e274>] get_exception_stack_info+0x124/0x2d8 [sysdump]gc > > > [<ffffffc07a27e670>] prepare_exception_info+0x158/0x1d4 [sysdump]gc > > > [<ffffffc07a280128>] sysdump_panic_event+0x5d8/0x748 [sysdump]gc > > > [<ffffffc08016a508>] notifier_call_chain+0x98/0x17cgc > > > [<ffffffc08016a9b4>] atomic_notifier_call_chain+0x44/0x68gc > > > [<ffffffc0810f0eb4>] panic+0x194/0x37cgc > > > [<ffffffc0800a638c>] die+0x300/0x310gc > > > [<ffffffc0800a77e8>] ubsan_handler+0x34/0x4cgc > > > [<ffffffc0800960a8>] brk_handler+0x9c/0x11cgc > > > [<ffffffc0800bf998>] do_debug_exception+0xb0/0x140gc > > > [<ffffffc0810f8bf0>] el1_dbg+0x58/0x74gc > > > [<ffffffc0810f89f4>] el1h_64_sync_handler+0x3c/0x90gc > > > [<ffffffc080091298>] el1h_64_sync+0x68/0x6cgc > > > [<ffffffc080736974>] dd_request_merge+0x100/0x110gc //out of bound > > > here caused by the value of class transferred from ioprio > > > [<ffffffc080707f28>] elv_merge+0x248/0x270gc > > > [<ffffffc0807146e8>] blk_mq_sched_try_merge+0x4c/0x20cgc > > > [<ffffffc080736824>] dd_bio_merge+0x64/0xb4gc > > > [<ffffffc080723e3c>] blk_mq_sched_bio_merge+0x68/0x144gc > > > [<ffffffc08071b944>] blk_mq_submit_bio+0x2e8/0x6c0gc > > > [<ffffffc08070dd3c>] __submit_bio+0xbc/0x1b0gc > > > [<ffffffc08070c440>] submit_bio_noacct_nocheck+0xe4/0x2f0gc > > > [<ffffffc08070c8e4>] submit_bio_noacct+0x298/0x3d8gc > > > [<ffffffc08070caf8>] submit_bio+0xd4/0xf0gc > > > [<ffffffc080642644>] f2fs_submit_write_bio+0xcc/0x49cgc > > > [<ffffffc0806442d4>] __submit_merged_bio+0x48/0x13cgc > > > [<ffffffc080641de4>] __submit_merged_write_cond+0x18c/0x1f8gc > > > [<ffffffc080641c4c>] f2fs_submit_merged_write+0x2c/0x38 > > > [<ffffffc080655724>] f2fs_sync_node_pages+0x6e0/0x740gc > > > [<ffffffc08063946c>] f2fs_write_checkpoint+0x4c0/0x97cgc > > > [<ffffffc08063b37c>] __checkpoint_and_complete_reqs+0x88/0x248gc > > > [<ffffffc08063ad70>] issue_checkpoint_thread+0x94/0xf4gc > > > [<ffffffc080167c20>] kthread+0x110/0x1b8gc > > > > > > Signed-off-by: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > --- > > > include/uapi/linux/ioprio.h | 5 ++++- > > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h > > > index bee2bdb0eedb..73c420a0df72 100644 > > > --- a/include/uapi/linux/ioprio.h > > > +++ b/include/uapi/linux/ioprio.h > > > @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel, > > > { > > > if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) || > > > IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) || > > > - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) > > > + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) { > > > + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n", > > > + __func__, prioclass, priolevel, priohint); > > > return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT; > > > + } > > > > > > return (prioclass << IOPRIO_CLASS_SHIFT) | > > > (priohint << IOPRIO_HINT_SHIFT) | priolevel; > > > -- > > > 2.25.1 > > > > > > > Adding linux-block to CC. > > > > pr_err() is a kernel function for printing. > > ioprio_value() is a function in a uapi header, so this function will be > > used by user space programs. > > > > There is a reason: > > $ git grep pr_err include/uapi/ > > > > Gives no results. > > > > > > > > I think you should fix mq-deadline instead. > > It looks like the problem comes from: > > ioprio_value() will set class to IOPRIO_CLASS_INVALID (value 7), > > if the user specified an class/level/hint that was invalid. > > > > ioprio_class_to_prio[] array in mq-deadline.c does currently not have an > > entry in to translate IOPRIO_CLASS_INVALID (7) to a valid DD_*_PRIO value. > > > > Although, why does this I/O even reach the scheduler, shouldn't this I/O > > get rejected even earlier? > > > > Both io_uring and libaio will call ioprio_check_cap(), which should fail > > the I/O before it reaches the I/O scheduler, but in your case, you are > > submitting the I/O from the filesystem. > > > > Should we perhaps add a call to ioprio_check_cap() or similar in some > > path used to submit I/O by filesystems? > > On second thought, how can the FS have a ioprio class stored that would > have been rejected at I/O submission time by the user? > > This sound like either a bug in the FS or by some of your local changes > that you did for your other patch (ioprio based on activity). Yes. That's why I would like to suggest adding some information here to help developers find the clue quickly. > > > Kind regards, > Niklas
On 2/1/24 08:34, Zhaoyang Huang wrote: > On Wed, Jan 31, 2024 at 9:07 PM Niklas Cassel <Niklas.Cassel@wdc.com> wrote: >> This sound like either a bug in the FS or by some of your local changes >> that you did for your other patch (ioprio based on activity). > Yes. That's why I would like to suggest adding some information here > to help developers find the clue quickly. The backtrace was not clear enough ? When a request reaches mq-deadline, the request priority is supposed to be correct already. Your changes had a bug and broke that assumption. Please fix that instead of adding error messages for errors that should never happen in the first place.
diff --git a/include/uapi/linux/ioprio.h b/include/uapi/linux/ioprio.h index bee2bdb0eedb..73c420a0df72 100644 --- a/include/uapi/linux/ioprio.h +++ b/include/uapi/linux/ioprio.h @@ -112,8 +112,11 @@ static __always_inline __u16 ioprio_value(int prioclass, int priolevel, { if (IOPRIO_BAD_VALUE(prioclass, IOPRIO_NR_CLASSES) || IOPRIO_BAD_VALUE(priolevel, IOPRIO_NR_LEVELS) || - IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) + IOPRIO_BAD_VALUE(priohint, IOPRIO_NR_HINTS)) { + pr_err("%s: get a invalid domain in class %d, level %d, hint %d\n", + __func__, prioclass, priolevel, priohint); return IOPRIO_CLASS_INVALID << IOPRIO_CLASS_SHIFT; + } return (prioclass << IOPRIO_CLASS_SHIFT) | (priohint << IOPRIO_HINT_SHIFT) | priolevel;