[1/2] block: make the fair sharing of tag configurable

Message ID 20230509065230.32552-2-ed.tsai@mediatek.com
State New
Headers
Series block: improve the share tag set performance |

Commit Message

Ed Tsai (蔡宗軒) May 9, 2023, 6:52 a.m. UTC
  Add a new queue flag QUEUE_FLAG_FAIR_TAG_SHARING to make the fair tag
sharing configurable.

Signed-off-by: Ed Tsai <ed.tsai@mediatek.com>
---
 block/blk-mq-debugfs.c | 1 +
 block/blk-mq-tag.c     | 1 +
 block/blk-mq.c         | 3 ++-
 include/linux/blkdev.h | 6 +++++-
 4 files changed, 9 insertions(+), 2 deletions(-)
  

Comments

kernel test robot May 9, 2023, 9:33 p.m. UTC | #1
Hi Ed,

kernel test robot noticed the following build warnings:

[auto build test WARNING on axboe-block/for-next]
[also build test WARNING on jejb-scsi/for-next mkp-scsi/for-next linus/master v6.4-rc1 next-20230509]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Ed-Tsai/block-make-the-fair-sharing-of-tag-configurable/20230509-145439
base:   https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git for-next
patch link:    https://lore.kernel.org/r/20230509065230.32552-2-ed.tsai%40mediatek.com
patch subject: [PATCH 1/2] block: make the fair sharing of tag configurable
config: openrisc-randconfig-r022-20230509 (https://download.01.org/0day-ci/archive/20230510/202305100557.gdIvlzRS-lkp@intel.com/config)
compiler: or1k-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/b1081024bc6d1cdaf5b39994b19040cd8e6099ec
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Ed-Tsai/block-make-the-fair-sharing-of-tag-configurable/20230509-145439
        git checkout b1081024bc6d1cdaf5b39994b19040cd8e6099ec
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=openrisc olddefconfig
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=openrisc SHELL=/bin/bash

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
| Link: https://lore.kernel.org/oe-kbuild-all/202305100557.gdIvlzRS-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from block/blk-mq.c:12:
   block/blk-mq.c: In function 'blk_mq_init_allocated_queue':
>> include/linux/blkdev.h:569:39: warning: left shift count >= width of type [-Wshift-count-overflow]
     569 |                                  (1UL << QUEUE_FLAG_FAIR_TAG_SHARING))
         |                                       ^~
   block/blk-mq.c:4232:27: note: in expansion of macro 'QUEUE_FLAG_MQ_DEFAULT'
    4232 |         q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT;
         |                           ^~~~~~~~~~~~~~~~~~~~~


vim +569 include/linux/blkdev.h

   565	
   566	#define QUEUE_FLAG_MQ_DEFAULT	((1UL << QUEUE_FLAG_IO_STAT) |		\
   567					 (1UL << QUEUE_FLAG_SAME_COMP) |	\
   568					 (1UL << QUEUE_FLAG_NOWAIT) |		\
 > 569					 (1UL << QUEUE_FLAG_FAIR_TAG_SHARING))
   570
  
Christoph Hellwig May 11, 2023, 3:33 p.m. UTC | #2
On Tue, May 09, 2023 at 02:52:29PM +0800, Ed Tsai wrote:
> Add a new queue flag QUEUE_FLAG_FAIR_TAG_SHARING to make the fair tag
> sharing configurable.

Why?
  
Liu, Yujie May 22, 2023, 5:30 a.m. UTC | #3
Hello,

kernel test robot noticed "UBSAN:shift-out-of-bounds_in(null)" on:

commit: b1081024bc6d1cdaf5b39994b19040cd8e6099ec ("[PATCH 1/2] block: make the fair sharing of tag configurable")
url: https://github.com/intel-lab-lkp/linux/commits/Ed-Tsai/block-make-the-fair-sharing-of-tag-configurable/20230509-145439
base: https://git.kernel.org/cgit/linux/kernel/git/axboe/linux-block.git for-next
patch subject: [PATCH 1/2] block: make the fair sharing of tag configurable
patch link: https://lore.kernel.org/all/20230509065230.32552-2-ed.tsai@mediatek.com/

in testcase: boot

compiler: clang-14
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)

+------------------------------------------------------------+------------+------------+
|                                                            | b2e48bd0db | b1081024bc |
+------------------------------------------------------------+------------+------------+
| boot_successes                                             | 8          | 0          |
| boot_failures                                              | 0          | 8          |
| UBSAN:shift-out-of-bounds_in(null)                         | 0          | 8          |
| WARNING:at_lib/ubsan.c:#__ubsan_handle_shift_out_of_bounds | 0          | 8          |
| EIP:__ubsan_handle_shift_out_of_bounds                     | 0          | 8          |
| BUG:unable_to_handle_page_fault_for_address                | 0          | 8          |
| Oops:#[##]                                                 | 0          | 8          |
| EIP:blk_mq_debugfs_register_sched                          | 0          | 8          |
| Kernel_panic-not_syncing:Fatal_exception                   | 0          | 8          |
+------------------------------------------------------------+------------+------------+


If you fix the issue, kindly add following tag
| Reported-by: kernel test robot <yujie.liu@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202305221236.5410a5c6-yujie.liu@intel.com


[    8.114565][    T1] UBSAN: shift-out-of-bounds in (null):0:-1017201787
[    8.115735][    T1] ------------[ cut here ]------------
[ 8.116722][ T1] WARNING: CPU: 0 PID: 1 at lib/ubsan.c:127 __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:127) 
[    8.118211][    T1] Modules linked in:
[    8.118975][    T1] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 6.4.0-rc1-00004-gb1081024bc6d #1 db924219c7bf519b06320a8fa4e221875190bd2e
[    8.121026][    T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 8.122583][ T1] EIP: __ubsan_handle_shift_out_of_bounds (lib/ubsan.c:127) 
[ 8.123706][ T1] Code: 8b 0a 8b 7a 04 8d 45 88 57 51 68 c3 0c d1 c2 6a 28 50 e8 89 ef 94 00 83 c4 14 8b 55 ec 8b 45 f0 66 83 38 00 0f 84 4b fe ff ff <0f> 0b e9 44 fe ff ff 0f 0b 66 83 f8 0b 0f 86 5a fe ff ff 8b 45 e8
All code
========
   0:	8b 0a                	mov    (%rdx),%ecx
   2:	8b 7a 04             	mov    0x4(%rdx),%edi
   5:	8d 45 88             	lea    -0x78(%rbp),%eax
   8:	57                   	push   %rdi
   9:	51                   	push   %rcx
   a:	68 c3 0c d1 c2       	push   $0xffffffffc2d10cc3
   f:	6a 28                	push   $0x28
  11:	50                   	push   %rax
  12:	e8 89 ef 94 00       	call   0x94efa0
  17:	83 c4 14             	add    $0x14,%esp
  1a:	8b 55 ec             	mov    -0x14(%rbp),%edx
  1d:	8b 45 f0             	mov    -0x10(%rbp),%eax
  20:	66 83 38 00          	cmpw   $0x0,(%rax)
  24:	0f 84 4b fe ff ff    	je     0xfffffffffffffe75
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	e9 44 fe ff ff       	jmp    0xfffffffffffffe75
  31:	0f 0b                	ud2
  33:	66 83 f8 0b          	cmp    $0xb,%ax
  37:	0f 86 5a fe ff ff    	jbe    0xfffffffffffffe97
  3d:	8b 45 e8             	mov    -0x18(%rbp),%eax

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	e9 44 fe ff ff       	jmp    0xfffffffffffffe4b
   7:	0f 0b                	ud2
   9:	66 83 f8 0b          	cmp    $0xb,%ax
   d:	0f 86 5a fe ff ff    	jbe    0xfffffffffffffe6d
  13:	8b 45 e8             	mov    -0x18(%rbp),%eax
[    8.126748][    T1] EAX: ca11ec40 EBX: c5bf0000 ECX: 00000000 EDX: c83b66c0
[    8.127956][    T1] ESI: ffffffff EDI: c8118000 EBP: c59f1a58 ESP: c59f19e0
[    8.129141][    T1] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 EFLAGS: 00010282
[    8.130361][    T1] CR0: 80050033 CR2: b7f19cd4 CR3: 035f7000 CR4: 00040690
[    8.131528][    T1] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[    8.132744][    T1] DR6: fffe0ff0 DR7: 00000400
[    8.133723][    T1] Call Trace:
[ 8.134454][ T1] ? mutex_unlock (kernel/locking/mutex.c:544) 
[ 8.135290][ T1] blk_mq_init_allocated_queue (block/blk-mq.c:4232) 
[ 8.140517][ T1] ? blk_timeout_work (block/blk-core.c:374) 
[ 8.141538][ T1] ? blk_alloc_queue (block/blk-core.c:438) 
[ 8.142497][ T1] __blk_mq_alloc_disk (block/blk-mq.c:4043 block/blk-mq.c:4089) 
[ 8.143445][ T1] add_mtd_blktrans_dev (drivers/mtd/mtd_blkdevs.c:336) 
[ 8.144403][ T1] mtdblock_add_mtd (drivers/mtd/mtdblock.c:333) 
[ 8.145285][ T1] blktrans_notify_add (drivers/mtd/mtd_blkdevs.c:?) 
[ 8.146175][ T1] add_mtd_device (drivers/mtd/mtdcore.c:?) 
[ 8.147040][ T1] ? mtd_cls_resume (drivers/mtd/mtdcore.c:504) 
[ 8.147909][ T1] add_mtd_partitions (drivers/mtd/mtdpart.c:416) 
[ 8.148795][ T1] mtd_device_parse_register (drivers/mtd/mtdcore.c:?) 
[ 8.149747][ T1] ? nand_create_bbt (drivers/mtd/nand/raw/nand_bbt.c:936 drivers/mtd/nand/raw/nand_bbt.c:1266 drivers/mtd/nand/raw/nand_bbt.c:1425) 
[ 8.150623][ T1] ? ns_init (drivers/mtd/nand/raw/nandsim.c:766) 
[ 8.151425][ T1] ? ns_init (drivers/mtd/nand/raw/nandsim.c:?) 
[ 8.152240][ T1] ns_init_module (drivers/mtd/nand/raw/nandsim.c:2382) 
[ 8.153113][ T1] ? _printk (kernel/printk/printk.c:2331) 
[ 8.153903][ T1] do_one_initcall (init/main.c:1246) 
[ 8.154821][ T1] ? inftl_partscan (drivers/mtd/nand/raw/nandsim.c:2261) 
[ 8.155683][ T1] do_initcall_level (init/main.c:1318) 
[ 8.156564][ T1] ? rest_init (init/main.c:1454) 
[ 8.157391][ T1] do_initcalls (init/main.c:1332) 
[ 8.158194][ T1] do_basic_setup (init/main.c:1355) 
[ 8.159030][ T1] kernel_init_freeable (init/main.c:1575) 
[ 8.159939][ T1] kernel_init (init/main.c:1464) 
[ 8.160759][ T1] ret_from_fork (arch/x86/entry/entry_32.S:770) 
[    8.161568][    T1] irq event stamp: 494889
[ 8.162351][ T1] hardirqs last enabled at (494899): __up_console_sem (arch/x86/include/asm/irqflags.h:19 arch/x86/include/asm/irqflags.h:67 arch/x86/include/asm/irqflags.h:127 kernel/printk/printk.c:347) 
[ 8.163719][ T1] hardirqs last disabled at (494910): __up_console_sem (kernel/printk/printk.c:345) 
[ 8.165101][ T1] softirqs last enabled at (494786): do_softirq_own_stack (arch/x86/kernel/irq_32.c:57 arch/x86/kernel/irq_32.c:147) 
[ 8.166495][ T1] softirqs last disabled at (494775): do_softirq_own_stack (arch/x86/kernel/irq_32.c:57 arch/x86/kernel/irq_32.c:147) 
[    8.167899][    T1] ---[ end trace 0000000000000000 ]---


To reproduce:

        # build kernel
	cd linux
	cp config-6.4.0-rc1-00004-gb1081024bc6d .config
	make HOSTCC=clang-14 CC=clang-14 ARCH=i386 olddefconfig prepare modules_prepare bzImage modules
	make HOSTCC=clang-14 CC=clang-14 ARCH=i386 INSTALL_MOD_PATH=<mod-install-dir> modules_install
	cd <mod-install-dir>
	find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz


        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.
  

Patch

diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index d23a8554ec4a..f03b8bfe63be 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -103,6 +103,7 @@  static const char *const blk_queue_flag_name[] = {
 	QUEUE_FLAG_NAME(RQ_ALLOC_TIME),
 	QUEUE_FLAG_NAME(HCTX_ACTIVE),
 	QUEUE_FLAG_NAME(NOWAIT),
+	QUEUE_FLAG_NAME(FAIR_TAG_SHARING),
 };
 #undef QUEUE_FLAG_NAME
 
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index d6af9d431dc6..b8b36823f5f5 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -97,6 +97,7 @@  static int __blk_mq_get_tag(struct blk_mq_alloc_data *data,
 			    struct sbitmap_queue *bt)
 {
 	if (!data->q->elevator && !(data->flags & BLK_MQ_REQ_RESERVED) &&
+			blk_queue_fair_tag_sharing(data->q) &&
 			!hctx_may_queue(data->hctx, bt))
 		return BLK_MQ_NO_TAG;
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index f6dad0886a2f..f903107759f7 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1746,7 +1746,8 @@  static bool __blk_mq_alloc_driver_tag(struct request *rq)
 		bt = &rq->mq_hctx->tags->breserved_tags;
 		tag_offset = 0;
 	} else {
-		if (!hctx_may_queue(rq->mq_hctx, bt))
+		if (blk_queue_fair_tag_sharing(rq->q) &&
+		    !hctx_may_queue(rq->mq_hctx, bt))
 			return false;
 	}
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index b441e633f4dd..7fcb2356860d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -561,10 +561,12 @@  struct request_queue {
 #define QUEUE_FLAG_NOWAIT       29	/* device supports NOWAIT */
 #define QUEUE_FLAG_SQ_SCHED     30	/* single queue style io dispatch */
 #define QUEUE_FLAG_SKIP_TAGSET_QUIESCE	31 /* quiesce_tagset skip the queue*/
+#define QUEUE_FLAG_FAIR_TAG_SHARING	32 /* fair allocation of shared tags */
 
 #define QUEUE_FLAG_MQ_DEFAULT	((1UL << QUEUE_FLAG_IO_STAT) |		\
 				 (1UL << QUEUE_FLAG_SAME_COMP) |	\
-				 (1UL << QUEUE_FLAG_NOWAIT))
+				 (1UL << QUEUE_FLAG_NOWAIT) |		\
+				 (1UL << QUEUE_FLAG_FAIR_TAG_SHARING))
 
 void blk_queue_flag_set(unsigned int flag, struct request_queue *q);
 void blk_queue_flag_clear(unsigned int flag, struct request_queue *q);
@@ -602,6 +604,8 @@  bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q);
 #define blk_queue_sq_sched(q)	test_bit(QUEUE_FLAG_SQ_SCHED, &(q)->queue_flags)
 #define blk_queue_skip_tagset_quiesce(q) \
 	test_bit(QUEUE_FLAG_SKIP_TAGSET_QUIESCE, &(q)->queue_flags)
+#define blk_queue_fair_tag_sharing(q) \
+	test_bit(QUEUE_FLAG_FAIR_TAG_SHARING, &(q)->queue_flags)
 
 extern void blk_set_pm_only(struct request_queue *q);
 extern void blk_clear_pm_only(struct request_queue *q);