Message ID | 20230628124546.1056698-2-chengming.zhou@linux.dev |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp8900668vqr; Wed, 28 Jun 2023 05:54:53 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4rRD2rRbLLlIztNNLh0CZWcWv6IYqAT7n6AurthRGSQ3hjG757l1KcfS0/bcBh1S4U3M2G X-Received: by 2002:a17:907:7d9e:b0:991:bd74:ea3e with SMTP id oz30-20020a1709077d9e00b00991bd74ea3emr7458930ejc.4.1687956893494; Wed, 28 Jun 2023 05:54:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687956893; cv=none; d=google.com; s=arc-20160816; b=IxAualhIBS30ntrg9IOMzmk0ku8LM5VG37n4OQQc1fpPAh67qT901ujgO/hw6dbsfi +AbYcqAucnqH3KDB+s9zHUtoJNfNhlv4WYUNhKyFd3SoqhSt5Yyjc9AzZml/85OFERV7 CcJZY2IFgtU0tYpDkxEbxOdEwiW0WuDgz3nEhilqhz4/WILlOmVimj4CD5OQE+PX8e89 FNHxpDrQ9pzk9RsnZHulodI/ZMvAqtEAcdCGVdAqlutle3BlZSfmrjgq9bOCUe7iylWc g9cU2pqNtQSo4FqZCAle9OfSMopZWVeKYIbzTc0tMgwRu4UI3ZI13Y3frO48Z6U1fnzS RVLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=pM/RdRZgsIDACaBVp4cmgZyrhE7a7KC7/ymGEmkl5Ls=; fh=lfhyp4TQmAVJIcYCHUoUy74bD0K2M2yBrUVAhHYFDUg=; b=xlmGVbfzr9i69AnE5bznjIoqbM90xj+uNeViPaG2OI9XI8uV/I+gOVfy3//e8FgFUM pl+8GVmnAs73vB9BkjioKWwoZJvCq4Um1g4VxIXIdcy0jX1HEGYUD8Eb2XLHe8Fsp3gt bYMUKBOARvfdcK68KcEdcI0RW9Jcg1JehBdxf/P8Xucn5UDu9vozllgupvlyy24+kX/2 zSNPomGtnTv7xHOgK+IEEKEbPEkiO2t4+XtKKSL34ecalGwpBX+vsDi6mdGiMz+3COPP C8eUOzk3YDz7lJFjDKk7BlDOKPsX+axqFUpptTy4JEddb+N5lUNKSfQK0oQTMu8oduCD ocpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=a5srjzai; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g17-20020a170906869100b0098df1cbe2acsi5112300ejx.997.2023.06.28.05.54.30; Wed, 28 Jun 2023 05:54:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=a5srjzai; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229951AbjF1Mr7 (ORCPT <rfc822;adanhawthorn@gmail.com> + 99 others); Wed, 28 Jun 2023 08:47:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231643AbjF1MrY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 28 Jun 2023 08:47:24 -0400 Received: from out-10.mta1.migadu.com (out-10.mta1.migadu.com [IPv6:2001:41d0:203:375::a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7979F30F3 for <linux-kernel@vger.kernel.org>; Wed, 28 Jun 2023 05:46:18 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1687956376; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pM/RdRZgsIDACaBVp4cmgZyrhE7a7KC7/ymGEmkl5Ls=; b=a5srjzaiS6K4nB3ZXnm0AFNJNnvAlJ+SIEb7kdp7kGnt8nR/79si34wBYmtpyqERKJFv8I eVtIL4RSuhZrsSSmG2xZXQp4NhCObzOVQLYI/yxDhZZUg/+b1APrxemcQ5Qm9jkVah+iGX B5753HdUD0P4NpeZ+22N6nKf+HV7Nf4= From: chengming.zhou@linux.dev To: axboe@kernel.dk, tj@kernel.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com, ming.lei@redhat.com, hch@lst.de Subject: [PATCH v3 1/3] blk-mq: always use __blk_mq_alloc_requests() to alloc and init rq Date: Wed, 28 Jun 2023 20:45:44 +0800 Message-Id: <20230628124546.1056698-2-chengming.zhou@linux.dev> In-Reply-To: <20230628124546.1056698-1-chengming.zhou@linux.dev> References: <20230628124546.1056698-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769951087525118139?= X-GMAIL-MSGID: =?utf-8?q?1769951087525118139?= |
Series |
blk-mq: fix start_time_ns and alloc_time_ns for pre-allocated rq
|
|
Commit Message
Chengming Zhou
June 28, 2023, 12:45 p.m. UTC
From: Chengming Zhou <zhouchengming@bytedance.com> This patch is preparation for the next patch that ktime_get_ns() only once for batched pre-allocated requests start_time_ns setting. 1. data->flags is input for blk_mq_rq_ctx_init(), shouldn't update in every blk_mq_rq_ctx_init() in batched requests alloc. So put the data->flags initialization in the caller. 2. make blk_mq_alloc_request_hctx() to reuse __blk_mq_alloc_requests(), instead of directly using blk_mq_rq_ctx_init() by itself, so avoid doing the same data->flags initialization in it. After these cleanup, __blk_mq_alloc_requests() is the only entry to alloc and init rq. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> --- block/blk-mq.c | 46 ++++++++++++++++++---------------------------- 1 file changed, 18 insertions(+), 28 deletions(-)
Comments
On Wed, Jun 28, 2023 at 08:45:44PM +0800, chengming.zhou@linux.dev wrote: > After these cleanup, __blk_mq_alloc_requests() is the only entry to > alloc and init rq. I find the code a little hard to follow now, due to the optional setting of the ctx. We also introduce really odd behavior here if the caller for a hctx-specific allocation doesn't have free tags, as we'll now run into the normal retry path. Is this really needed for your timestamp changes? If not I'd prefer to skip it.
On 2023/6/29 13:28, Christoph Hellwig wrote: > On Wed, Jun 28, 2023 at 08:45:44PM +0800, chengming.zhou@linux.dev wrote: >> After these cleanup, __blk_mq_alloc_requests() is the only entry to >> alloc and init rq. > > I find the code a little hard to follow now, due to the optional > setting of the ctx. We also introduce really odd behavior here > if the caller for a hctx-specific allocation doesn't have free > tags, as we'll now run into the normal retry path. > > Is this really needed for your timestamp changes? If not I'd prefer > to skip it. > Thanks for your review! Since hctx-specific allocation path always has BLK_MQ_REQ_NOWAIT flag, it won't retry. But I agree, this makes the general __blk_mq_alloc_requests() more complex. The reason is blk_mq_rq_ctx_init() has some data->rq_flags initialization: ``` if (data->flags & BLK_MQ_REQ_PM) data->rq_flags |= RQF_PM; if (blk_queue_io_stat(q)) data->rq_flags |= RQF_IO_STAT; rq->rq_flags = data->rq_flags; ``` Because we need this data->rq_flags to tell if we need start_time_ns, we need to put these initialization in the callers of blk_mq_rq_ctx_init(). Now we basically have two callers, the 1st is general __blk_mq_alloc_requests(), the 2nd is the special blk_mq_alloc_request_hctx(). So I change the 2nd caller to reuse the 1st __blk_mq_alloc_requests(). Or we put these data->rq_flags initialization in blk_mq_alloc_request_hctx() too? Thanks.
On Thu, Jun 29, 2023 at 03:40:03PM +0800, Chengming Zhou wrote: > Thanks for your review! > > Since hctx-specific allocation path always has BLK_MQ_REQ_NOWAIT flag, > it won't retry. > > But I agree, this makes the general __blk_mq_alloc_requests() more complex. And also very confusing as it pretends to share some code, while almost nothing of __blk_mq_alloc_requests is actually used. > The reason is blk_mq_rq_ctx_init() has some data->rq_flags initialization: > > ``` > if (data->flags & BLK_MQ_REQ_PM) > data->rq_flags |= RQF_PM; > if (blk_queue_io_stat(q)) > data->rq_flags |= RQF_IO_STAT; > rq->rq_flags = data->rq_flags; > ``` > > Because we need this data->rq_flags to tell if we need start_time_ns, > we need to put these initialization in the callers of blk_mq_rq_ctx_init(). Why can't we just always initialize the time stampts after blk_mq_rq_ctx_init? Something like this (untested) variant of your patch 2 from the latest iteration: diff --git a/block/blk-mq.c b/block/blk-mq.c index 5504719b970d59..55bf1009f3e32a 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -328,8 +328,26 @@ void blk_rq_init(struct request_queue *q, struct request *rq) } EXPORT_SYMBOL(blk_rq_init); +/* Set alloc and start time when pre-allocated rq is actually used */ +static inline void blk_mq_rq_time_init(struct request *rq, bool set_alloc_time) +{ + if (blk_mq_need_time_stamp(rq)) { + u64 now = ktime_get_ns(); + +#ifdef CONFIG_BLK_RQ_ALLOC_TIME + /* + * The alloc time is only used by iocost for now, + * only possible when blk_mq_need_time_stamp(). + */ + if (set_alloc_time) + rq->alloc_time_ns = now; +#endif + rq->start_time_ns = now; + } +} + static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, - struct blk_mq_tags *tags, unsigned int tag, u64 alloc_time_ns) + struct blk_mq_tags *tags, unsigned int tag) { struct blk_mq_ctx *ctx = data->ctx; struct blk_mq_hw_ctx *hctx = data->hctx; @@ -356,14 +374,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, } rq->timeout = 0; - if (blk_mq_need_time_stamp(rq)) - rq->start_time_ns = ktime_get_ns(); - else - rq->start_time_ns = 0; rq->part = NULL; -#ifdef CONFIG_BLK_RQ_ALLOC_TIME - rq->alloc_time_ns = alloc_time_ns; -#endif rq->io_start_time_ns = 0; rq->stats_sectors = 0; rq->nr_phys_segments = 0; @@ -393,8 +404,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, } static inline struct request * -__blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data, - u64 alloc_time_ns) +__blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data) { unsigned int tag, tag_offset; struct blk_mq_tags *tags; @@ -413,7 +423,7 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data, tag = tag_offset + i; prefetch(tags->static_rqs[tag]); tag_mask &= ~(1UL << i); - rq = blk_mq_rq_ctx_init(data, tags, tag, alloc_time_ns); + rq = blk_mq_rq_ctx_init(data, tags, tag); rq_list_add(data->cached_rq, rq); nr++; } @@ -427,12 +437,13 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data, static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) { struct request_queue *q = data->q; + bool set_alloc_time = blk_queue_rq_alloc_time(q); u64 alloc_time_ns = 0; struct request *rq; unsigned int tag; /* alloc_time includes depth and tag waits */ - if (blk_queue_rq_alloc_time(q)) + if (set_alloc_time) alloc_time_ns = ktime_get_ns(); if (data->cmd_flags & REQ_NOWAIT) @@ -474,9 +485,11 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) * Try batched alloc if we want more than 1 tag. */ if (data->nr_tags > 1) { - rq = __blk_mq_alloc_requests_batch(data, alloc_time_ns); - if (rq) + rq = __blk_mq_alloc_requests_batch(data); + if (rq) { + blk_mq_rq_time_init(rq, true); return rq; + } data->nr_tags = 1; } @@ -499,8 +512,10 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) goto retry; } - return blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag, - alloc_time_ns); + rq = blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag); + if (rq) + blk_mq_rq_time_init(rq, set_alloc_time); + return rq; } static struct request *blk_mq_rq_cache_fill(struct request_queue *q, @@ -555,6 +570,7 @@ static struct request *blk_mq_alloc_cached_request(struct request_queue *q, return NULL; plug->cached_rq = rq_list_next(rq); + blk_mq_rq_time_init(rq, blk_queue_rq_alloc_time(rq->q)); } rq->cmd_flags = opf; @@ -656,8 +672,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, tag = blk_mq_get_tag(&data); if (tag == BLK_MQ_NO_TAG) goto out_queue_exit; - rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag, - alloc_time_ns); + rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag); + blk_mq_rq_time_init(rq, blk_queue_rq_alloc_time(rq->q)); rq->__data_len = 0; rq->__sector = (sector_t) -1; rq->bio = rq->biotail = NULL; @@ -2896,6 +2912,7 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q, plug->cached_rq = rq_list_next(rq); rq_qos_throttle(q, *bio); + blk_mq_rq_time_init(rq, blk_queue_rq_alloc_time(rq->q)); rq->cmd_flags = (*bio)->bi_opf; INIT_LIST_HEAD(&rq->queuelist); return rq;
On 2023/7/10 15:36, Christoph Hellwig wrote: > On Thu, Jun 29, 2023 at 03:40:03PM +0800, Chengming Zhou wrote: >> Thanks for your review! >> >> Since hctx-specific allocation path always has BLK_MQ_REQ_NOWAIT flag, >> it won't retry. >> >> But I agree, this makes the general __blk_mq_alloc_requests() more complex. > > And also very confusing as it pretends to share some code, while almost > nothing of __blk_mq_alloc_requests is actually used. You are right. I will not mess with reusing __blk_mq_alloc_requests() in the next version. > >> The reason is blk_mq_rq_ctx_init() has some data->rq_flags initialization: >> >> ``` >> if (data->flags & BLK_MQ_REQ_PM) >> data->rq_flags |= RQF_PM; >> if (blk_queue_io_stat(q)) >> data->rq_flags |= RQF_IO_STAT; >> rq->rq_flags = data->rq_flags; >> ``` >> >> Because we need this data->rq_flags to tell if we need start_time_ns, >> we need to put these initialization in the callers of blk_mq_rq_ctx_init(). > > Why can't we just always initialize the time stampts after > blk_mq_rq_ctx_init? Something like this (untested) variant of your > patch 2 from the latest iteration: I get what you mean: always initialize the two time stamps after blk_mq_rq_ctx_init(), so we know whether the time stamps are needed in blk_mq_rq_time_init(). It seems better and clearer indeed, I will try to change as you suggest. > > diff --git a/block/blk-mq.c b/block/blk-mq.c > index 5504719b970d59..55bf1009f3e32a 100644 > --- a/block/blk-mq.c > +++ b/block/blk-mq.c > @@ -328,8 +328,26 @@ void blk_rq_init(struct request_queue *q, struct request *rq) > } > EXPORT_SYMBOL(blk_rq_init); > > +/* Set alloc and start time when pre-allocated rq is actually used */ > +static inline void blk_mq_rq_time_init(struct request *rq, bool set_alloc_time) We need to pass "u64 alloc_time_ns" here, which includes depth and tag waits time by definition. So: 1. for non-batched request that need alloc_time_ns: passed alloc_time_ns != 0 2. for batched request that need alloc_time_ns: passed alloc_time_ns == 0, will be set to start_time_ns I have just updated the patch: https://lore.kernel.org/all/20230710105516.2053478-1-chengming.zhou@linux.dev/ Thanks! > +{ > + if (blk_mq_need_time_stamp(rq)) { > + u64 now = ktime_get_ns(); > + > +#ifdef CONFIG_BLK_RQ_ALLOC_TIME > + /* > + * The alloc time is only used by iocost for now, > + * only possible when blk_mq_need_time_stamp(). > + */ > + if (set_alloc_time) > + rq->alloc_time_ns = now; > +#endif > + rq->start_time_ns = now; > + } > +} > + > static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, > - struct blk_mq_tags *tags, unsigned int tag, u64 alloc_time_ns) > + struct blk_mq_tags *tags, unsigned int tag) > { > struct blk_mq_ctx *ctx = data->ctx; > struct blk_mq_hw_ctx *hctx = data->hctx; > @@ -356,14 +374,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, > } > rq->timeout = 0; > > - if (blk_mq_need_time_stamp(rq)) > - rq->start_time_ns = ktime_get_ns(); > - else > - rq->start_time_ns = 0; > rq->part = NULL; > -#ifdef CONFIG_BLK_RQ_ALLOC_TIME > - rq->alloc_time_ns = alloc_time_ns; > -#endif > rq->io_start_time_ns = 0; > rq->stats_sectors = 0; > rq->nr_phys_segments = 0; > @@ -393,8 +404,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, > } > > static inline struct request * > -__blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data, > - u64 alloc_time_ns) > +__blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data) > { > unsigned int tag, tag_offset; > struct blk_mq_tags *tags; > @@ -413,7 +423,7 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data, > tag = tag_offset + i; > prefetch(tags->static_rqs[tag]); > tag_mask &= ~(1UL << i); > - rq = blk_mq_rq_ctx_init(data, tags, tag, alloc_time_ns); > + rq = blk_mq_rq_ctx_init(data, tags, tag); > rq_list_add(data->cached_rq, rq); > nr++; > } > @@ -427,12 +437,13 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data, > static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) > { > struct request_queue *q = data->q; > + bool set_alloc_time = blk_queue_rq_alloc_time(q); > u64 alloc_time_ns = 0; > struct request *rq; > unsigned int tag; > > /* alloc_time includes depth and tag waits */ > - if (blk_queue_rq_alloc_time(q)) > + if (set_alloc_time) > alloc_time_ns = ktime_get_ns(); > > if (data->cmd_flags & REQ_NOWAIT) > @@ -474,9 +485,11 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) > * Try batched alloc if we want more than 1 tag. > */ > if (data->nr_tags > 1) { > - rq = __blk_mq_alloc_requests_batch(data, alloc_time_ns); > - if (rq) > + rq = __blk_mq_alloc_requests_batch(data); > + if (rq) { > + blk_mq_rq_time_init(rq, true); > return rq; > + } > data->nr_tags = 1; > } > > @@ -499,8 +512,10 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) > goto retry; > } > > - return blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag, > - alloc_time_ns); > + rq = blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag); > + if (rq) > + blk_mq_rq_time_init(rq, set_alloc_time); > + return rq; > } > > static struct request *blk_mq_rq_cache_fill(struct request_queue *q, > @@ -555,6 +570,7 @@ static struct request *blk_mq_alloc_cached_request(struct request_queue *q, > return NULL; > > plug->cached_rq = rq_list_next(rq); > + blk_mq_rq_time_init(rq, blk_queue_rq_alloc_time(rq->q)); > } > > rq->cmd_flags = opf; > @@ -656,8 +672,8 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, > tag = blk_mq_get_tag(&data); > if (tag == BLK_MQ_NO_TAG) > goto out_queue_exit; > - rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag, > - alloc_time_ns); > + rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag); > + blk_mq_rq_time_init(rq, blk_queue_rq_alloc_time(rq->q)); > rq->__data_len = 0; > rq->__sector = (sector_t) -1; > rq->bio = rq->biotail = NULL; > @@ -2896,6 +2912,7 @@ static inline struct request *blk_mq_get_cached_request(struct request_queue *q, > plug->cached_rq = rq_list_next(rq); > rq_qos_throttle(q, *bio); > > + blk_mq_rq_time_init(rq, blk_queue_rq_alloc_time(rq->q)); > rq->cmd_flags = (*bio)->bi_opf; > INIT_LIST_HEAD(&rq->queuelist); > return rq;
diff --git a/block/blk-mq.c b/block/blk-mq.c index decb6ab2d508..c50ef953759f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -349,11 +349,6 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data, rq->mq_ctx = ctx; rq->mq_hctx = hctx; rq->cmd_flags = data->cmd_flags; - - if (data->flags & BLK_MQ_REQ_PM) - data->rq_flags |= RQF_PM; - if (blk_queue_io_stat(q)) - data->rq_flags |= RQF_IO_STAT; rq->rq_flags = data->rq_flags; if (data->rq_flags & RQF_SCHED_TAGS) { @@ -447,6 +442,15 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) if (data->cmd_flags & REQ_NOWAIT) data->flags |= BLK_MQ_REQ_NOWAIT; + if (data->flags & BLK_MQ_REQ_RESERVED) + data->rq_flags |= RQF_RESV; + + if (data->flags & BLK_MQ_REQ_PM) + data->rq_flags |= RQF_PM; + + if (blk_queue_io_stat(q)) + data->rq_flags |= RQF_IO_STAT; + if (q->elevator) { /* * All requests use scheduler tags when an I/O scheduler is @@ -471,14 +475,15 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) } retry: - data->ctx = blk_mq_get_ctx(q); - data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx); + /* See blk_mq_alloc_request_hctx() for details */ + if (!data->ctx) { + data->ctx = blk_mq_get_ctx(q); + data->hctx = blk_mq_map_queue(q, data->cmd_flags, data->ctx); + } + if (!(data->rq_flags & RQF_SCHED_TAGS)) blk_mq_tag_busy(data->hctx); - if (data->flags & BLK_MQ_REQ_RESERVED) - data->rq_flags |= RQF_RESV; - /* * Try batched alloc if we want more than 1 tag. */ @@ -505,6 +510,7 @@ static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data) * is going away. */ msleep(3); + data->ctx = NULL; goto retry; } @@ -613,16 +619,10 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, .cmd_flags = opf, .nr_tags = 1, }; - u64 alloc_time_ns = 0; struct request *rq; unsigned int cpu; - unsigned int tag; int ret; - /* alloc_time includes depth and tag waits */ - if (blk_queue_rq_alloc_time(q)) - alloc_time_ns = ktime_get_ns(); - /* * If the tag allocator sleeps we could get an allocation for a * different hardware context. No need to complicate the low level @@ -653,20 +653,10 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q, goto out_queue_exit; data.ctx = __blk_mq_get_ctx(q, cpu); - if (q->elevator) - data.rq_flags |= RQF_SCHED_TAGS; - else - blk_mq_tag_busy(data.hctx); - - if (flags & BLK_MQ_REQ_RESERVED) - data.rq_flags |= RQF_RESV; - ret = -EWOULDBLOCK; - tag = blk_mq_get_tag(&data); - if (tag == BLK_MQ_NO_TAG) + rq = __blk_mq_alloc_requests(&data); + if (!rq) goto out_queue_exit; - rq = blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag, - alloc_time_ns); rq->__data_len = 0; rq->__sector = (sector_t) -1; rq->bio = rq->biotail = NULL;