From patchwork Fri Mar 24 13:58:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 74565 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp651167vqo; Fri, 24 Mar 2023 07:06:08 -0700 (PDT) X-Google-Smtp-Source: AKy350Zt51BwqXWKtldQzDv12KMh9pQWZ4baHG8YTRqPj8nzbI3wsM3jn3GfETTaitUjGJihQYsf X-Received: by 2002:aa7:c950:0:b0:500:2c4f:3f5 with SMTP id h16-20020aa7c950000000b005002c4f03f5mr2582034edt.12.1679666767852; Fri, 24 Mar 2023 07:06:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679666767; cv=none; d=google.com; s=arc-20160816; b=AsWCjiVePbCmn1PPSiDQ7I7dKK3rsxZHzYRs6OC0alXFkfByhTqoSZPnlPCfGwcPIk FPRdXDf7nF+X4dTQ0XXwZnKwn8Xb8Vi1k+6nqLb7MjMKtxX9BUYO4yd2VpIZ1OA3z9fG xaWfOWN+O0CaOSKTb+n7EmPnGVNu/mAppWSPks0X5oe00yPjQc6BPSrDvekmIEo/7ai6 xoxKyEH/2bPgNSEOsN7WqRZ/hOmOqSceBZaM9FF+Pz3pq1J02csX3MM+g+VQo2CMfQZE gtHyduib0zqdMqYcIxha2wTSc514q1WG0XxzoaNbaFFGibBLD63RAcygu7AdvhurR8Nb /wqw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hvyvZWm/csvLwmrm6KZCbJ14Wg5t5i+o8LQkCJJRLEg=; b=fRaD8UgHq5E+Xf7xXmq59VZ1lp4CjeAYgaKYvFgVH9R/xGk2aOzHdpPQjUS0LY9E7B zpkQdBg7MUSByzWI7YsTfcr9qUDD6COdp6XZrqeLEwLJmUa7I+DmVzv7XzXxbagy7Qvo dsatcRGUgPgdVOJ49KfBMbRUDsCPtNGjtVAJ51chCxPod6+Wi4bSJB2SGxXAIqPUiyp4 44qAhIMTURBlhEAnKip5UWcorkdkC/h83CwRZBBMC6GKRMy3YChNA4NPMatE3s0/T2ES kcjucL81I153QqhHwNqQ6ap3DsPXmEOm7mw6byxnKPu3C6MjDMjveqN2U3Wn7UzCU7LD k8Sw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RbO4awdU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o12-20020aa7c50c000000b004fd23c911f7si695049edq.544.2023.03.24.07.05.25; Fri, 24 Mar 2023 07:06:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=RbO4awdU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231358AbjCXOBM (ORCPT + 99 others); Fri, 24 Mar 2023 10:01:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232114AbjCXOA1 (ORCPT ); Fri, 24 Mar 2023 10:00:27 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A0C71C32E for ; Fri, 24 Mar 2023 06:59:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679666350; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hvyvZWm/csvLwmrm6KZCbJ14Wg5t5i+o8LQkCJJRLEg=; b=RbO4awdUhqR0n4yMUrcGfDwR71YWgsNcD9TEPHumh7UO9tmN5ABTe2MLdciLpY6LKW9/+9 nIkhZv/yBOxQ1AAy3fS3bGwFQSTUJMP0njhw2KyyyEG8K4HPtzPC0oWvub3mDSDZnnXzov HPfdmuTx/Pt1GoSBG/WU2nDmMp/D/1o= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-28-D9mKF-kFPQ6VIpICSV0kEQ-1; Fri, 24 Mar 2023 09:59:07 -0400 X-MC-Unique: D9mKF-kFPQ6VIpICSV0kEQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BBBDF385556D; Fri, 24 Mar 2023 13:59:06 +0000 (UTC) Received: from localhost (ovpn-8-20.pek2.redhat.com [10.72.8.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id D4412492B03; Fri, 24 Mar 2023 13:59:05 +0000 (UTC) From: Ming Lei To: Jens Axboe , io-uring@vger.kernel.org, linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Miklos Szeredi , ZiyangZhang , Xiaoguang Wang , Bernd Schubert , Pavel Begunkov , Stefan Hajnoczi , Ming Lei Subject: [PATCH V4 13/17] block: ublk_drv: grab request reference when the request is handled by userspace Date: Fri, 24 Mar 2023 21:58:04 +0800 Message-Id: <20230324135808.855245-14-ming.lei@redhat.com> In-Reply-To: <20230324135808.855245-1-ming.lei@redhat.com> References: <20230324135808.855245-1-ming.lei@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761258260515640955?= X-GMAIL-MSGID: =?utf-8?q?1761258260515640955?= Add one reference counter into request pdu data, and hold this reference in the request's lifetime. This way is always safe. In theory, the ublk request won't be completed until fused commands are done. However, it is userspace, and application can submit fused command at will. Prepare for supporting zero copy, which needs to retrieve request buffer by fused command, so we have to guarantee: - the fused command can't succeed unless the request isn't queued - when any fused command is successful, this request can't be freed until all fused commands on this request are done. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 67 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 64 insertions(+), 3 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 85ceb8c09d0e..88d5a657834d 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #define UBLK_MINORS (1U << MINORBITS) @@ -62,6 +63,17 @@ struct ublk_rq_data { struct llist_node node; struct callback_head work; + + /* + * Only for applying fused command to support zero copy: + * + * - if there is any fused command aiming at this request, not complete + * request until all fused commands are done + * + * - fused command has to fail unless this reference is grabbed + * successfully + */ + struct kref ref; }; struct ublk_uring_cmd_pdu { @@ -180,6 +192,9 @@ struct ublk_params_header { __u32 types; }; +static inline void __ublk_complete_rq(struct request *req); +static void ublk_complete_rq(struct kref *ref); + static dev_t ublk_chr_devt; static struct class *ublk_chr_class; @@ -288,6 +303,35 @@ static int ublk_apply_params(struct ublk_device *ub) return 0; } +static inline bool ublk_support_zc(const struct ublk_queue *ubq) +{ + return ubq->flags & UBLK_F_SUPPORT_ZERO_COPY; +} + +static inline bool ublk_get_req_ref(const struct ublk_queue *ubq, + struct request *req) +{ + if (ublk_support_zc(ubq)) { + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req); + + return kref_get_unless_zero(&data->ref); + } + + return true; +} + +static inline void ublk_put_req_ref(const struct ublk_queue *ubq, + struct request *req) +{ + if (ublk_support_zc(ubq)) { + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req); + + kref_put(&data->ref, ublk_complete_rq); + } else { + __ublk_complete_rq(req); + } +} + static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq) { if (IS_BUILTIN(CONFIG_BLK_DEV_UBLK) && @@ -632,13 +676,19 @@ static inline bool ubq_daemon_is_dying(struct ublk_queue *ubq) } /* todo: handle partial completion */ -static void ublk_complete_rq(struct request *req) +static inline void __ublk_complete_rq(struct request *req) { struct ublk_queue *ubq = req->mq_hctx->driver_data; struct ublk_io *io = &ubq->ios[req->tag]; unsigned int unmapped_bytes; blk_status_t res = BLK_STS_OK; + /* called from ublk_abort_queue() code path */ + if (io->flags & UBLK_IO_FLAG_ABORTED) { + res = BLK_STS_IOERR; + goto exit; + } + /* failed read IO if nothing is read */ if (!io->res && req_op(req) == REQ_OP_READ) io->res = -EIO; @@ -678,6 +728,15 @@ static void ublk_complete_rq(struct request *req) blk_mq_end_request(req, res); } +static void ublk_complete_rq(struct kref *ref) +{ + struct ublk_rq_data *data = container_of(ref, struct ublk_rq_data, + ref); + struct request *req = blk_mq_rq_from_pdu(data); + + __ublk_complete_rq(req); +} + /* * Since __ublk_rq_task_work always fails requests immediately during * exiting, __ublk_fail_req() is only called from abort context during @@ -696,7 +755,7 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io, if (ublk_queue_can_use_recovery_reissue(ubq)) blk_mq_requeue_request(req, false); else - blk_mq_end_request(req, BLK_STS_IOERR); + ublk_put_req_ref(ubq, req); } } @@ -732,6 +791,7 @@ static inline void __ublk_abort_rq(struct ublk_queue *ubq, static inline void __ublk_rq_task_work(struct request *req) { struct ublk_queue *ubq = req->mq_hctx->driver_data; + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req); int tag = req->tag; struct ublk_io *io = &ubq->ios[tag]; unsigned int mapped_bytes; @@ -803,6 +863,7 @@ static inline void __ublk_rq_task_work(struct request *req) mapped_bytes >> 9; } + kref_init(&data->ref); ubq_complete_io_cmd(io, UBLK_IO_RES_OK); } @@ -1013,7 +1074,7 @@ static void ublk_commit_completion(struct ublk_device *ub, req = blk_mq_tag_to_rq(ub->tag_set.tags[qid], tag); if (req && likely(!blk_should_fake_timeout(req->q))) - ublk_complete_rq(req); + ublk_put_req_ref(ubq, req); } /*