Message ID | 20230704005144.1172-1-lihongweizz@inspur.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp905558vqx; Mon, 3 Jul 2023 18:40:57 -0700 (PDT) X-Google-Smtp-Source: APBJJlGxM2AaT4V2mTjfZXI/ZQU+NFABdMi2qrYIjENGrfxr4gDBuBOCsfXmimo00TtKQFUNoiL0 X-Received: by 2002:a25:6b4e:0:b0:c1b:2fab:b802 with SMTP id o14-20020a256b4e000000b00c1b2fabb802mr10895116ybm.17.1688434857229; Mon, 03 Jul 2023 18:40:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688434857; cv=none; d=google.com; s=arc-20160816; b=gdY6x38+UMYKt/CaPTS8WZQJLimgkZ8Nz/PG0wUv2nmGXD8jKvG5L0Vsv1ocn7OwR8 yhairr8XDSHk86Gt78tKdlTeNLoahueAzPqWTYtRJI3r3Hs1cF6jI8ogBK8EYKAO1wvt nR8brOgOae6x981ValU4oQqAeS9eZh0vN5zedxEqKBOtzUCRZfREI7HWg6335eZhiu0g spuzM5nmNG79EXMxhHKkpJBp236FJyUW7To2E+g+V/1Tv/wXR1pyBLHXDBlSVs0Y5SkC 5uzDwiSS77kaLzKLCB4p4AenAV8XqbhAKU0sWpzrI6yxdB43IKlJwSKzuAcb8Bdrvm+A Y5IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:abuse-reports-to:tuid:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from; bh=6pUXOKBVjt9ET/bexlWKt1oGRUetvQ3neKXkKSYbgb0=; fh=eC855aMQauT/30LWsEpnXlpxeQmHZU8q6VnE+1rVgtc=; b=pJWXBu9M9XcSzrBWD4vS3Fuc9CpKqlUJsRCarxpeN1vuyhNVi0KrQEDg/o5jr6jrOk YfYGcv563AV8a3ox9kyRlEOBlQGmKxBEv7xHxlQpenb3/wTmz7x2juP+TAMBy+QImK+3 P0B6Ko6gbxoYB9IwjHeurPA9SamO5/7i+3JUvbOGxpDhn6kPx9UjiyT8YrkWcg7OV5Sc gwsIvrotf2oeZPBSkCr67vQ9UhaB+FstahCUnxxwt1JjLPf5HCoUQr8XQexxJSSW8eMB VZ51qdUwPEEcWrS3kEwpiz15kv1A7UWhBzQoK+41nKz04eJznzuZHu6jRmsxfK84vGyV I9Yw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s18-20020a632c12000000b005579d6bdf7asi19139031pgs.733.2023.07.03.18.40.42; Mon, 03 Jul 2023 18:40:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230311AbjGDAxE (ORCPT <rfc822;adanhawthorn@gmail.com> + 99 others); Mon, 3 Jul 2023 20:53:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229622AbjGDAxD (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 3 Jul 2023 20:53:03 -0400 X-Greylist: delayed 65 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Mon, 03 Jul 2023 17:53:00 PDT Received: from unicom146.biz-email.net (unicom146.biz-email.net [210.51.26.146]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7200F136; Mon, 3 Jul 2023 17:53:00 -0700 (PDT) Received: from unicom146.biz-email.net by unicom146.biz-email.net ((D)) with ASMTP (SSL) id ZXD00050; Tue, 04 Jul 2023 08:51:50 +0800 Received: from lihongweizz00.home.langchao.com (10.180.207.169) by jtjnmail201601.home.langchao.com (10.100.2.1) with Microsoft SMTP Server id 15.1.2507.27; Tue, 4 Jul 2023 08:51:50 +0800 From: lihongweizz <lihongweizz@inspur.com> To: <sagi@grimberg.me>, <mgurtovoy@nvidia.com>, <jgg@ziepe.ca>, <leon@kernel.org> CC: <linux-rdma@vger.kernel.org>, <linux-kernel@vger.kernel.org>, Rock Li <lihongweizz@inspur.com> Subject: [PATCH] IB/iser: Protect tasks cleanup in case iser connection was stopped Date: Tue, 4 Jul 2023 08:51:44 +0800 Message-ID: <20230704005144.1172-1-lihongweizz@inspur.com> X-Mailer: git-send-email 2.40.1.windows.1 MIME-Version: 1.0 Content-Transfer-Encoding: 7BIT Content-Type: text/plain; charset=US-ASCII X-Originating-IP: [10.180.207.169] tUid: 20237040851508f3372386f2df8ea6a16e04484201e48 X-Abuse-Reports-To: service@corp-email.com Abuse-Reports-To: service@corp-email.com X-Complaints-To: service@corp-email.com X-Report-Abuse-To: service@corp-email.com X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770452268934109742?= X-GMAIL-MSGID: =?utf-8?q?1770452268934109742?= |
Series |
IB/iser: Protect tasks cleanup in case iser connection was stopped
|
|
Commit Message
lihongweizz
July 4, 2023, 12:51 a.m. UTC
From: Rock Li <lihongweizz@inspur.com> We met a crash issue as below: ... #7 [ff61b991f6f63d10] page_fault at ffffffffab80111e [exception RIP: iscsi_iser_cleanup_task+13] RIP: ffffffffc046c04d RSP: ff61b991f6f63dc0 RFLAGS: 00010246 RAX: 0000000000000000 RBX: ff4bd0aalf7a5610 RCX: ff61b991f6f63dc8 RDX: ff61b991f6f63d68 RSI: ff61b991f6f63d58 RDI: ff4bd0aalf6cdc00 RBP: 0000000000000005 R8: 0000000000000073 R9: 0000000000000005 R10: 0000000000000000 R11: 00000ccde3e0f5c0 R12: ff4bd08c0e0631f8 R13: ff4bd0a95ffd3c78 R14: ff4bd0a95ffd3c78 R15: ff4bd0aalf6cdc00 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #8 [ff616991f6f63dc0] __iscsi_put_task at ffffffffc0bd3652 [libiscsi] #9 [ff61b991f6f63e00] iscsi_put_task at ffffffffc0bd36e9 [libiscsi] ... After analysing the vmcore, we find that the iser connection was already stopped before abort handler running. The iser_conn is already unbindded and released. So we add iser connection validation check inside cleanup task to fix this corner case. Signed-off-by: Rock Li <lihongweizz@inspur.com> --- drivers/infiniband/ulp/iser/iscsi_iser.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
Comments
On Tue, Jul 04, 2023 at 08:51:44AM +0800, lihongweizz wrote: > From: Rock Li <lihongweizz@inspur.com> > > We met a crash issue as below: > ... > #7 [ff61b991f6f63d10] page_fault at ffffffffab80111e > [exception RIP: iscsi_iser_cleanup_task+13] > RIP: ffffffffc046c04d RSP: ff61b991f6f63dc0 RFLAGS: 00010246 > RAX: 0000000000000000 RBX: ff4bd0aalf7a5610 RCX: ff61b991f6f63dc8 > RDX: ff61b991f6f63d68 RSI: ff61b991f6f63d58 RDI: ff4bd0aalf6cdc00 > RBP: 0000000000000005 R8: 0000000000000073 R9: 0000000000000005 > R10: 0000000000000000 R11: 00000ccde3e0f5c0 R12: ff4bd08c0e0631f8 > R13: ff4bd0a95ffd3c78 R14: ff4bd0a95ffd3c78 R15: ff4bd0aalf6cdc00 > ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 > #8 [ff616991f6f63dc0] __iscsi_put_task at ffffffffc0bd3652 [libiscsi] > #9 [ff61b991f6f63e00] iscsi_put_task at ffffffffc0bd36e9 [libiscsi] > ... > > After analysing the vmcore, we find that the iser connection was already > stopped before abort handler running. The iser_conn is already unbindded > and released. So we add iser connection validation check inside cleanup > task to fix this corner case. > > Signed-off-by: Rock Li <lihongweizz@inspur.com> > --- > drivers/infiniband/ulp/iser/iscsi_iser.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c > index bb9aaff92ca3..35dfbf41fc40 100644 > --- a/drivers/infiniband/ulp/iser/iscsi_iser.c > +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c > @@ -366,7 +366,12 @@ static void iscsi_iser_cleanup_task(struct iscsi_task *task) > struct iscsi_iser_task *iser_task = task->dd_data; > struct iser_tx_desc *tx_desc = &iser_task->desc; > struct iser_conn *iser_conn = task->conn->dd_data; > - struct iser_device *device = iser_conn->ib_conn.device; > + struct iser_device *device; > + > + /* stop connection might happens before iser cleanup work */ > + if (!iser_conn) > + return; And what prevents from iser_conn being not valid here? For example, in the flow: 1. Start iscsi_iser_cleanup_task 2. Get valid task->conn->dd_data 3. Pass this if (..) check 4. Context switch and release connection 5. iser_conn now points to released memory. Thanks > + device = iser_conn->ib_conn.device; > > /* DEVICE_REMOVAL event might have already released the device */ > if (!device) > -- > 2.27.0 >
diff --git a/drivers/infiniband/ulp/iser/iscsi_iser.c b/drivers/infiniband/ulp/iser/iscsi_iser.c index bb9aaff92ca3..35dfbf41fc40 100644 --- a/drivers/infiniband/ulp/iser/iscsi_iser.c +++ b/drivers/infiniband/ulp/iser/iscsi_iser.c @@ -366,7 +366,12 @@ static void iscsi_iser_cleanup_task(struct iscsi_task *task) struct iscsi_iser_task *iser_task = task->dd_data; struct iser_tx_desc *tx_desc = &iser_task->desc; struct iser_conn *iser_conn = task->conn->dd_data; - struct iser_device *device = iser_conn->ib_conn.device; + struct iser_device *device; + + /* stop connection might happens before iser cleanup work */ + if (!iser_conn) + return; + device = iser_conn->ib_conn.device; /* DEVICE_REMOVAL event might have already released the device */ if (!device)