Message ID | 20230208200957.14073-1-djeffery@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp3671169wrn; Wed, 8 Feb 2023 12:30:12 -0800 (PST) X-Google-Smtp-Source: AK7set89jtqInGhck8OLxCrqvDcWEpNW8GMT68eZsbBCdQjeocgNgBlJ+tvjF6uoyK5q9P1oM3fn X-Received: by 2002:a17:902:d48a:b0:196:6308:c9d3 with SMTP id c10-20020a170902d48a00b001966308c9d3mr9366311plg.0.1675888212465; Wed, 08 Feb 2023 12:30:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675888212; cv=none; d=google.com; s=arc-20160816; b=eYlhmSF+dwl0LjmqEbPnE309JLQzNcQlZ61mKFFVDv/NXQpZxmohZY2Ya2DG4UtZXy HYrQDDjheT4OzRdAgGdG6VVR8z/alQl+zEpDOA/h4zNyM9xINous3wJdpFnVKFclsXCr TsMXioULg7k07KOmKRMMZwprMtaLPXBuAVHQddNp49Glv8DMkat3hgjvNE13KmfZ86b1 UHrHYAIEYzPpPzaAX10Uk+SQwklbsVLuTB7jeBw7UKhNl8SRbLinEQqdQiA7BqcQX4Nz eF7ixqc+d9qeZGZ43dgG+62mFpePJcHqKRKRd/wnWGuK4Mo6ILaAIXmjSuS6NXv8OQ6T owRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=pkpGIxstJz2N9eQc0UMJE5ogLSVhp75UoejwnhOmCxs=; b=vESReki9VeZfZtSt2fDPlpT8Xb74LLv7IRvdz+1XLRpnoxu9YI/qk9DbRK/+NJytjN T5Pdhl9fOVfarXMhICxs6VbVrZeTEslXHU8rX7DKOxHbd7dH+70Ah0IgBcE+O8q0PngH s9sKDw/DKiFR9uNa2XB2Um5q+zfrnMXDF91RZOPzCi3gQ+LGbfpRi/7blPPL2tUke3CN uIAcwHxLAJeFsbIUjqVX3h41kTZ5sjncg0ZMZXkbx88wQ35TQN8IPD5EoE1hWWsAV/Iw ATmy7gG8wqLWcG3v/rhOFMy9wMdJCXqFhMM8AzIF4M9mKKEfEKpJQ+KT8GiRomZjATWJ a2jQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=B1E+wFz9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o1-20020a170902d4c100b00196780aada3si19350145plg.381.2023.02.08.12.29.58; Wed, 08 Feb 2023 12:30:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=B1E+wFz9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232025AbjBHULr (ORCPT <rfc822;ivan.orlov0322@gmail.com> + 99 others); Wed, 8 Feb 2023 15:11:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35138 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231771AbjBHULp (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 8 Feb 2023 15:11:45 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E703E301BD for <linux-kernel@vger.kernel.org>; Wed, 8 Feb 2023 12:10:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675887057; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=pkpGIxstJz2N9eQc0UMJE5ogLSVhp75UoejwnhOmCxs=; b=B1E+wFz9vzeZ8+peIwqPBQJONZujAQ+9pRyQeeMMWMzn7wqWVX4nDEKVQ+s+K9IwZUrF35 PPFU8yem/p05ON5XVslJkBFloMGKLSoD5tYwB2hc+NZrIqRYKWQjSAIt+LsFJSsCLyB3oD rZ1Xg5aTB6smSkneDdOZXqK1An/sIic= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-22-kT6d-ytvNSaIgPYixMKvMA-1; Wed, 08 Feb 2023 15:10:54 -0500 X-MC-Unique: kT6d-ytvNSaIgPYixMKvMA-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F180038123A1; Wed, 8 Feb 2023 20:10:53 +0000 (UTC) Received: from fedora-work.redhat.com (unknown [10.22.10.165]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9BAC1492C3F; Wed, 8 Feb 2023 20:10:47 +0000 (UTC) From: David Jeffery <djeffery@redhat.com> To: target-devel@vger.kernel.org Cc: "Martin K . Petersen" <martin.petersen@oracle.com>, Mike Christie <michael.christie@oracle.com>, Maurizio Lombardi <mlombard@redhat.com>, Laurence Oberman <loberman@redhat.com>, linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org, David Jeffery <djeffery@redhat.com> Subject: [PATCH] scsi: target: iscsi: set memalloc_noio with loopback network connections Date: Wed, 8 Feb 2023 15:09:57 -0500 Message-Id: <20230208200957.14073-1-djeffery@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757296157809127522?= X-GMAIL-MSGID: =?utf-8?q?1757296157809127522?= |
Series |
scsi: target: iscsi: set memalloc_noio with loopback network connections
|
|
Commit Message
David Jeffery
Feb. 8, 2023, 8:09 p.m. UTC
If an admin connects an iscsi initiator to an iscsi target on the same
system, the iscsi connection is vulnerable to deadlocks during memory
allocations. Memory allocations in the target task accepting the I/O from
the initiator can wait on the initiator's I/O when the system is under
memory pressure, causing a deadlock situation between the iscsi target and
initiator.
When in this configuration, the deadlock scenario can be avoided by use of
GFP_NOIO allocations. Rather than force all configurations to use NOIO,
memalloc_noio_save/restore can be used to force GFP_NOIO allocations only
when in this loopback configuration.
Signed-off-by: David Jeffery <djeffery@redhat.com>
---
drivers/target/iscsi/iscsi_target.c | 21 +++++++++++++++++++++
1 file changed, 21 insertions(+)
Comments
On Wed, 2023-02-08 at 15:09 -0500, David Jeffery wrote: > If an admin connects an iscsi initiator to an iscsi target on the > same > system, the iscsi connection is vulnerable to deadlocks during memory > allocations. Memory allocations in the target task accepting the I/O > from > the initiator can wait on the initiator's I/O when the system is > under > memory pressure, causing a deadlock situation between the iscsi > target and > initiator. > > When in this configuration, the deadlock scenario can be avoided by > use of > GFP_NOIO allocations. Rather than force all configurations to use > NOIO, > memalloc_noio_save/restore can be used to force GFP_NOIO allocations > only > when in this loopback configuration. > > Signed-off-by: David Jeffery <djeffery@redhat.com> > --- > drivers/target/iscsi/iscsi_target.c | 21 +++++++++++++++++++++ > 1 file changed, 21 insertions(+) > > diff --git a/drivers/target/iscsi/iscsi_target.c > b/drivers/target/iscsi/iscsi_target.c > index baf4da7bb3b4..a68e47e2cdf9 100644 > --- a/drivers/target/iscsi/iscsi_target.c > +++ b/drivers/target/iscsi/iscsi_target.c > @@ -16,6 +16,7 @@ > #include <linux/vmalloc.h> > #include <linux/idr.h> > #include <linux/delay.h> > +#include <linux/sched/mm.h> > #include <linux/sched/signal.h> > #include <asm/unaligned.h> > #include <linux/inet.h> > @@ -4168,7 +4169,10 @@ int iscsi_target_rx_thread(void *arg) > { > int rc; > struct iscsit_conn *conn = arg; > + struct dst_entry *dst; > bool conn_freed = false; > + bool loopback = false; > + unsigned int flags; > > /* > * Allow ourselves to be interrupted by SIGINT so that a > @@ -4186,8 +4190,25 @@ int iscsi_target_rx_thread(void *arg) > if (!conn->conn_transport->iscsit_get_rx_pdu) > return 0; > > + /* > + * If the iscsi connection is over a loopback device from using > + * iscsi and iscsit on the same system, we need to set > memalloc_noio to > + * prevent memory allocation deadlocks between target and > initiator. > + */ > + rcu_read_lock(); > + dst = rcu_dereference(conn->sock->sk->sk_dst_cache); > + if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK) > + loopback = true; > + rcu_read_unlock(); > + > + if (loopback) > + flags = memalloc_noio_save(); > + > conn->conn_transport->iscsit_get_rx_pdu(conn); > > + if (loopback) > + memalloc_noio_restore(flags); > + > if (!signal_pending(current)) > atomic_set(&conn->transport_failed, 1); > iscsit_take_action_for_connection_exit(conn, &conn_freed); I had mentioned to Mike that this was already tested at a large customer and in our labs and resolved the deadlocks . Regards Laurence Oberman
On Wed, 2023-02-08 at 15:58 -0500, Laurence Oberman wrote: > On Wed, 2023-02-08 at 15:09 -0500, David Jeffery wrote: > > If an admin connects an iscsi initiator to an iscsi target on the > > same > > system, the iscsi connection is vulnerable to deadlocks during > > memory > > allocations. Memory allocations in the target task accepting the > > I/O > > from > > the initiator can wait on the initiator's I/O when the system is > > under > > memory pressure, causing a deadlock situation between the iscsi > > target and > > initiator. > > > > When in this configuration, the deadlock scenario can be avoided by > > use of > > GFP_NOIO allocations. Rather than force all configurations to use > > NOIO, > > memalloc_noio_save/restore can be used to force GFP_NOIO > > allocations > > only > > when in this loopback configuration. > > > > Signed-off-by: David Jeffery <djeffery@redhat.com> > > --- > > drivers/target/iscsi/iscsi_target.c | 21 +++++++++++++++++++++ > > 1 file changed, 21 insertions(+) > > > > diff --git a/drivers/target/iscsi/iscsi_target.c > > b/drivers/target/iscsi/iscsi_target.c > > index baf4da7bb3b4..a68e47e2cdf9 100644 > > --- a/drivers/target/iscsi/iscsi_target.c > > +++ b/drivers/target/iscsi/iscsi_target.c > > @@ -16,6 +16,7 @@ > > #include <linux/vmalloc.h> > > #include <linux/idr.h> > > #include <linux/delay.h> > > +#include <linux/sched/mm.h> > > #include <linux/sched/signal.h> > > #include <asm/unaligned.h> > > #include <linux/inet.h> > > @@ -4168,7 +4169,10 @@ int iscsi_target_rx_thread(void *arg) > > { > > int rc; > > struct iscsit_conn *conn = arg; > > + struct dst_entry *dst; > > bool conn_freed = false; > > + bool loopback = false; > > + unsigned int flags; > > > > /* > > * Allow ourselves to be interrupted by SIGINT so that a > > @@ -4186,8 +4190,25 @@ int iscsi_target_rx_thread(void *arg) > > if (!conn->conn_transport->iscsit_get_rx_pdu) > > return 0; > > > > + /* > > + * If the iscsi connection is over a loopback device from using > > + * iscsi and iscsit on the same system, we need to set > > memalloc_noio to > > + * prevent memory allocation deadlocks between target and > > initiator. > > + */ > > + rcu_read_lock(); > > + dst = rcu_dereference(conn->sock->sk->sk_dst_cache); > > + if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK) > > + loopback = true; > > + rcu_read_unlock(); > > + > > + if (loopback) > > + flags = memalloc_noio_save(); > > + > > conn->conn_transport->iscsit_get_rx_pdu(conn); > > > > + if (loopback) > > + memalloc_noio_restore(flags); > > + > > if (!signal_pending(current)) > > atomic_set(&conn->transport_failed, 1); > > iscsit_take_action_for_connection_exit(conn, &conn_freed); > > I had mentioned to Mike that this was already tested at a large > customer and in our labs and resolved the deadlocks . > > Regards > Laurence Oberman > Tested-by: Laurence Oberman <loberman@redhat.com> Reviewed-by: Laurence Oberman <loberman@redhat.com> I hate to nag here but we have a pressing customer issue and are keen to get others to weigh in here. Regards Laurence Thanks Laurence
st 8. 2. 2023 v 21:10 odesílatel David Jeffery <djeffery@redhat.com> napsal: > > > + /* > + * If the iscsi connection is over a loopback device from using > + * iscsi and iscsit on the same system, we need to set memalloc_noio to > + * prevent memory allocation deadlocks between target and initiator. > + */ > + rcu_read_lock(); > + dst = rcu_dereference(conn->sock->sk->sk_dst_cache); > + if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK) > + loopback = true; > + rcu_read_unlock(); Hi Mike, I tested it, it works. The customer also confirmed that it fixes the deadlock on his setup. Maurizio
On 2/13/23 5:59 AM, Maurizio Lombardi wrote: > st 8. 2. 2023 v 21:10 odesílatel David Jeffery <djeffery@redhat.com> napsal: >> >> >> + /* >> + * If the iscsi connection is over a loopback device from using >> + * iscsi and iscsit on the same system, we need to set memalloc_noio to >> + * prevent memory allocation deadlocks between target and initiator. >> + */ >> + rcu_read_lock(); >> + dst = rcu_dereference(conn->sock->sk->sk_dst_cache); >> + if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK) >> + loopback = true; >> + rcu_read_unlock(); > > Hi Mike, > I tested it, it works. The customer also confirmed that it fixes the > deadlock on his setup. You never responded about why/how it's used in production. Is it some sort of clustering or container or what? The login related code can still swing back on you if it's run for a relogin. It would happen if we overqueue and a nop timesout because the iscsi recv thread is waiting for backend resources like a request/queue slot, or if management tools disable/enable the tpgt for reconfigs, etc.
On Mon, 2023-02-13 at 10:22 -0600, Mike Christie wrote: > On 2/13/23 5:59 AM, Maurizio Lombardi wrote: > > st 8. 2. 2023 v 21:10 odesílatel David Jeffery <djeffery@redhat.com > > > napsal: > > > > > > + /* > > > + * If the iscsi connection is over a loopback device from > > > using > > > + * iscsi and iscsit on the same system, we need to set > > > memalloc_noio to > > > + * prevent memory allocation deadlocks between target and > > > initiator. > > > + */ > > > + rcu_read_lock(); > > > + dst = rcu_dereference(conn->sock->sk->sk_dst_cache); > > > + if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK) > > > + loopback = true; > > > + rcu_read_unlock(); > > > > Hi Mike, > > I tested it, it works. The customer also confirmed that it fixes > > the > > deadlock on his setup. > > You never responded about why/how it's used in production. Is it some > sort > of clustering or container or what? > > The login related code can still swing back on you if it's run for a > relogin. > It would happen if we overqueue and a nop timesout because the iscsi > recv thread > is waiting for backend resources like a request/queue slot, or if > management tools > disable/enable the tpgt for reconfigs, etc. > Hi Mike, The use case described is as follows: "This customer moved their on-premise system to the cloud. Their on-premise system runs with two servers and one external storage and uses data mirroring software to mirror data. When moving to the cloud, customer wanted to implement a data mirror using data mirror software with two instances to reduce the cost of using the cloud infrastructure. To build a system with two instances, we use iSCSI to mirror data between a local disk on one instance and a local disk on the other instance. We coexist iSCSI initiator and target so that data mirroring software can access each disk through a unified interface." Thanks Laurence
diff --git a/drivers/target/iscsi/iscsi_target.c b/drivers/target/iscsi/iscsi_target.c index baf4da7bb3b4..a68e47e2cdf9 100644 --- a/drivers/target/iscsi/iscsi_target.c +++ b/drivers/target/iscsi/iscsi_target.c @@ -16,6 +16,7 @@ #include <linux/vmalloc.h> #include <linux/idr.h> #include <linux/delay.h> +#include <linux/sched/mm.h> #include <linux/sched/signal.h> #include <asm/unaligned.h> #include <linux/inet.h> @@ -4168,7 +4169,10 @@ int iscsi_target_rx_thread(void *arg) { int rc; struct iscsit_conn *conn = arg; + struct dst_entry *dst; bool conn_freed = false; + bool loopback = false; + unsigned int flags; /* * Allow ourselves to be interrupted by SIGINT so that a @@ -4186,8 +4190,25 @@ int iscsi_target_rx_thread(void *arg) if (!conn->conn_transport->iscsit_get_rx_pdu) return 0; + /* + * If the iscsi connection is over a loopback device from using + * iscsi and iscsit on the same system, we need to set memalloc_noio to + * prevent memory allocation deadlocks between target and initiator. + */ + rcu_read_lock(); + dst = rcu_dereference(conn->sock->sk->sk_dst_cache); + if (dst && dst->dev && dst->dev->flags & IFF_LOOPBACK) + loopback = true; + rcu_read_unlock(); + + if (loopback) + flags = memalloc_noio_save(); + conn->conn_transport->iscsit_get_rx_pdu(conn); + if (loopback) + memalloc_noio_restore(flags); + if (!signal_pending(current)) atomic_set(&conn->transport_failed, 1); iscsit_take_action_for_connection_exit(conn, &conn_freed);