Message ID | cover.1669036433.git.bcodding@redhat.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1591034wrr; Mon, 21 Nov 2022 05:37:07 -0800 (PST) X-Google-Smtp-Source: AA0mqf59SeVFVCYNYFwwvOe+UIrDLZ0VrFPeBZeuVDx4I2L5qEpsV3xpZZ4fZFfhyirNCTLGhNk7 X-Received: by 2002:a17:902:c7cc:b0:188:537d:78d8 with SMTP id r12-20020a170902c7cc00b00188537d78d8mr3253203pla.37.1669037826972; Mon, 21 Nov 2022 05:37:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669037826; cv=none; d=google.com; s=arc-20160816; b=f4SS8XCZZfvBXUYZfrYbHo7lmX/d2wh4jieNpnaofBcwhOrVQ/bdAQ/w/u671NlGH6 M4goeJRu4AMDe0ftSxUjErk7eiPfWeAplOPbGsbgJ88vUKkhNo7OXvwXSN/tsiBldSlh Pa1D6GDhSHiTLbYmcOn6MJdQfPbw+uj0gwryQSN3LR5mWEX9SOA7aSlvScVzA1WKrU8v ZkoRAYSGIioQtVlotLP9CQ7j641npxOxCj//jGeeUCLe5cC33y54vMpqm35vlSwqDKs6 AGnn+9HgPjt3ZUCS2Q3g5BGXoht1JZbdMCpYXkILpBDKI6bgooRuD3Z32zLqlE9LBls7 xo2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=kv3DFPK60Zt0kQKX09txPuy6ctuc3jatg8rhDf4HcIA=; b=iVMKKXUQySLV6fu3JumXsxjPHv7rTlVIv0EsgiwA9kP6EIo6+0DpVFqiWQB+ZI6tA0 FcWBfQDOH9j4PYrnTO+pMvtS0w+7mBp6CUH771KGhfMChKBbLZ60s8D/xpH37njJYBBl 5B8eDcD+Yl4entv6lgit5Tb479a8miQspOru31BZjKvCazkO4Og8Kx6zitW0UjgRGgEL X8Th44Hd3Tc3YDUqrcy7DRoO80fvsfOW9RrSiCO2jhP7uopifmnia9T5Pbm3N5xUxiZX gYd7FfkW/3C+T6MulcQZK/G0xEz5r7vQuTVeeWB7czpiJo15b2G1O2qtvFFZsnjT6iXE nDdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fND9JFM9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b9-20020a170902bd4900b001867d0e4307si10365497plx.372.2022.11.21.05.36.52; Mon, 21 Nov 2022 05:37:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fND9JFM9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230497AbiKUNgd (ORCPT <rfc822;cjcooper78@gmail.com> + 99 others); Mon, 21 Nov 2022 08:36:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230481AbiKUNg1 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 21 Nov 2022 08:36:27 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 33BD9C2860 for <linux-kernel@vger.kernel.org>; Mon, 21 Nov 2022 05:35:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669037726; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=kv3DFPK60Zt0kQKX09txPuy6ctuc3jatg8rhDf4HcIA=; b=fND9JFM9bIWk34xKMRNl9RwHdoOOgNsjO3OyM0Gb3CP+sNsJukpXSpp6+i+Dcek7kY8NrY nqHmKieCBiLP26LOx+Ukug4UxqNBfkPperXdWoQL5K/e997TZu/gIgseyKUr8HgGYwweAT jhbC0xDZiJf5OrCLQiLV9uaAcyS+3X8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-651-P1zce99uPti9aGvqoLm26g-1; Mon, 21 Nov 2022 08:35:23 -0500 X-MC-Unique: P1zce99uPti9aGvqoLm26g-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E89FA887400; Mon, 21 Nov 2022 13:35:22 +0000 (UTC) Received: from bcodding.csb (unknown [10.22.50.7]) by smtp.corp.redhat.com (Postfix) with ESMTP id B230117585; Mon, 21 Nov 2022 13:35:22 +0000 (UTC) Received: by bcodding.csb (Postfix, from userid 24008) id 33D9010C30E3; Mon, 21 Nov 2022 08:35:19 -0500 (EST) From: Benjamin Coddington <bcodding@redhat.com> To: netdev@vger.kernel.org Cc: linux-kernel@vger.kernel.org Subject: [PATCH v1 0/3] Stop corrupting socket's task_frag Date: Mon, 21 Nov 2022 08:35:16 -0500 Message-Id: <cover.1669036433.git.bcodding@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750113008208711543?= X-GMAIL-MSGID: =?utf-8?q?1750113008208711543?= |
Series |
Stop corrupting socket's task_frag
|
|
Message
Benjamin Coddington
Nov. 21, 2022, 1:35 p.m. UTC
The networking code uses flags in sk_allocation to determine if it can use current->task_frag, however in-kernel users of sockets may stop setting sk_allocation when they convert to the preferred memalloc_nofs_save/restore, as SUNRPC has done in commit a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save() on all rpciod/xprtiod jobs"). This will cause corruption in current->task_frag when recursing into the network layer for those subsystems during page fault or reclaim. The corruption is difficult to diagnose because stack traces may not contain the offending subsystem at all. The corruption is unlikely to show up in testing because it requires memory pressure, and so subsystems that convert to memalloc_nofs_save/restore are likely to continue to run into this issue. Previous reports and proposed fixes: https://lore.kernel.org/netdev/96a18bd00cbc6cb554603cc0d6ef1c551965b078.1663762494.git.gnault@redhat.com/ https://lore.kernel.org/netdev/b4d8cb09c913d3e34f853736f3f5628abfd7f4b6.1656699567.git.gnault@redhat.com/ https://lore.kernel.org/linux-nfs/de6d99321d1dcaa2ad456b92b3680aa77c07a747.1665401788.git.gnault@redhat.com/ Guilluame Nault has done all of the hard work tracking this problem down and finding the best fix for this issue. I'm just taking a turn posting another fix. Benjamin Coddington (2): Treewide: Stop corrupting socket's task_frag net: simplify sk_page_frag Guillaume Nault (1): net: Introduce sk_use_task_frag in struct sock. drivers/block/drbd/drbd_receiver.c | 3 +++ drivers/block/nbd.c | 1 + drivers/nvme/host/tcp.c | 1 + drivers/scsi/iscsi_tcp.c | 1 + drivers/usb/usbip/usbip_common.c | 1 + fs/afs/rxrpc.c | 1 + fs/cifs/connect.c | 1 + fs/dlm/lowcomms.c | 2 ++ fs/ocfs2/cluster/tcp.c | 1 + include/net/sock.h | 10 ++++++---- net/9p/trans_fd.c | 1 + net/ceph/messenger.c | 1 + net/core/sock.c | 1 + net/sunrpc/xprtsock.c | 3 +++ 14 files changed, 24 insertions(+), 4 deletions(-)
Comments
Hi Dave, Eric, Jakub, Paolo, I think it makes sense for all three of these to go together through netdev. If you agree, would you like me to chase down individual ACKs for each treewide touch? What can I do from netdev's perspective to move this forward? Ben On 21 Nov 2022, at 8:35, Benjamin Coddington wrote: > The networking code uses flags in sk_allocation to determine if it can use > current->task_frag, however in-kernel users of sockets may stop setting > sk_allocation when they convert to the preferred memalloc_nofs_save/restore, > as SUNRPC has done in commit a1231fda7e94 ("SUNRPC: Set memalloc_nofs_save() > on all rpciod/xprtiod jobs"). > > This will cause corruption in current->task_frag when recursing into the > network layer for those subsystems during page fault or reclaim. The > corruption is difficult to diagnose because stack traces may not contain the > offending subsystem at all. The corruption is unlikely to show up in > testing because it requires memory pressure, and so subsystems that > convert to memalloc_nofs_save/restore are likely to continue to run into > this issue. > > Previous reports and proposed fixes: > https://lore.kernel.org/netdev/96a18bd00cbc6cb554603cc0d6ef1c551965b078.1663762494.git.gnault@redhat.com/ > https://lore.kernel.org/netdev/b4d8cb09c913d3e34f853736f3f5628abfd7f4b6.1656699567.git.gnault@redhat.com/ > https://lore.kernel.org/linux-nfs/de6d99321d1dcaa2ad456b92b3680aa77c07a747.1665401788.git.gnault@redhat.com/ > > Guilluame Nault has done all of the hard work tracking this problem down and > finding the best fix for this issue. I'm just taking a turn posting another > fix. > > Benjamin Coddington (2): > Treewide: Stop corrupting socket's task_frag > net: simplify sk_page_frag > > Guillaume Nault (1): > net: Introduce sk_use_task_frag in struct sock. > > drivers/block/drbd/drbd_receiver.c | 3 +++ > drivers/block/nbd.c | 1 + > drivers/nvme/host/tcp.c | 1 + > drivers/scsi/iscsi_tcp.c | 1 + > drivers/usb/usbip/usbip_common.c | 1 + > fs/afs/rxrpc.c | 1 + > fs/cifs/connect.c | 1 + > fs/dlm/lowcomms.c | 2 ++ > fs/ocfs2/cluster/tcp.c | 1 + > include/net/sock.h | 10 ++++++---- > net/9p/trans_fd.c | 1 + > net/ceph/messenger.c | 1 + > net/core/sock.c | 1 + > net/sunrpc/xprtsock.c | 3 +++ > 14 files changed, 24 insertions(+), 4 deletions(-) > > -- > 2.31.1