From patchwork Thu Mar 2 13:06:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Kellermann X-Patchwork-Id: 63422 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp4234887wrd; Thu, 2 Mar 2023 05:31:10 -0800 (PST) X-Google-Smtp-Source: AK7set/+F0bh8HNubYjZ0W50IdAJSpiJoXEUZrVlV/P2cK6sa2YEYnQaTygRR0OCGECiqjZh/LvO X-Received: by 2002:aa7:cd59:0:b0:4bc:502e:e7de with SMTP id v25-20020aa7cd59000000b004bc502ee7demr5878428edw.32.1677763870024; Thu, 02 Mar 2023 05:31:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677763870; cv=none; d=google.com; s=arc-20160816; b=e1t0kPLwqdQC0G3v117OyALGG+9QicJdBsWmLRBI+Fv3/A69ppiplG0f1WBdabtDLq dlnHR6Uktgy47vlzSJEcydapd45nlAc7Z60x5nJdnlSBfavmCQ/NOwJ4CrfZiWkTozP8 bviBqweaZ0717SGOfijHAjw33GCCJVhLSzAToV88JmD5rQTeLpp+A17cLEPT8DygYE7f xuOzKwAp7E5FWwD+fcekALm8Yu4aYh1UqdOAJqBYfZl7LXQKfguxXrNKE1NWBEOBi9wd qRJreBJhHaUcIVQZb6F8BkmLymIbqOVuWj+8YFxgpPEas1Wk3lW3cO2NRxtBL4MgDr2S /Vow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=W8jKwCX/ceD2lfLDQ/Jr6zujwOsAn91spsgYEW7AGCg=; b=GShr1zeUMvYCK0hvmJJQphX6UP9WNrEucsLDnETtwy049cF4jW/mbyDTuodaeMnkSH M8+k2R/T6S7KLaHEYNUQ0cP16YUEjVVqHrgfNy9vB5nPBcWQLvNlsX0zpNZLYWcMieFa EFAakdMNWpB3DWe10RhOUkOOvGcy/35axobuR0bCLrcGTNdNwzWaNducNi0rHV/g3x1Q 9UtFrYrUWlGyQyHtNBF+CwP8rpz4PhzKP/fC6bZ74Kprd4jdgDY602h0mC2/IZIDdm1y SqMHJq+yAVlSE0pDWCNhePhoNwVyrXDb2kUF2ocbRempBfu7FSoqVlaIH8+WYJ64E68X 7buA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ionos.com header.s=google header.b=Un8P68sD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=ionos.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i25-20020a05640200d900b004acc6c7a631si7203901edu.179.2023.03.02.05.30.45; Thu, 02 Mar 2023 05:31:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ionos.com header.s=google header.b=Un8P68sD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=ionos.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229854AbjCBNHX (ORCPT + 99 others); Thu, 2 Mar 2023 08:07:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40078 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229541AbjCBNHW (ORCPT ); Thu, 2 Mar 2023 08:07:22 -0500 Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC0134C6E3 for ; Thu, 2 Mar 2023 05:07:10 -0800 (PST) Received: by mail-ed1-x533.google.com with SMTP id i34so67280719eda.7 for ; Thu, 02 Mar 2023 05:07:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ionos.com; s=google; t=1677762429; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=W8jKwCX/ceD2lfLDQ/Jr6zujwOsAn91spsgYEW7AGCg=; b=Un8P68sDgNULbDppani3GM3C8MmI6xcIfnij8M8YBQAxvuXcISMITPUWWHWjfP+h5H Wnvw1Y7zUBhL/Hk4zIDW5p+ceoGX/NgygLRlPsmpp7wrlkzmQ4aBDjEKCK4ObHnxmq/G ZptXxjMgrZB8N9Rg+y/qf6RmJRNP62OFN1kgErkOmEpitXHY0gE+3xwHO+N5xDTUxJCJ nFv1fn9+rjoiLfnkm/ObdbUB9VBbmLPVYEHDN/QEYLgrtlcz6RaSE+qGR6hyiSL4JCZl rJTrHZTGcesn7kAhpY28+ll5rsPkMcI31tH9ABecg+JQSEH2VMeIFZn+ldS9vdt6if6V 1CFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1677762429; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=W8jKwCX/ceD2lfLDQ/Jr6zujwOsAn91spsgYEW7AGCg=; b=6FzGoG7aIF3Q40gH+POQmfX2p/p8pRbtginpB1M9e/wKJKkHKOltGOzWzDlBE/CPpg SUd7jJrRHM+nwsGUeI7u7vaqH32PSvNng9m1Akqu1cgRXy/f4FLY7bYjLYsSmcQkf8ZV AJ9Rq+wPfjEu7YWwWxBQxPOdb8MWnvMGkGsIWwzeKY6HG0hPrCdAv4jJpViYEdpH4QI1 ORwG1BnaLWq+DCiIBxCNhhadoOYDDZ+4P29eAxAZ9iYzzXnlKh8JvfQ46HjJ5StV5w80 wnOvNYPb7UM4M4ZEmcNGfzxp5hJHCaQqTVYeXBAlJgQbL3SaBTbAe+rvmBy3PxteBcLR xITw== X-Gm-Message-State: AO0yUKW/xr6Mpnxdx/536lIAXkKkHjPJ1uacWoqlIDrlO2vHyfLOo5i1 BlfVT1lcQL+TwBWEZkwYtcQSYw== X-Received: by 2002:a05:6402:1602:b0:4c2:96d0:c0cb with SMTP id f2-20020a056402160200b004c296d0c0cbmr395452edv.23.1677762429156; Thu, 02 Mar 2023 05:07:09 -0800 (PST) Received: from heron.intern.cm-ag (p200300dc6f390200529a4cfffe3dd983.dip0.t-ipconnect.de. [2003:dc:6f39:200:529a:4cff:fe3d:d983]) by smtp.gmail.com with ESMTPSA id b8-20020a509f08000000b004c041723816sm703848edf.89.2023.03.02.05.07.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Mar 2023 05:07:08 -0800 (PST) From: Max Kellermann To: xiubli@redhat.com, idryomov@gmail.com, jlayton@kernel.org, ceph-devel@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Max Kellermann , stable@vger.kernel.org Subject: [PATCH] fs/ceph/mds_client: ignore responses for waiting requests Date: Thu, 2 Mar 2023 14:06:50 +0100 Message-Id: <20230302130650.2209938-1-max.kellermann@ionos.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759262927701596316?= X-GMAIL-MSGID: =?utf-8?q?1759262927701596316?= If a request is put on the waiting list, its submission is postponed until the session becomes ready (e.g. via `mdsc->waiting_for_map` or `session->s_waiting`). If a `CEPH_MSG_CLIENT_REPLY` happens to be received before `CEPH_MSG_MDS_MAP`, the request gets freed, and then this assertion fails: WARN_ON_ONCE(!list_empty(&req->r_wait)); This occurred on a server after the Ceph MDS connection failed, and a corrupt reply packet was received for a waiting request: libceph: mds1 (1)10.0.0.10:6801 socket error on write libceph: mds1 (1)10.0.0.10:6801 session reset ceph: mds1 closed our session ceph: mds1 reconnect start ceph: mds1 reconnect success ceph: problem parsing mds trace -5 ceph: mds parse_reply err -5 ceph: mdsc_handle_reply got corrupt reply mds1(tid:5530222) [...] ------------[ cut here ]------------ WARNING: CPU: 9 PID: 229180 at fs/ceph/mds_client.c:966 ceph_mdsc_release_request+0x17a/0x180 Modules linked in: CPU: 9 PID: 229180 Comm: kworker/9:3 Not tainted 6.1.8-cm4all1 #45 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 Workqueue: ceph-msgr ceph_con_workfn RIP: 0010:ceph_mdsc_release_request+0x17a/0x180 Code: 39 d8 75 26 5b 48 89 ee 48 8b 3d f9 2d 04 02 5d e9 fb 01 c9 ff e8 56 85 ab ff eb 9c 48 8b 7f 58 e8 db 4d ff ff e9 a4 fe ff ff <0f> 0b eb d6 66 90 0f 1f 44 00 00 41 54 48 8d 86 b8 03 00 00 55 4c RSP: 0018:ffffa6f2c0e2bd20 EFLAGS: 00010287 RAX: ffff8f58b93687f8 RBX: ffff8f592f6374a8 RCX: 0000000000000aed RDX: 0000000000000ac0 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffff8f592f637148 R08: 0000000000000001 R09: ffffffffa901de00 R10: 0000000000000001 R11: ffffd630ad09dfc8 R12: ffff8f58b9368000 R13: ffff8f5806b33800 R14: ffff8f58894f6780 R15: 000000000054626e FS: 0000000000000000(0000) GS:ffff8f630f040000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007ffc2926df68 CR3: 0000000218dce002 CR4: 00000000001706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: mds_dispatch+0xec5/0x1460 ? inet_recvmsg+0x4d/0xf0 ceph_con_process_message+0x6b/0x80 ceph_con_v1_try_read+0xb92/0x1400 ceph_con_workfn+0x383/0x4e0 process_one_work+0x1da/0x370 ? process_one_work+0x370/0x370 worker_thread+0x4d/0x3c0 ? process_one_work+0x370/0x370 kthread+0xbb/0xe0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 ---[ end trace 0000000000000000 ]--- ceph: mds1 caps renewed If we know that a request has not yet been submitted, we should ignore all responses for it, just like we ignore responses for unknown TIDs. To: ceph-devel@vger.kernel.org Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann --- fs/ceph/mds_client.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 27a245d959c0..fa74fdb2cbfb 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -3275,6 +3275,13 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) } dout("handle_reply %p\n", req); + /* waiting, not yet submitted? */ + if (!list_empty(&req->r_wait)) { + pr_err("mdsc_handle_reply on waiting request tid %llu\n", tid); + mutex_unlock(&mdsc->mutex); + goto out; + } + /* correct session? */ if (req->r_session != session) { pr_err("mdsc_handle_reply got %llu on session mds%d"