From patchwork Tue Feb 7 02:10:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Demi Marie Obenour X-Patchwork-Id: 53596 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2594560wrn; Mon, 6 Feb 2023 18:15:06 -0800 (PST) X-Google-Smtp-Source: AK7set+eAwg3EFAlcjfPy4USJf3KcH7ix83f3Y/jvhNf1qlH2LHPsc4403rH+/EzHjmj1XhShNNG X-Received: by 2002:a17:902:d0ca:b0:194:d78a:fc80 with SMTP id n10-20020a170902d0ca00b00194d78afc80mr830726pln.61.1675736105795; Mon, 06 Feb 2023 18:15:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675736105; cv=none; d=google.com; s=arc-20160816; b=noMubh8WE41QTLYr2htZ9W+2TFf+3uqp//GHjZtw0CEI5PkURkNLb2QWQdRI9InwDZ qXjzcyZrsSXCeV5pTrxQFeNt6hePNBqjijIg8z/BG6r0//AdxRaEOuWG8AqbIOsG8Z9/ sDqJBaWlPwh1EMrSOpoIi9fjpM8j0mkeCnFmZAAv2o6LrbqcnpEEV0PoE+aI7jVIvFhW KGHizlj7anwfUMT5JkUN/sh2yZZPvxajSf1wpNaVCTPDOCMhgfeXq1rjym26daobUfZX I2C9X7LQMCXiIfBhIpuV40IxTgqDcS+kWKUZ33W4jb6PKZ/Tef5XXjLvwim+7zINO6A1 T3qg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:feedback-id:dkim-signature :dkim-signature; bh=0/QcbN0ctdlX8hZfS9z2Phx50pOCQWlpB7rg/xsi3T8=; b=jR1KjeNyF1WHVUaRagIqeQdU4Or8bgj4+S+5mEAWxiLvBW72xlHUNOKvlZeJ551FwA bBVniArVtE5JefVDXNJiVwbxvTwM9yJkNN573kQ01YrcO56PZz66y2lF2CDobHYLOrFw hN0ZmMmZRWLuX8fUeIoZIAefFAmiaCmVlMAfMMXA4KcqHjm7SkcvwkRbEj2EoQDYmS2o Q9gdwAo8cIN7uuvU1Fu5DZyDPxOc9jjEJ1X9UYjWnSoVSy2Eiirp6yO5BFKrQ7STwUoX 0KfBqnyIvEFI5iqWteFBX8TjBqq/cJ4+MquEHaR9hwAiY19IswvBtCmg0G12HZOVW1sE iOCw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@invisiblethingslab.com header.s=fm3 header.b=gKIMsGmT; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=hb2PbVPN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x16-20020a170902821000b001963669c7d4si11800525pln.532.2023.02.06.18.14.53; Mon, 06 Feb 2023 18:15:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@invisiblethingslab.com header.s=fm3 header.b=gKIMsGmT; dkim=pass header.i=@messagingengine.com header.s=fm3 header.b=hb2PbVPN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229628AbjBGCLA (ORCPT + 99 others); Mon, 6 Feb 2023 21:11:00 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56278 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229745AbjBGCK5 (ORCPT ); Mon, 6 Feb 2023 21:10:57 -0500 Received: from out2-smtp.messagingengine.com (out2-smtp.messagingengine.com [66.111.4.26]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C05C82CFDF; Mon, 6 Feb 2023 18:10:41 -0800 (PST) Received: from compute6.internal (compute6.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 2898B5C0172; Mon, 6 Feb 2023 21:10:41 -0500 (EST) Received: from mailfrontend2 ([10.202.2.163]) by compute6.internal (MEProxy); Mon, 06 Feb 2023 21:10:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= invisiblethingslab.com; h=cc:cc:content-transfer-encoding:date :date:from:from:in-reply-to:message-id:mime-version:reply-to :sender:subject:subject:to:to; s=fm3; t=1675735841; x= 1675822241; bh=0/QcbN0ctdlX8hZfS9z2Phx50pOCQWlpB7rg/xsi3T8=; b=g KIMsGmT/w5H9y1dmBcwk3+EuiJVcncOMk+yfPyNXHgrIcp4CDmqk2RFQQy4RJwhB FVtgklsUkVPcUHpc063Zw7r4pKQCr0mDeKglx3SMNE19OXcbccjitm09hHq5aFYe QA0LPGu1N8xInLYh2QLzBj0CVo4eioeTLPVHXbnEElLcrJ1IFMH8Czc/IPmnZkrH cEoSnslw4s4aII+uGqT+wIuHnUhTO8rpJWqnac1eaydxzI67KGQ7sEmWryf0UznS cnCZgyikSbrkOOt71NO0EGTNZnniIs673RSLus88v8A6XnD1fS4ixVJgeRRybNdQ wMQiRa4k8DBuU4cEJN8Ow== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding:date:date :feedback-id:feedback-id:from:from:in-reply-to:message-id :mime-version:reply-to:sender:subject:subject:to:to:x-me-proxy :x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1675735841; x=1675822241; bh=0/QcbN0ctdlX8hZfS9z2Phx50pOCQWlpB7r g/xsi3T8=; b=hb2PbVPN3WfbPNLBQ9T4tGr5IrsBE7dbEnpTs8tq9aMNH8NkGVP JY+CW8PDPHWa76W7eElX52MQRkTtijle4Q7tOsCeHQLZgDdZQXu8R2ZHnkBh9pH5 GtkhocApSdYzy2ig8uylwpG41Kj+zTP/Ym7Hb2nvtzxvvyzOmIF/t1I3wesrCvSp GP4McVSIEJJaM44SuiOUdMKfZSmUjlr0WZ/8f98fO1JjmXHBbFdLTrKbuKXtBcKx cTaA/Ck7UoX6QmVeNlSjLycSJq2eYLJdgObIur+bzJCSGPU/dOdgsr7DjfJi1iFM 6mq2T+vgHNxmpAEzaNHXUl9Kq0lERwJN0fw== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvhedrudegjedggeefucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhephffvvefufffkofgggfestdekredtredttdenucfhrhhomhepffgvmhhiucfo rghrihgvucfqsggvnhhouhhruceouggvmhhisehinhhvihhsihgslhgvthhhihhnghhslh grsgdrtghomheqnecuggftrfgrthhtvghrnhepvdefgeekvdekgfffgeekhfeijedtffek hefhleehfeejueetgfelgefgtdevieelnecuvehluhhsthgvrhfuihiivgeptdenucfrrg hrrghmpehmrghilhhfrhhomhepuggvmhhisehinhhvihhsihgslhgvthhhihhnghhslhgr sgdrtghomh X-ME-Proxy: Feedback-ID: iac594737:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 6 Feb 2023 21:10:39 -0500 (EST) From: Demi Marie Obenour To: Juergen Gross , Stefano Stabellini , Oleksandr Tyshchenko Cc: Demi Marie Obenour , =?utf-8?q?Marek_Marczy?= =?utf-8?q?kowski-G=C3=B3recki?= , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH] xen: speed up grant-table reclaim Date: Mon, 6 Feb 2023 21:10:33 -0500 Message-Id: <20230207021033.1797-1-demi@invisiblethingslab.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757134714501597549?= X-GMAIL-MSGID: =?utf-8?q?1757136662863428912?= When a grant entry is still in use by the remote domain, Linux must put it on a deferred list. Normally, this list is very short, because the PV network and block protocols expect the backend to unmap the grant first. However, Qubes OS's GUI protocol is subject to the constraints of the X Window System, and as such winds up with the frontend unmapping the window first. As a result, the list can grow very large, resulting in a massive memory leak and eventual VM freeze. Fix this problem by bumping the number of entries that the VM will attempt to free at each iteration to 10000. This is an ugly hack that may well make a denial of service easier, but for Qubes OS that is less bad than the problem Qubes OS users are facing today. There really needs to be a way for a frontend to be notified when the backend has unmapped the grants. Additionally, a module parameter is provided to allow tuning the reclaim speed. The code previously used printk(KERN_DEBUG) whenever it had to defer reclaiming a page because the grant was still mapped. This resulted in a large volume of log messages that bothered users. Use pr_debug instead, which suppresses the messages by default. Developers can enable them using the dynamic debug mechanism. Fixes: QubesOS/qubes-issues#7410 (memory leak) Fixes: QubesOS/qubes-issues#7359 (excessive logging) Fixes: 569ca5b3f94c ("xen/gnttab: add deferred freeing logic") Cc: stable@vger.kernel.org Signed-off-by: Demi Marie Obenour --- Anyone have suggestions for improving the grant mechanism? Argo isn't a good option, as in the GUI protocol there are substantial performance wins to be had by using true shared memory. Resending as I forgot the Signed-off-by on the first submission. Sorry about that. drivers/xen/grant-table.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/xen/grant-table.c b/drivers/xen/grant-table.c index 5c83d41..2c2faa7 100644 --- a/drivers/xen/grant-table.c +++ b/drivers/xen/grant-table.c @@ -355,14 +355,20 @@ static void gnttab_handle_deferred(struct timer_list *); static DEFINE_TIMER(deferred_timer, gnttab_handle_deferred); +static atomic64_t deferred_count; +static atomic64_t leaked_count; +static unsigned int free_per_iteration = 10000; + static void gnttab_handle_deferred(struct timer_list *unused) { - unsigned int nr = 10; + unsigned int nr = READ_ONCE(free_per_iteration); + const bool ignore_limit = nr == 0; struct deferred_entry *first = NULL; unsigned long flags; + size_t freed = 0; spin_lock_irqsave(&gnttab_list_lock, flags); - while (nr--) { + while ((ignore_limit || nr--) && !list_empty(&deferred_list)) { struct deferred_entry *entry = list_first_entry(&deferred_list, struct deferred_entry, list); @@ -372,10 +378,13 @@ list_del(&entry->list); spin_unlock_irqrestore(&gnttab_list_lock, flags); if (_gnttab_end_foreign_access_ref(entry->ref)) { + uint64_t ret = atomic64_sub_return(1, &deferred_count); put_free_entry(entry->ref); - pr_debug("freeing g.e. %#x (pfn %#lx)\n", - entry->ref, page_to_pfn(entry->page)); + pr_debug("freeing g.e. %#x (pfn %#lx), %llu remaining\n", + entry->ref, page_to_pfn(entry->page), + (unsigned long long)ret); put_page(entry->page); + freed++; kfree(entry); entry = NULL; } else { @@ -387,14 +396,15 @@ spin_lock_irqsave(&gnttab_list_lock, flags); if (entry) list_add_tail(&entry->list, &deferred_list); - else if (list_empty(&deferred_list)) - break; } - if (!list_empty(&deferred_list) && !timer_pending(&deferred_timer)) { + if (list_empty(&deferred_list)) + WARN_ON(atomic64_read(&deferred_count)); + else if (!timer_pending(&deferred_timer)) { deferred_timer.expires = jiffies + HZ; add_timer(&deferred_timer); } spin_unlock_irqrestore(&gnttab_list_lock, flags); + pr_debug("Freed %zu references", freed); } static void gnttab_add_deferred(grant_ref_t ref, struct page *page) @@ -402,7 +412,7 @@ { struct deferred_entry *entry; gfp_t gfp = (in_atomic() || irqs_disabled()) ? GFP_ATOMIC : GFP_KERNEL; - const char *what = KERN_WARNING "leaking"; + uint64_t leaked, deferred; entry = kmalloc(sizeof(*entry), gfp); if (!page) { @@ -426,12 +436,20 @@ add_timer(&deferred_timer); } spin_unlock_irqrestore(&gnttab_list_lock, flags); - what = KERN_DEBUG "deferring"; + deferred = atomic64_add_return(1, &deferred_count); + leaked = atomic64_read(&leaked_count); + pr_debug("deferring g.e. %#x (pfn %#lx) (total deferred %llu, total leaked %llu)\n", + ref, page ? page_to_pfn(page) : -1, deferred, leaked); + } else { + deferred = atomic64_read(&deferred_count); + leaked = atomic64_add_return(1, &leaked_count); + pr_warn("leaking g.e. %#x (pfn %#lx) (total deferred %llu, total leaked %llu)\n", + ref, page ? page_to_pfn(page) : -1, deferred, leaked); } - printk("%s g.e. %#x (pfn %#lx)\n", - what, ref, page ? page_to_pfn(page) : -1); } +module_param(free_per_iteration, uint, 0600); + int gnttab_try_end_foreign_access(grant_ref_t ref) { int ret = _gnttab_end_foreign_access_ref(ref);