Message ID | 20231016132819.1002933-17-michael.roth@amd.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3463295vqb; Mon, 16 Oct 2023 06:33:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEzt46oo+HeJ5lcXkSta9Dvwbd6SSkAi8sIcm0HP3UuAaDFJs2+VzgtzNErrtPOxegRh9On X-Received: by 2002:a17:902:e80a:b0:1c4:1e65:1e5e with SMTP id u10-20020a170902e80a00b001c41e651e5emr38597475plg.0.1697463185802; Mon, 16 Oct 2023 06:33:05 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697463185; cv=pass; d=google.com; s=arc-20160816; b=tXGuR0tPJq7LiyzlS4ucra8Rr5hrj2a1Imf1U496tLC+cmr10IW6lEagKln52tKhrC Ng2iF/FZ7Ld6b4Uxou+rEDzYuYDlkKOvlgmNiniQhBbFHTai/hDVXaUnhZse1VqbxUU+ 0ZWKP97pCPYSKyAMk9eZdrkhU8Cdk9nBwvR5SXyjX/vZYIYsgHZavWr435Wg/y0XgXDP CCnkYq605yOpbTFXfDQmRlersue4RG9z5EDIVXaP5XsUIdeoQuve/EKNP7FiZCmhTxSJ sPIwL94Gzv+QzCTRiBDC0XV5Z9eIlmTa175qlQ2bi757aye5YfxcKkJRRG4BFl3oNVko epLQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=fSDz8HX83fzrJWcQ6dpqgPzz9CCVnpNKgbpzcgQZtjY=; fh=Dh0Aiq0CQbwIAB6B+HAGBIRRAIsSCgT5f9hTVp3Wyy8=; b=w+XzlWyazPMKeojj9LQD/Qr6dZlVdOQrt2Mm1pWI2w1xt/SfVKSL0RDYuXep1kFv9e xKgH+wCjVKBkrt1+PJtpFZVwtTmoSynoEvy4YPAYVAVfHQAMk3PK/tWCkfBRWvJdnd51 vivI5QRzfFO7Uec1WHYs1hmFwyEy/V1tk9I1+P/ohvFDSaA6ysONUCT5xUBDrfmWx0PV irlqDsmeIOJ0HYVZeQQ5y1bd/V3TY8APXJ81cPqBO7tYga3emZexe0GpkFezPRniX2za nWqkMp5iFvtYX3gwaNCMC0ImyqkMy8R4Qnr35XdYz00eOmxlQWDknfrFTwuMz8KVELbL h8SA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=l0QBYrYd; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id g22-20020a170902869600b001b3d6c68bd1si10251954plo.643.2023.10.16.06.33.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 06:33:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=l0QBYrYd; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 7CB038024C5D; Mon, 16 Oct 2023 06:33:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233782AbjJPNcx (ORCPT <rfc822;hjfbswb@gmail.com> + 18 others); Mon, 16 Oct 2023 09:32:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47282 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233769AbjJPNcg (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 16 Oct 2023 09:32:36 -0400 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2059.outbound.protection.outlook.com [40.107.93.59]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 46335AC; Mon, 16 Oct 2023 06:32:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=d6l/2zHK4BwlLFLbgiwlR0FazVsnumqmgG2Fln8fp/fvVFAnDWtBoe1Z3WBYf3yhBQ81xwI5BlTviQZc/HnHURUOBNn7sQdmnlF/OCI+iCQXLEfxbWibg26gUtVwqUpUseH8oIH94oPV9k9MFDa1gJMIqa9xEZQM8WL3aKbjGibzfzgnO2VM4W1o7PwVFuqMBoTLbu+LZliFghhPSq0ZztWvZxmkwzWSgU+sfKUVqjzYQB53DBIAsCGvK9xy/NBWE+JMamgFHR2YOPCulDICyR5O4ndVIN4SkKmKCTTJChvvSA6V1F+d+K9Rid9zm0ENXZhBfEXq4PTaPWe5980wzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fSDz8HX83fzrJWcQ6dpqgPzz9CCVnpNKgbpzcgQZtjY=; b=Pusi5vCkpCqnIIO8UH41mMy2cF+tSs4v8dfKQXFh3MLm6wRIDI4fRkCseYlThtx835vy8tU0WjDea3mti/yvSHcWKYbh9qxkQv+8OYGZWOVdRGyA8nBE5KfzeRTW/ZobvmN2eQ2ASSCxRVXzi5oOReyGNKMqvtGgtKxS35nZm6jlC53+Hy4xFphlcj0+IHIGoRDL4pP6vzjd4i0zCGzMEr4HoXFEm8HBVRjaSUg5tfahr2WL/RSK4mYrDiy+Q3fyIFC014cw8rVcPcE3fPFx6W9CIxi2IzmxEvH5unbXXOsTz9z46YXaHKMsr0z0MSw8+7porOZUWJvS0h8oQsLB8A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fSDz8HX83fzrJWcQ6dpqgPzz9CCVnpNKgbpzcgQZtjY=; b=l0QBYrYdaXfYiBTwhDO3O3k+o0dPgdSH1fPKt+clsOgsVcxLsm09bAry2SZe5Lg7L7pPQqECR/HnhFeBN97VO62v62qpo3neqlvN6d3xJrL6KHf85S3ZUEqBp5QizPDXcAInYUYzOICcD9p6rlJSyALmccu6e/5SD7VA1wa48BM= Received: from MW2PR16CA0021.namprd16.prod.outlook.com (2603:10b6:907::34) by DM6PR12MB5007.namprd12.prod.outlook.com (2603:10b6:5:20d::18) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.34; Mon, 16 Oct 2023 13:32:17 +0000 Received: from MWH0EPF000989E9.namprd02.prod.outlook.com (2603:10b6:907:0:cafe::b) by MW2PR16CA0021.outlook.office365.com (2603:10b6:907::34) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6886.35 via Frontend Transport; Mon, 16 Oct 2023 13:32:17 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by MWH0EPF000989E9.mail.protection.outlook.com (10.167.241.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6838.22 via Frontend Transport; Mon, 16 Oct 2023 13:32:17 +0000 Received: from localhost (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Mon, 16 Oct 2023 08:32:15 -0500 From: Michael Roth <michael.roth@amd.com> To: <kvm@vger.kernel.org> CC: <linux-coco@lists.linux.dev>, <linux-mm@kvack.org>, <linux-crypto@vger.kernel.org>, <x86@kernel.org>, <linux-kernel@vger.kernel.org>, <tglx@linutronix.de>, <mingo@redhat.com>, <jroedel@suse.de>, <thomas.lendacky@amd.com>, <hpa@zytor.com>, <ardb@kernel.org>, <pbonzini@redhat.com>, <seanjc@google.com>, <vkuznets@redhat.com>, <jmattson@google.com>, <luto@kernel.org>, <dave.hansen@linux.intel.com>, <slp@redhat.com>, <pgonda@google.com>, <peterz@infradead.org>, <srinivas.pandruvada@linux.intel.com>, <rientjes@google.com>, <dovmurik@linux.ibm.com>, <tobin@ibm.com>, <bp@alien8.de>, <vbabka@suse.cz>, <kirill@shutemov.name>, <ak@linux.intel.com>, <tony.luck@intel.com>, <marcorr@google.com>, <sathyanarayanan.kuppuswamy@linux.intel.com>, <alpergun@google.com>, <jarkko@kernel.org>, <ashish.kalra@amd.com>, <nikunj.dadhania@amd.com>, <pankaj.gupta@amd.com>, <liam.merwick@oracle.com>, <zhi.a.wang@intel.com> Subject: [PATCH v10 16/50] x86/sev: Introduce snp leaked pages list Date: Mon, 16 Oct 2023 08:27:45 -0500 Message-ID: <20231016132819.1002933-17-michael.roth@amd.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20231016132819.1002933-1-michael.roth@amd.com> References: <20231016132819.1002933-1-michael.roth@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB03.amd.com (10.181.40.144) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWH0EPF000989E9:EE_|DM6PR12MB5007:EE_ X-MS-Office365-Filtering-Correlation-Id: 77a81225-271a-4813-a9cf-08dbce4c5100 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 3NaBGp4rvhpt6y23/ihvigh+xP4Y9+mcIlneZGBLXQK+kl3jiasXj3Hw5IjuLq0JG2z1/eTeILKM/Hv9Q1UZJ8zdzE9KYvRQEoggOz5jeprCVK+/yTWcWlgVyHEa95Ub/DZR0HT59VByNxDfR5S8E0qcvdhLCQ/4qQsaqzuT0W/BZiXtdl8soKTd+E9zmZ8Dygc5F/vT2uf4vlIDc0cs9sl8rcvIqbNkiP7mPP9X1aiBenLb5cvw3JO8B4eXnAFgNl2Ucixp+ixGQvkAwG+IsQGcsh9FX8TC6JXIO4Kx6MIxWKT1ATtoPXCMW9IDN+95igs0gTb08Tx617i4X7TT6OI47shsEHAbCRZ3lqhCjx4dQiFfDTDvOv9aA0JHgn/fNARmZiaNayzkx5LoHx4hidWUbQm3BpLECLyOiDDvMGNYLjfo3QjEbyPynGAkLLyjdhQl6Wi9BCNyVWLW7M2LLbIAOQBFig8UWvSdBl3Qq4EhFpqmk3TB3TTWhjI7gDmbJ2pkkObqb0za1L9nUtz3dCRPq3nQJ5SeUdowCGTVHO6e0uB1bSE+yEqpbDmM1D9zTXOQtRFfTXQZZ0lW87Sp6USoNo9uUq7LlgbxzWe3vrkvPLZnhgC52IU3qVouWUk5TnzhzWS3jNEz7CNWJLJ/EmxCCFVN5C61vaqUu78iandkZYUdwqsd6BKnc/tgxm+ypl5Axh3493hpgoBfBphni6mnhZySCdYL9J2BmcGP3r35nm7W/2UPE9eatWSgFdMk3zR/5yEEW7dQcJSo/iVhpw== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230031)(4636009)(396003)(136003)(39860400002)(376002)(346002)(230922051799003)(82310400011)(64100799003)(451199024)(1800799009)(186009)(36840700001)(40470700004)(46966006)(6666004)(40460700003)(36860700001)(70206006)(8936002)(54906003)(6916009)(356005)(316002)(70586007)(478600001)(81166007)(82740400003)(2616005)(1076003)(26005)(16526019)(8676002)(47076005)(426003)(83380400001)(336012)(4326008)(7406005)(7416002)(40480700001)(5660300002)(2906002)(36756003)(86362001)(41300700001)(44832011)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 16 Oct 2023 13:32:17.3485 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 77a81225-271a-4813-a9cf-08dbce4c5100 X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: MWH0EPF000989E9.namprd02.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB5007 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Mon, 16 Oct 2023 06:33:03 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779919157152315765 X-GMAIL-MSGID: 1779919157152315765 |
Series |
Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support
|
|
Commit Message
Michael Roth
Oct. 16, 2023, 1:27 p.m. UTC
From: Ashish Kalra <ashish.kalra@amd.com> Pages are unsafe to be released back to the page-allocator, if they have been transitioned to firmware/guest state and can't be reclaimed or transitioned back to hypervisor/shared state. In this case add them to an internal leaked pages list to ensure that they are not freed or touched/accessed to cause fatal page faults. Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> [mdr: relocate to arch/x86/coco/sev/host.c] Signed-off-by: Michael Roth <michael.roth@amd.com> --- arch/x86/include/asm/sev-host.h | 3 +++ arch/x86/virt/svm/sev.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+)
Comments
On Mon, Oct 16, 2023 at 08:27:45AM -0500, Michael Roth wrote: > + spin_lock(&snp_leaked_pages_list_lock); > + while (npages--) { > + /* > + * Reuse the page's buddy list for chaining into the leaked > + * pages list. This page should not be on a free list currently > + * and is also unsafe to be added to a free list. > + */ > + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); > + sev_dump_rmpentry(pfn); > + pfn++; > + } > + spin_unlock(&snp_leaked_pages_list_lock); > + atomic_long_inc(&snp_nr_leaked_pages); How is this supposed to count? You're leaking @npages as the function's parameter but are incrementing snp_nr_leaked_pages only once? Just make it a bog-normal unsigned long and increment it inside the locked section. Or do at the beginning of the function: atomic_long_add(npages, &snp_nr_leaked_pages);
On 10/16/23 15:27, Michael Roth wrote: > From: Ashish Kalra <ashish.kalra@amd.com> > > Pages are unsafe to be released back to the page-allocator, if they > have been transitioned to firmware/guest state and can't be reclaimed > or transitioned back to hypervisor/shared state. In this case add > them to an internal leaked pages list to ensure that they are not freed Note the adding to the list doesn't ensure anything like that. Not dropping the refcount to zero does. But tracking them might indeed not be bad for e.g. crashdump investigations so no objection there. > or touched/accessed to cause fatal page faults. > > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> > [mdr: relocate to arch/x86/coco/sev/host.c] > Signed-off-by: Michael Roth <michael.roth@amd.com> > --- > arch/x86/include/asm/sev-host.h | 3 +++ > arch/x86/virt/svm/sev.c | 28 ++++++++++++++++++++++++++++ > 2 files changed, 31 insertions(+) > > diff --git a/arch/x86/include/asm/sev-host.h b/arch/x86/include/asm/sev-host.h > index 1df989411334..7490a665e78f 100644 > --- a/arch/x86/include/asm/sev-host.h > +++ b/arch/x86/include/asm/sev-host.h > @@ -19,6 +19,8 @@ void sev_dump_hva_rmpentry(unsigned long address); > int psmash(u64 pfn); > int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable); > int rmp_make_shared(u64 pfn, enum pg_level level); > +void snp_leak_pages(u64 pfn, unsigned int npages); > + > #else > static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENXIO; } > static inline void sev_dump_hva_rmpentry(unsigned long address) {} > @@ -29,6 +31,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int as > return -ENXIO; > } > static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENXIO; } > +static inline void snp_leak_pages(u64 pfn, unsigned int npages) {} > #endif > > #endif > diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c > index bf9b97046e05..29a69f4b8cfb 100644 > --- a/arch/x86/virt/svm/sev.c > +++ b/arch/x86/virt/svm/sev.c > @@ -59,6 +59,12 @@ struct rmpentry { > static struct rmpentry *rmptable_start __ro_after_init; > static u64 rmptable_max_pfn __ro_after_init; > > +/* list of pages which are leaked and cannot be reclaimed */ > +static LIST_HEAD(snp_leaked_pages_list); > +static DEFINE_SPINLOCK(snp_leaked_pages_list_lock); > + > +static atomic_long_t snp_nr_leaked_pages = ATOMIC_LONG_INIT(0); > + > #undef pr_fmt > #define pr_fmt(fmt) "SEV-SNP: " fmt > > @@ -518,3 +524,25 @@ int rmp_make_shared(u64 pfn, enum pg_level level) > return rmpupdate(pfn, &val); > } > EXPORT_SYMBOL_GPL(rmp_make_shared); > + > +void snp_leak_pages(u64 pfn, unsigned int npages) > +{ > + struct page *page = pfn_to_page(pfn); > + > + pr_debug("%s: leaking PFN range 0x%llx-0x%llx\n", __func__, pfn, pfn + npages); > + > + spin_lock(&snp_leaked_pages_list_lock); > + while (npages--) { > + /* > + * Reuse the page's buddy list for chaining into the leaked > + * pages list. This page should not be on a free list currently > + * and is also unsafe to be added to a free list. > + */ > + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); > + sev_dump_rmpentry(pfn); > + pfn++; You increment pfn, but not page, which is always pointing to the page of the initial pfn, so need to do page++ too. But that assumes it's all order-0 pages (hard to tell for me whether that's true as we start with a pfn), if there can be compound pages, it would be best to only add the head page and skip the tail pages - it's not expected to use page->buddy_list of tail pages. > + } > + spin_unlock(&snp_leaked_pages_list_lock); > + atomic_long_inc(&snp_nr_leaked_pages); > +} > +EXPORT_SYMBOL_GPL(snp_leak_pages);
On 12/6/2023 2:42 PM, Borislav Petkov wrote: > On Mon, Oct 16, 2023 at 08:27:45AM -0500, Michael Roth wrote: >> + spin_lock(&snp_leaked_pages_list_lock); >> + while (npages--) { >> + /* >> + * Reuse the page's buddy list for chaining into the leaked >> + * pages list. This page should not be on a free list currently >> + * and is also unsafe to be added to a free list. >> + */ >> + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); >> + sev_dump_rmpentry(pfn); >> + pfn++; >> + } >> + spin_unlock(&snp_leaked_pages_list_lock); >> + atomic_long_inc(&snp_nr_leaked_pages); > > How is this supposed to count? > > You're leaking @npages as the function's parameter but are incrementing > snp_nr_leaked_pages only once? > > Just make it a bog-normal unsigned long and increment it inside the > locked section. > > Or do at the beginning of the function: > > atomic_long_add(npages, &snp_nr_leaked_pages); > Yes will fix accordingly by incrementing it inside the locked section. Thanks, Ashish
Hello Vlastimil, On 12/7/2023 10:20 AM, Vlastimil Babka wrote: >> + >> +void snp_leak_pages(u64 pfn, unsigned int npages) >> +{ >> + struct page *page = pfn_to_page(pfn); >> + >> + pr_debug("%s: leaking PFN range 0x%llx-0x%llx\n", __func__, pfn, pfn + npages); >> + >> + spin_lock(&snp_leaked_pages_list_lock); >> + while (npages--) { >> + /* >> + * Reuse the page's buddy list for chaining into the leaked >> + * pages list. This page should not be on a free list currently >> + * and is also unsafe to be added to a free list. >> + */ >> + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); >> + sev_dump_rmpentry(pfn); >> + pfn++; > > You increment pfn, but not page, which is always pointing to the page of the > initial pfn, so need to do page++ too. Yes, that is a bug and needs to be fixed. > But that assumes it's all order-0 pages (hard to tell for me whether that's > true as we start with a pfn), if there can be compound pages, it would be > best to only add the head page and skip the tail pages - it's not expected > to use page->buddy_list of tail pages. Can't we use PageCompound() to check if the page is a compound page and then use page->compound_head to get and add the head page to leaked pages list. I understand the tail pages for compound pages are really limited for usage. Thanks, Ashish
On 12/8/23 23:10, Kalra, Ashish wrote: > Hello Vlastimil, > > On 12/7/2023 10:20 AM, Vlastimil Babka wrote: > >>> + >>> +void snp_leak_pages(u64 pfn, unsigned int npages) >>> +{ >>> + struct page *page = pfn_to_page(pfn); >>> + >>> + pr_debug("%s: leaking PFN range 0x%llx-0x%llx\n", __func__, pfn, >>> pfn + npages); >>> + >>> + spin_lock(&snp_leaked_pages_list_lock); >>> + while (npages--) { >>> + /* >>> + * Reuse the page's buddy list for chaining into the leaked >>> + * pages list. This page should not be on a free list currently >>> + * and is also unsafe to be added to a free list. >>> + */ >>> + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); >>> + sev_dump_rmpentry(pfn); >>> + pfn++; >> >> You increment pfn, but not page, which is always pointing to the page >> of the >> initial pfn, so need to do page++ too. > > Yes, that is a bug and needs to be fixed. > >> But that assumes it's all order-0 pages (hard to tell for me whether >> that's >> true as we start with a pfn), if there can be compound pages, it would be >> best to only add the head page and skip the tail pages - it's not >> expected >> to use page->buddy_list of tail pages. > > Can't we use PageCompound() to check if the page is a compound page and > then use page->compound_head to get and add the head page to leaked > pages list. I understand the tail pages for compound pages are really > limited for usage. Yeah that should work. Need to be careful though, should probably only process head pages and check if the whole compound_order() is within the range we are to leak, and then leak the head page and advance the loop by compound_order(). And if we encounter a tail page, it should probably be just skipped. I'm looking at snp_reclaim_pages() which seems to process a number of pages with SEV_CMD_SNP_PAGE_RECLAIM and once any fails, call snp_leak_pages() on the rest. Could that invoke snp_leak_pages with the first pfn being a tail page? > Thanks, > Ashish
Hello Vlastimil, On 12/11/2023 7:08 AM, Vlastimil Babka wrote: > > > On 12/8/23 23:10, Kalra, Ashish wrote: >> Hello Vlastimil, >> >> On 12/7/2023 10:20 AM, Vlastimil Babka wrote: >> >>>> + >>>> +void snp_leak_pages(u64 pfn, unsigned int npages) >>>> +{ >>>> + struct page *page = pfn_to_page(pfn); >>>> + >>>> + pr_debug("%s: leaking PFN range 0x%llx-0x%llx\n", __func__, pfn, >>>> pfn + npages); >>>> + >>>> + spin_lock(&snp_leaked_pages_list_lock); >>>> + while (npages--) { >>>> + /* >>>> + * Reuse the page's buddy list for chaining into the leaked >>>> + * pages list. This page should not be on a free list currently >>>> + * and is also unsafe to be added to a free list. >>>> + */ >>>> + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); >>>> + sev_dump_rmpentry(pfn); >>>> + pfn++; >>> >>> You increment pfn, but not page, which is always pointing to the page >>> of the >>> initial pfn, so need to do page++ too. >> >> Yes, that is a bug and needs to be fixed. >> >>> But that assumes it's all order-0 pages (hard to tell for me whether >>> that's >>> true as we start with a pfn), if there can be compound pages, it would be >>> best to only add the head page and skip the tail pages - it's not >>> expected >>> to use page->buddy_list of tail pages. >> >> Can't we use PageCompound() to check if the page is a compound page and >> then use page->compound_head to get and add the head page to leaked >> pages list. I understand the tail pages for compound pages are really >> limited for usage. > > Yeah that should work. Need to be careful though, should probably only > process head pages and check if the whole compound_order() is within the > range we are to leak, and then leak the head page and advance the loop > by compound_order(). And if we encounter a tail page, it should probably > be just skipped. I'm looking at snp_reclaim_pages() which seems to > process a number of pages with SEV_CMD_SNP_PAGE_RECLAIM and once any > fails, call snp_leak_pages() on the rest. Could that invoke > snp_leak_pages with the first pfn being a tail page? Yes i don't think we can assume that the first pfn will not be a tail page. But then this becomes complex as we might have already reclaimed the head page and one or more tail pages successfully or probably never transitioned head page to FW state as alloc_page()/alloc_pages() would have returned subpage(s) of a largepage. But then we really can't use the buddy_list of a tail page to insert it in the snp leaked pages list, right ? These non-reclaimed pages are not usable anymore anyway, any access to them will cause fatal RMP #PF, so don't know if i can use the buddy_list to insert tail pages as that will corrupt the page metadata ? We initially used to invoke memory_failure() here to try to gracefully handle failure of these non-reclaimed pages and that used to handle hugepages, etc., but as pointed in previous review feedback that is not a logical approach for this as that's meant more for the RAS stuff. Maybe it is a simpler approach to have our own container object on top and have this page pointer and list_head in it and use that list_head to insert into the snp leaked list instead of re-using the buddy_list for chaining into the leaked pages list ? Thanks, Ashish
diff --git a/arch/x86/include/asm/sev-host.h b/arch/x86/include/asm/sev-host.h index 1df989411334..7490a665e78f 100644 --- a/arch/x86/include/asm/sev-host.h +++ b/arch/x86/include/asm/sev-host.h @@ -19,6 +19,8 @@ void sev_dump_hva_rmpentry(unsigned long address); int psmash(u64 pfn); int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int asid, bool immutable); int rmp_make_shared(u64 pfn, enum pg_level level); +void snp_leak_pages(u64 pfn, unsigned int npages); + #else static inline int snp_lookup_rmpentry(u64 pfn, bool *assigned, int *level) { return -ENXIO; } static inline void sev_dump_hva_rmpentry(unsigned long address) {} @@ -29,6 +31,7 @@ static inline int rmp_make_private(u64 pfn, u64 gpa, enum pg_level level, int as return -ENXIO; } static inline int rmp_make_shared(u64 pfn, enum pg_level level) { return -ENXIO; } +static inline void snp_leak_pages(u64 pfn, unsigned int npages) {} #endif #endif diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c index bf9b97046e05..29a69f4b8cfb 100644 --- a/arch/x86/virt/svm/sev.c +++ b/arch/x86/virt/svm/sev.c @@ -59,6 +59,12 @@ struct rmpentry { static struct rmpentry *rmptable_start __ro_after_init; static u64 rmptable_max_pfn __ro_after_init; +/* list of pages which are leaked and cannot be reclaimed */ +static LIST_HEAD(snp_leaked_pages_list); +static DEFINE_SPINLOCK(snp_leaked_pages_list_lock); + +static atomic_long_t snp_nr_leaked_pages = ATOMIC_LONG_INIT(0); + #undef pr_fmt #define pr_fmt(fmt) "SEV-SNP: " fmt @@ -518,3 +524,25 @@ int rmp_make_shared(u64 pfn, enum pg_level level) return rmpupdate(pfn, &val); } EXPORT_SYMBOL_GPL(rmp_make_shared); + +void snp_leak_pages(u64 pfn, unsigned int npages) +{ + struct page *page = pfn_to_page(pfn); + + pr_debug("%s: leaking PFN range 0x%llx-0x%llx\n", __func__, pfn, pfn + npages); + + spin_lock(&snp_leaked_pages_list_lock); + while (npages--) { + /* + * Reuse the page's buddy list for chaining into the leaked + * pages list. This page should not be on a free list currently + * and is also unsafe to be added to a free list. + */ + list_add_tail(&page->buddy_list, &snp_leaked_pages_list); + sev_dump_rmpentry(pfn); + pfn++; + } + spin_unlock(&snp_leaked_pages_list_lock); + atomic_long_inc(&snp_nr_leaked_pages); +} +EXPORT_SYMBOL_GPL(snp_leak_pages);