Message ID | 20240112055251.36101-6-vannapurve@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-24330-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp1927614dyi; Thu, 11 Jan 2024 21:57:35 -0800 (PST) X-Google-Smtp-Source: AGHT+IEclV92I+YQir2qv5HeDeHSTEgDfYA+unOLhaOOLZqsgESPvh4tBcIsLnJJ7IUP3DNQfPjr X-Received: by 2002:a17:903:11c6:b0:1d5:4ba:c0d1 with SMTP id q6-20020a17090311c600b001d504bac0d1mr386338plh.103.1705039055150; Thu, 11 Jan 2024 21:57:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705039055; cv=none; d=google.com; s=arc-20160816; b=wEMtP1+/mxcmZu2Z+ttyLcRhtEJdcG91qwaYjaRRi2d+TgEeB7zAglQVZKQF0Jhg6k UL+siu6+TXGjjs8kBeln+edN23Utl0A9EnbjHuJIfqhvggIKhfJ8ZXjX3J7VG82y11DN JfF83p7Q8DSnTArhjk4jVKQlDS1UwomVNc7uc4+2cJ2JvqrBLLMc0S0ibKWrLeLlqdaX nkb6bs0NUToG8DhbVPDcTwN52cRaAlEgpnyPTTh5uEmo3/kc4ODcYayvLgnoIH9FnzAA NQv/r+r95YDg8O34HGdDZ0SFB+vOXBku2PbrgSRNTbXTenXeWphNDUKQitw0HoCXdHRM mr0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=MxvkESmzviEYCBtp91snCvTXp4bYwC2oMLMP2RAwiuA=; fh=f772pzXrwhuBlMr6yEvM0wkvmOz74CN31N0YtANlfBM=; b=XBoI3FKgu8J6h1pFLjDQMbNKvudKb/eKvl7GCL4NEqCqM5mX4huFsrAfjLRWx+cjfh THR6yfCf+VgRjEAyq1wbgI77lZG7ZdHQ7u/FA2fah+C//ukju0SqOcLQdNYhvWzW+5Y6 IPQiZqGBh/ctgKEPhq3da+TeoYLIip49AiNJvcag5jVBtf1L0jiruOfk76jzkZmlb9rx tC/ZFcwY5hpmUnE1pSz01vOt0iMKhejfxUBuwYUuzI/YCUoZMLbD2Dqd5VgrZIhwo6HK ocoGCvmiTY0GWONUpza9YHLVWDcMezBMAO2qgbHk2zoinc0krPCVlcb0ltx0IwaLZjIA 7rHg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Z9k1mwSi; spf=pass (google.com: domain of linux-kernel+bounces-24330-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-24330-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id jx12-20020a170903138c00b001d0ced77a90si2563092plb.510.2024.01.11.21.57.34 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 21:57:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-24330-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Z9k1mwSi; spf=pass (google.com: domain of linux-kernel+bounces-24330-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-24330-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 85A37B21C5F for <ouuuleilei@gmail.com>; Fri, 12 Jan 2024 05:54:29 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 8923C5D8F2; Fri, 12 Jan 2024 05:53:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Z9k1mwSi" Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A42585D754 for <linux-kernel@vger.kernel.org>; Fri, 12 Jan 2024 05:53:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--vannapurve.bounces.google.com Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-5e9de9795dfso116557157b3.0 for <linux-kernel@vger.kernel.org>; Thu, 11 Jan 2024 21:53:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705038795; x=1705643595; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MxvkESmzviEYCBtp91snCvTXp4bYwC2oMLMP2RAwiuA=; b=Z9k1mwSi3Lrc7sKciPEyKLWtgn/dQi3Igqi+JSP+YZoi4eO+aTjh9Dd0d6+6EB80EE lxHol5Bn8YJ79P6KQjHoBGzBF4nX4nEZDLSqWMc9fkx+y7lo2SzJe9DsLZrXRTo0f346 M4gBmBmT75EIobbjPaN0TRNWod16RKAJWmtgg2Cx4jfe8iluEGCqF96krG3z8jZ6kVP/ h8aUWwzv93o1DCGlE80pI2ZVBTCG0uojo7In9TDtlmuNcJEwaTDq8PNMexaMACdePyZn k6NkCHIsRBq99NtpqLci+xKTsCnQO4ZWjaENDMNPg90JZKXvBaxwvHougTEEJ13FrF6L +Myg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705038795; x=1705643595; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MxvkESmzviEYCBtp91snCvTXp4bYwC2oMLMP2RAwiuA=; b=cfCHft7+FbUQ/Ym9tHJ+eaPFAl4NiHeu0TPgc82vReOePER5Xig9iD283XOG4fVbQC 1KQFVoS6X35DHETJkuFXD+uAsFwsqn5t6S9rLzIpwFQ3XqA27iduJlvtyIdl1OjeI5Zq U9EuUpAAVhsqoKgMuE4sv6blMj8bNAsB39u8oEtBxinA26EpD+BPlqVaJPzBQAfx0bZq kI9m7X6yVgH1UO/40c5fRQq2iyeqplLAoqB9r+c09y4Bo40RaWlpwnbmoQ6cpGZ1QgHU lvvaeO9FLAPcvJN6c8XXHwdxK5cgwQK5eh4jsnmy1EbH73Hr3L9iaAjFAn6xXVy5QSRs LybA== X-Gm-Message-State: AOJu0YwGAKVJYBxIND+NkQZ2QCGitbIarRKEZMDhmrB0g4dSDlvA6fIZ IXsqyGxZnXBsELS7x5K+9It6ErXeT36nNekrvXokeKU= X-Received: from vannapurve2.c.googlers.com ([fda3:e722:ac3:cc00:20:ed76:c0a8:43a4]) (user=vannapurve job=sendgmr) by 2002:a0d:e003:0:b0:5fb:7d88:a558 with SMTP id j3-20020a0de003000000b005fb7d88a558mr497217ywe.0.1705038794846; Thu, 11 Jan 2024 21:53:14 -0800 (PST) Date: Fri, 12 Jan 2024 05:52:51 +0000 In-Reply-To: <20240112055251.36101-1-vannapurve@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> Mime-Version: 1.0 References: <20240112055251.36101-1-vannapurve@google.com> X-Mailer: git-send-email 2.43.0.275.g3460e3d667-goog Message-ID: <20240112055251.36101-6-vannapurve@google.com> Subject: [RFC V1 5/5] x86: CVMs: Ensure that memory conversions happen at 2M alignment From: Vishal Annapurve <vannapurve@google.com> To: x86@kernel.org, linux-kernel@vger.kernel.org Cc: pbonzini@redhat.com, rientjes@google.com, bgardon@google.com, seanjc@google.com, erdemaktas@google.com, ackerleytng@google.com, jxgao@google.com, sagis@google.com, oupton@google.com, peterx@redhat.com, vkuznets@redhat.com, dmatlack@google.com, pgonda@google.com, michael.roth@amd.com, kirill@shutemov.name, thomas.lendacky@amd.com, dave.hansen@linux.intel.com, linux-coco@lists.linux.dev, chao.p.peng@linux.intel.com, isaku.yamahata@gmail.com, andrew.jones@linux.dev, corbet@lwn.net, hch@lst.de, m.szyprowski@samsung.com, bp@suse.de, rostedt@goodmis.org, iommu@lists.linux.dev, Vishal Annapurve <vannapurve@google.com> Content-Type: text/plain; charset="UTF-8" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787863032298786731 X-GMAIL-MSGID: 1787863032298786731 |
Series |
x86: CVMs: Align memory conversions to 2M granularity
|
|
Commit Message
Vishal Annapurve
Jan. 12, 2024, 5:52 a.m. UTC
Return error on conversion of memory ranges not aligned to 2M size.
Signed-off-by: Vishal Annapurve <vannapurve@google.com>
---
arch/x86/mm/pat/set_memory.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Comments
On 1/11/24 21:52, Vishal Annapurve wrote: > @@ -2133,8 +2133,10 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) > int ret; > > /* Should not be working on unaligned addresses */ > - if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) > - addr &= PAGE_MASK; > + if (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: %#lx\n", addr) > + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK, > + "misaligned numpages: %#lx\n", numpages)) > + return -EINVAL; This series is talking about swiotlb and DMA, then this applies a restriction to what I *thought* was a much more generic function: __set_memory_enc_pgtable(). What prevents this function from getting used on 4k mappings?
On Wed, Jan 31, 2024 at 10:03 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 1/11/24 21:52, Vishal Annapurve wrote: > > @@ -2133,8 +2133,10 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) > > int ret; > > > > /* Should not be working on unaligned addresses */ > > - if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) > > - addr &= PAGE_MASK; > > + if (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: %#lx\n", addr) > > + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK, > > + "misaligned numpages: %#lx\n", numpages)) > > + return -EINVAL; > > This series is talking about swiotlb and DMA, then this applies a > restriction to what I *thought* was a much more generic function: > __set_memory_enc_pgtable(). What prevents this function from getting > used on 4k mappings? > > The end goal here is to limit the conversion granularity to hugepage sizes. SWIOTLB allocations are the major source of unaligned allocations(and so the conversions) that need to be fixed before achieving this goal. This change will ensure that conversion fails for unaligned ranges, as I don't foresee the need for 4K aligned conversions apart from DMA allocations.
On 01/02/2024 04:46, Vishal Annapurve wrote: > On Wed, Jan 31, 2024 at 10:03 PM Dave Hansen <dave.hansen@intel.com> wrote: >> >> On 1/11/24 21:52, Vishal Annapurve wrote: >>> @@ -2133,8 +2133,10 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) >>> int ret; >>> >>> /* Should not be working on unaligned addresses */ >>> - if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) >>> - addr &= PAGE_MASK; >>> + if (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: %#lx\n", addr) >>> + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK, >>> + "misaligned numpages: %#lx\n", numpages)) >>> + return -EINVAL; >> >> This series is talking about swiotlb and DMA, then this applies a >> restriction to what I *thought* was a much more generic function: >> __set_memory_enc_pgtable(). What prevents this function from getting >> used on 4k mappings? >> >> > > The end goal here is to limit the conversion granularity to hugepage > sizes. SWIOTLB allocations are the major source of unaligned > allocations(and so the conversions) that need to be fixed before > achieving this goal. > > This change will ensure that conversion fails for unaligned ranges, as > I don't foresee the need for 4K aligned conversions apart from DMA > allocations. Hi Vishal, This assumption is wrong. set_memory_decrypted is called from various parts of the kernel: kexec, sev-guest, kvmclock, hyperv code. These conversions are for non-DMA allocations that need to be done at 4KB granularity because the data structures in question are page sized. Thanks, Jeremi
On 02/02/2024 06:08, Vishal Annapurve wrote: > On Thu, Feb 1, 2024 at 5:32 PM Jeremi Piotrowski > <jpiotrowski@linux.microsoft.com> wrote: >> >> On 01/02/2024 04:46, Vishal Annapurve wrote: >>> On Wed, Jan 31, 2024 at 10:03 PM Dave Hansen <dave.hansen@intel.com> wrote: >>>> >>>> On 1/11/24 21:52, Vishal Annapurve wrote: >>>>> @@ -2133,8 +2133,10 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) >>>>> int ret; >>>>> >>>>> /* Should not be working on unaligned addresses */ >>>>> - if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) >>>>> - addr &= PAGE_MASK; >>>>> + if (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: %#lx\n", addr) >>>>> + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK, >>>>> + "misaligned numpages: %#lx\n", numpages)) >>>>> + return -EINVAL; >>>> >>>> This series is talking about swiotlb and DMA, then this applies a >>>> restriction to what I *thought* was a much more generic function: >>>> __set_memory_enc_pgtable(). What prevents this function from getting >>>> used on 4k mappings? >>>> >>>> >>> >>> The end goal here is to limit the conversion granularity to hugepage >>> sizes. SWIOTLB allocations are the major source of unaligned >>> allocations(and so the conversions) that need to be fixed before >>> achieving this goal. >>> >>> This change will ensure that conversion fails for unaligned ranges, as >>> I don't foresee the need for 4K aligned conversions apart from DMA >>> allocations. >> >> Hi Vishal, >> >> This assumption is wrong. set_memory_decrypted is called from various >> parts of the kernel: kexec, sev-guest, kvmclock, hyperv code. These conversions >> are for non-DMA allocations that need to be done at 4KB granularity >> because the data structures in question are page sized. >> >> Thanks, >> Jeremi > > Thanks Jeremi for pointing out these usecases. > > My brief analysis for these call sites: > 1) machine_kexec_64.c, realmode/init.c, kvm/mmu/mmu.c - shared memory > allocation/conversion happens when host side memory encryption > (CC_ATTR_HOST_MEM_ENCRYPT) is enabled. > 2) kernel/kvmclock.c - Shared memory allocation can be made to align > 2M even if the memory needed is lesser. > 3) drivers/virt/coco/sev-guest/sev-guest.c, > drivers/virt/coco/tdx-guest/tdx-guest.c - Shared memory allocation can > be made to align 2M even if the memory needed is lesser. > > I admit I haven't analyzed hyperv code in context of these changes, > but will take a better look to see if the calls for memory conversion > here can fit the category of "Shared memory allocation can be made to > align 2M even if the memory needed is lesser". > > Agree that this patch should be modified to look something like > (subject to more changes on the call sites) No, this patch is still built on the wrong assumptions. You're trying to alter a generic function in the guest for the constraints of a very specific hypervisor + host userspace + memory backend combination. That's not right. Is the numpages check supposed to ensure that the guest *only* toggles visibility in chunks of 2MB? Then you're exposing more memory to the host than the guest intends. If you must - focus on getting swiotlb conversions to happen at the desired granularity but don't try to force every single conversion to be >4K. Thanks, Jeremi > > ============= > diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c > index e9b448d1b1b7..8c608d6913c4 100644 > --- a/arch/x86/mm/pat/set_memory.c > +++ b/arch/x86/mm/pat/set_memory.c > @@ -2132,10 +2132,15 @@ static int __set_memory_enc_pgtable(unsigned > long addr, int numpages, bool enc) > struct cpa_data cpa; > int ret; > > /* Should not be working on unaligned addresses */ > if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) > addr &= PAGE_MASK; > > + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && > + (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: > %#lx\n", addr) > + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK, > + "misaligned numpages: %#lx\n", numpages))) > + return -EINVAL; > + > memset(&cpa, 0, sizeof(cpa)); > cpa.vaddr = &addr; > cpa.numpages = numpages;
On 2/2/24 08:22, Vishal Annapurve wrote: >> If you must - focus on getting swiotlb conversions to happen at the desired >> granularity but don't try to force every single conversion to be >4K. > If any conversion within a guest happens at 4K granularity, then this > will effectively cause non-hugepage aligned EPT/NPT entries. This > series is trying to get all private and shared memory regions to be > hugepage aligned to address the problem statement. Yeah, but the series is trying to do that by being awfully myopic at this stage and without being _declared_ to be so myopic. Take a look at all of the set_memory_decrypted() calls. How many of them even operate on the part of the guest address space rooted in the memfd where splits matter? They're not doing conversions. They're just setting up shared mappings in the page tables of gunk that was never private in the first place.
diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index bda9f129835e..6f7b06a502f4 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -2133,8 +2133,10 @@ static int __set_memory_enc_pgtable(unsigned long addr, int numpages, bool enc) int ret; /* Should not be working on unaligned addresses */ - if (WARN_ONCE(addr & ~PAGE_MASK, "misaligned address: %#lx\n", addr)) - addr &= PAGE_MASK; + if (WARN_ONCE(addr & ~HPAGE_MASK, "misaligned address: %#lx\n", addr) + || WARN_ONCE((numpages << PAGE_SHIFT) & ~HPAGE_MASK, + "misaligned numpages: %#lx\n", numpages)) + return -EINVAL; memset(&cpa, 0, sizeof(cpa)); cpa.vaddr = &addr;