diff mbox series

[v7,6/6] x86/efi: Safely enable unaccepted memory in UEFI

Message ID	1d38d28c2731075d66ac65b56b813a138900f638.1680628986.git.thomas.lendacky@amd.com
State	New
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C From: Tom Lendacky <thomas.lendacky@amd.com> To: <linux-kernel@vger.kernel.org>, <x86@kernel.org> CC: Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "Kirill A. Shutemov" <kirill@shutemov.name>, "H. Peter Anvin" <hpa@zytor.com>, Michael Roth <michael.roth@amd.com>, Joerg Roedel <jroedel@suse.de>, Dionna Glaze <dionnaglaze@google.com>, Andy Lutomirski <luto@kernel.org>, Peter Zijlstra <peterz@infradead.org>, Ard Biescheuvel <ardb@kernel.org>, "Min M. Xu" <min.m.xu@intel.com>, Gerd Hoffmann <kraxel@redhat.com>, James Bottomley <jejb@linux.ibm.com>, Tom Lendacky <Thomas.Lendacky@amd.com>, Jiewen Yao <jiewen.yao@intel.com>, Erdem Aktas <erdemaktas@google.com>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com> Subject: [PATCH v7 6/6] x86/efi: Safely enable unaccepted memory in UEFI Date: Tue, 4 Apr 2023 12:23:06 -0500 Message-ID: <1d38d28c2731075d66ac65b56b813a138900f638.1680628986.git.thomas.lendacky@amd.com> In-Reply-To: <cover.1680628986.git.thomas.lendacky@amd.com> References: <20230330114956.20342-1-kirill.shutemov@linux.intel.com> <cover.1680628986.git.thomas.lendacky@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: bulk
Series	Provide SEV-SNP support for unaccepted memory \| [v7,0/6] Provide SEV-SNP support for unaccepted memory [v7,1/6] x86/sev: Fix calculation of end address based on number of pages [v7,2/6] x86/sev: Put PSC struct on the stack in prep for unaccepted memory support [v7,3/6] x86/sev: Allow for use of the early boot GHCB for PSC requests [v7,4/6] x86/sev: Use large PSC requests if applicable [v7,5/6] x86/sev: Add SNP-specific unaccepted memory support [v7,6/6] x86/efi: Safely enable unaccepted memory in UEFI

Commit Message

Tom Lendacky April 4, 2023, 5:23 p.m. UTC

  From: Dionna Glaze <dionnaglaze@google.com>

The UEFI v2.9 specification includes a new memory type to be used in
environments where the OS must accept memory that is provided from its
host. Before the introduction of this memory type, all memory was
accepted eagerly in the firmware. In order for the firmware to safely
stop accepting memory on the OS's behalf, the OS must affirmatively
indicate support to the firmware. This is only a problem for AMD
SEV-SNP, since Linux has had support for it since 5.19. The other
technology that can make use of unaccepted memory, Intel TDX, does not
yet have Linux support, so it can strictly require unaccepted memory
support as a dependency of CONFIG_TDX and not require communication with
the firmware.

Enabling unaccepted memory requires calling a 0-argument enablement
protocol before ExitBootServices. This call is only made if the kernel
is compiled with UNACCEPTED_MEMORY=y

This protocol will be removed after the end of life of the first LTS
that includes it, in order to give firmware implementations an
expiration date for it. When the protocol is removed, firmware will
strictly infer that a SEV-SNP VM is running an OS that supports the
unaccepted memory type. At the earliest convenience, when unaccepted
memory support is added to Linux, SEV-SNP may take strict dependence in
it. After the firmware removes support for the protocol, this patch
should be reverted.

  [tl: address some checkscript warnings]

Cc: Ard Biescheuvel <ardb@kernel.org>
Cc: "Min M. Xu" <min.m.xu@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: James Bottomley <jejb@linux.ibm.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Cc: Jiewen Yao <jiewen.yao@intel.com>
Cc: Erdem Aktas <erdemaktas@google.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 drivers/firmware/efi/libstub/x86-stub.c | 36 +++++++++++++++++++++++++
 include/linux/efi.h                     |  3 +++
 2 files changed, 39 insertions(+)

Comments

Kirill A. Shutemov April 4, 2023, 5:45 p.m. UTC | #1

On Tue, Apr 04, 2023 at 12:23:06PM -0500, Tom Lendacky wrote:
> From: Dionna Glaze <dionnaglaze@google.com>
> 
> The UEFI v2.9 specification includes a new memory type to be used in
> environments where the OS must accept memory that is provided from its
> host. Before the introduction of this memory type, all memory was
> accepted eagerly in the firmware. In order for the firmware to safely
> stop accepting memory on the OS's behalf, the OS must affirmatively
> indicate support to the firmware. This is only a problem for AMD
> SEV-SNP, since Linux has had support for it since 5.19. The other
> technology that can make use of unaccepted memory, Intel TDX, does not
> yet have Linux support, so it can strictly require unaccepted memory
> support as a dependency of CONFIG_TDX and not require communication with
> the firmware.
> 
> Enabling unaccepted memory requires calling a 0-argument enablement
> protocol before ExitBootServices. This call is only made if the kernel
> is compiled with UNACCEPTED_MEMORY=y
> 
> This protocol will be removed after the end of life of the first LTS
> that includes it, in order to give firmware implementations an
> expiration date for it. When the protocol is removed, firmware will
> strictly infer that a SEV-SNP VM is running an OS that supports the
> unaccepted memory type. At the earliest convenience, when unaccepted
> memory support is added to Linux, SEV-SNP may take strict dependence in
> it. After the firmware removes support for the protocol, this patch
> should be reverted.
> 
>   [tl: address some checkscript warnings]
> 
> Cc: Ard Biescheuvel <ardb@kernel.org>
> Cc: "Min M. Xu" <min.m.xu@intel.com>
> Cc: Gerd Hoffmann <kraxel@redhat.com>
> Cc: James Bottomley <jejb@linux.ibm.com>
> Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
> Cc: Jiewen Yao <jiewen.yao@intel.com>
> Cc: Erdem Aktas <erdemaktas@google.com>
> Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Signed-off-by: Dionna Glaze <dionnaglaze@google.com>
> Signed-off-by: Tom Lendacky <thomas.lendacky@amd.com>

I still think it is a bad idea.

As I asked before, please include my

Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>

into the patch.

Dave Hansen April 4, 2023, 5:57 p.m. UTC | #2

On 4/4/23 10:45, Kirill A. Shutemov wrote:
> I still think it is a bad idea.
> 
> As I asked before, please include my
> 
> Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> 
> into the patch.

I was pretty opposed to this when I first saw it too.  But, Tom and
company have worn down my opposition a bit.

The fact is that we have upstream kernels out there with SEV-SNP support
that don't know anything about unaccepted memory.  They're either
relegated to using the pre-accepted memory (4GB??) or _some_ entity
needs to accept the memory.  That entity obviously can't be the kernel
unless we backport unaccepted memory support.

This both lets the BIOS be the page-accepting entity _and_ allows the
entity to delegate that to the kernel when it needs to.

As much as I want to nak this and pretend that that those existing
kernel's don't exist, my powers of self-delusion do have their limits.

If our AMD friends don't do this, what is their alternative?

Kirill A. Shutemov April 4, 2023, 6:09 p.m. UTC | #3

On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > I still think it is a bad idea.
> > 
> > As I asked before, please include my
> > 
> > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > 
> > into the patch.
> 
> I was pretty opposed to this when I first saw it too.  But, Tom and
> company have worn down my opposition a bit.
> 
> The fact is that we have upstream kernels out there with SEV-SNP support
> that don't know anything about unaccepted memory.  They're either
> relegated to using the pre-accepted memory (4GB??) or _some_ entity
> needs to accept the memory.  That entity obviously can't be the kernel
> unless we backport unaccepted memory support.
> 
> This both lets the BIOS be the page-accepting entity _and_ allows the
> entity to delegate that to the kernel when it needs to.
> 
> As much as I want to nak this and pretend that that those existing
> kernel's don't exist, my powers of self-delusion do have their limits.
> 
> If our AMD friends don't do this, what is their alternative?

The alternative is coordination on the host side: VMM can load a BIOS that
pre-accepts all memory if the kernel is older.

I know that it is not convenient for VMM, but it is technically possible.

Introduce an ABI with an expiration date is much more ugly. And nobody
will care about the expiration date, until you will try to remove it.

Dave Hansen April 4, 2023, 7:27 p.m. UTC | #4

On 4/4/23 11:09, Kirill A. Shutemov wrote:
>> If our AMD friends don't do this, what is their alternative?
> The alternative is coordination on the host side: VMM can load a BIOS that
> pre-accepts all memory if the kernel is older.
> 
> I know that it is not convenient for VMM, but it is technically possible.

Yeah, either a specific BIOS or a knob to tell the BIOS what it has to
do.  But, either way, that requires coordination between the BIOS (or
BIOS configuration) and the specific guest.  I can see why that's
unpalatable.

> Introduce an ABI with an expiration date is much more ugly. And nobody
> will care about the expiration date, until you will try to remove it.

Yeah, the only real expiration date for an ABI is "never".  I don't
believe for a second that we'll ever be able to remove the interface.

Either way, I'd love to hear more from folks about why a BIOS-side
option (configuration or otherwise) is not a good option.  I know we've
discussed this in a few mail threads, but it would be even better to get
it into the cover letter or documentation.

Ard Biesheuvel April 4, 2023, 7:49 p.m. UTC | #5

On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > I still think it is a bad idea.
> > >
> > > As I asked before, please include my
> > >
> > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > >
> > > into the patch.
> >
> > I was pretty opposed to this when I first saw it too.  But, Tom and
> > company have worn down my opposition a bit.
> >
> > The fact is that we have upstream kernels out there with SEV-SNP support
> > that don't know anything about unaccepted memory.  They're either
> > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > needs to accept the memory.  That entity obviously can't be the kernel
> > unless we backport unaccepted memory support.
> >
> > This both lets the BIOS be the page-accepting entity _and_ allows the
> > entity to delegate that to the kernel when it needs to.
> >
> > As much as I want to nak this and pretend that that those existing
> > kernel's don't exist, my powers of self-delusion do have their limits.
> >
> > If our AMD friends don't do this, what is their alternative?
>
> The alternative is coordination on the host side: VMM can load a BIOS that
> pre-accepts all memory if the kernel is older.
>

And how does one identify such a kernel? How does the VMM know which
kernel the guest is going to load after it boots?

> I know that it is not convenient for VMM, but it is technically possible.
>
> Introduce an ABI with an expiration date is much more ugly. And nobody
> will care about the expiration date, until you will try to remove it.
>

None of us are thrilled about this, but the simple reality is that
there are kernels that do not understand unaccepted memory. EFI being
an extensible, generic, protocol based programmatic interface, the
best way of informing the loader that a kernel does understand it is
/not/ by adding some flag to some highly arch and OS specific header,
but to discover a protocol and call it.

We're past arguing that a legitimate need exists for a solution to
this problem. So what solution are you proposing?

Kirill A. Shutemov April 4, 2023, 8:24 p.m. UTC | #6

On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote:
> On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> >
> > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > > I still think it is a bad idea.
> > > >
> > > > As I asked before, please include my
> > > >
> > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > >
> > > > into the patch.
> > >
> > > I was pretty opposed to this when I first saw it too.  But, Tom and
> > > company have worn down my opposition a bit.
> > >
> > > The fact is that we have upstream kernels out there with SEV-SNP support
> > > that don't know anything about unaccepted memory.  They're either
> > > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > > needs to accept the memory.  That entity obviously can't be the kernel
> > > unless we backport unaccepted memory support.
> > >
> > > This both lets the BIOS be the page-accepting entity _and_ allows the
> > > entity to delegate that to the kernel when it needs to.
> > >
> > > As much as I want to nak this and pretend that that those existing
> > > kernel's don't exist, my powers of self-delusion do have their limits.
> > >
> > > If our AMD friends don't do this, what is their alternative?
> >
> > The alternative is coordination on the host side: VMM can load a BIOS that
> > pre-accepts all memory if the kernel is older.
> >
> 
> And how does one identify such a kernel? How does the VMM know which
> kernel the guest is going to load after it boots?

VMM has to know what it is running. Yes, it is cumbersome. But enabling
phase for a feature is often rough. It will get smoother overtime.

> > I know that it is not convenient for VMM, but it is technically possible.
> >
> > Introduce an ABI with an expiration date is much more ugly. And nobody
> > will care about the expiration date, until you will try to remove it.
> >
> 
> None of us are thrilled about this, but the simple reality is that
> there are kernels that do not understand unaccepted memory.

How is it different from any other feature the kernel is not [yet] aware
of?

Like if we boot a legacy kernel on machine with persistent memory or
memory attached over CLX, it will not see it as conventional memory.

> EFI being
> an extensible, generic, protocol based programmatic interface, the
> best way of informing the loader that a kernel does understand it is
> /not/ by adding some flag to some highly arch and OS specific header,
> but to discover a protocol and call it.
> 
> We're past arguing that a legitimate need exists for a solution to
> this problem. So what solution are you proposing?

I described the solution multiple times. You just don't like it.

Ard Biesheuvel April 4, 2023, 8:41 p.m. UTC | #7

On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote:
> > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > >
> > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > > > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > > > I still think it is a bad idea.
> > > > >
> > > > > As I asked before, please include my
> > > > >
> > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > >
> > > > > into the patch.
> > > >
> > > > I was pretty opposed to this when I first saw it too.  But, Tom and
> > > > company have worn down my opposition a bit.
> > > >
> > > > The fact is that we have upstream kernels out there with SEV-SNP support
> > > > that don't know anything about unaccepted memory.  They're either
> > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > > > needs to accept the memory.  That entity obviously can't be the kernel
> > > > unless we backport unaccepted memory support.
> > > >
> > > > This both lets the BIOS be the page-accepting entity _and_ allows the
> > > > entity to delegate that to the kernel when it needs to.
> > > >
> > > > As much as I want to nak this and pretend that that those existing
> > > > kernel's don't exist, my powers of self-delusion do have their limits.
> > > >
> > > > If our AMD friends don't do this, what is their alternative?
> > >
> > > The alternative is coordination on the host side: VMM can load a BIOS that
> > > pre-accepts all memory if the kernel is older.
> > >
> >
> > And how does one identify such a kernel? How does the VMM know which
> > kernel the guest is going to load after it boots?
>
> VMM has to know what it is running. Yes, it is cumbersome. But enabling
> phase for a feature is often rough. It will get smoother overtime.
>

So how does the VMM get informed about what it is running? How does it
distinguish between kernels that support unaccepted memory and ones
that don't? And how does it predict which kernel a guest is going to
load?

If the solution you described many times addresses these questions,
could you please share a link?

Kirill A. Shutemov April 4, 2023, 9:01 p.m. UTC | #8

On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote:
> On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> >
> > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote:
> > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > > >
> > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > > > > I still think it is a bad idea.
> > > > > >
> > > > > > As I asked before, please include my
> > > > > >
> > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > >
> > > > > > into the patch.
> > > > >
> > > > > I was pretty opposed to this when I first saw it too.  But, Tom and
> > > > > company have worn down my opposition a bit.
> > > > >
> > > > > The fact is that we have upstream kernels out there with SEV-SNP support
> > > > > that don't know anything about unaccepted memory.  They're either
> > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > > > > needs to accept the memory.  That entity obviously can't be the kernel
> > > > > unless we backport unaccepted memory support.
> > > > >
> > > > > This both lets the BIOS be the page-accepting entity _and_ allows the
> > > > > entity to delegate that to the kernel when it needs to.
> > > > >
> > > > > As much as I want to nak this and pretend that that those existing
> > > > > kernel's don't exist, my powers of self-delusion do have their limits.
> > > > >
> > > > > If our AMD friends don't do this, what is their alternative?
> > > >
> > > > The alternative is coordination on the host side: VMM can load a BIOS that
> > > > pre-accepts all memory if the kernel is older.
> > > >
> > >
> > > And how does one identify such a kernel? How does the VMM know which
> > > kernel the guest is going to load after it boots?
> >
> > VMM has to know what it is running. Yes, it is cumbersome. But enabling
> > phase for a feature is often rough. It will get smoother overtime.
> >
> 
> So how does the VMM get informed about what it is running? How does it
> distinguish between kernels that support unaccepted memory and ones
> that don't? And how does it predict which kernel a guest is going to
> load?

User will specify if it wants unaccepted memory or not for the VM. And if
it does it is his responsibility to have kernel that supports it.

And you have not addressed my question:

	How is it different from any other feature the kernel is not [yet] aware
	of?

Ard Biesheuvel April 5, 2023, 7:46 a.m. UTC | #9

On Tue, 4 Apr 2023 at 23:02, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote:
> > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > >
> > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote:
> > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > > > >
> > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > > > > > I still think it is a bad idea.
> > > > > > >
> > > > > > > As I asked before, please include my
> > > > > > >
> > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > >
> > > > > > > into the patch.
> > > > > >
> > > > > > I was pretty opposed to this when I first saw it too.  But, Tom and
> > > > > > company have worn down my opposition a bit.
> > > > > >
> > > > > > The fact is that we have upstream kernels out there with SEV-SNP support
> > > > > > that don't know anything about unaccepted memory.  They're either
> > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > > > > > needs to accept the memory.  That entity obviously can't be the kernel
> > > > > > unless we backport unaccepted memory support.
> > > > > >
> > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the
> > > > > > entity to delegate that to the kernel when it needs to.
> > > > > >
> > > > > > As much as I want to nak this and pretend that that those existing
> > > > > > kernel's don't exist, my powers of self-delusion do have their limits.
> > > > > >
> > > > > > If our AMD friends don't do this, what is their alternative?
> > > > >
> > > > > The alternative is coordination on the host side: VMM can load a BIOS that
> > > > > pre-accepts all memory if the kernel is older.
> > > > >
> > > >
> > > > And how does one identify such a kernel? How does the VMM know which
> > > > kernel the guest is going to load after it boots?
> > >
> > > VMM has to know what it is running. Yes, it is cumbersome. But enabling
> > > phase for a feature is often rough. It will get smoother overtime.
> > >
> >
> > So how does the VMM get informed about what it is running? How does it
> > distinguish between kernels that support unaccepted memory and ones
> > that don't? And how does it predict which kernel a guest is going to
> > load?
>
> User will specify if it wants unaccepted memory or not for the VM. And if
> it does it is his responsibility to have kernel that supports it.
>
> And you have not addressed my question:
>
>         How is it different from any other feature the kernel is not [yet] aware
>         of?
>

It is the same problem, but this is just a better solution. Having a
BIOS menu option (or similar) to choose between unaccepted memory or
not (or to expose CXL memory via the EFI memory map, which is another
hack I have seen) is just unnecessary complication, if the kernel can
simply inform the loader about what it supports. We do this all the
time with things like OsIndications.

We can phase out the protocol implementation from the firmware once we
no longer need it, at which point the LocateProtocol() call just
becomes a NOP (we do the same thing for UGA support, which has
disappeared a long time ago, but we still look for the protocol in the
EFI stub).

Once the firmware stops exposing this protocol (and ceases to accept
memory on the OS's behalf), we can phase it out from the kernel as
well.

The only other potential solution I see is exposing the unaccepted
memory as coldplugged ACPI memory objects, and implementing the accept
calls via PRM methods. But PRM has had very little test coverage, so
it is anybody's guess whether it works for the stable kernels that we
need to support with this. It would also mean that the new unaccepted
memory logic would need to be updated and cross reference these memory
regions with EFI unaccepted memory regions and avoid claiming them
both.

Gerd Hoffmann April 5, 2023, 10:06 a.m. UTC | #10

Hi,

> User will specify if it wants unaccepted memory or not for the VM. And if
> it does it is his responsibility to have kernel that supports it.
> 
> And you have not addressed my question:
> 
> 	How is it different from any other feature the kernel is not [yet] aware
> 	of?

Come on.  Automatic feature negotiation is standard procedure in many
places.  It's not like we inventing something totally new here.

Just one example:  When a virtio device learns a new trick a feature flag
is added for it, and in case both guest and host support it it can be
enabled, otherwise not.  There is no need for the user to configure the
virtio device features manually according to the capabilities of the
kernel it is going to boot.

take care,
  Gerd

Dave Hansen April 5, 2023, 1 p.m. UTC | #11

On 4/5/23 00:46, Ard Biesheuvel wrote:
> Once the firmware stops exposing this protocol (and ceases to accept
> memory on the OS's behalf), we can phase it out from the kernel as
> well.

This is a part of the story that I have doubts about.

How and when do you think this phase-out would happen, realistically?

The firmware will need the unaccepted memory protocol support as long as
there are guests around that need it, right?

People like to keep running old kernels for a _long_ time.  Doesn't that
mean _some_ firmware will need to keep doing this dance for a long time?

As long as there is firmware out there in the wild that people want to
run new kernels on, the support needs to stay in mainline.  It can't be
dropped.

Kirill A. Shutemov April 5, 2023, 1:42 p.m. UTC | #12

On Wed, Apr 05, 2023 at 09:46:59AM +0200, Ard Biesheuvel wrote:
> On Tue, 4 Apr 2023 at 23:02, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> >
> > On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote:
> > > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > > >
> > > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote:
> > > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > > > > >
> > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > > > > > > I still think it is a bad idea.
> > > > > > > >
> > > > > > > > As I asked before, please include my
> > > > > > > >
> > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > >
> > > > > > > > into the patch.
> > > > > > >
> > > > > > > I was pretty opposed to this when I first saw it too.  But, Tom and
> > > > > > > company have worn down my opposition a bit.
> > > > > > >
> > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support
> > > > > > > that don't know anything about unaccepted memory.  They're either
> > > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > > > > > > needs to accept the memory.  That entity obviously can't be the kernel
> > > > > > > unless we backport unaccepted memory support.
> > > > > > >
> > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the
> > > > > > > entity to delegate that to the kernel when it needs to.
> > > > > > >
> > > > > > > As much as I want to nak this and pretend that that those existing
> > > > > > > kernel's don't exist, my powers of self-delusion do have their limits.
> > > > > > >
> > > > > > > If our AMD friends don't do this, what is their alternative?
> > > > > >
> > > > > > The alternative is coordination on the host side: VMM can load a BIOS that
> > > > > > pre-accepts all memory if the kernel is older.
> > > > > >
> > > > >
> > > > > And how does one identify such a kernel? How does the VMM know which
> > > > > kernel the guest is going to load after it boots?
> > > >
> > > > VMM has to know what it is running. Yes, it is cumbersome. But enabling
> > > > phase for a feature is often rough. It will get smoother overtime.
> > > >
> > >
> > > So how does the VMM get informed about what it is running? How does it
> > > distinguish between kernels that support unaccepted memory and ones
> > > that don't? And how does it predict which kernel a guest is going to
> > > load?
> >
> > User will specify if it wants unaccepted memory or not for the VM. And if
> > it does it is his responsibility to have kernel that supports it.
> >
> > And you have not addressed my question:
> >
> >         How is it different from any other feature the kernel is not [yet] aware
> >         of?
> >
> 
> It is the same problem, but this is just a better solution.

Okay, we at least agree that there are more then one solution to the
problem.

> Having a BIOS menu option (or similar) to choose between unaccepted
> memory or not (or to expose CXL memory via the EFI memory map, which is
> another hack I have seen) is just unnecessary complication, if the
> kernel can simply inform the loader about what it supports. We do this
> all the time with things like OsIndications.

It assumes that kernel calls ExitBootServices() which is not always true.
A bootloader in between will make impossible for kernel to use any of
futures exposed this way.

But we talked about this before.

BTW, can we at least acknowledge the limitation in the commit message?

> We can phase out the protocol implementation from the firmware once we
> no longer need it, at which point the LocateProtocol() call just
> becomes a NOP (we do the same thing for UGA support, which has
> disappeared a long time ago, but we still look for the protocol in the
> EFI stub).
> 
> Once the firmware stops exposing this protocol (and ceases to accept
> memory on the OS's behalf), we can phase it out from the kernel as
> well.

It is unlikely to ever happen. In few year everybody will forget about
this conversation. Regardless of what is written in commit message.

Everything works, why bother?

> The only other potential solution I see is exposing the unaccepted
> memory as coldplugged ACPI memory objects, and implementing the accept
> calls via PRM methods. But PRM has had very little test coverage, so
> it is anybody's guess whether it works for the stable kernels that we
> need to support with this. It would also mean that the new unaccepted
> memory logic would need to be updated and cross reference these memory
> regions with EFI unaccepted memory regions and avoid claiming them
> both.

Nah. That is a lot of complexity for no particular reason.

Ard Biesheuvel April 5, 2023, 1:44 p.m. UTC | #13

On Wed, 5 Apr 2023 at 15:00, Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 4/5/23 00:46, Ard Biesheuvel wrote:
> > Once the firmware stops exposing this protocol (and ceases to accept
> > memory on the OS's behalf), we can phase it out from the kernel as
> > well.
>
> This is a part of the story that I have doubts about.
>
> How and when do you think this phase-out would happen, realistically?
>
> The firmware will need the unaccepted memory protocol support as long as
> there are guests around that need it, right?
>

Current firmware will accept all memory on behalf of the OS unless the
OS invokes the protocol to prevent it from doing so.

Future firmware will simply never accept all memory on behalf of the
OS, and not expose the protocol at all.

So the difference of opinion mainly comes down to whether or not the
intermediate, first step is needed or not.

Unenlightened OS kernels will not invoke the protocol, and will
therefore need current firmware in order to see all of their memory.

Enlightened OS kernels will invoke the protocol unless it does not
exist, and so will be able to accept their memory lazily both on
current and future firmware.

We will be able to move to future firmware once we no longer need to
support unenlightened kernels.

> People like to keep running old kernels for a _long_ time.  Doesn't that
> mean _some_ firmware will need to keep doing this dance for a long time?
>

Yes.

> As long as there is firmware out there in the wild that people want to
> run new kernels on, the support needs to stay in mainline.  It can't be
> dropped.

The penalty for not calling the protocol on firmware that implements
it is a much slower boot, but everything works as it should beyond
that.

Given that the intent here is to retain compatibility with
unenlightened workloads (i.e., which do not upgrade their kernels), I
think it is perfectly reasonable to drop this from mainline at some
point.

Ard Biesheuvel April 5, 2023, 1:51 p.m. UTC | #14

On Wed, 5 Apr 2023 at 15:42, Kirill A. Shutemov <kirill@shutemov.name> wrote:
>
> On Wed, Apr 05, 2023 at 09:46:59AM +0200, Ard Biesheuvel wrote:
> > On Tue, 4 Apr 2023 at 23:02, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > >
> > > On Tue, Apr 04, 2023 at 10:41:02PM +0200, Ard Biesheuvel wrote:
> > > > On Tue, 4 Apr 2023 at 22:24, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > > > >
> > > > > On Tue, Apr 04, 2023 at 09:49:52PM +0200, Ard Biesheuvel wrote:
> > > > > > On Tue, 4 Apr 2023 at 20:09, Kirill A. Shutemov <kirill@shutemov.name> wrote:
> > > > > > >
> > > > > > > On Tue, Apr 04, 2023 at 10:57:52AM -0700, Dave Hansen wrote:
> > > > > > > > On 4/4/23 10:45, Kirill A. Shutemov wrote:
> > > > > > > > > I still think it is a bad idea.
> > > > > > > > >
> > > > > > > > > As I asked before, please include my
> > > > > > > > >
> > > > > > > > > Nacked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> > > > > > > > >
> > > > > > > > > into the patch.
> > > > > > > >
> > > > > > > > I was pretty opposed to this when I first saw it too.  But, Tom and
> > > > > > > > company have worn down my opposition a bit.
> > > > > > > >
> > > > > > > > The fact is that we have upstream kernels out there with SEV-SNP support
> > > > > > > > that don't know anything about unaccepted memory.  They're either
> > > > > > > > relegated to using the pre-accepted memory (4GB??) or _some_ entity
> > > > > > > > needs to accept the memory.  That entity obviously can't be the kernel
> > > > > > > > unless we backport unaccepted memory support.
> > > > > > > >
> > > > > > > > This both lets the BIOS be the page-accepting entity _and_ allows the
> > > > > > > > entity to delegate that to the kernel when it needs to.
> > > > > > > >
> > > > > > > > As much as I want to nak this and pretend that that those existing
> > > > > > > > kernel's don't exist, my powers of self-delusion do have their limits.
> > > > > > > >
> > > > > > > > If our AMD friends don't do this, what is their alternative?
> > > > > > >
> > > > > > > The alternative is coordination on the host side: VMM can load a BIOS that
> > > > > > > pre-accepts all memory if the kernel is older.
> > > > > > >
> > > > > >
> > > > > > And how does one identify such a kernel? How does the VMM know which
> > > > > > kernel the guest is going to load after it boots?
> > > > >
> > > > > VMM has to know what it is running. Yes, it is cumbersome. But enabling
> > > > > phase for a feature is often rough. It will get smoother overtime.
> > > > >
> > > >
> > > > So how does the VMM get informed about what it is running? How does it
> > > > distinguish between kernels that support unaccepted memory and ones
> > > > that don't? And how does it predict which kernel a guest is going to
> > > > load?
> > >
> > > User will specify if it wants unaccepted memory or not for the VM. And if
> > > it does it is his responsibility to have kernel that supports it.
> > >
> > > And you have not addressed my question:
> > >
> > >         How is it different from any other feature the kernel is not [yet] aware
> > >         of?
> > >
> >
> > It is the same problem, but this is just a better solution.
>
> Okay, we at least agree that there are more then one solution to the
> problem.
>
> > Having a BIOS menu option (or similar) to choose between unaccepted
> > memory or not (or to expose CXL memory via the EFI memory map, which is
> > another hack I have seen) is just unnecessary complication, if the
> > kernel can simply inform the loader about what it supports. We do this
> > all the time with things like OsIndications.
>
> It assumes that kernel calls ExitBootServices() which is not always true.
> A bootloader in between will make impossible for kernel to use any of
> futures exposed this way.
>
> But we talked about this before.
>

Yes, we have. But this is a theoretical concern, as nobody who is
deploying this stuff is interested in booting the kernel without the
stub: even the trenchboot folks are bending over backwards to
incorporate execution of the kernel's EFI stub into the D-RTM
bootflow, and all of the confidential compute attestation logic is
based on EFI protocols as well. So using a bootloader that calls
ExitBootServices() and subsequently boots the Linux kernel using the
legacy boot protocol is simply not something anyone is interested in
doing. But don't take my word for it.

> BTW, can we at least acknowledge the limitation in the commit message?
>

Sure.

> > We can phase out the protocol implementation from the firmware once we
> > no longer need it, at which point the LocateProtocol() call just
> > becomes a NOP (we do the same thing for UGA support, which has
> > disappeared a long time ago, but we still look for the protocol in the
> > EFI stub).
> >
> > Once the firmware stops exposing this protocol (and ceases to accept
> > memory on the OS's behalf), we can phase it out from the kernel as
> > well.
>
> It is unlikely to ever happen. In few year everybody will forget about
> this conversation. Regardless of what is written in commit message.
>
> Everything works, why bother?
>

That is a good question. If it doesn't get in the way and does not
prevent us from doing any of the things we want to do, why would we
even care?

But as I argued in my reply to Dave, we can actually drop it from
mainline later if we provide an upgrade path for legacy workloads that
want to upgrade their kernels.

> > The only other potential solution I see is exposing the unaccepted
> > memory as coldplugged ACPI memory objects, and implementing the accept
> > calls via PRM methods. But PRM has had very little test coverage, so
> > it is anybody's guess whether it works for the stable kernels that we
> > need to support with this. It would also mean that the new unaccepted
> > memory logic would need to be updated and cross reference these memory
> > regions with EFI unaccepted memory regions and avoid claiming them
> > both.
>
> Nah. That is a lot of complexity for no particular reason.
>

Good, at least we agree on that :-)

Dave Hansen April 5, 2023, 4:15 p.m. UTC | #15

On 4/5/23 06:44, Ard Biesheuvel wrote:
> Given that the intent here is to retain compatibility with
> unenlightened workloads (i.e., which do not upgrade their kernels), I
> think it is perfectly reasonable to drop this from mainline at some
> point.

OK, so there are three firmware types that matter:

1. Today's SEV-SNP deployed firmware.
2. Near future SEV-SNP firmware that exposes the new ExitBootServices()
   protocol that allows guests that speak the protocol to boot faster
   by participating in the unaccepted memory dance.
3. Far future firmware that doesn't have the ExitBootServices() protocol

There are also three kernel types:
1. Old kernels with zero unaccepted memory support: no
   ExitBootServices() protocol support and no hypercalls to accept pages
2. Kernels that can accept pages and twiddle the ExitBootServices() flag
3. Future kernels that can accept pages, but have had ExitBootServices()
   support removed.

That leads to nine possible mix-and-match firmware/kernel combos.  I'm
personally assuming that folks are going to *try* to run with all of
these combos and will send us kernel folks bug reports if they see
regressions.  Let's just enumerate all of them and their implications
before we go consult our crystal balls about what folks will actually do
in the future.

So, here we go:

              |                   Kernel                   |
              |                                            |
              | Unenlightened | Enlightened | Dropped UEFI |
Firmware      |     ~5.19??   |    ~6.4??   | protocol     |
              |---------------+-------------+--------------|
Deployed      |   Slow boot   |  Slow boot  |  Slow boot   |
Near future   |   Slow boot   |  Fast boot  |  Slow boot   |
Far future    |   Crashes??   |  Fast Boot  |  Fast boot   |

I hope I got that all right.

The thing that worries me is the "Near future firmware" where someone
runs a ~6.4 kernel and has a fast boot experience.  They upgrade to a
newer, "dropped protocol" kernel and their boot gets slower.

I'm also a little fuzzy about what an ancient enlightened kernel would
do on a "far future" firmware that requires unaccepted memory support.
I _think_ those kernels would hit some unaccepted memory, and
#VC/#VE/#whatever and die.  Is that right, or is there some fallback there?

Kirill A. Shutemov April 5, 2023, 7:06 p.m. UTC | #16

On Wed, Apr 05, 2023 at 09:15:15AM -0700, Dave Hansen wrote:
> On 4/5/23 06:44, Ard Biesheuvel wrote:
> > Given that the intent here is to retain compatibility with
> > unenlightened workloads (i.e., which do not upgrade their kernels), I
> > think it is perfectly reasonable to drop this from mainline at some
> > point.
> 
> OK, so there are three firmware types that matter:
> 
> 1. Today's SEV-SNP deployed firmware.
> 2. Near future SEV-SNP firmware that exposes the new ExitBootServices()
>    protocol that allows guests that speak the protocol to boot faster
>    by participating in the unaccepted memory dance.
> 3. Far future firmware that doesn't have the ExitBootServices() protocol
> 
> There are also three kernel types:
> 1. Old kernels with zero unaccepted memory support: no
>    ExitBootServices() protocol support and no hypercalls to accept pages
> 2. Kernels that can accept pages and twiddle the ExitBootServices() flag
> 3. Future kernels that can accept pages, but have had ExitBootServices()
>    support removed.
> 
> That leads to nine possible mix-and-match firmware/kernel combos.  I'm
> personally assuming that folks are going to *try* to run with all of
> these combos and will send us kernel folks bug reports if they see
> regressions.  Let's just enumerate all of them and their implications
> before we go consult our crystal balls about what folks will actually do
> in the future.
> 
> So, here we go:
> 
>               |                   Kernel                   |
>               |                                            |
>               | Unenlightened | Enlightened | Dropped UEFI |
> Firmware      |     ~5.19??   |    ~6.4??   | protocol     |
>               |---------------+-------------+--------------|
> Deployed      |   Slow boot   |  Slow boot  |  Slow boot   |
> Near future   |   Slow boot   |  Fast boot  |  Slow boot   |
> Far future    |   Crashes??   |  Fast Boot  |  Fast boot   |
> 
> I hope I got that all right.
> 
> The thing that worries me is the "Near future firmware" where someone
> runs a ~6.4 kernel and has a fast boot experience.  They upgrade to a
> newer, "dropped protocol" kernel and their boot gets slower.
> 
> I'm also a little fuzzy about what an ancient enlightened kernel would
> do on a "far future" firmware that requires unaccepted memory support.
> I _think_ those kernels would hit some unaccepted memory, and
> #VC/#VE/#whatever and die.  Is that right, or is there some fallback there?

The far future firmware in this scheme would expose unaccepted memory in
EFI memory map without need of kernel to declare unaccepted memory
support. The unenlightened kernel in this case will not be able to use the
memory and consider it reserved. Only memory accepted by firmware will be
accessible. Depending on how much memory firmware would pre-accept it can
be OOM, but more likely it will boot fine with the fraction of memory
usable.

Tom Lendacky April 5, 2023, 8:11 p.m. UTC | #17

On 4/5/23 14:06, Kirill A. Shutemov wrote:
> On Wed, Apr 05, 2023 at 09:15:15AM -0700, Dave Hansen wrote:
>> On 4/5/23 06:44, Ard Biesheuvel wrote:
>>> Given that the intent here is to retain compatibility with
>>> unenlightened workloads (i.e., which do not upgrade their kernels), I
>>> think it is perfectly reasonable to drop this from mainline at some
>>> point.
>>
>> OK, so there are three firmware types that matter:
>>
>> 1. Today's SEV-SNP deployed firmware.

SNP support is originally available as part of the edk2-stable202202 release.

>> 2. Near future SEV-SNP firmware that exposes the new ExitBootServices()
>>     protocol that allows guests that speak the protocol to boot faster
>>     by participating in the unaccepted memory dance.

This is already out and available as part of the edk2-stable202302 release.

But it did come out after general SNP support, so the near future 
terminology works.

>> 3. Far future firmware that doesn't have the ExitBootServices() protocol
>>
>> There are also three kernel types:
>> 1. Old kernels with zero unaccepted memory support: no
>>     ExitBootServices() protocol support and no hypercalls to accept pages
>> 2. Kernels that can accept pages and twiddle the ExitBootServices() flag
>> 3. Future kernels that can accept pages, but have had ExitBootServices()
>>     support removed.
>>
>> That leads to nine possible mix-and-match firmware/kernel combos.  I'm
>> personally assuming that folks are going to *try* to run with all of
>> these combos and will send us kernel folks bug reports if they see
>> regressions.  Let's just enumerate all of them and their implications
>> before we go consult our crystal balls about what folks will actually do
>> in the future.
>>
>> So, here we go:
>>
>>                |                   Kernel                   |
>>                |                                            |
>>                | Unenlightened | Enlightened | Dropped UEFI |
>> Firmware      |     ~5.19??   |    ~6.4??   | protocol     |
>>                |---------------+-------------+--------------|
>> Deployed      |   Slow boot   |  Slow boot  |  Slow boot   |
>> Near future   |   Slow boot   |  Fast boot  |  Slow boot   |
>> Far future    |   Crashes??   |  Fast Boot  |  Fast boot   |
>>
>> I hope I got that all right.

Looks correct to me (with Kirill's description below in place of the 
"Crashes??").

>>
>> The thing that worries me is the "Near future firmware" where someone
>> runs a ~6.4 kernel and has a fast boot experience.  They upgrade to a
>> newer, "dropped protocol" kernel and their boot gets slower.

Right, so that is what begs the question of when to actually drop the 
call. Or does it really need to be dropped? It's a small patch to execute 
a boot services call, I guess I don't see the big deal of it being there.
If the firmware still has the protocol, the call is made, if it doesn't, 
its not. In the overall support for unaccepted memory, this seems to be a 
very minor piece.

>>
>> I'm also a little fuzzy about what an ancient enlightened kernel would
>> do on a "far future" firmware that requires unaccepted memory support.
>> I _think_ those kernels would hit some unaccepted memory, and
>> #VC/#VE/#whatever and die.  Is that right, or is there some fallback there?
> 
> The far future firmware in this scheme would expose unaccepted memory in
> EFI memory map without need of kernel to declare unaccepted memory
> support. The unenlightened kernel in this case will not be able to use the
> memory and consider it reserved. Only memory accepted by firmware will be
> accessible. Depending on how much memory firmware would pre-accept it can
> be OOM, but more likely it will boot fine with the fraction of memory
> usable.

Right, since a typical Qemu VM has a 2GB hole for PCI/MMIO, the guest is 
likely to only see 2GB of memory available to it.

Thanks,
Tom

>

Dave Hansen April 5, 2023, 9:22 p.m. UTC | #18

On 4/5/23 13:11, Tom Lendacky wrote:
>>> The thing that worries me is the "Near future firmware" where someone
>>> runs a ~6.4 kernel and has a fast boot experience.  They upgrade to a
>>> newer, "dropped protocol" kernel and their boot gets slower.
> 
> Right, so that is what begs the question of when to actually drop the
> call. Or does it really need to be dropped? It's a small patch to
> execute a boot services call, I guess I don't see the big deal of it
> being there.
> If the firmware still has the protocol, the call is made, if it doesn't,
> its not. In the overall support for unaccepted memory, this seems to be
> a very minor piece.

I honestly don't think it's a big deal either, at least on the kernel
side.  Maybe it's a bigger deal to the firmware folks on their side.

So, the corrected table looks something like this:

              |                   Kernel                   |
              |                                            |
              | Unenlightened | Enlightened | Dropped UEFI |
Firmware      |     ~5.19??   |    ~6.4??   | protocol     |
              |---------------+-------------+--------------|
Deployed      |   Slow boot   |  Slow boot  |  Slow boot   |
Near future   |   Slow boot   |  Fast boot  |  Slow boot   |
Far future    |  2GB limited  |  Fast Boot  |  Fast boot   |

But, honestly, I don't see much benefit to the "dropped UEFI protocol".
It adds complexity and will represent a regression either in boot
speeds, or in unenlightened kernels losing RAM when moving to newer
firmware.  Neither of those is great.

Looking at this _purely_ from the kernel perspective, I think I'd prefer
this situation:

          |            Kernel           |
          |                             |
          | Unenlightened | Enlightened |
Firmware  |     ~5.19??   |    ~6.4??   |
          |---------------+-------------+
Deployed  |   Slow boot   |  Slow boot  |
Future    |   Slow boot   |  Fast boot  |

and not have future firmware drop support for the handshake protocol.
That way there are no potential regressions.

Is there a compelling reason on the firmware side to drop the
ExitBootServices() protocol that I'm missing?

Ard Biesheuvel April 5, 2023, 9:34 p.m. UTC | #19

On Wed, 5 Apr 2023 at 23:23, Dave Hansen <dave.hansen@intel.com> wrote:
>
> On 4/5/23 13:11, Tom Lendacky wrote:
> >>> The thing that worries me is the "Near future firmware" where someone
> >>> runs a ~6.4 kernel and has a fast boot experience.  They upgrade to a
> >>> newer, "dropped protocol" kernel and their boot gets slower.
> >
> > Right, so that is what begs the question of when to actually drop the
> > call. Or does it really need to be dropped? It's a small patch to
> > execute a boot services call, I guess I don't see the big deal of it
> > being there.
> > If the firmware still has the protocol, the call is made, if it doesn't,
> > its not. In the overall support for unaccepted memory, this seems to be
> > a very minor piece.
>
> I honestly don't think it's a big deal either, at least on the kernel
> side.  Maybe it's a bigger deal to the firmware folks on their side.
>
> So, the corrected table looks something like this:
>
>               |                   Kernel                   |
>               |                                            |
>               | Unenlightened | Enlightened | Dropped UEFI |
> Firmware      |     ~5.19??   |    ~6.4??   | protocol     |
>               |---------------+-------------+--------------|
> Deployed      |   Slow boot   |  Slow boot  |  Slow boot   |
> Near future   |   Slow boot   |  Fast boot  |  Slow boot   |
> Far future    |  2GB limited  |  Fast Boot  |  Fast boot   |
>

I don't think there is any agreement on the firmware side on what
constitutes are reasonable minimum to accept when lazy accept is in
use, so the 2 GiB is really the upper bound here, and it could
substantially less.

>
> But, honestly, I don't see much benefit to the "dropped UEFI protocol".
> It adds complexity and will represent a regression either in boot
> speeds, or in unenlightened kernels losing RAM when moving to newer
> firmware.  Neither of those is great.
>
> Looking at this _purely_ from the kernel perspective, I think I'd prefer
> this situation:
>
>           |            Kernel           |
>           |                             |
>           | Unenlightened | Enlightened |
> Firmware  |     ~5.19??   |    ~6.4??   |
>           |---------------+-------------+
> Deployed  |   Slow boot   |  Slow boot  |
> Future    |   Slow boot   |  Fast boot  |
>
> and not have future firmware drop support for the handshake protocol.
> That way there are no potential regressions.
>
> Is there a compelling reason on the firmware side to drop the
> ExitBootServices() protocol that I'm missing?

The protocol only exists to stop the firmware from eagerly accepting
all memory on behalf of the OS. So from the firmware side, it would be
more about removing that functionality (making the protocol call moot)
rather than removing the protocol itself.

diff mbox series

Patch

diff --git a/drivers/firmware/efi/libstub/x86-stub.c b/drivers/firmware/efi/libstub/x86-stub.c
index 1afe7b5b02e1..119e201cfc68 100644
--- a/drivers/firmware/efi/libstub/x86-stub.c
+++ b/drivers/firmware/efi/libstub/x86-stub.c
@@ -27,6 +27,17 @@  const efi_dxe_services_table_t *efi_dxe_table;
 u32 image_offset __section(".data");
 static efi_loaded_image_t *image = NULL;
 
+typedef union sev_memory_acceptance_protocol sev_memory_acceptance_protocol_t;
+union sev_memory_acceptance_protocol {
+	struct {
+		efi_status_t (__efiapi * allow_unaccepted_memory)(
+			sev_memory_acceptance_protocol_t *);
+	};
+	struct {
+		u32 allow_unaccepted_memory;
+	} mixed_mode;
+};
+
 static efi_status_t
 preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom)
 {
@@ -311,6 +322,29 @@  setup_memory_protection(unsigned long image_base, unsigned long image_size)
 #endif
 }
 
+static void setup_unaccepted_memory(void)
+{
+	efi_guid_t mem_acceptance_proto = OVMF_SEV_MEMORY_ACCEPTANCE_PROTOCOL_GUID;
+	sev_memory_acceptance_protocol_t *proto;
+	efi_status_t status;
+
+	if (!IS_ENABLED(CONFIG_UNACCEPTED_MEMORY))
+		return;
+
+	/*
+	 * Enable unaccepted memory before calling exit boot services in order
+	 * for the UEFI to not accept all memory on EBS.
+	 */
+	status = efi_bs_call(locate_protocol, &mem_acceptance_proto, NULL,
+			     (void **)&proto);
+	if (status != EFI_SUCCESS)
+		return;
+
+	status = efi_call_proto(proto, allow_unaccepted_memory);
+	if (status != EFI_SUCCESS)
+		efi_err("Memory acceptance protocol failed\n");
+}
+
 static const efi_char16_t apple[] = L"Apple";
 
 static void setup_quirks(struct boot_params *boot_params,
@@ -967,6 +1001,8 @@  asmlinkage unsigned long efi_main(efi_handle_t handle,
 
 	setup_quirks(boot_params, bzimage_addr, buffer_end - buffer_start);
 
+	setup_unaccepted_memory();
+
 	status = exit_boot(boot_params, handle);
 	if (status != EFI_SUCCESS) {
 		efi_err("exit_boot() failed!\n");
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 1d4f0343c710..e728b8cf6b73 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -436,6 +436,9 @@  void efi_native_runtime_setup(void);
 #define DELLEMC_EFI_RCI2_TABLE_GUID		EFI_GUID(0x2d9f28a2, 0xa886, 0x456a,  0x97, 0xa8, 0xf1, 0x1e, 0xf2, 0x4f, 0xf4, 0x55)
 #define AMD_SEV_MEM_ENCRYPT_GUID		EFI_GUID(0x0cf29b71, 0x9e51, 0x433a,  0xa3, 0xb7, 0x81, 0xf3, 0xab, 0x16, 0xb8, 0x75)
 
+/* OVMF protocol GUIDs */
+#define OVMF_SEV_MEMORY_ACCEPTANCE_PROTOCOL_GUID	EFI_GUID(0xc5a010fe, 0x38a7, 0x4531,  0x8a, 0x4a, 0x05, 0x00, 0xd2, 0xfd, 0x16, 0x49)
+
 typedef struct {
 	efi_guid_t guid;
 	u64 table;