[PATCHv13,5/9] efi: Add unaccepted memory support

Message ID 20230601182543.19036-6-kirill.shutemov@linux.intel.com
State New
Headers
Series mm, x86/cc, efi: Implement support for unaccepted memory |

Commit Message

Kirill A. Shutemov June 1, 2023, 6:25 p.m. UTC
  efi_config_parse_tables() reserves memory that holds unaccepted memory
configuration table so it won't be reused by page allocator.

Core-mm requires few helpers to support unaccepted memory:

 - accept_memory() checks the range of addresses against the bitmap and
   accept memory if needed.

 - range_contains_unaccepted_memory() checks if anything within the
   range requires acceptance.

Architectural code has to provide efi_get_unaccepted_table() that
returns pointer to the unaccepted memory configuration table.

arch_accept_memory() handles arch-specific part of memory acceptance.

Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Tom Lendacky <thomas.lendacky@amd.com>
---
 arch/x86/platform/efi/efi.c              |   3 +
 drivers/firmware/efi/Makefile            |   1 +
 drivers/firmware/efi/efi.c               |  25 ++++++
 drivers/firmware/efi/unaccepted_memory.c | 103 +++++++++++++++++++++++
 include/linux/efi.h                      |   1 +
 5 files changed, 133 insertions(+)
 create mode 100644 drivers/firmware/efi/unaccepted_memory.c
  

Comments

Borislav Petkov June 5, 2023, 3:43 p.m. UTC | #1
On Thu, Jun 01, 2023 at 09:25:39PM +0300, Kirill A. Shutemov wrote:
> +void accept_memory(phys_addr_t start, phys_addr_t end)
> +{
> +	struct efi_unaccepted_memory *unaccepted;
> +	unsigned long range_start, range_end;
> +	unsigned long flags;
> +	u64 unit_size;
> +
> +	if (efi.unaccepted == EFI_INVALID_TABLE_ADDR)
> +		return;

efi_get_unaccepted_table() already does this test.

> +	unaccepted = efi_get_unaccepted_table();
> +	if (!unaccepted)
> +		return;

So this looks weird: callers can call accept_memory() and that function
can fail. But they can't know whether it failed or not because it
returns void.

> +	unit_size = unaccepted->unit_size;
> +
> +	/*
> +	 * Only care for the part of the range that is represented
> +	 * in the bitmap.
> +	 */
> +	if (start < unaccepted->phys_base)
> +		start = unaccepted->phys_base;

So this silently trims start...

> +	if (end < unaccepted->phys_base)
> +		return;

But fails only when end is outside of range.

I'd warn here at least. And return an error so that the callers know.

> +	/* Translate to offsets from the beginning of the bitmap */
> +	start -= unaccepted->phys_base;
> +	end -= unaccepted->phys_base;
> +
> +	/* Make sure not to overrun the bitmap */
> +	if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
> +		end = unaccepted->size * unit_size * BITS_PER_BYTE;

How is all that trimming not important to the caller?

It would assume that its memory got accepted but not really.

> +	range_start = start / unit_size;
> +
> +	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> +	for_each_set_bitrange_from(range_start, range_end, unaccepted->bitmap,
> +				   DIV_ROUND_UP(end, unit_size)) {
> +		unsigned long phys_start, phys_end;
> +		unsigned long len = range_end - range_start;
> +
> +		phys_start = range_start * unit_size + unaccepted->phys_base;
> +		phys_end = range_end * unit_size + unaccepted->phys_base;
> +
> +		arch_accept_memory(phys_start, phys_end);
> +		bitmap_clear(unaccepted->bitmap, range_start, len);
> +	}
> +	spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
> +}
> +
> +bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end)
> +{
> +	struct efi_unaccepted_memory *unaccepted;
> +	unsigned long flags;
> +	bool ret = false;
> +	u64 unit_size;
> +
> +	unaccepted = efi_get_unaccepted_table();
> +	if (!unaccepted)
> +		return false;
> +
> +	unit_size = unaccepted->unit_size;
> +
> +	/*
> +	 * Only care for the part of the range that is represented
> +	 * in the bitmap.
> +	 */
> +	if (start < unaccepted->phys_base)
> +		start = unaccepted->phys_base;

Same comment as above. Trimming start is fine?

> +	if (end < unaccepted->phys_base)
> +		return false;
> +
> +	/* Translate to offsets from the beginning of the bitmap */
> +	start -= unaccepted->phys_base;
> +	end -= unaccepted->phys_base;

Ditto as above.

> +
> +	/* Make sure not to overrun the bitmap */
> +	if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
> +		end = unaccepted->size * unit_size * BITS_PER_BYTE;

Ditto.

> +	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> +	while (start < end) {
> +		if (test_bit(start / unit_size, unaccepted->bitmap)) {
> +			ret = true;
> +			break;

I have a faint memory we've had this before but you need to check
*every* bit in the unaccepted bitmap before returning true. Doh.
  
Kirill A. Shutemov June 5, 2023, 5:33 p.m. UTC | #2
On Mon, Jun 05, 2023 at 05:43:33PM +0200, Borislav Petkov wrote:
> On Thu, Jun 01, 2023 at 09:25:39PM +0300, Kirill A. Shutemov wrote:
> > +void accept_memory(phys_addr_t start, phys_addr_t end)
> > +{
> > +	struct efi_unaccepted_memory *unaccepted;
> > +	unsigned long range_start, range_end;
> > +	unsigned long flags;
> > +	u64 unit_size;
> > +
> > +	if (efi.unaccepted == EFI_INVALID_TABLE_ADDR)
> > +		return;
> 
> efi_get_unaccepted_table() already does this test.

Okay.

> > +	unaccepted = efi_get_unaccepted_table();
> > +	if (!unaccepted)
> > +		return;
> 
> So this looks weird: callers can call accept_memory() and that function
> can fail. But they can't know whether it failed or not because it
> returns void.

It is not a failure here. If there's no unaccepted memory in the system
accept_memory() always succeeds.

> > +	unit_size = unaccepted->unit_size;
> > +
> > +	/*
> > +	 * Only care for the part of the range that is represented
> > +	 * in the bitmap.
> > +	 */
> > +	if (start < unaccepted->phys_base)
> > +		start = unaccepted->phys_base;
> 
> So this silently trims start...
> 
> > +	if (end < unaccepted->phys_base)
> > +		return;
> 
> But fails only when end is outside of range.
> 
> I'd warn here at least. And return an error so that the callers know.

There's nothing to warn about. The range (or part of it) is not
represented in the bitmap because it is not unaccepted. We only allocate
bitmap for the range that has unaccepted memory. It can reduce memory
overhead on the bitmap if the unaccepted memory starts very high or ends
early, but there's something else very high in physical addresss space.

> > +	/* Translate to offsets from the beginning of the bitmap */
> > +	start -= unaccepted->phys_base;
> > +	end -= unaccepted->phys_base;
> > +
> > +	/* Make sure not to overrun the bitmap */
> > +	if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
> > +		end = unaccepted->size * unit_size * BITS_PER_BYTE;
> 
> How is all that trimming not important to the caller?
> 
> It would assume that its memory got accepted but not really.

See above: not represented in the bitmap means pre-accepted.

...

> > +	spin_lock_irqsave(&unaccepted_memory_lock, flags);
> > +	while (start < end) {
> > +		if (test_bit(start / unit_size, unaccepted->bitmap)) {
> > +			ret = true;
> > +			break;
> 
> I have a faint memory we've had this before but you need to check
> *every* bit in the unaccepted bitmap before returning true. Doh.

Yes, it was discussed before. Here's context:

https://lore.kernel.org/all/Ynt8vDY78/YeXO99@zn.tnic
  
Borislav Petkov June 5, 2023, 7:12 p.m. UTC | #3
On Mon, Jun 05, 2023 at 08:33:03PM +0300, Kirill A. Shutemov wrote:
> There's nothing to warn about. The range (or part of it) is not
> represented in the bitmap because it is not unaccepted.

Sorry but how am I supposed to know that?!

I've read the whole patchset up until now and all text talks like *all*
*memory* needs to be accepted and before that has happeend, it is
unaccepted.

So how about you explain that explicitly somewhere, perhaps in a comment
above accept_memory(), that the unaccepted range is not the whole memory
but only, well, what is unaccepted and the rest is implicitly accepted?

And I went and looked at the final result - we error() if we fail
accepting.

I guess that's the only action we can do anyway...

> Yes, it was discussed before. Here's context:
> 
> https://lore.kernel.org/all/Ynt8vDY78/YeXO99@zn.tnic

You should try those before you paste them - it says "Not found" because
of the '/' in the Message-ID and it needs to be escaped.

This works:

https://lore.kernel.org/all/Ynt8vDY78%2FYeXO99@zn.tnic/

Now I remember.

Thx.
  
Kirill A. Shutemov June 5, 2023, 9:37 p.m. UTC | #4
On Mon, Jun 05, 2023 at 09:12:25PM +0200, Borislav Petkov wrote:
> On Mon, Jun 05, 2023 at 08:33:03PM +0300, Kirill A. Shutemov wrote:
> > There's nothing to warn about. The range (or part of it) is not
> > represented in the bitmap because it is not unaccepted.
> 
> Sorry but how am I supposed to know that?!
> 
> I've read the whole patchset up until now and all text talks like *all*
> *memory* needs to be accepted and before that has happeend, it is
> unaccepted.
> 
> So how about you explain that explicitly somewhere, perhaps in a comment
> above accept_memory(), that the unaccepted range is not the whole memory
> but only, well, what is unaccepted and the rest is implicitly accepted?

Okay, will do.

> And I went and looked at the final result - we error() if we fail
> accepting.
> 
> I guess that's the only action we can do anyway...

Right, there's no recovery from the error.
  
Kirill A. Shutemov June 6, 2023, 12:19 p.m. UTC | #5
On Mon, Jun 05, 2023 at 09:12:25PM +0200, Borislav Petkov wrote:
> So how about you explain that explicitly somewhere, perhaps in a comment
> above accept_memory(), that the unaccepted range is not the whole memory
> but only, well, what is unaccepted and the rest is implicitly accepted?

Does it look okay to you?

/*
 * accept_memory() -- Consult bitmap and accept the memory if needed.
 *
 * Only memory that explicitly marked as unaccepted in the bitmap requires
 * an action.
 *
 * No need to accept:
 *  - anything if the system has no unaccepted table;
 *  - memory that is below phys_base;
 *  - memory that is above the memory that addressable by the bitmap;
 */
  
Borislav Petkov June 6, 2023, 12:29 p.m. UTC | #6
On Tue, Jun 06, 2023 at 03:19:24PM +0300, Kirill A. Shutemov wrote:
> Does it look okay to you?
> 
> /*
>  * accept_memory() -- Consult bitmap and accept the memory if needed.
>  *
>  * Only memory that explicitly marked as unaccepted in the bitmap requires

		... that is ...

>  * an action.

And let's add an additional sentence stating it all clearly:

"All the remaining memory is implicitly accepted and doesn't need acceptance."

>  *
>  * No need to accept:
>  *  - anything if the system has no unaccepted table;
>  *  - memory that is below phys_base;
>  *  - memory that is above the memory that addressable by the bitmap;

And this is an additional clarification.

Good, thanks.
  

Patch

diff --git a/arch/x86/platform/efi/efi.c b/arch/x86/platform/efi/efi.c
index f3f2d87cce1b..e9f99c56f3ce 100644
--- a/arch/x86/platform/efi/efi.c
+++ b/arch/x86/platform/efi/efi.c
@@ -96,6 +96,9 @@  static const unsigned long * const efi_tables[] = {
 #ifdef CONFIG_EFI_COCO_SECRET
 	&efi.coco_secret,
 #endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+	&efi.unaccepted,
+#endif
 };
 
 u64 efi_setup;		/* efi setup_data physical address */
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index b51f2a4c821e..e489fefd23da 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -41,3 +41,4 @@  obj-$(CONFIG_EFI_CAPSULE_LOADER)	+= capsule-loader.o
 obj-$(CONFIG_EFI_EARLYCON)		+= earlycon.o
 obj-$(CONFIG_UEFI_CPER_ARM)		+= cper-arm.o
 obj-$(CONFIG_UEFI_CPER_X86)		+= cper-x86.o
+obj-$(CONFIG_UNACCEPTED_MEMORY)		+= unaccepted_memory.o
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 7dce06e419c5..d817e7afd266 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -50,6 +50,9 @@  struct efi __read_mostly efi = {
 #ifdef CONFIG_EFI_COCO_SECRET
 	.coco_secret		= EFI_INVALID_TABLE_ADDR,
 #endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+	.unaccepted		= EFI_INVALID_TABLE_ADDR,
+#endif
 };
 EXPORT_SYMBOL(efi);
 
@@ -605,6 +608,9 @@  static const efi_config_table_type_t common_tables[] __initconst = {
 #ifdef CONFIG_EFI_COCO_SECRET
 	{LINUX_EFI_COCO_SECRET_AREA_GUID,	&efi.coco_secret,	"CocoSecret"	},
 #endif
+#ifdef CONFIG_UNACCEPTED_MEMORY
+	{LINUX_EFI_UNACCEPTED_MEM_TABLE_GUID,	&efi.unaccepted,	"Unaccepted"	},
+#endif
 #ifdef CONFIG_EFI_GENERIC_STUB
 	{LINUX_EFI_SCREEN_INFO_TABLE_GUID,	&screen_info_table			},
 #endif
@@ -759,6 +765,25 @@  int __init efi_config_parse_tables(const efi_config_table_t *config_tables,
 		}
 	}
 
+	if (IS_ENABLED(CONFIG_UNACCEPTED_MEMORY) &&
+	    efi.unaccepted != EFI_INVALID_TABLE_ADDR) {
+		struct efi_unaccepted_memory *unaccepted;
+
+		unaccepted = early_memremap(efi.unaccepted, sizeof(*unaccepted));
+		if (unaccepted) {
+			unsigned long size;
+
+			if (unaccepted->version == 1) {
+				size = sizeof(*unaccepted) + unaccepted->size;
+				memblock_reserve(efi.unaccepted, size);
+			} else {
+				efi.unaccepted = EFI_INVALID_TABLE_ADDR;
+			}
+
+			early_memunmap(unaccepted, sizeof(*unaccepted));
+		}
+	}
+
 	return 0;
 }
 
diff --git a/drivers/firmware/efi/unaccepted_memory.c b/drivers/firmware/efi/unaccepted_memory.c
new file mode 100644
index 000000000000..bb91c41f76fb
--- /dev/null
+++ b/drivers/firmware/efi/unaccepted_memory.c
@@ -0,0 +1,103 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include <linux/efi.h>
+#include <linux/memblock.h>
+#include <linux/spinlock.h>
+#include <asm/unaccepted_memory.h>
+
+/* Protects unaccepted memory bitmap */
+static DEFINE_SPINLOCK(unaccepted_memory_lock);
+
+void accept_memory(phys_addr_t start, phys_addr_t end)
+{
+	struct efi_unaccepted_memory *unaccepted;
+	unsigned long range_start, range_end;
+	unsigned long flags;
+	u64 unit_size;
+
+	if (efi.unaccepted == EFI_INVALID_TABLE_ADDR)
+		return;
+
+	unaccepted = efi_get_unaccepted_table();
+	if (!unaccepted)
+		return;
+
+	unit_size = unaccepted->unit_size;
+
+	/*
+	 * Only care for the part of the range that is represented
+	 * in the bitmap.
+	 */
+	if (start < unaccepted->phys_base)
+		start = unaccepted->phys_base;
+	if (end < unaccepted->phys_base)
+		return;
+
+	/* Translate to offsets from the beginning of the bitmap */
+	start -= unaccepted->phys_base;
+	end -= unaccepted->phys_base;
+
+	/* Make sure not to overrun the bitmap */
+	if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
+		end = unaccepted->size * unit_size * BITS_PER_BYTE;
+
+	range_start = start / unit_size;
+
+	spin_lock_irqsave(&unaccepted_memory_lock, flags);
+	for_each_set_bitrange_from(range_start, range_end, unaccepted->bitmap,
+				   DIV_ROUND_UP(end, unit_size)) {
+		unsigned long phys_start, phys_end;
+		unsigned long len = range_end - range_start;
+
+		phys_start = range_start * unit_size + unaccepted->phys_base;
+		phys_end = range_end * unit_size + unaccepted->phys_base;
+
+		arch_accept_memory(phys_start, phys_end);
+		bitmap_clear(unaccepted->bitmap, range_start, len);
+	}
+	spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
+}
+
+bool range_contains_unaccepted_memory(phys_addr_t start, phys_addr_t end)
+{
+	struct efi_unaccepted_memory *unaccepted;
+	unsigned long flags;
+	bool ret = false;
+	u64 unit_size;
+
+	unaccepted = efi_get_unaccepted_table();
+	if (!unaccepted)
+		return false;
+
+	unit_size = unaccepted->unit_size;
+
+	/*
+	 * Only care for the part of the range that is represented
+	 * in the bitmap.
+	 */
+	if (start < unaccepted->phys_base)
+		start = unaccepted->phys_base;
+	if (end < unaccepted->phys_base)
+		return false;
+
+	/* Translate to offsets from the beginning of the bitmap */
+	start -= unaccepted->phys_base;
+	end -= unaccepted->phys_base;
+
+	/* Make sure not to overrun the bitmap */
+	if (end > unaccepted->size * unit_size * BITS_PER_BYTE)
+		end = unaccepted->size * unit_size * BITS_PER_BYTE;
+
+	spin_lock_irqsave(&unaccepted_memory_lock, flags);
+	while (start < end) {
+		if (test_bit(start / unit_size, unaccepted->bitmap)) {
+			ret = true;
+			break;
+		}
+
+		start += unit_size;
+	}
+	spin_unlock_irqrestore(&unaccepted_memory_lock, flags);
+
+	return ret;
+}
diff --git a/include/linux/efi.h b/include/linux/efi.h
index 29cc622910da..9864f9c00da2 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -646,6 +646,7 @@  extern struct efi {
 	unsigned long			tpm_final_log;		/* TPM2 Final Events Log table */
 	unsigned long			mokvar_table;		/* MOK variable config table */
 	unsigned long			coco_secret;		/* Confidential computing secret table */
+	unsigned long			unaccepted;		/* Unaccepted memory table */
 
 	efi_get_time_t			*get_time;
 	efi_set_time_t			*set_time;