[v2] x86/boot: Add a message about ignored early NMIs

Message ID ZaTziftQSSg/v5Np@jeru.linux.bs1.fc.nec.co.jp
State New
Headers
Series [v2] x86/boot: Add a message about ignored early NMIs |

Commit Message

NOMURA JUNICHI(野村 淳一) Jan. 15, 2024, 8:57 a.m. UTC
  Commit 78a509fba9c9 ("x86/boot: Ignore NMIs during very early boot") added
empty handler in early boot stage to avoid boot failure by spurious NMIs.

Add a diagnostic message in case we need to know whether early NMIs have
occurred and/or what happened to them.

Signed-off-by: Jun'ichi Nomura <junichi.nomura@nec.com>
Suggested-by: Borislav Petkov <bp@alien8.de>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Link: https://lore.kernel.org/lkml/20231130103339.GCZWhlA196uRklTMNF@fat_crate.local/
--
v2
  * Moved variable declaration and definition to the right place
    based on comments from Kirill.
    No functional changes.
  

Comments

Kirill A. Shutemov Jan. 15, 2024, 10:14 a.m. UTC | #1
On Mon, Jan 15, 2024 at 08:57:45AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
> Commit 78a509fba9c9 ("x86/boot: Ignore NMIs during very early boot") added
> empty handler in early boot stage to avoid boot failure by spurious NMIs.
> 
> Add a diagnostic message in case we need to know whether early NMIs have
> occurred and/or what happened to them.
> 
> Signed-off-by: Jun'ichi Nomura <junichi.nomura@nec.com>
> Suggested-by: Borislav Petkov <bp@alien8.de>
> Suggested-by: H. Peter Anvin <hpa@zytor.com>

Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
  
Borislav Petkov Jan. 23, 2024, 11:26 a.m. UTC | #2
On Mon, Jan 15, 2024 at 08:57:45AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
> +	if (spurious_nmi_count) {
> +		error_putstr("Spurious early NMI ignored. Number of NMIs: 0x");
> +		error_puthex(spurious_nmi_count);
> +		error_putstr("\n");

Uff, that's just silly:

Spurious early NMIs ignored: 0x0000000000000017

Would you like to add a error_putnum() or so in a prepatch which would
make this output

Spurious early NMIs ignored: 23.

?

So that it is human readable and doesn't make me wonder what that hex
value is supposed to mean?

Thx.

Btw, please use this version when sending next time:

---
From: =?UTF-8?q?NOMURA=20JUNICHI=28=E9=87=8E=E6=9D=91=20=E6=B7=B3=E4=B8=80?=
 =?UTF-8?q?=29?= <junichi.nomura@nec.com>
Date: Mon, 15 Jan 2024 08:57:45 +0000
Subject: [PATCH] x86/boot: Add a message about ignored early NMIs

Commit

  78a509fba9c9 ("x86/boot: Ignore NMIs during very early boot")

added an empty handler in early boot stage to avoid boot failure due to
spurious NMIs.

Add a diagnostic message to show when early NMIs have occurred and/or
what happened to them.

  [ bp: Touchups. ]

Suggested-by: Borislav Petkov <bp@alien8.de>
Suggested-by: H. Peter Anvin <hpa@zytor.com>
Signed-off-by: Jun'ichi Nomura <junichi.nomura@nec.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Link: https://lore.kernel.org/lkml/20231130103339.GCZWhlA196uRklTMNF@fat_crate.local
---
 arch/x86/boot/compressed/ident_map_64.c | 2 +-
 arch/x86/boot/compressed/misc.c         | 7 +++++++
 arch/x86/boot/compressed/misc.h         | 1 +
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
index ff09ca6dbb87..909f2a35b60c 100644
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -389,5 +389,5 @@ void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code)
 
 void do_boot_nmi_trap(struct pt_regs *regs, unsigned long error_code)
 {
-	/* Empty handler to ignore NMI during early boot */
+	spurious_nmi_count++;
 }
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
index b99e08e6815b..e7f4eb24a9a4 100644
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -52,6 +52,7 @@ struct port_io_ops pio_ops;
 
 memptr free_mem_ptr;
 memptr free_mem_end_ptr;
+int spurious_nmi_count;
 
 static char *vidmem;
 static int vidport;
@@ -493,6 +494,12 @@ asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output)
 	/* Disable exception handling before booting the kernel */
 	cleanup_exception_handling();
 
+	if (spurious_nmi_count) {
+		error_putstr("Spurious early NMIs ignored: 0x");
+		error_puthex(spurious_nmi_count);
+		error_putstr("\n");
+	}
+
 	return output + entry_offset;
 }
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
index bc2f0f17fb90..b858d6aa648c 100644
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -59,6 +59,7 @@ extern char _head[], _end[];
 /* misc.c */
 extern memptr free_mem_ptr;
 extern memptr free_mem_end_ptr;
+extern int spurious_nmi_count;
 void *malloc(int size);
 void free(void *where);
 void __putstr(const char *s);
  
NOMURA JUNICHI(野村 淳一) Jan. 24, 2024, 11:44 a.m. UTC | #3
> From: Borislav Petkov <bp@alien8.de>
> On Mon, Jan 15, 2024 at 08:57:45AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
> > +	if (spurious_nmi_count) {
> > +		error_putstr("Spurious early NMI ignored. Number of NMIs: 0x");
> > +		error_puthex(spurious_nmi_count);
> > +		error_putstr("\n");
> 
> Uff, that's just silly:
> 
> Spurious early NMIs ignored: 0x0000000000000017
> 
> Would you like to add a error_putnum() or so in a prepatch which would
> make this output
> 
> Spurious early NMIs ignored: 23.
> 
> ?
> 
> So that it is human readable and doesn't make me wonder what that hex
> value is supposed to mean?

Yes, it would be nicer to print that way.  I used the existing error_puthex() just
to keep the patch minimal.  I will try to add error_putnum().

> Btw, please use this version when sending next time:

Thank you.

--
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
  
H. Peter Anvin Jan. 25, 2024, 2 a.m. UTC | #4
On 1/24/24 03:44, NOMURA JUNICHI(野村 淳一) wrote:
>> From: Borislav Petkov <bp@alien8.de>
>> On Mon, Jan 15, 2024 at 08:57:45AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
>>> +	if (spurious_nmi_count) {
>>> +		error_putstr("Spurious early NMI ignored. Number of NMIs: 0x");
>>> +		error_puthex(spurious_nmi_count);
>>> +		error_putstr("\n");
>>
>> Uff, that's just silly:
>>
>> Spurious early NMIs ignored: 0x0000000000000017
>>
>> Would you like to add a error_putnum() or so in a prepatch which would
>> make this output
>>
>> Spurious early NMIs ignored: 23.
>>
>> ?
>>
>> So that it is human readable and doesn't make me wonder what that hex
>> value is supposed to mean?
> 
> Yes, it would be nicer to print that way.  I used the existing error_puthex() just
> to keep the patch minimal.  I will try to add error_putnum().
> 
>> Btw, please use this version when sending next time:
> 

Here is a *completely* untested patch for you...

	-hpa
  
NOMURA JUNICHI(野村 淳一) Jan. 26, 2024, 2:15 a.m. UTC | #5
> From: H. Peter Anvin <hpa@zytor.com>
> On 1/24/24 03:44, NOMURA JUNICHI(野村 淳一) wrote:
> >> From: Borislav Petkov <bp@alien8.de>
> >> On Mon, Jan 15, 2024 at 08:57:45AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
> >>> +	if (spurious_nmi_count) {
> >>> +		error_putstr("Spurious early NMI ignored. Number of NMIs: 0x");
> >>> +		error_puthex(spurious_nmi_count);
> >>> +		error_putstr("\n");
> >>
> >> Uff, that's just silly:
> >>
> >> Spurious early NMIs ignored: 0x0000000000000017
> >>
> >> Would you like to add a error_putnum() or so in a prepatch which would
> >> make this output
> >>
> >> Spurious early NMIs ignored: 23.
> >>
> >> ?
> >>
> >> So that it is human readable and doesn't make me wonder what that hex
> >> value is supposed to mean?
> >
> > Yes, it would be nicer to print that way.  I used the existing error_puthex() just
> > to keep the patch minimal.  I will try to add error_putnum().
> >
> >> Btw, please use this version when sending next time:
> 
> Here is a *completely* untested patch for you...

Ah, I was preparing decimal only version but yours is much better.
I tested and it just works.

I would like to use yours as a prepatch.  May I have your signed-off-by?

--
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
  
H. Peter Anvin Feb. 1, 2024, 5:22 p.m. UTC | #6
On January 25, 2024 6:15:15 PM PST, "NOMURA JUNICHI(野村 淳一)" <junichi.nomura@nec.com> wrote:
>> From: H. Peter Anvin <hpa@zytor.com>
>> On 1/24/24 03:44, NOMURA JUNICHI(野村 淳一) wrote:
>> >> From: Borislav Petkov <bp@alien8.de>
>> >> On Mon, Jan 15, 2024 at 08:57:45AM +0000, NOMURA JUNICHI(野村 淳一) wrote:
>> >>> +	if (spurious_nmi_count) {
>> >>> +		error_putstr("Spurious early NMI ignored. Number of NMIs: 0x");
>> >>> +		error_puthex(spurious_nmi_count);
>> >>> +		error_putstr("\n");
>> >>
>> >> Uff, that's just silly:
>> >>
>> >> Spurious early NMIs ignored: 0x0000000000000017
>> >>
>> >> Would you like to add a error_putnum() or so in a prepatch which would
>> >> make this output
>> >>
>> >> Spurious early NMIs ignored: 23.
>> >>
>> >> ?
>> >>
>> >> So that it is human readable and doesn't make me wonder what that hex
>> >> value is supposed to mean?
>> >
>> > Yes, it would be nicer to print that way.  I used the existing error_puthex() just
>> > to keep the patch minimal.  I will try to add error_putnum().
>> >
>> >> Btw, please use this version when sending next time:
>> 
>> Here is a *completely* untested patch for you...
>
>Ah, I was preparing decimal only version but yours is much better.
>I tested and it just works.
>
>I would like to use yours as a prepatch.  May I have your signed-off-by?
>
>--
>Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.

Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
  

Patch

diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c
--- a/arch/x86/boot/compressed/ident_map_64.c
+++ b/arch/x86/boot/compressed/ident_map_64.c
@@ -390,4 +390,5 @@  void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code)
 void do_boot_nmi_trap(struct pt_regs *regs, unsigned long error_code)
 {
 	/* Empty handler to ignore NMI during early boot */
+	spurious_nmi_count++;
 }
diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -52,6 +52,7 @@  struct port_io_ops pio_ops;
 
 memptr free_mem_ptr;
 memptr free_mem_end_ptr;
+int spurious_nmi_count;
 
 static char *vidmem;
 static int vidport;
@@ -493,6 +494,12 @@  asmlinkage __visible void *extract_kernel(void *rmode, unsigned char *output)
 	/* Disable exception handling before booting the kernel */
 	cleanup_exception_handling();
 
+	if (spurious_nmi_count) {
+		error_putstr("Spurious early NMI ignored. Number of NMIs: 0x");
+		error_puthex(spurious_nmi_count);
+		error_putstr("\n");
+	}
+
 	return output + entry_offset;
 }
 
diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h
--- a/arch/x86/boot/compressed/misc.h
+++ b/arch/x86/boot/compressed/misc.h
@@ -59,6 +59,7 @@  extern char _head[], _end[];
 /* misc.c */
 extern memptr free_mem_ptr;
 extern memptr free_mem_end_ptr;
+extern int spurious_nmi_count;
 void *malloc(int size);
 void free(void *where);
 void __putstr(const char *s);