[RFC,V3,00/16] x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv

Message ID 20230122024607.788454-1-ltykernel@gmail.com
Headers
Series x86/hyperv/sev: Add AMD sev-snp enlightened guest support on hyperv |

Message

Tianyu Lan Jan. 22, 2023, 2:45 a.m. UTC
  From: Tianyu Lan <tiala@microsoft.com>

This patchset is to add AMD sev-snp enlightened guest
support on hyperv. Hyperv uses Linux direct boot mode
to boot up Linux kernel and so it needs to pvalidate
system memory by itself.

In hyperv case, there is no boot loader and so cc blob
is prepared by hypervisor. In this series, hypervisor
set the cc blob address directly into boot parameter
of Linux kernel. If the magic number on cc blob address
is valid, kernel will read cc blob.

Shared memory between guests and hypervisor should be
decrypted and zero memory after decrypt memory. The data
in the target address. It maybe smearedto avoid smearing
data.

Introduce #HV exception support in AMD sev snp code and
#HV handler.

Change since v2:
       - Remove validate kernel memory code at boot stage
       - Split #HV page patch into two parts
       - Remove HV-APIC change due to enable x2apic from
       	 host side
       - Rework vmbus code to handle error of decrypt page
       - Spilt memory and cpu initialization patch. 

Change since v1:
       - Remove boot param changes for cc blob address and
       use setup head to pass cc blob info
       - Remove unnessary WARN and BUG check
       - Add system vector table map in the #HV exception
       - Fix interrupt exit issue when use #HV exception

Ashish Kalra (2):
  x86/sev: optimize system vector processing invoked from #HV exception
  x86/sev: Fix interrupt exit code paths from #HV exception

Tianyu Lan (14):
  x86/hyperv: Add sev-snp enlightened guest specific config
  x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
  x86/hyperv: Set Virtual Trust Level in vmbus init message
  x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp
    enlightened guest
  clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp
    enlightened guest
  x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
  drivers: hv: Decrypt percpu hvcall input arg page in sev-snp
    enlightened guest
  x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
  x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
  x86/hyperv: Add smp support for sev-snp guest
  x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
  x86/sev: Add a #HV exception handler
  x86/sev: Add Check of #HV event in path
  x86/sev: Initialize #HV doorbell and handle interrupt requests

 arch/x86/entry/entry_64.S             |  82 ++++++
 arch/x86/hyperv/hv_init.c             |  43 +++
 arch/x86/hyperv/ivm.c                 |  10 +
 arch/x86/include/asm/cpu_entry_area.h |   6 +
 arch/x86/include/asm/hyperv-tlfs.h    |   4 +
 arch/x86/include/asm/idtentry.h       | 105 ++++++-
 arch/x86/include/asm/irqflags.h       |  10 +
 arch/x86/include/asm/mem_encrypt.h    |   2 +
 arch/x86/include/asm/mshyperv.h       |  56 +++-
 arch/x86/include/asm/msr-index.h      |   6 +
 arch/x86/include/asm/page_64_types.h  |   1 +
 arch/x86/include/asm/sev.h            |  13 +
 arch/x86/include/asm/svm.h            |  59 +++-
 arch/x86/include/asm/trapnr.h         |   1 +
 arch/x86/include/asm/traps.h          |   1 +
 arch/x86/include/asm/x86_init.h       |   2 +
 arch/x86/include/uapi/asm/svm.h       |   4 +
 arch/x86/kernel/cpu/common.c          |   1 +
 arch/x86/kernel/cpu/mshyperv.c        | 228 ++++++++++++++-
 arch/x86/kernel/dumpstack_64.c        |   9 +-
 arch/x86/kernel/idt.c                 |   1 +
 arch/x86/kernel/sev.c                 | 395 ++++++++++++++++++++++----
 arch/x86/kernel/traps.c               |  42 +++
 arch/x86/kernel/vmlinux.lds.S         |   7 +
 arch/x86/kernel/x86_init.c            |   4 +-
 arch/x86/mm/cpu_entry_area.c          |   2 +
 drivers/clocksource/hyperv_timer.c    |   2 +-
 drivers/hv/connection.c               |   1 +
 drivers/hv/hv.c                       |  33 ++-
 drivers/hv/hv_common.c                |  26 +-
 include/asm-generic/hyperv-tlfs.h     |  19 ++
 include/asm-generic/mshyperv.h        |   2 +
 include/linux/hyperv.h                |   4 +-
 33 files changed, 1102 insertions(+), 79 deletions(-)
  

Comments

Zhi Wang Feb. 2, 2023, 11 p.m. UTC | #1
On Sat, 21 Jan 2023 21:45:50 -0500
Tianyu Lan <ltykernel@gmail.com> wrote:

1) I am thinking if it is a good time to organize a common code path for
enlightened VM on hyper-v.

Wouldn't it be better to have a common flag for enlightened VM? 
Like bool hv_isolation_type_enlightened()

Many of the decryption of the post msg page... are also required
in the enlightened TDX guest, they are not AMD-specific. 

Then in the "TDX guest on hyper-V" patch set, Dexuan can save some LOCs instead
of ending up with if (hv_isolation_type_en_snp() ||
hv_isolation_type_en_tdx())...

2) It seems the AMD SEV-SNP enlightened guest on hyper-V is implemented as
CC_VENDOR_AMD, while TDX enlightened guest is still implemented as
CC_VENDOR_HYPERV. I am curious about the reason.

> From: Tianyu Lan <tiala@microsoft.com>
> 
> This patchset is to add AMD sev-snp enlightened guest
> support on hyperv. Hyperv uses Linux direct boot mode
> to boot up Linux kernel and so it needs to pvalidate
> system memory by itself.
> 
> In hyperv case, there is no boot loader and so cc blob
> is prepared by hypervisor. In this series, hypervisor
> set the cc blob address directly into boot parameter
> of Linux kernel. If the magic number on cc blob address
> is valid, kernel will read cc blob.
> 
> Shared memory between guests and hypervisor should be
> decrypted and zero memory after decrypt memory. The data
> in the target address. It maybe smearedto avoid smearing
> data.
> 
> Introduce #HV exception support in AMD sev snp code and
> #HV handler.
> 
> Change since v2:
>        - Remove validate kernel memory code at boot stage
>        - Split #HV page patch into two parts
>        - Remove HV-APIC change due to enable x2apic from
>        	 host side
>        - Rework vmbus code to handle error of decrypt page
>        - Spilt memory and cpu initialization patch. 
> 
> Change since v1:
>        - Remove boot param changes for cc blob address and
>        use setup head to pass cc blob info
>        - Remove unnessary WARN and BUG check
>        - Add system vector table map in the #HV exception
>        - Fix interrupt exit issue when use #HV exception
> 
> Ashish Kalra (2):
>   x86/sev: optimize system vector processing invoked from #HV exception
>   x86/sev: Fix interrupt exit code paths from #HV exception
> 
> Tianyu Lan (14):
>   x86/hyperv: Add sev-snp enlightened guest specific config
>   x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
>   x86/hyperv: Set Virtual Trust Level in vmbus init message
>   x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp
>     enlightened guest
>   clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp
>     enlightened guest
>   x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
>   drivers: hv: Decrypt percpu hvcall input arg page in sev-snp
>     enlightened guest
>   x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
>   x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
>   x86/hyperv: Add smp support for sev-snp guest
>   x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
>   x86/sev: Add a #HV exception handler
>   x86/sev: Add Check of #HV event in path
>   x86/sev: Initialize #HV doorbell and handle interrupt requests
> 
>  arch/x86/entry/entry_64.S             |  82 ++++++
>  arch/x86/hyperv/hv_init.c             |  43 +++
>  arch/x86/hyperv/ivm.c                 |  10 +
>  arch/x86/include/asm/cpu_entry_area.h |   6 +
>  arch/x86/include/asm/hyperv-tlfs.h    |   4 +
>  arch/x86/include/asm/idtentry.h       | 105 ++++++-
>  arch/x86/include/asm/irqflags.h       |  10 +
>  arch/x86/include/asm/mem_encrypt.h    |   2 +
>  arch/x86/include/asm/mshyperv.h       |  56 +++-
>  arch/x86/include/asm/msr-index.h      |   6 +
>  arch/x86/include/asm/page_64_types.h  |   1 +
>  arch/x86/include/asm/sev.h            |  13 +
>  arch/x86/include/asm/svm.h            |  59 +++-
>  arch/x86/include/asm/trapnr.h         |   1 +
>  arch/x86/include/asm/traps.h          |   1 +
>  arch/x86/include/asm/x86_init.h       |   2 +
>  arch/x86/include/uapi/asm/svm.h       |   4 +
>  arch/x86/kernel/cpu/common.c          |   1 +
>  arch/x86/kernel/cpu/mshyperv.c        | 228 ++++++++++++++-
>  arch/x86/kernel/dumpstack_64.c        |   9 +-
>  arch/x86/kernel/idt.c                 |   1 +
>  arch/x86/kernel/sev.c                 | 395 ++++++++++++++++++++++----
>  arch/x86/kernel/traps.c               |  42 +++
>  arch/x86/kernel/vmlinux.lds.S         |   7 +
>  arch/x86/kernel/x86_init.c            |   4 +-
>  arch/x86/mm/cpu_entry_area.c          |   2 +
>  drivers/clocksource/hyperv_timer.c    |   2 +-
>  drivers/hv/connection.c               |   1 +
>  drivers/hv/hv.c                       |  33 ++-
>  drivers/hv/hv_common.c                |  26 +-
>  include/asm-generic/hyperv-tlfs.h     |  19 ++
>  include/asm-generic/mshyperv.h        |   2 +
>  include/linux/hyperv.h                |   4 +-
>  33 files changed, 1102 insertions(+), 79 deletions(-)
>
  
Michael Kelley (LINUX) Feb. 3, 2023, 4:04 a.m. UTC | #2
From: Zhi Wang <zhi.wang.linux@gmail.com> Sent: Thursday, February 2, 2023 3:01 PM
> 
> On Sat, 21 Jan 2023 21:45:50 -0500
> Tianyu Lan <ltykernel@gmail.com> wrote:
> 
> 1) I am thinking if it is a good time to organize a common code path for
> enlightened VM on hyper-v.
> 
> Wouldn't it be better to have a common flag for enlightened VM?
> Like bool hv_isolation_type_enlightened()
> 
> Many of the decryption of the post msg page... are also required
> in the enlightened TDX guest, they are not AMD-specific.
> 
> Then in the "TDX guest on hyper-V" patch set, Dexuan can save some LOCs instead
> of ending up with if (hv_isolation_type_en_snp() ||
> hv_isolation_type_en_tdx())...

I've had the same thought, and have briefly discussed the
idea with Dexuan and Tianyu.  But there's some code coming
for a non-confidential VM scenario that hasn't yet been posted
upstream, and it adds yet more cases to consider.   We were
thinking to wait a bit until all the cases were evident, and then
find the right simplification.  If we try to do the simplification
now, we may need to do it again.

> 
> 2) It seems the AMD SEV-SNP enlightened guest on hyper-V is implemented as
> CC_VENDOR_AMD, while TDX enlightened guest is still implemented as
> CC_VENDOR_HYPERV. I am curious about the reason.

Patch set [1] makes CC_VENDOR_HYPERV go away.  Once that
happens, the TDX enlightened guest uses CC_VENDOR_INTEL.

Michael

[1] https://lore.kernel.org/linux-hyperv/1673559753-94403-1-git-send-email-mikelley@microsoft.com/T/#m4639d697e9a6619edfcdceffc1b0613a9016f601



> 
> > From: Tianyu Lan <tiala@microsoft.com>
> >
> > This patchset is to add AMD sev-snp enlightened guest
> > support on hyperv. Hyperv uses Linux direct boot mode
> > to boot up Linux kernel and so it needs to pvalidate
> > system memory by itself.
> >
> > In hyperv case, there is no boot loader and so cc blob
> > is prepared by hypervisor. In this series, hypervisor
> > set the cc blob address directly into boot parameter
> > of Linux kernel. If the magic number on cc blob address
> > is valid, kernel will read cc blob.
> >
> > Shared memory between guests and hypervisor should be
> > decrypted and zero memory after decrypt memory. The data
> > in the target address. It maybe smearedto avoid smearing
> > data.
> >
> > Introduce #HV exception support in AMD sev snp code and
> > #HV handler.
> >
> > Change since v2:
> >        - Remove validate kernel memory code at boot stage
> >        - Split #HV page patch into two parts
> >        - Remove HV-APIC change due to enable x2apic from
> >        	 host side
> >        - Rework vmbus code to handle error of decrypt page
> >        - Spilt memory and cpu initialization patch.
> >
> > Change since v1:
> >        - Remove boot param changes for cc blob address and
> >        use setup head to pass cc blob info
> >        - Remove unnessary WARN and BUG check
> >        - Add system vector table map in the #HV exception
> >        - Fix interrupt exit issue when use #HV exception
> >
> > Ashish Kalra (2):
> >   x86/sev: optimize system vector processing invoked from #HV exception
> >   x86/sev: Fix interrupt exit code paths from #HV exception
> >
> > Tianyu Lan (14):
> >   x86/hyperv: Add sev-snp enlightened guest specific config
> >   x86/hyperv: Decrypt hv vp assist page in sev-snp enlightened guest
> >   x86/hyperv: Set Virtual Trust Level in vmbus init message
> >   x86/hyperv: Use vmmcall to implement Hyper-V hypercall in sev-snp
> >     enlightened guest
> >   clocksource/drivers/hyper-v: decrypt hyperv tsc page in sev-snp
> >     enlightened guest
> >   x86/hyperv: decrypt vmbus pages for sev-snp enlightened guest
> >   drivers: hv: Decrypt percpu hvcall input arg page in sev-snp
> >     enlightened guest
> >   x86/hyperv: Initialize cpu and memory for sev-snp enlightened guest
> >   x86/hyperv: SEV-SNP enlightened guest don't support legacy rtc
> >   x86/hyperv: Add smp support for sev-snp guest
> >   x86/hyperv: Add hyperv-specific hadling for VMMCALL under SEV-ES
> >   x86/sev: Add a #HV exception handler
> >   x86/sev: Add Check of #HV event in path
> >   x86/sev: Initialize #HV doorbell and handle interrupt requests
> >
> >  arch/x86/entry/entry_64.S             |  82 ++++++
> >  arch/x86/hyperv/hv_init.c             |  43 +++
> >  arch/x86/hyperv/ivm.c                 |  10 +
> >  arch/x86/include/asm/cpu_entry_area.h |   6 +
> >  arch/x86/include/asm/hyperv-tlfs.h    |   4 +
> >  arch/x86/include/asm/idtentry.h       | 105 ++++++-
> >  arch/x86/include/asm/irqflags.h       |  10 +
> >  arch/x86/include/asm/mem_encrypt.h    |   2 +
> >  arch/x86/include/asm/mshyperv.h       |  56 +++-
> >  arch/x86/include/asm/msr-index.h      |   6 +
> >  arch/x86/include/asm/page_64_types.h  |   1 +
> >  arch/x86/include/asm/sev.h            |  13 +
> >  arch/x86/include/asm/svm.h            |  59 +++-
> >  arch/x86/include/asm/trapnr.h         |   1 +
> >  arch/x86/include/asm/traps.h          |   1 +
> >  arch/x86/include/asm/x86_init.h       |   2 +
> >  arch/x86/include/uapi/asm/svm.h       |   4 +
> >  arch/x86/kernel/cpu/common.c          |   1 +
> >  arch/x86/kernel/cpu/mshyperv.c        | 228 ++++++++++++++-
> >  arch/x86/kernel/dumpstack_64.c        |   9 +-
> >  arch/x86/kernel/idt.c                 |   1 +
> >  arch/x86/kernel/sev.c                 | 395 ++++++++++++++++++++++----
> >  arch/x86/kernel/traps.c               |  42 +++
> >  arch/x86/kernel/vmlinux.lds.S         |   7 +
> >  arch/x86/kernel/x86_init.c            |   4 +-
> >  arch/x86/mm/cpu_entry_area.c          |   2 +
> >  drivers/clocksource/hyperv_timer.c    |   2 +-
> >  drivers/hv/connection.c               |   1 +
> >  drivers/hv/hv.c                       |  33 ++-
> >  drivers/hv/hv_common.c                |  26 +-
> >  include/asm-generic/hyperv-tlfs.h     |  19 ++
> >  include/asm-generic/mshyperv.h        |   2 +
> >  include/linux/hyperv.h                |   4 +-
> >  33 files changed, 1102 insertions(+), 79 deletions(-)
> >
  
Gupta, Pankaj Feb. 9, 2023, 11:36 a.m. UTC | #3
Hi Tianyu,

> This patchset is to add AMD sev-snp enlightened guest
> support on hyperv. Hyperv uses Linux direct boot mode
> to boot up Linux kernel and so it needs to pvalidate
> system memory by itself.
> 
> In hyperv case, there is no boot loader and so cc blob
> is prepared by hypervisor. In this series, hypervisor
> set the cc blob address directly into boot parameter
> of Linux kernel. If the magic number on cc blob address
> is valid, kernel will read cc blob.
> 
> Shared memory between guests and hypervisor should be
> decrypted and zero memory after decrypt memory. The data
> in the target address. It maybe smearedto avoid smearing
> data.
> 
> Introduce #HV exception support in AMD sev snp code and
> #HV handler.

I am interested to test the Linux guest #HV exception handling (patches 
12-16 in this series) for the restricted interrupt injection with the 
Linux/KVM host.

Do you have a git tree which or any base commit on which
I can use to apply these patches?

Thank You,
Pankaj
  
Gupta, Pankaj Feb. 17, 2023, 12:47 p.m. UTC | #4
On 2/9/2023 12:36 PM, Gupta, Pankaj wrote:
> Hi Tianyu,
> 
>> This patchset is to add AMD sev-snp enlightened guest
>> support on hyperv. Hyperv uses Linux direct boot mode
>> to boot up Linux kernel and so it needs to pvalidate
>> system memory by itself.
>>
>> In hyperv case, there is no boot loader and so cc blob
>> is prepared by hypervisor. In this series, hypervisor
>> set the cc blob address directly into boot parameter
>> of Linux kernel. If the magic number on cc blob address
>> is valid, kernel will read cc blob.
>>
>> Shared memory between guests and hypervisor should be
>> decrypted and zero memory after decrypt memory. The data
>> in the target address. It maybe smearedto avoid smearing
>> data.
>>
>> Introduce #HV exception support in AMD sev snp code and
>> #HV handler.
> 
> I am interested to test the Linux guest #HV exception handling (patches 
> 12-16 in this series) for the restricted interrupt injection with the 
> Linux/KVM host.
> 
> Do you have a git tree which or any base commit on which
> I can use to apply these patches?

Never mind. I could apply the patches 12-16 on master (except minor 
tweak in patch 14). Now, will try to test.

Thanks,
Pankaj
  
Tianyu Lan Feb. 18, 2023, 7:15 a.m. UTC | #5
On 2/17/2023 8:47 PM, Gupta, Pankaj wrote:
> On 2/9/2023 12:36 PM, Gupta, Pankaj wrote:
>> Hi Tianyu,
>>
>>> This patchset is to add AMD sev-snp enlightened guest
>>> support on hyperv. Hyperv uses Linux direct boot mode
>>> to boot up Linux kernel and so it needs to pvalidate
>>> system memory by itself.
>>>
>>> In hyperv case, there is no boot loader and so cc blob
>>> is prepared by hypervisor. In this series, hypervisor
>>> set the cc blob address directly into boot parameter
>>> of Linux kernel. If the magic number on cc blob address
>>> is valid, kernel will read cc blob.
>>>
>>> Shared memory between guests and hypervisor should be
>>> decrypted and zero memory after decrypt memory. The data
>>> in the target address. It maybe smearedto avoid smearing
>>> data.
>>>
>>> Introduce #HV exception support in AMD sev snp code and
>>> #HV handler.
>>
>> I am interested to test the Linux guest #HV exception handling 
>> (patches 12-16 in this series) for the restricted interrupt injection 
>> with the Linux/KVM host.
>>
>> Do you have a git tree which or any base commit on which
>> I can use to apply these patches?
> 
> Never mind. I could apply the patches 12-16 on master (except minor 
> tweak in patch 14). Now, will try to test.
> 

Hi Pankaj:
	Sorry. I missed your first mail. Please let me know any issue son KVM 
side if available。Thanks in advance.
  
Gupta, Pankaj March 10, 2023, 3:35 p.m. UTC | #6
Hi Tianyu,

While testing the guest patches on KVM host, My guest kernel is stuck
at early bootup. As it did not seem a hang but sort of loop where 
interrupts are getting processed from "pv_native_irq_enable" path 
repeatedly and prevent boot process to make progress IIUC. Did you face 
any such scenario in your testing?

It seems to me "native_irq_enable" enable interrupts and 
"check_hv_pending_irq_enable" starts handling the interrupts (after 
disabling irqs). But "check_hv_pending_irq_enable=>do_exc_hv" can again 
call "pv_native_irq_enable" in interrupt handling path and execute the 
same loop?

Also pasting below the stack dump [1].

Thanks,
Pankaj

[1]
[   20.530786] Call Trace:^M
[   20.531099]  <IRQ>^M
[   20.531360]  dump_stack_lvl+0x4d/0x67^M
[   20.531820]  dump_stack+0x14/0x1a^M
[   20.532235]  do_exc_hv.cold+0x11/0xec^M
[   20.532792]  check_hv_pending_irq_enable+0x64/0x80^M
[   20.533390]  pv_native_irq_enable+0xe/0x20^M   ====> here
[   20.533902]  __do_softirq+0x89/0x2f3^M
[   20.534352]  __irq_exit_rcu+0x9f/0x110^M
[   20.534825]  irq_exit_rcu+0x12/0x20^M
[   20.535267]  common_interrupt+0xca/0xf0^M
[   20.535745]  </IRQ>^M
[   20.536014]  <TASK>^M
[   20.536286]  do_exc_hv.cold+0xda/0xec^M
[   20.536826]  check_hv_pending_irq_enable+0x64/0x80^M
[   20.537429]  pv_native_irq_enable+0xe/0x20^M    ====> here
[   20.537942]  _raw_spin_unlock_irqrestore+0x21/0x50^M
[   20.538539]  __setup_irq+0x3be/0x740^M
[   20.538990]  request_threaded_irq+0x116/0x180^M
[   20.539533]  hpet_time_init+0x35/0x56^M
[   20.539994]  x86_late_time_init+0x1f/0x3d^M
[   20.540556]  start_kernel+0x8af/0x970^M
[   20.541033]  x86_64_start_reservations+0x28/0x2e^M
[   20.541607]  x86_64_start_kernel+0x96/0xa0^M
[   20.542126]  secondary_startup_64_no_verify+0xe5/0xeb^M
[   20.542757]  </TASK>^M
  
Tianyu Lan March 10, 2023, 4:19 p.m. UTC | #7
On 3/10/2023 11:35 PM, Gupta, Pankaj wrote:
> 
> 
> Hi Tianyu,
> 
> While testing the guest patches on KVM host, My guest kernel is stuck
> at early bootup. As it did not seem a hang but sort of loop where 
> interrupts are getting processed from "pv_native_irq_enable" path 
> repeatedly and prevent boot process to make progress IIUC. Did you face 
> any such scenario in your testing?
> 
> It seems to me "native_irq_enable" enable interrupts and 
> "check_hv_pending_irq_enable" starts handling the interrupts (after 
> disabling irqs). But "check_hv_pending_irq_enable=>do_exc_hv" can again 
> call "pv_native_irq_enable" in interrupt handling path and execute the 
> same loop?


I don't meet the issue. Thanks for report. I will double check and 
report back.

> Also pasting below the stack dump [1].
> 
> Thanks,
> Pankaj
> 
> [1]
> [   20.530786] Call Trace:^M
> [   20.531099]  <IRQ>^M
> [   20.531360]  dump_stack_lvl+0x4d/0x67^M
> [   20.531820]  dump_stack+0x14/0x1a^M
> [   20.532235]  do_exc_hv.cold+0x11/0xec^M
> [   20.532792]  check_hv_pending_irq_enable+0x64/0x80^M
> [   20.533390]  pv_native_irq_enable+0xe/0x20^M   ====> here
> [   20.533902]  __do_softirq+0x89/0x2f3^M
> [   20.534352]  __irq_exit_rcu+0x9f/0x110^M
> [   20.534825]  irq_exit_rcu+0x12/0x20^M
> [   20.535267]  common_interrupt+0xca/0xf0^M
> [   20.535745]  </IRQ>^M
> [   20.536014]  <TASK>^M
> [   20.536286]  do_exc_hv.cold+0xda/0xec^M
> [   20.536826]  check_hv_pending_irq_enable+0x64/0x80^M
> [   20.537429]  pv_native_irq_enable+0xe/0x20^M    ====> here
> [   20.537942]  _raw_spin_unlock_irqrestore+0x21/0x50^M
> [   20.538539]  __setup_irq+0x3be/0x740^M
> [   20.538990]  request_threaded_irq+0x116/0x180^M
> [   20.539533]  hpet_time_init+0x35/0x56^M
> [   20.539994]  x86_late_time_init+0x1f/0x3d^M
> [   20.540556]  start_kernel+0x8af/0x970^M
> [   20.541033]  x86_64_start_reservations+0x28/0x2e^M
> [   20.541607]  x86_64_start_kernel+0x96/0xa0^M
> [   20.542126]  secondary_startup_64_no_verify+0xe5/0xeb^M
> [   20.542757]  </TASK>^M
  
Gupta, Pankaj March 15, 2023, 6:40 a.m. UTC | #8
Hi Tianyu,

>> Hi Tianyu,
>>
>> While testing the guest patches on KVM host, My guest kernel is stuck
>> at early bootup. As it did not seem a hang but sort of loop where 
>> interrupts are getting processed from "pv_native_irq_enable" path 
>> repeatedly and prevent boot process to make progress IIUC. Did you 
>> face any such scenario in your testing?
>>
>> It seems to me "native_irq_enable" enable interrupts and 
>> "check_hv_pending_irq_enable" starts handling the interrupts (after 
>> disabling irqs). But "check_hv_pending_irq_enable=>do_exc_hv" can 
>> again call "pv_native_irq_enable" in interrupt handling path and 
>> execute the same loop?
> 
> 
> I don't meet the issue. Thanks for report. I will double check and 
> report back.

Thank you!

More testing with the patches: After I commented out "do_exc_hv" from
pv_native_irq_enable()->check_hv_pending_irq_enable() code path. Now, I 
am getting below [2] stack trace repeatedly when I dump stack.

This seems to me after IST stack return from #VC handling
for "native_cpuid", paranoid_exit =>"do_exc_hv" is handling interrupts. 
As we don't disable interrupts in check_hv_pending()=>do_exc_hv(), so 
interrupts are handled continuously here. This also prevents the boot 
processor to make progress and stuck here.

Thoughts please? as I might be missing some important details here.

Thanks,
Pankaj

[2]

[   59.845396] Call Trace:^M
[   59.845703]  <TASK>^M
[   59.845980]  dump_stack_lvl+0x4d/0x67^M
[   59.846432]  dump_stack+0x14/0x1a^M
[   59.846842]  do_exc_hv.cold+0x22/0xfd^M
[   59.847301]  check_hv_pending+0x38/0x50^M
[   59.847773]  paranoid_exit+0x8/0x70^M
[   59.848205] RIP: 0010:native_cpuid+0x19/0x30^M
[   59.848729] Code: 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 
f3 0f 1e fa 55 49 89 f8 49 89 c9 48 89 d7 41 8b 00 48 89 e5 53 8b 0a 0f 
a2 <41> 89 00 89 1e 48 8b 5d f8 89 0f 41 89 11 c9 e9 f7 bc df 00 0f 1f^M
[   59.850995] RSP: 0000:ffffffffbd403e48 EFLAGS: 00010202^M
[   59.851636] RAX: 000000000100007b RBX: 0000000000000000 RCX: 
0000000000000000^M
[   59.852498] RDX: 0000000000000000 RSI: ffffffffbd403e64 RDI: 
ffffffffbd403e68^M
[   59.853361] RBP: ffffffffbd403e50 R08: ffffffffbd403e60 R09: 
ffffffffbd403e6c^M
[   59.854240] R10: ffffffffbd403d10 R11: ffff9af5bff3cfe8 R12: 
0000000000000056^M
[   59.855111] R13: ffff9af5bffc8e40 R14: 0000000000000000 R15: 
ffffffffbd41a120^M
[   59.855976]  kvm_arch_para_features+0x4e/0x80^M
[   59.856511]  pv_ipi_supported+0xe/0x34^M
[   59.856973]  kvm_apic_init+0x12/0x3f^M
[   59.857414]  apic_intr_mode_init+0x8d/0x10d^M
[   59.857939]  x86_late_time_init+0x28/0x3d^M
[   59.858435]  start_kernel+0x8af/0x970^M
[   59.858894]  x86_64_start_reservations+0x28/0x2e^M
[   59.859461]  x86_64_start_kernel+0x96/0xa0^M
[   59.859965]  secondary_startup_64_no_verify+0xe5/0xeb^M
[   59.860583]  </TASK>^M