[RESEND,v9,00/36] x86: enable FRED for x86-64

Message ID 20230801083318.8363-1-xin3.li@intel.com
Headers
Series x86: enable FRED for x86-64 |

Message

Li, Xin3 Aug. 1, 2023, 8:32 a.m. UTC
  Resend because the mail system failed to deliver some messages yesterday.

This patch set enables the Intel flexible return and event delivery
(FRED) architecture for x86-64.

The FRED architecture defines simple new transitions that change
privilege level (ring transitions). The FRED architecture was
designed with the following goals:

1) Improve overall performance and response time by replacing event
   delivery through the interrupt descriptor table (IDT event
   delivery) and event return by the IRET instruction with lower
   latency transitions.

2) Improve software robustness by ensuring that event delivery
   establishes the full supervisor context and that event return
   establishes the full user context.

The new transitions defined by the FRED architecture are FRED event
delivery and, for returning from events, two FRED return instructions.
FRED event delivery can effect a transition from ring 3 to ring 0, but
it is used also to deliver events incident to ring 0. One FRED
instruction (ERETU) effects a return from ring 0 to ring 3, while the
other (ERETS) returns while remaining in ring 0. Collectively, FRED
event delivery and the FRED return instructions are FRED transitions.

Search for the latest FRED spec in most search engines with this search pattern:

  site:intel.com FRED (flexible return and event delivery) specification

As of now there is no publicly avaiable CPU supporting FRED, thus the Intel
Simics® Simulator is used as software development and testing vehicles. And
it can be downloaded from:
  https://www.intel.com/content/www/us/en/developer/articles/tool/simics-simulator.html

To enable FRED, the Simics package 8112 QSP-CPU needs to be installed with CPU
model configured as:
	$cpu_comp_class = "x86-experimental-fred"


Changes since v8:
* Move the FRED initialization patch after all required changes are in
  place (Thomas Gleixner).
* Don't do syscall early out in fred_entry_from_user() before there are
  proper performance numbers and justifications (Thomas Gleixner).
* Add the control exception handler to the FRED exception handler table
  (Thomas Gleixner).
* Introduce a macro sysvec_install() to derive the asm handler name from
  a C handler, which simplifies the code and avoids an ugly typecast
  (Thomas Gleixner).
* Remove junk code that assumes no local APIC on x86_64 (Thomas Gleixner).
* Put IDTENTRY changes in a separate patch (Thomas Gleixner).
* Use high-order 48 bits above the lowest 16 bit SS only when FRED is
  enabled (Thomas Gleixner).
* Explain why writing directly to the IA32_KERNEL_GS_BASE MSR is
  doing the right thing (Thomas Gleixner).
* Reword some patch descriptions (Thomas Gleixner).
* Add a new macro VMX_DO_FRED_EVENT_IRQOFF for FRED instead of
  refactoring VMX_DO_EVENT_IRQOFF (Sean Christopherson).
* Do NOT use a trampoline, just LEA+PUSH the return RIP, PUSH the error
  code, and jump to the FRED kernel entry point for NMI or call
  external_interrupt() for IRQs (Sean Christopherson).
* Call external_interrupt() only when FRED is enabled, and convert the
  non-FRED handling to external_interrupt() after FRED lands (Sean
  Christopherson).
* Use __packed instead of __attribute__((__packed__)) (Borislav Petkov).
* Put all comments above the members, like the rest of the file does
  (Borislav Petkov).
* Reflect the FRED spec 5.0 change that ERETS and ERETU add 8 to %rsp
  before popping the return context from the stack.
* Reflect stack frame definition changes from FRED spec 3.0 to 5.0.
* Add ENDBR to the FRED_ENTER asm macro after kernel IBT is added to
  FRED base line in FRED spec 5.0.
* Add a document which briefly introduces FRED features.
* Remove 2 patches, "allow FRED systems to use interrupt vectors
  0x10-0x1f" and "allow dynamic stack frame size", from this patch set,
  as they are "optimizations" only.
* Send 2 patches, "header file for event types" and "do not modify the
  DPL bits for a null selector", as pre-FRED patches.

Changes since v7:
* Always call external_interrupt() for VMX IRQ handling on x86_64, thus avoid
  re-entering the noinstr code.
* Create a FRED stack frame when FRED is compiled-in but not enabled, which
  uses some extra stack space but simplifies the code.
* Add a log message when FRED is enabled.

Changes since v6:
* Add a comment to explain why it is safe to write to a previous FRED stack
  frame. (Lai Jiangshan).
* Export fred_entrypoint_kernel(), required when kvm-intel built as a module.
* Reserve a REDZONE for CALL emulation and Align RSP to a 64-byte boundary
  before pushing a new FRED stack frame.
* Replace pt_regs csx flags prefix FRED_CSL_ with FRED_CSX_.

Changes since v5:
* Initialize system_interrupt_handlers with dispatch_table_spurious_interrupt()
  instead of NULL to get rid of a branch (Peter Zijlstra).
* Disallow #DB inside #MCE for robustness sake (Peter Zijlstra).
* Add a comment for FRED stack level settings (Lai Jiangshan).
* Move the NMI bit from an invalid stack frame, which caused ERETU to fault,
  to the fault handler's stack frame, thus to unblock NMI ASAP if NMI is blocked
  (Lai Jiangshan).
* Refactor VMX_DO_EVENT_IRQOFF to handle IRQ/NMI in IRQ/NMI induced VM exits
  when FRED is enabled (Sean Christopherson).

Changes since v4:
* Do NOT use the term "injection", which in the KVM context means to
  reinject an event into the guest (Sean Christopherson).
* Add the explanation of why to execute "int $2" to invoke the NMI handler
  in NMI caused VM exits (Sean Christopherson).
* Use cs/ss instead of csx/ssx when initializing the pt_regs structure
  for calling external_interrupt(), otherwise it breaks i386 build.

Changes since v3:
* Call external_interrupt() to handle IRQ in IRQ caused VM exits.
* Execute "int $2" to handle NMI in NMI caused VM exits.
* Rename csl/ssl of the pt_regs structure to csx/ssx (x for extended)
  (Andrew Cooper).

Changes since v2:
* Improve comments for changes in arch/x86/include/asm/idtentry.h.

Changes since v1:
* call irqentry_nmi_{enter,exit}() in both IDT and FRED debug fault kernel
  handler (Peter Zijlstra).
* Initialize a FRED exception handler to fred_bad_event() instead of NULL
  if no FRED handler defined for an exception vector (Peter Zijlstra).
* Push calling irqentry_{enter,exit}() and instrumentation_{begin,end}()
  down into individual FRED exception handlers, instead of in the dispatch
  framework (Peter Zijlstra).

H. Peter Anvin (Intel) (22):
  x86/fred: Add Kconfig option for FRED (CONFIG_X86_FRED)
  x86/fred: Disable FRED support if CONFIG_X86_FRED is disabled
  x86/cpufeatures: Add the cpu feature bit for FRED
  x86/opcode: Add ERETU, ERETS instructions to x86-opcode-map
  x86/objtool: Teach objtool about ERETU and ERETS
  x86/cpu: Add X86_CR4_FRED macro
  x86/cpu: Add MSR numbers for FRED configuration
  x86/fred: Make unions for the cs and ss fields in struct pt_regs
  x86/fred: Add a new header file for FRED definitions
  x86/fred: Reserve space for the FRED stack frame
  x86/fred: Update MSR_IA32_FRED_RSP0 during task switch
  x86/fred: Let ret_from_fork_asm() jmp to fred_exit_user when FRED is
    enabled
  x86/fred: Disallow the swapgs instruction when FRED is enabled
  x86/fred: No ESPFIX needed when FRED is enabled
  x86/fred: Allow single-step trap and NMI when starting a new task
  x86/fred: Add a page fault entry stub for FRED
  x86/fred: Add a debug fault entry stub for FRED
  x86/fred: Add a NMI entry stub for FRED
  x86/traps: Add a system interrupt handler table for system interrupt
    dispatch
  x86/traps: Add external_interrupt() to dispatch external interrupts
  x86/fred: FRED entry/exit and dispatch code
  x86/fred: FRED initialization code

Xin Li (14):
  Documentation/x86/64: Add documentation for FRED
  x86/fred: Define a common function type fred_handler
  x86/fred: Add a machine check entry stub for FRED
  x86/fred: Add a double fault entry stub for FRED
  x86/entry: Remove idtentry_sysvec from entry_{32,64}.S
  x86/idtentry: Incorporate definitions/declarations of the FRED
    external interrupt handler type
  x86/traps: Add sysvec_install() to install a system interrupt handler
  x86/idtentry: Incorporate declaration/definition of the FRED exception
    handler type
  x86/fred: Fixup fault on ERETU by jumping to fred_entrypoint_user
  x86/traps: Export external_interrupt() for handling IRQ in IRQ induced
    VM exits
  x86/fred: Export fred_entrypoint_kernel() for handling NMI in NMI
    induced VM exits
  KVM: VMX: Add VMX_DO_FRED_EVENT_IRQOFF for IRQ/NMI handling
  x86/syscall: Split IDT syscall setup code into idt_syscall_init()
  x86/fred: Disable FRED by default in its early stage

 .../admin-guide/kernel-parameters.txt         |   4 +
 Documentation/arch/x86/x86_64/fred.rst        | 102 ++++++++
 Documentation/arch/x86/x86_64/index.rst       |   1 +
 arch/x86/Kconfig                              |   9 +
 arch/x86/entry/Makefile                       |   5 +-
 arch/x86/entry/entry_32.S                     |   4 -
 arch/x86/entry/entry_64.S                     |  14 +-
 arch/x86/entry/entry_64_fred.S                |  58 +++++
 arch/x86/entry/entry_fred.c                   | 220 ++++++++++++++++++
 arch/x86/entry/vsyscall/vsyscall_64.c         |   2 +-
 arch/x86/include/asm/asm-prototypes.h         |   1 +
 arch/x86/include/asm/cpufeatures.h            |   1 +
 arch/x86/include/asm/disabled-features.h      |   8 +-
 arch/x86/include/asm/extable_fixup_types.h    |   4 +-
 arch/x86/include/asm/fred.h                   | 157 +++++++++++++
 arch/x86/include/asm/idtentry.h               | 115 ++++++++-
 arch/x86/include/asm/msr-index.h              |  13 +-
 arch/x86/include/asm/ptrace.h                 |  57 ++++-
 arch/x86/include/asm/switch_to.h              |  11 +-
 arch/x86/include/asm/thread_info.h            |  12 +-
 arch/x86/include/asm/traps.h                  |  23 ++
 arch/x86/include/uapi/asm/processor-flags.h   |   2 +
 arch/x86/kernel/Makefile                      |   1 +
 arch/x86/kernel/cpu/acrn.c                    |   5 +-
 arch/x86/kernel/cpu/common.c                  |  47 +++-
 arch/x86/kernel/cpu/mce/core.c                |  15 ++
 arch/x86/kernel/cpu/mshyperv.c                |  16 +-
 arch/x86/kernel/espfix_64.c                   |   8 +
 arch/x86/kernel/fred.c                        |  67 ++++++
 arch/x86/kernel/irqinit.c                     |   7 +-
 arch/x86/kernel/kvm.c                         |   2 +-
 arch/x86/kernel/nmi.c                         |  19 ++
 arch/x86/kernel/process_64.c                  |  31 ++-
 arch/x86/kernel/traps.c                       | 153 ++++++++++--
 arch/x86/kvm/vmx/vmenter.S                    |  88 +++++++
 arch/x86/kvm/vmx/vmx.c                        |  19 +-
 arch/x86/lib/x86-opcode-map.txt               |   2 +-
 arch/x86/mm/extable.c                         |  79 +++++++
 arch/x86/mm/fault.c                           |  18 +-
 drivers/xen/events/events_base.c              |   3 +-
 tools/arch/x86/include/asm/cpufeatures.h      |   1 +
 .../arch/x86/include/asm/disabled-features.h  |   8 +-
 tools/arch/x86/include/asm/msr-index.h        |  13 +-
 tools/arch/x86/lib/x86-opcode-map.txt         |   2 +-
 tools/objtool/arch/x86/decode.c               |  19 +-
 45 files changed, 1348 insertions(+), 98 deletions(-)
 create mode 100644 Documentation/arch/x86/x86_64/fred.rst
 create mode 100644 arch/x86/entry/entry_64_fred.S
 create mode 100644 arch/x86/entry/entry_fred.c
 create mode 100644 arch/x86/include/asm/fred.h
 create mode 100644 arch/x86/kernel/fred.c
  

Comments

Peter Zijlstra Aug. 1, 2023, 10:52 a.m. UTC | #1
On Tue, Aug 01, 2023 at 01:32:42AM -0700, Xin Li wrote:
> Resend because the mail system failed to deliver some messages yesterday.

Well, you need to figure out how to send patches, because both yesterday
and today are screwy.

The one from yesterday came in 6 thread groups: 0-25, 26, 27, 28, 29, 30-36,
while the one from today comes in 2 thread groups: 0-26, 27-36. Which I
suppose one can count as an improvement :/

Seriously, it should not be hard to send 36 patches in a single thread.

I see you're trying to send through the regular corporate email
trainwreck; do you have a linux.intel.com account? Or really anything
else besides intel.com? You can try sending the series to yourself to
see if it arrives correctly as a whole before sending it out to the list
again.

I also believe there is a kernel.org service for sending patch series,
but i'm not sure I remember the details.
  
Peter Zijlstra Aug. 1, 2023, 1:02 p.m. UTC | #2
On Tue, Aug 01, 2023 at 12:52:36PM +0200, Peter Zijlstra wrote:

> I also believe there is a kernel.org service for sending patch series,
> but i'm not sure I remember the details.

https://b4.docs.kernel.org/en/latest/contributor/send.html
  
Li, Xin3 Aug. 1, 2023, 5:09 p.m. UTC | #3
> > Resend because the mail system failed to deliver some messages yesterday.
> 
> The one from yesterday came in 6 thread groups: 0-25, 26, 27, 28, 29, 30-36,
> while the one from today comes in 2 thread groups: 0-26, 27-36. Which I
> suppose one can count as an improvement :/

Sigh, sorry for the chaos.

>
> Seriously, it should not be hard to send 36 patches in a single thread.

No, but it worked fine before thus I didn't realize there an email
service policy which prevents sending email to too many recipients
in a short period (I have a long CC list in this v9 patch set).

> I see you're trying to send through the regular corporate email
> trainwreck; do you have a linux.intel.com account? Or really anything
> else besides intel.com? You can try sending the series to yourself to
> see if it arrives correctly as a whole before sending it out to the list
> again.

I did try sending to myself before to LKML, and it worked fine. But
now you know why it happened.

As mentioned, I should avoid "the regular corporate email trainwreck"
before doing it again. Working on it...
  
Li, Xin3 Aug. 1, 2023, 5:41 p.m. UTC | #4
> > I also believe there is a kernel.org service for sending patch series,
> > but i'm not sure I remember the details.
> 
> https://b4.docs.kernel.org/en/latest/contributor/send.html

It says:
The kernel.org endpoint can only be used for kernel.org-hosted projects.
If there are no recognized mailing lists in the to/cc headers, then the
submission will be rejected.

If I want to test the email sending service, how could I test it with sending
just to myself?  Maybe it allows only sending to the sender.