[v7,0/4] kexec: Fix kexec_file_load for llvm16 with PGO

Message ID 20230321-kexec_clang16-v7-0-b05c520b7296@chromium.org
Headers
Series kexec: Fix kexec_file_load for llvm16 with PGO |

Message

Ricardo Ribalda May 19, 2023, 2:47 p.m. UTC
  When upreving llvm I realised that kexec stopped working on my test
platform.

The reason seems to be that due to PGO there are multiple .text sections
on the purgatory, and kexec does not supports that.

Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
---
Changes in v7:
- Fix $SUBJECT of riscv patch
- Rename PGO as Profile-guided optimization
- Link to v6: https://lore.kernel.org/r/20230321-kexec_clang16-v6-0-a2255e81ab45@chromium.org

Changes in v6:
- Replace linker script with Makefile rule. Thanks Nick
- Link to v5: https://lore.kernel.org/r/20230321-kexec_clang16-v5-0-5563bf7c4173@chromium.org

Changes in v5:
- Add warning when multiple text sections are found. Thanks Simon!
- Add Fixes tag.
- Link to v4: https://lore.kernel.org/r/20230321-kexec_clang16-v4-0-1340518f98e9@chromium.org

Changes in v4:
- Add Cc: stable
- Add linker script for x86
- Add a warning when the kernel image has overlapping sections.
- Link to v3: https://lore.kernel.org/r/20230321-kexec_clang16-v3-0-5f016c8d0e87@chromium.org

Changes in v3:
- Fix initial value. Thanks Ross!
- Link to v2: https://lore.kernel.org/r/20230321-kexec_clang16-v2-0-d10e5d517869@chromium.org

Changes in v2:
- Fix if condition. Thanks Steven!.
- Update Philipp email. Thanks Baoquan.
- Link to v1: https://lore.kernel.org/r/20230321-kexec_clang16-v1-0-a768fc2c7c4d@chromium.org

---
Ricardo Ribalda (4):
      kexec: Support purgatories with .text.hot sections
      x86/purgatory: Remove PGO flags
      powerpc/purgatory: Remove PGO flags
      riscv/purgatory: Remove PGO flags

 arch/powerpc/purgatory/Makefile |  5 +++++
 arch/riscv/purgatory/Makefile   |  5 +++++
 arch/x86/purgatory/Makefile     |  5 +++++
 kernel/kexec_file.c             | 14 +++++++++++++-
 4 files changed, 28 insertions(+), 1 deletion(-)
---
base-commit: 58390c8ce1bddb6c623f62e7ed36383e7fa5c02f
change-id: 20230321-kexec_clang16-4510c23d129c

Best regards,
  

Comments

Song Liu Sept. 8, 2023, 9:48 p.m. UTC | #1
Hi Ricardo,

Thanks for your kind reply.

On Fri, Sep 8, 2023 at 2:18 PM Ricardo Ribalda <ribalda@chromium.org> wrote:
>
> Hi Song
>
> On Fri, 8 Sept 2023 at 01:08, Song Liu <song@kernel.org> wrote:
> >
> > Hi Ricardo and folks,
> >
> > On Fri, May 19, 2023 at 7:48 AM Ricardo Ribalda <ribalda@chromium.org> wrote:
> > >
> > > When upreving llvm I realised that kexec stopped working on my test
> > > platform.
> > >
> > > The reason seems to be that due to PGO there are multiple .text sections
> > > on the purgatory, and kexec does not supports that.
> > >
> > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> >
> > We are seeing WARNINGs like the following while kexec'ing a PGO and
> > LTO enabled kernel:
> >
> > WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919
> > kexec_load_purgatory+0x37f/0x390
> >
> > AFAICT, the warning was added by this set, and it was triggered when
> > we have many .text sections
> > in purgatory.ro. The kexec was actually successful. So I wonder
> > whether we really need the
> > WARNING here. If we disable LTO (PGO is still enabled), we don't see
> > the WARNING any more.
> >
> > I also tested an older kernel (5.19 based), where we also see many
> > .text sections with LTO. It
> > kexec()'ed fine. (It doesn't have the WARN_ON() in
> > kexec_purgatory_setup_sechdrs).
>
> You have been "lucky" that the code has chosen the correct start
> address, you need to modify the linker script of your kernel to
> disable PGO.
> You need to backport a patch like this:
> https://lore.kernel.org/lkml/CAPhsuW5_qAvV0N3o+hOiAnb1=buJ1pLzqYW9D+Bwft6hxJvAeQ@mail.gmail.com/T/#md68b7f832216b0c56bbec0c9b07332e180b9ba2b

We already have this commit in our branch. AFAICT, the issue was
triggered by LTO. So something like the following seems fixes it
(I haven't finished the end-to-end test yet). Does this change make
sense to you?

Thanks again,
Song

diff --git i/arch/x86/purgatory/Makefile w/arch/x86/purgatory/Makefile
index 8f71aaa04cc2..dc306fa7197d 100644
--- i/arch/x86/purgatory/Makefile
+++ w/arch/x86/purgatory/Makefile
@@ -19,6 +19,10 @@ CFLAGS_sha256.o := -D__DISABLE_EXPORTS
 # optimization flags.
 KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=%
-fprofile-use=%,$(KBUILD_CFLAGS))

+# When LTO is enabled, llvm emits many text sections, which is not supported
+# by kexec. Remove -flto=* flags.
+KBUILD_CFLAGS := $(filter-out -flto=%,$(KBUILD_CFLAGS))
+
 # When linking purgatory.ro with -r unresolved symbols are not checked,
 # also link a purgatory.chk binary without -r to check for unresolved symbols.
 PURGATORY_LDFLAGS := -e purgatory_start -z nodefaultlib
  
Song Liu Sept. 8, 2023, 10:53 p.m. UTC | #2
On Fri, Sep 8, 2023 at 2:52 PM Ricardo Ribalda <ribalda@chromium.org> wrote:
>
> Hi Song
>
> On Fri, 8 Sept 2023 at 23:48, Song Liu <song@kernel.org> wrote:
> >
> > Hi Ricardo,
> >
> > Thanks for your kind reply.
> >
> > On Fri, Sep 8, 2023 at 2:18 PM Ricardo Ribalda <ribalda@chromium.org> wrote:
> > >
> > > Hi Song
> > >
> > > On Fri, 8 Sept 2023 at 01:08, Song Liu <song@kernel.org> wrote:
> > > >
> > > > Hi Ricardo and folks,
> > > >
> > > > On Fri, May 19, 2023 at 7:48 AM Ricardo Ribalda <ribalda@chromium.org> wrote:
> > > > >
> > > > > When upreving llvm I realised that kexec stopped working on my test
> > > > > platform.
> > > > >
> > > > > The reason seems to be that due to PGO there are multiple .text sections
> > > > > on the purgatory, and kexec does not supports that.
> > > > >
> > > > > Signed-off-by: Ricardo Ribalda <ribalda@chromium.org>
> > > >
> > > > We are seeing WARNINGs like the following while kexec'ing a PGO and
> > > > LTO enabled kernel:
> > > >
> > > > WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919
> > > > kexec_load_purgatory+0x37f/0x390
> > > >
> > > > AFAICT, the warning was added by this set, and it was triggered when
> > > > we have many .text sections
> > > > in purgatory.ro. The kexec was actually successful. So I wonder
> > > > whether we really need the
> > > > WARNING here. If we disable LTO (PGO is still enabled), we don't see
> > > > the WARNING any more.
> > > >
> > > > I also tested an older kernel (5.19 based), where we also see many
> > > > .text sections with LTO. It
> > > > kexec()'ed fine. (It doesn't have the WARN_ON() in
> > > > kexec_purgatory_setup_sechdrs).
> > >
> > > You have been "lucky" that the code has chosen the correct start
> > > address, you need to modify the linker script of your kernel to
> > > disable PGO.
> > > You need to backport a patch like this:
> > > https://lore.kernel.org/lkml/CAPhsuW5_qAvV0N3o+hOiAnb1=buJ1pLzqYW9D+Bwft6hxJvAeQ@mail.gmail.com/T/#md68b7f832216b0c56bbec0c9b07332e180b9ba2b
> >
> > We already have this commit in our branch. AFAICT, the issue was
> > triggered by LTO. So something like the following seems fixes it
> > (I haven't finished the end-to-end test yet). Does this change make
> > sense to you?
>
> if the end-to-end works, please send it as a patch to the mailing list.
>
> Thanks! :)

OK, it works (AFAICT). Sending the patch.

Thanks,
Song