Message ID | cover.1697614386.git.andrea.porta@suse.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4712983vqb; Wed, 18 Oct 2023 04:13:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHhY41cmrKrOvvLbPaD2muUM35cv0uDa+Vp1k9THmAR+zo4NEHYeA/Mqg2nlS+rZAwxyx2m X-Received: by 2002:a17:90b:38cf:b0:27d:32d8:5f23 with SMTP id nn15-20020a17090b38cf00b0027d32d85f23mr5215327pjb.2.1697627637347; Wed, 18 Oct 2023 04:13:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697627637; cv=none; d=google.com; s=arc-20160816; b=zE0uclNlbxQDh0UiuGcpoauqzWD8vnJCmEt20G72hovY3ckKV/LZaA81yTqB7VgV6m mKzBB3xCs9BLpRmL/Z1CXxMDIs0hXhjUURhttEXfFyw64qfFRqofYW7zf+d+psfAXmF+ /Rv22oU9VTSUKH4mIhlGHgtS49xnrTTrDXQQ/qJLefBh02cPfIC4xqO3D2W4rOiTza/N tPG+U00yi1VprCeH0gBycxbV4pI1crf0LuygtlFVBulZCSndcJ/u2wqav9cJgTfSdMgJ uPjgMPmP0cfCXnDn8fYdmK0jwhFrt6OTjiG47cCq6gFkSzUvmg0qDASol9qCuO11/q/i /yoQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=WWOeIhCQpfkoB9v4pU6ssMo6xZ7yBOXJ0schrwa2slA=; fh=Fmfa0t6QgnUIyzgsSgnN1cYWfnFwLN+1Ba+VDm5h09M=; b=Ag0azrz9EQAFZFEVS74T+EyXvRfMgPFTnD/Ztwz+ep7lVXzcbHgUtxmD3e9XoJS17S KpG5VEWVEK8KXB1JR6tban4Ha7d23LzxfjznSxBh1oC0RQWH4SB66kakSbPEn8JygKk0 ncsPe9/s8T9gr8DblLhDpkP28VtcLj5/lWKth0GJBQsO5e8VfuLCU4g3eHtVhlKZutJB DlJoVyUxQ73FGFIwaZhsFKcrpWeC1zwZZPDj6QK/PBDzrBdpMFkGHe7/8yiB4FWcMvLa csrUHlaiz28klt4CwZVD4Vtqh6cDPkCgEpnN1MzFmHYvdr98eLcPrT8/KANUDsl0AQ2m Yeng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=rqsvqGS0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id bs191-20020a6328c8000000b005b3b8896199si1842387pgb.591.2023.10.18.04.13.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 04:13:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=rqsvqGS0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 117908114EE8; Wed, 18 Oct 2023 04:13:51 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230121AbjJRLNc (ORCPT <rfc822;lkml4gm@gmail.com> + 24 others); Wed, 18 Oct 2023 07:13:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229977AbjJRLNa (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 18 Oct 2023 07:13:30 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4530E112 for <linux-kernel@vger.kernel.org>; Wed, 18 Oct 2023 04:13:28 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 7E67E1F383; Wed, 18 Oct 2023 11:13:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1697627606; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=WWOeIhCQpfkoB9v4pU6ssMo6xZ7yBOXJ0schrwa2slA=; b=rqsvqGS0pplX1TMVXx3jtnoyflVVfjwHTHeEozpvg2BKUFgH/CbqvfwPoKYX6824ddK+7w RhX8j53M0c7g4pXZIkarww1+OSi2lNQGPVmL5awBufgdFZ1omg4JTUSkmRxCIhwP5QA+Tl zRLEwkbzK7cGaQvPPWalwslb1yVv73Q= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 61D9F13915; Wed, 18 Oct 2023 11:13:26 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id BAJvFda9L2VDZwAAMHmgww (envelope-from <aporta@suse.de>); Wed, 18 Oct 2023 11:13:26 +0000 From: Andrea della Porta <andrea.porta@suse.com> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Cc: nik.borisov@suse.com, Andrea della Porta <andrea.porta@suse.com> Subject: [PATCH 0/4] arm64: Make Aarch32 compatibility enablement optional at boot Date: Wed, 18 Oct 2023 13:13:18 +0200 Message-ID: <cover.1697614386.git.andrea.porta@suse.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Authentication-Results: smtp-out2.suse.de; none X-Spam-Level: X-Spam-Score: -1.70 X-Spamd-Result: default: False [-1.70 / 50.00]; ARC_NA(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; BAYES_HAM(-3.00)[100.00%]; FROM_HAS_DN(0.00)[]; TO_DN_SOME(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; R_MISSING_CHARSET(2.50)[]; MIME_GOOD(-0.10)[text/plain]; BROKEN_CONTENT_TYPE(1.50)[]; RCPT_COUNT_FIVE(0.00)[6]; NEURAL_HAM_LONG(-3.00)[-1.000]; DKIM_SIGNED(0.00)[suse.com:s=susede1]; NEURAL_HAM_SHORT(-1.00)[-1.000]; MID_CONTAINS_FROM(1.00)[]; FORGED_SENDER(0.30)[andrea.porta@suse.com,aporta@suse.de]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_TWO(0.00)[2]; FROM_NEQ_ENVFROM(0.10)[andrea.porta@suse.com,aporta@suse.de]; RCVD_TLS_ALL(0.00)[] X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Wed, 18 Oct 2023 04:13:51 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780091597627190026 X-GMAIL-MSGID: 1780091597627190026 |
Series |
arm64: Make Aarch32 compatibility enablement optional at boot
|
|
Message
Andrea della Porta
Oct. 18, 2023, 11:13 a.m. UTC
Aarch32 compatibility mode is enabled at compile time through CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support (for both processes and syscalls) be enabled at boot time using a kernel parameter. Also, it provides a mean for distributions to set their own default without sacrificing compatibility support, that is users can override default behaviour through the kernel parameter. *** Notes about syscall management *** VBAR_EL1 register, which holds the exception table address, is setup very early in the boot process, before parse_early_param(). This means that it's not possible to access boot parameter before setting the register. Also, setting the aforementioned register for secondary cpus is done later in the boot flow. Several ways to work around this has been considered, among which: * resetting VBAR_EL1 to point to one of two vector tables (the former with 32-bit exceptions handler enabled and the latter pointing to unhandled stub, just as if CONFIG_COMPAT is enabled) depending on the proposed boot parameter. This has the disadvantage to produce a somewhat messy patchset involving several lines, has higher cognitive load since there are at least three places where the register is getting changed (not near to each other), and have implications on other code segments (namely kpti, kvm and vdso), requiring special care. * patching the vector table contents once the early param is available. This has most of the implications of the previous option (except maybe not impacting other code segments), plus it sounds a little 'hackish'. The chosen approach involves conditional executing 32-bit syscalls depending on the parameter value. This of course results in a little performance loss, but has the following advantages: * all the cons from previously explained alternatives are solved * users of 32-bit apps on 64-bit kernel are already suffering from performance losses due to 32-bit apps not fully leveraging the 64-bit processor, so they are already aware of this * users of 32-bit apps on 64-bit kernel are believed to be a minority and most of the time there are sources available to be recompiled for 64-bit as a workaround for better performance It worth mentioning that users of 64-bit apps are, of course, unaffected. Based on the work from Nikolay Borisov, see: Link: https://lkml.org/lkml/2023/6/23/387 Andrea della Porta (4): arm64: Introduce aarch32_enabled() arm64/process: Make loading of 32bit processes depend on aarch32_enabled() arm64/entry-common: Make Aarch32 syscalls' availability depend on aarch32_enabled() arm64: Make Aarch32 emulation boot time configurable .../admin-guide/kernel-parameters.txt | 7 ++++ arch/arm64/Kconfig | 9 +++++ arch/arm64/include/asm/compat.h | 12 +++++++ arch/arm64/kernel/entry-common.c | 33 +++++++++++++++++-- arch/arm64/kernel/process.c | 2 +- 5 files changed, 59 insertions(+), 4 deletions(-)
Comments
Hi, On Wed, Oct 18, 2023 at 01:13:18PM +0200, Andrea della Porta wrote: > Aarch32 compatibility mode is enabled at compile time through > CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support > (for both processes and syscalls) be enabled at boot time using > a kernel parameter. Also, it provides a mean for distributions > to set their own default without sacrificing compatibility support, > that is users can override default behaviour through the kernel > parameter. I proposed something similar in the past: https://lkml.kernel.org/linux-fsdevel/20210916131816.8841-1-will@kernel.org/ bu the conclusion there (see the reply from Kees) was that it was better to either use existing seccomp mechanisms or add something to control which binfmts can be loaded. Will
On Wed, Oct 18, 2023, at 14:27, Will Deacon wrote: > Hi, > > On Wed, Oct 18, 2023 at 01:13:18PM +0200, Andrea della Porta wrote: >> Aarch32 compatibility mode is enabled at compile time through >> CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support >> (for both processes and syscalls) be enabled at boot time using >> a kernel parameter. Also, it provides a mean for distributions >> to set their own default without sacrificing compatibility support, >> that is users can override default behaviour through the kernel >> parameter. > > I proposed something similar in the past: > > https://lkml.kernel.org/linux-fsdevel/20210916131816.8841-1-will@kernel.org/ > > bu the conclusion there (see the reply from Kees) was that it was better > to either use existing seccomp mechanisms or add something to control > which binfmts can be loaded. Right, I was going to reply along the same lines here: x86 is a bit of a special case that needs this, but I believe all the other architectures already guard the compat syscall execution on test_thread_flag(TIF_32BIT) that is only set by the compat binfmt loader. Doing the reverse is something that has however come up in the past several times and that could be interesting: In order to run userspace emulation (qemu-user, fex, ...) we may want to allow calling syscalls and ioctls for foreign ABIs in a native task, and at that point having a mechanism to control this capability globally or per task would be useful as well. The compat mode (arm32 on arm64) is the easiest case here, but the same thing could be done for emulating the very subtle architecture differences (x86-64 on arm64, arm64 on x86_64, arm32 on x86-compat, or any of the above on riscv or loongarch). Arnd
On Wed, Oct 18, 2023 at 01:13:18PM +0200, Andrea della Porta wrote: > Aarch32 compatibility mode is enabled at compile time through > CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support > (for both processes and syscalls) be enabled at boot time using > a kernel parameter. Also, it provides a mean for distributions > to set their own default without sacrificing compatibility support, > that is users can override default behaviour through the kernel > parameter. Can you elaborate on *why* people want such a policy? > *** Notes about syscall management *** > VBAR_EL1 register, which holds the exception table address, > is setup very early in the boot process, before parse_early_param(). > This means that it's not possible to access boot parameter before > setting the register. Also, setting the aforementioned register > for secondary cpus is done later in the boot flow. > Several ways to work around this has been considered, among which: > > * resetting VBAR_EL1 to point to one of two vector tables (the > former with 32-bit exceptions handler enabled and the latter > pointing to unhandled stub, just as if CONFIG_COMPAT is enabled) > depending on the proposed boot parameter. This has the disadvantage > to produce a somewhat messy patchset involving several lines, > has higher cognitive load since there are at least three places > where the register is getting changed (not near to each other), > and have implications on other code segments (namely kpti, kvm > and vdso), requiring special care. > > * patching the vector table contents once the early param is available. > This has most of the implications of the previous option > (except maybe not impacting other code segments), plus it sounds > a little 'hackish'. > > The chosen approach involves conditional executing 32-bit syscalls > depending on the parameter value. Why does the compat syscall path need to do anything? On arm64 it's not possible to issue compat syscalls from a native 64-bit task. If you prevent the loading of AArch32 binaries, none of the compat syscalls will be reachable at all. That's the proper way to implement this, and we already have logic for that as part of the mismatched AArch32 support. > This of course results in a little performance loss, but has the following > advantages: A performance loss for what relative to what? How much of a performance loss? Mark. > * all the cons from previously explained alternatives are solved > * users of 32-bit apps on 64-bit kernel are already suffering from > performance losses due to 32-bit apps not fully leveraging the 64-bit > processor, so they are already aware of this > * users of 32-bit apps on 64-bit kernel are believed > to be a minority and most of the time there are sources available > to be recompiled for 64-bit as a workaround for better performance > > It worth mentioning that users of 64-bit apps are, of course, > unaffected. > > Based on the work from Nikolay Borisov, see: > Link: https://lkml.org/lkml/2023/6/23/387 > > Andrea della Porta (4): > arm64: Introduce aarch32_enabled() > arm64/process: Make loading of 32bit processes depend on > aarch32_enabled() > arm64/entry-common: Make Aarch32 syscalls' availability depend on > aarch32_enabled() > arm64: Make Aarch32 emulation boot time configurable > > .../admin-guide/kernel-parameters.txt | 7 ++++ > arch/arm64/Kconfig | 9 +++++ > arch/arm64/include/asm/compat.h | 12 +++++++ > arch/arm64/kernel/entry-common.c | 33 +++++++++++++++++-- > arch/arm64/kernel/process.c | 2 +- > 5 files changed, 59 insertions(+), 4 deletions(-) > > -- > 2.35.3 >
On 13:27 Wed 18 Oct , Will Deacon wrote: > Hi, > > On Wed, Oct 18, 2023 at 01:13:18PM +0200, Andrea della Porta wrote: > > Aarch32 compatibility mode is enabled at compile time through > > CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support > > (for both processes and syscalls) be enabled at boot time using > > a kernel parameter. Also, it provides a mean for distributions > > to set their own default without sacrificing compatibility support, > > that is users can override default behaviour through the kernel > > parameter. > > I proposed something similar in the past: > > https://lkml.kernel.org/linux-fsdevel/20210916131816.8841-1-will@kernel.org/ > > bu the conclusion there (see the reply from Kees) was that it was better > to either use existing seccomp mechanisms or add something to control > which binfmts can be loaded. > > Will I see. Seccomp sounds like a really good idea, since just blocking the compat binfmt would not avoid the call to 32-bit syscalls per se: it's true that ARM64 enforce the transition from A64 to A32 only on exception return and PSTATE.nRW flag can change only from EL1, maybe though some exploitation may arise in the future to do just that (I'm not aware of any or come up with a proof off the top of my head, but I can't exclude it either). So, assuming by absurd a switch to A32 is feasible, the further step of embedding A32 instruction in a A64 ELF executable is a breeze. Hence blocking the syscall (and not only the binfmt loading) could prove necessary. I know all of this is higly speculative right now, maybe it's worth thinking nonetheless. Andrea
On 14:44 Wed 18 Oct , Arnd Bergmann wrote: > On Wed, Oct 18, 2023, at 14:27, Will Deacon wrote: > > Hi, > > > > On Wed, Oct 18, 2023 at 01:13:18PM +0200, Andrea della Porta wrote: > >> Aarch32 compatibility mode is enabled at compile time through > >> CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support > >> (for both processes and syscalls) be enabled at boot time using > >> a kernel parameter. Also, it provides a mean for distributions > >> to set their own default without sacrificing compatibility support, > >> that is users can override default behaviour through the kernel > >> parameter. > > > > I proposed something similar in the past: > > > > https://lkml.kernel.org/linux-fsdevel/20210916131816.8841-1-will@kernel.org/ > > > > bu the conclusion there (see the reply from Kees) was that it was better > > to either use existing seccomp mechanisms or add something to control > > which binfmts can be loaded. > > Right, I was going to reply along the same lines here: x86 is > a bit of a special case that needs this, but I believe all the > other architectures already guard the compat syscall execution > on test_thread_flag(TIF_32BIT) that is only set by the compat > binfmt loader. Are you referring to the fact that x86 can switch at will between 32- and 64- bit code? Regarding the TIF_32BIT flag, thanks for the head-up. I still believe though that this mechanism can somehow break down in the future, since prohibiting 32 bit executable loading *and* blocking 32 bit compat syscall are two separate path of execution, held together by the architecture prohibiting to switch to A32 instructions by design. Breaking the first rule and embedding wisely crafted A32 instruction in an executable is easy, while the difficult part is finding some 'reentrancy' to be able to do the execution state switch, as pinted out in https://lore.kernel.org/lkml/ZTD0DAes-J-YQ2eu@apocalypse/. I agree it's highly speculative and not something to be concerned right now, it's just a head up, should the need arise in the future. > Doing the reverse is something that has however come up in the > past several times and that could be interesting: In order to > run userspace emulation (qemu-user, fex, ...) we may want to > allow calling syscalls and ioctls for foreign ABIs in a native > task, and at that point having a mechanism to control this > capability globally or per task would be useful as well. > > The compat mode (arm32 on arm64) is the easiest case here, but the > same thing could be done for emulating the very subtle architecture > differences (x86-64 on arm64, arm64 on x86_64, arm32 on x86-compat, > or any of the above on riscv or loongarch). > > Arnd Really interesting, Since it's more related to emulation needs (my patch has another focus due to the fact that A64 can execute A32 natively), I'll take a look at this separately. Andrea
On Thu, Oct 19, 2023, at 12:52, Andrea della Porta wrote: > On 14:44 Wed 18 Oct , Arnd Bergmann wrote: >> On Wed, Oct 18, 2023, at 14:27, Will Deacon wrote: >> >> Right, I was going to reply along the same lines here: x86 is >> a bit of a special case that needs this, but I believe all the >> other architectures already guard the compat syscall execution >> on test_thread_flag(TIF_32BIT) that is only set by the compat >> binfmt loader. > > Are you referring to the fact that x86 can switch at will between 32- and 64- > bit code? No. > Regarding the TIF_32BIT flag, thanks for the head-up. I still believe though > that this mechanism can somehow break down in the future, since prohibiting > 32 bit executable loading *and* blocking 32 bit compat syscall are two > separate path of execution, held together by the architecture prohibiting > to switch to A32 instructions by design. Breaking the first rule and embedding > wisely crafted A32 instruction in an executable is easy, while the difficult > part is finding some 'reentrancy' to be able to do the execution state switch, > as pinted out in https://lore.kernel.org/lkml/ZTD0DAes-J-YQ2eu@apocalypse/. > I agree it's highly speculative and not something to be concerned right > now, it's just a head up, should the need arise in the future. There are (at least) five separate aspects to compat mode that are easy to mix up: 1. Instruction decoding -- switching between the modes supported by the CPU (A64/A32/T32) 2. Word size -- what happens to the upper 32 bits of a register in an arithmetic operation 3. Personality -- Which architecture string gets returned by the uname syscall (aarch64 vs armv8) as well as the format of /proc/cpuinfo 4. system call entry points -- how a process calls into native or compat syscalls, or possibly foreign OS emulation 5. Binary format -- elf32 vs elf64 executables On most architectures with compat mode, 4. and 5. are fundamentally tied together today: a compat task can only call compat syscalls and a native task can only call native syscalls. x86 is the exception here, as it uses different instructions (int80, syscall, sysenter) and picks the syscall table based on that instruction. I think 1. and 2. are also always tied to 5 on arm, but this is not necessarily true for other architectures. 3. used to be tied to 5 on some architectures in the past, but should be independent now. >> Doing the reverse is something that has however come up in the >> past several times and that could be interesting: In order to >> run userspace emulation (qemu-user, fex, ...) we may want to >> allow calling syscalls and ioctls for foreign ABIs in a native >> task, and at that point having a mechanism to control this >> capability globally or per task would be useful as well. >> >> The compat mode (arm32 on arm64) is the easiest case here, but the >> same thing could be done for emulating the very subtle architecture >> differences (x86-64 on arm64, arm64 on x86_64, arm32 on x86-compat, >> or any of the above on riscv or loongarch). > > Really interesting, Since it's more related to emulation needs (my patch > has another focus due to the fact that A64 can execute A32 natively), > I'll take a look at this separately. A64 mode (unlike some other architectures, notably mips64) cannot execute A32 or T32 instructions without a mode switch, the three are entirely incompatible on the binary level. Many ARMv8-CPUs support both Aarch64 mode and Aarch32 (A32/T32), but a lot of the newer ones (e.g. Apple M1/M2, Cortex-R82 or Cortex-A715) only do Aarch64 and need user-space emulation to run 32-bit binaries. Arnd
On 13:52 Wed 18 Oct , Mark Rutland wrote: > On Wed, Oct 18, 2023 at 01:13:18PM +0200, Andrea della Porta wrote: > > Aarch32 compatibility mode is enabled at compile time through > > CONFIG_COMPAT Kconfig option. This patchset lets 32-bit support > > (for both processes and syscalls) be enabled at boot time using > > a kernel parameter. Also, it provides a mean for distributions > > to set their own default without sacrificing compatibility support, > > that is users can override default behaviour through the kernel > > parameter. > > Can you elaborate on *why* people want such a policy? > Formerly, the reason was to reduce kernel attack surface by excluding compat syscall, wherever applicable. Much less important but still a point, I would also say this could be a good chance to get rid of somewhat old and stale 32-bit libraries and programs, but this is of course debatable. > > *** Notes about syscall management *** > > VBAR_EL1 register, which holds the exception table address, > > is setup very early in the boot process, before parse_early_param(). > > This means that it's not possible to access boot parameter before > > setting the register. Also, setting the aforementioned register > > for secondary cpus is done later in the boot flow. > > Several ways to work around this has been considered, among which: > > > > * resetting VBAR_EL1 to point to one of two vector tables (the > > former with 32-bit exceptions handler enabled and the latter > > pointing to unhandled stub, just as if CONFIG_COMPAT is enabled) > > depending on the proposed boot parameter. This has the disadvantage > > to produce a somewhat messy patchset involving several lines, > > has higher cognitive load since there are at least three places > > where the register is getting changed (not near to each other), > > and have implications on other code segments (namely kpti, kvm > > and vdso), requiring special care. > > > > * patching the vector table contents once the early param is available. > > This has most of the implications of the previous option > > (except maybe not impacting other code segments), plus it sounds > > a little 'hackish'. > > > > The chosen approach involves conditional executing 32-bit syscalls > > depending on the parameter value. > > Why does the compat syscall path need to do anything? I probably didn't catch your point here, compat syscall does not need to do anything and they do not (just like they works right now with CONFIG_COMPAT alone), except for the conditional instruction that excludes them at runtime. Of course this conditional *is* doing something and somewhat redundant if compat is disabled, but in this scenario I think it's unavoidable. > > On arm64 it's not possible to issue compat syscalls from a native 64-bit task. > If you prevent the loading of AArch32 binaries, none of the compat syscalls > will be reachable at all. > > That's the proper way to implement this, and we already have logic for that as > part of the mismatched AArch32 support. > > > This of course results in a little performance loss, but has the following > > advantages: > > A performance loss for what relative to what? of a compat syscall as it is now enabling CONFIG_COMPAT vs the patched syscall handlers that need a further conditional instruction to check whether comapt is enabled or not. > > How much of a performance loss? I did not take measurement yet since it was just a qualitative consideration more than a quantitative one, also considering that chances are that it would affect just very little population. The conditional instruction time taken to execute is reasonably near to negligible if compared to any syscall execution.