Message ID | 20230518131013.3366406-1-guoren@kernel.org |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp494833vqo; Thu, 18 May 2023 06:31:59 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4sWQJS1I7nr7iakfGXKK4uk8FcxRMYzLSlSHogwZIFK8Q+134EqSWxSXIR/EsmM0wI4d1P X-Received: by 2002:a17:902:e889:b0:1ac:6fc3:6beb with SMTP id w9-20020a170902e88900b001ac6fc36bebmr2691870plg.9.1684416719502; Thu, 18 May 2023 06:31:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684416719; cv=none; d=google.com; s=arc-20160816; b=J6ZQmst/C2O4eZJkqY4tgW7cged8yMaaHOAdeIAFSMeUPZwKRZTvWekcPhwWbGvXqF /GS+O+U/QCVYFEB7lHQ+4bDcec+aSRmWmW8IT7wpnAVv0aOgDWmGow7K8u8CL+JJ3xQF U/qLkn1VWXau4exBgwk8SsBXPxMHc2q9sJ3/78U+pCByG9rTyWgg7Gt0VlVNeDH1q5eQ k/MmAn+kLqZjrNqx+6JFQiwyZ97TuVN+8E1SeBbtx4YlnFkuR1KC7qgOL5R7FW1YqilS mHMVI0Qfh5pGAXHmAwjWK/wYBC9rFmPk1pHCKdCmvQWqaOuAwSrl/8bd1YkjaWkuwcTY 5dIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=uQBFMnyvGd3AmvFkAjmtQEdISrVeldez+rXLKKSW4EM=; b=FoR4RM1d5umxrFJlADuu3cODk0FhM1u56RCeHL2J/WbuB7feOmCTcvkdccSM/sQPJ5 aT7HHqJ7UQRkFEjCliKkjkqWIJRFbqag+XmoHAzhFVT8KWRyrlbyvTGa6XIkGR3H4rRl XO9bTQPsvyUz0EFQKFpik+t7pFQFFazX5Lqr65OF0CQl6WiuW/3VWOOJ0ntuI7ULOGti NPUp/Idnn9+i0yWs6qVXWNFWGdvG2rO+oDVIrwKgAYcy7nAq0KFVOMOjDcDqxmNKYPas BMhr4KY838/tgTJmKO3OARnk0k7RINesuVKThEgbNgrT5POTrbfnZx/0k3GMUCDi4LSK MX2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="VM+VA5Q/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a4-20020a170902b58400b001a990bcc3f5si1280342pls.392.2023.05.18.06.31.43; Thu, 18 May 2023 06:31:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="VM+VA5Q/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230368AbjERNM2 (ORCPT <rfc822;pacteraone@gmail.com> + 99 others); Thu, 18 May 2023 09:12:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231163AbjERNMY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 18 May 2023 09:12:24 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C8C0D172D; Thu, 18 May 2023 06:11:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F228064F3A; Thu, 18 May 2023 13:10:44 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5514FC433D2; Thu, 18 May 2023 13:10:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1684415444; bh=tJaysd76cVhWZxG/fChiYJE/huUh95V3nRgloDbHiwo=; h=From:To:Cc:Subject:Date:From; b=VM+VA5Q/E4wNv2N2O1TXNTDXmb1OqzJpc/IRx06U221Q7JEpQJ6rQ7wdc6raUq26o kycdVLGhcDoeMJoG5mQKyMXxHk7aQIUqB+EQjxTIRA3DO4J8/JA6x5Ei73qGRfeGua 2hYw7+eI97fpESX/3xoOlOpI0qZMAkQKgdL61+YNw+SreIYefUlAfsagFI1gH+Qe+g xFTfIRT1NOUA0L24ysWYMtVko16F6H3ClwXJCTN+uanyt4AfdSUds4FSR0UbTn1Kdg RD3e1tzJxEm6IfLlGcf0F8dOj/Ia6fx5NhFv4GVyqx6eM0q4z761V3sqoP0y32TAch e0CA4/y2PZJKw== From: guoren@kernel.org To: arnd@arndb.de, guoren@kernel.org, palmer@rivosinc.com, tglx@linutronix.de, peterz@infradead.org, luto@kernel.org, conor.dooley@microchip.com, heiko@sntech.de, jszhang@kernel.org, chenhuacai@kernel.org, apatel@ventanamicro.com, atishp@atishpatra.org, mark.rutland@arm.com, bjorn@kernel.org, paul.walmsley@sifive.com, catalin.marinas@arm.com, will@kernel.org, rppt@kernel.org, anup@brainfault.org, shihua@iscas.ac.cn, jiawei@iscas.ac.cn, liweiwei@iscas.ac.cn, luxufan@iscas.ac.cn, chunyu@iscas.ac.cn, tsu.yubo@gmail.com, wefu@redhat.com, wangjunqiang@iscas.ac.cn, kito.cheng@sifive.com, andy.chiu@sifive.com, vincent.chen@sifive.com, greentime.hu@sifive.com, corbet@lwn.net, wuwei2016@iscas.ac.cn, jrtc27@jrtc27.com Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Guo Ren <guoren@linux.alibaba.com> Subject: [RFC PATCH 00/22] riscv: s64ilp32: Running 32-bit Linux kernel on 64-bit supervisor mode Date: Thu, 18 May 2023 09:09:51 -0400 Message-Id: <20230518131013.3366406-1-guoren@kernel.org> X-Mailer: git-send-email 2.36.1 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766238945925486489?= X-GMAIL-MSGID: =?utf-8?q?1766238945925486489?= |
Series |
riscv: s64ilp32: Running 32-bit Linux kernel on 64-bit supervisor mode
|
|
Message
Guo Ren
May 18, 2023, 1:09 p.m. UTC
From: Guo Ren <guoren@linux.alibaba.com>
This patch series adds s64ilp32 support to riscv. The term s64ilp32
means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers are all
32-bit), i.e., running 32-bit Linux kernel on pure 64-bit supervisor
mode. There have been many 64ilp32 abis existing, such as mips-n32 [1],
arm-aarch64ilp32 [2], and x86-x32 [3], but they are all about userspace.
Thus, this should be the first time running a 32-bit Linux kernel with
the 64ilp32 ABI at supervisor mode (If not, correct me).
Why 32-bit Linux?
=================
The motivation for using a 32-bit Linux kernel is to reduce memory
footprint and meet the small capacity of DDR & cache requirement
(e.g., 64/128MB SIP SoC).
Here are the 32-bit v.s. 64-bit Linux kernel data type comparison
summary:
32-bit 64-bit
sizeof(page): 32bytes 64bytes
sizeof(list_head): 8bytes 16bytes
sizeof(hlist_head): 8bytes 16bytes
sizeof(vm_area): 68bytes 136bytes
...
The size of ilp32's long & pointer is just half of lp64's (rv64 default
abi - longs and pointers are all 64-bit). This significant difference
in data type causes different memory & cache footprint costs. Here is
the comparison measurement between s32ilp32, s64ilp32, and s64lp64 in
the same 128MB qemu system environment:
Rootfs:
u32ilp32 - Using the same 32-bit userspace rootfs.ext2 (UXL=32) binary
from buildroot 2023.02-rc3, qemu_riscv32_virt_defconfig
Linux:
s32ilp32 - Linux version 6.3.0-rc1 (124MB)
rv32_defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile
defconfig 32-bit.config
s64lp64 - Linux version 6.3.0-rc1 (126MB)
defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile defconfig
s64ilp32 - Linux version 6.3.0-rc1 (126MB)
rv64ilp32_defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile
defconfig 64ilp32.config
Opensbi:
m64lp64 - (2MB) OpenSBI v1.2-80-g4b28afc98bbe
m32ilp32 - (4MB) OpenSBI v1.2-80-g4b28afc98bbe
+----------------------------------------+--------
| u32ilp32 |
| UXL=32 | Rootfs
+----------------------------------------+--------
| +----------+ +---------+ | +---------+ |
| | s64ilp32 | | s64lp64 | | | s32ilp32| |
| | SXL=64 | | SXL=64 | | | SXL=32 | | Linux
| +----------+ +---------+ | +---------+ |
+----------------------------------------+--------
| +----------------------+ | +---------+ |
| | m64lp64 | | | m32ilp32| |
| | MXL=64 | | | MXL=32 | | Opensbi
| +----------------------+ | +---------+ |
+----------------------------------------+--------
| +----------------------+ | +---------+ |
| | qemu-rv64 | | |qemu-rv32| | HW
| +----------------------+ | +---------+ |
+----------------------------------------+--------
Mem-usage:
(s32ilp32) # free
total used free shared buff/cache available
Mem: 100040 8380 88244 44 3416 88080
(s64lp64) # free
total used free shared buff/cache available
Mem: 91568 11848 75796 44 3924 75952
(s64ilp32) # free
total used free shared buff/cache available
Mem: 101952 8528 90004 44 3420 89816
^^^^^
It's a rough measurement based on the current default config without any
modification, and 32-bit (s32ilp32, s64ilp32) saved more than 16% memory
to 64-bit (s64lp64). But s32ilp32 & s64ilp32 have a similar memory
footprint (about 0.33% difference), meaning s64ilp32 has a big chance to
replace s32ilp32 on the 64-bit machine.
Why s64ilp32?
=============
The current RISC-V has the profiles of RVA20S64, RVA22S64, and RVA23S64
(ongoing) [4], but no RVA**S32 profile exists or any ongoing plan. That
means when a vendor wants to produce a 32-bit s-mode RISC-V Application
Processor, they have no shape to follow. Therefore, many cheap riscv
chips have come out but follow the RVA2xS64 profiles, such as Allwinner
D1/D1s/F133 [5], SOPHGO CV1800B [6], Canaan Kendryte k230 [7], and
Bouffalo Lab BL808 which are typically cortex a7/a35/a53 product
scenarios. The D1 & CV1800B & BL808 didn't support UXL=32 (32-bit U-mode),
so they need a new u64ilp32 userspace ABI which has no software ecosystem
for the current. Thus, the first landing of s64ilp32 would be on Canaan
Kendryte k230, which has c908 with rv64gcv and compat user mode
(sstatus.uxl=32/64), which could support the existing rv32 userspace
software ecosystem.
Another reason for inventing s64ilp32 is performance benefits and
simplify 64-bit CPU hardware design (v.s. s32ilp32).
Why s64ilp32 has better performance?
====================================
Generally speaking, we should build a 32-bit hardware s-mode to run
32-bit Linux on a 64-bit processor (such as Linux-arm32 on cortex-a53).
Or only use old 32ilp32-abi on a 64-bit machine (such as mips
SYS_SUPPORTS_32BIT_KERNEL). These can't reuse performance-related
features and instructions of the 64-bit hardware, such as 64-bit ALU,
AMO, and LD/SD, which would cause significant performance gaps on many
Linux features:
- memcpy/memset/strcmp (s64ilp32 has half of the instructions count
and double the bandwidth of load/store instructions than s32ilp32.)
- ebpf JIT is a 64-bit virtual ISA, which is not suitable
for mapping to s32ilp32.
- Atomic64 (s64ilp32 has the exact native instructions mapping as
s64lp64, but s32ilp32 only uses generic_atomic64, a tradeoff &
limited software solution.)
- 64-bit native arithmetic instructions for "long long" type
- Support cmxchg_double for slub (The 2nd 32-bit Linux
supports the feature, the 1st is i386.)
- ...
Compared with the user space ecosystem, the 32-bit Linux kernel is more
eager to need 64ilp32 to improve performance because the Linux kernel
can't utilize float-point/vector features of the ISA.
Let's look at performance from another perspective (s64ilp32 v.s.
s64lp64). Just as the first chapter said, the pointer size of ilp32 is
half of the lp64, and it reduces the size of the critical data structs
(e.g., page, list, ...). That means the cache of using ilp32 could
contain double data that lp64 with the same cache capacity, which is a
natural advantage of 32-bit.
Why s64ilp32 simplifies CPU design?
===================================
Yes, there are a lot of runing 32-bit Linux on 64-bit hardware examples
in history, such as arm cortex a35/a53/a55, which implements the 32-bit
EL1/EL2/EL3 hardware mode to support 32-bit Linux. We could follow Arm's
style, but riscv could choose another better way. Compared to UXL=32,
the MXL=SXL=32 has many CSR-related hardware functionalities, which
causes a lot of effort to mix them into 64-bit hardware. The s64ilp32
works on MXL=SXL=64 mode, so the CPU vendors needn't implement 32-bit
machine and supervisor modes.
How does s64ilp32 work?
=======================
The s64ilp32 is the same as the s64lp64 compat mode from a hardware
view, i.e., MXL=SXL=64 + UXL=32. Because the s64ilp32 uses CONFIG_32BIT
of Linux, it only supports u32ilp32 abi user space, the current standard
rv32 software ecosystem, and it can't work with u64lp64 abi (I don't
want that complex and useless stuff). But it may work with u64ilp32 in the
future; now, the s64ilp32 depends on the UXL=32 feature of the hardware.
The 64ilp32 gcc still uses sign-extend lw & auipc to generate address
variables because inserting zero-extend instructions to mask the highest
32-bit would cause significant code size and performance problems. Thus,
we invented an OS approach to solve the problem:
- When satp=bare and start physical address < 2GB, there is no sign-extend
address problem.
- When satp=bare and start physical address > 2GB, we need zjpm liked
hardware extensions to mask high 32bit.
(Fortunately, all existed SoCs' (D1/D1s/F133, CV1800B, k230, BL808)
start physical address < 2GB.)
- When satp=sv39, we invent double mapping to make the sign-extended
virtual address the same as the zero-extended virtual address.
+--------+ +---------+ +--------+
| | +--| 511:PUD1| | |
| | | +---------+ | |
| | | | 510:PUD0|--+ | |
| | | +---------+ | | |
| | | | | | | |
| | | | | | | |
| | | | | | | |
| | | | INVALID | | | |
| | | | | | | |
| .... | | | | | | .... |
| | | | | | | |
| | | +---------+ | | |
| | +--| 3:PUD1 | | | |
| | | +---------+ | | |
| | | | 2:PUD0 |--+ | |
| | | +---------+ | | |
| | | |1:USR_PUD| | | |
| | | +---------+ | | |
| | | |0:USR_PUD| | | |
+--------+<--+ +---------+ +-->+--------+
PUD1 ^ PGD PUD0
1GB | 4GB 1GB
|
+----------+
| Sv39 PGDP|
+----------+
SATP
The size of xlen was always equal to the pointer/long size before
s64ilp32 emerged. So we need to introduce a new type of data - xlen_t,
which could deal with CSR-related and callee-save/restore operations.
Some kernel features use 32BIT/64BIT to determine the exact ISA, such as
ebpf JIT would map to rv32 ISA when CONFIG_32BIT=y. But s64ilp32 needs
the ebpf JIT map to rv64 ISA when CONFIG_32BIT=y and we need to use
another config to distinguish the difference.
More detials, please review the path series.
How to run s64ilp32?
====================
GNU toolchain
-------------
git clone https://github.com/Liaoshihua/riscv-gnu-toolchain.git
cd riscv-gnu-toolchain
./configure --prefix="$PWD/opt-rv64-ilp32/" --with-arch=rv64imac --with-abi=ilp32
make linux
export PATH=$PATH:$PWD/opt-rv64-ilp32/bin/
Opensbi
-------
git clone https://github.com/riscv-software-src/opensbi.git
CROSS_COMPILE=riscv64-unknown-linux-gnu- make PLATFORM=generic
Linux kernel
------------
git clone https://github.com/guoren83/linux.git -b s64ilp32
cd linux
make ARCH=riscv CROSS_COMPILE=riscv64-unknown-linux-gnu- rv64ilp32_defconfig
make ARCH=riscv CROSS_COMPILE=riscv64-unknown-linux-gnu- all
Rootfs
------
git clone git://git.busybox.net/buildroot
cd buildroot
make qemu_riscv32_virt_defconfig
make
Qemu
----
git clone https://github.com/plctlab/plct-qemu.git -b plct-s64ilp32-dev
cd plct-qemu
mkdir build
cd build
../qemu/configure --target-list="riscv64-softmmu riscv32-softmmu"
make
Run
---
./qemu-system-riscv64 -cpu rv64 -M virt -m 128m -nographic -bios fw_dynamic.bin -kernel Image -drive file=rootfs.ext2,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 -append "rootwait root=/dev/vda ro console=ttyS0 earlycon=sbi" -netdev user,id=net0 -device virtio-net-device,netdev=net0
OpenSBI v1.2-119-gdc1c7db05e07
____ _____ ____ _____
/ __ \ / ____| _ \_ _|
| | | |_ __ ___ _ __ | (___ | |_) || |
| | | | '_ \ / _ \ '_ \ \___ \| _ < | |
| |__| | |_) | __/ | | |____) | |_) || |_
\____/| .__/ \___|_| |_|_____/|___/_____|
| |
|_|
Platform Name : riscv-virtio,qemu
Platform Features : medeleg
Platform HART Count : 1
Platform IPI Device : aclint-mswi
Platform Timer Device : aclint-mtimer @ 10000000Hz
Platform Console Device : uart8250
Platform HSM Device : ---
Platform PMU Device : ---
Platform Reboot Device : sifive_test
Platform Shutdown Device : sifive_test
Platform Suspend Device : ---
Platform CPPC Device : ---
Firmware Base : 0x60000000
Firmware Size : 360 KB
Firmware RW Offset : 0x40000
Runtime SBI Version : 1.0
Domain0 Name : root
Domain0 Boot HART : 0
Domain0 HARTs : 0*
Domain0 Region00 : 0x0000000002000000-0x000000000200ffff M: (I,R,W) S/U: ()
Domain0 Region01 : 0x0000000060040000-0x000000006005ffff M: (R,W) S/U: ()
Domain0 Region02 : 0x0000000060000000-0x000000006003ffff M: (R,X) S/U: ()
Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: (R,W,X) S/U: (R,W,X)
Domain0 Next Address : 0x0000000060200000
Domain0 Next Arg1 : 0x0000000067e00000
Domain0 Next Mode : S-mode
Domain0 SysReset : yes
Domain0 SysSuspend : yes
Boot HART ID : 0
Boot HART Domain : root
Boot HART Priv Version : v1.12
Boot HART Base ISA : rv64imafdch
Boot HART ISA Extensions : time,sstc
Boot HART PMP Count : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count : 16
Boot HART MIDELEG : 0x0000000000001666
Boot HART MEDELEG : 0x0000000000f0b509
[ 0.000000] Linux version 6.3.0-rc1-00086-gc8d2fedb997a (guoren@fedora) (riscv64-unknown-linux-gnu-gcc (g5e578a16201f) 13.0.1 20230206 (experimental), GNU ld (GNU Binutils) 2.40.50.20230205) #1 SMP Sun May 14 10:46:42 EDT 2023
[ 0.000000] random: crng init done
[ 0.000000] OF: fdt: Ignoring memory range 0x60000000 - 0x60200000
[ 0.000000] Machine model: riscv-virtio,qemu
[ 0.000000] efi: UEFI not found.
[ 0.000000] OF: reserved mem: 0x60000000..0x6003ffff (256 KiB) map non-reusable mmode_resv1@60000000
[ 0.000000] OF: reserved mem: 0x60040000..0x6005ffff (128 KiB) map non-reusable mmode_resv0@60040000
[ 0.000000] Zone ranges:
[ 0.000000] Normal [mem 0x0000000060200000-0x0000000067ffffff]
[ 0.000000] Movable zone start for each node
[ 0.000000] Early memory node ranges
[ 0.000000] node 0: [mem 0x0000000060200000-0x0000000067ffffff]
[ 0.000000] Initmem setup node 0 [mem 0x0000000060200000-0x0000000067ffffff]
[ 0.000000] On node 0, zone Normal: 512 pages in unavailable ranges
[ 0.000000] SBI specification v1.0 detected
[ 0.000000] SBI implementation ID=0x1 Version=0x10002
[ 0.000000] SBI TIME extension detected
[ 0.000000] SBI IPI extension detected
[ 0.000000] SBI RFENCE extension detected
[ 0.000000] SBI SRST extension detected
[ 0.000000] SBI HSM extension detected
[ 0.000000] riscv: base ISA extensions acdfhim
[ 0.000000] riscv: ELF capabilities acdfim
[ 0.000000] percpu: Embedded 13 pages/cpu s24352 r8192 d20704 u53248
[ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 31941
[ 0.000000] Kernel command line: rootwait root=/dev/vda ro console=ttyS0 earlycon=sbi norandmaps
[ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear)
[ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear)
[ 0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
[ 0.000000] Virtual kernel memory layout:
[ 0.000000] fixmap : 0x9ce00000 - 0x9d000000 (2048 kB)
[ 0.000000] pci io : 0x9d000000 - 0x9e000000 ( 16 MB)
[ 0.000000] vmemmap : 0x9e000000 - 0xa0000000 ( 32 MB)
[ 0.000000] vmalloc : 0xa0000000 - 0xc0000000 ( 512 MB)
[ 0.000000] lowmem : 0xc0000000 - 0xc7e00000 ( 126 MB)
[ 0.000000] Memory: 97748K/129024K available (8699K kernel code, 8867K rwdata, 4096K rodata, 4204K init, 361K bss, 31276K reserved, 0K cma-reserved)
...
Starting network: udhcpc: started, v1.36.0
udhcpc: broadcasting discover
udhcpc: broadcasting select for 10.0.2.15, server 10.0.2.2
udhcpc: lease of 10.0.2.15 obtained from 10.0.2.2, lease time 86400
deleting routers
adding dns 10.0.2.3
OK
Welcome to Buildroot
buildroot login: root
# cat /proc/cpuinfo
processor : 0
hart : 0
isa : rv64imafdch_zihintpause_zbb_sstc
mmu : sv39
mvendorid : 0x0
marchid : 0x70232
mimpid : 0x70232
# uname -a
Linux buildroot 6.3.0-rc1-00086-gc8d2fedb997a #1 SMP Sun May 14 10:46:42 EDT 2023 riscv32 GNU/Linux
# ls /lib/
ld-linux-riscv32-ilp32d.so.1 libgcc_s.so.1
libanl.so.1 libm.so.6
libatomic.so libnss_dns.so.2
libatomic.so.1 libnss_files.so.2
libatomic.so.1.2.0 libpthread.so.0
libc.so.6 libresolv.so.2
libcrypt.so.1 librt.so.1
libdl.so.2 libutil.so.1
libgcc_s.so modules
# cat /proc/99/maps
0000000055554000-0000000055634000 r-xp 00000000 00000000fe:00 17 /bin/busybox
0000000055634000-0000000055636000 r--p 00000000df000 00000000fe:00 17 /bin/busybox
0000000055636000-0000000055637000 rw-p 00000000e1000 00000000fe:00 17 /bin/busybox
0000000055637000-0000000055659000 rw-p 00000000 00:00 0 [heap]
0000000077e8d000-0000000077fbe000 r-xp 00000000 00000000fe:00 137 /lib/libc.so.6
0000000077fbe000-0000000077fbf000 ---p 00000000131000 00000000fe:00 137 /lib/libc.so.6
0000000077fbf000-0000000077fc1000 r--p 00000000131000 00000000fe:00 137 /lib/libc.so.6
0000000077fc1000-0000000077fc2000 rw-p 00000000133000 00000000fe:00 137 /lib/libc.so.6
0000000077fc2000-0000000077fcc000 rw-p 00000000 00:00 0
0000000077fcc000-0000000077fd4000 r-xp 00000000 00000000fe:00 146 /lib/libresolv.so.2
0000000077fd4000-0000000077fd5000 ---p 000000008000 00000000fe:00 146 /lib/libresolv.so.2
0000000077fd5000-0000000077fd6000 r--p 000000008000 00000000fe:00 146 /lib/libresolv.so.2
0000000077fd6000-0000000077fd7000 rw-p 000000009000 00000000fe:00 146 /lib/libresolv.so.2
0000000077fd7000-0000000077fd9000 rw-p 00000000 00:00 0
0000000077fd9000-0000000077fdb000 r--p 00000000 00:00 0 [vvar]
0000000077fdb000-0000000077fdd000 r-xp 00000000 00:00 0 [vdso]
0000000077fdd000-0000000077ffc000 r-xp 00000000 00000000fe:00 132 /lib/ld-linux-riscv32-ilp32d.so.1
0000000077ffd000-0000000077ffe000 r--p 000000001f000 00000000fe:00 132 /lib/ld-linux-riscv32-ilp32d.so.1
0000000077ffe000-0000000077fff000 rw-p 0000000020000 00000000fe:00 132 /lib/ld-linux-riscv32-ilp32d.so.1
000000007ffde000-000000007ffff000 rw-p 00000000 00:00 0 [stack]
Other resources
===============
OpenEuler riscv32 rootfs
------------------------
The OpenEuler riscv32 rootfs you can download from here:
https://repo.tarsier-infra.com/openEuler-RISC-V/obs/archive/rv32/openeuler-image-qemu-riscv32-20221111070036.rootfs.ext4
(Made by Junqiang Wang)
Debain riscv32 rootfs
---------------------
The Debian riscv32 rootfs you can download from here:
https://github.com/yuzibo/riscv32
(Made by Bo YU and Han Gao)
Fedora riscv32 rootfs
---------------------
https://fedoraproject.org/wiki/Architectures/RISC-V/RV32
(Made by Wei Fu)
LLVM 64ilp32
------------
git clone https://github.com/luxufan/llvm-project.git -b rv64-ilp32
cd llvm-project
mkdir build && cd build
cmake ../llvm -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=“X86;RISCV" -DLLVM_ENABLE_PROJECTS="clang;lld"
ninja all
(LLVM development status is that CC=clang can compile the kernel with
LLVM=1 but has not yet booted successfully.)
Patch organization
==================
This series depends on 64ilp32 toolchain patches that are not upstream
yet.
PATCH [0-1] unify vdso32 & compat_vdso
PATCH [2] adds time-related vDSO common flow for vdso32
PATCH [3] adds s64ilp32 support of clocksource driver
PATCH [5] adds s64ilp32 support of irqchip driver
PATCH [4,6-12] add basic data types and compiling framework
PATCH [13] adds MMU_SV39 support
PATCH [14] adds native atomic64
PATCH [15] adds TImode
PATCH [16] adds cmpxchg_double
PATCH [17-19] cleanup kconfig & add defconfig
PATCH [20-21] fix temporary compiler problems
Open issues
===========
Callee saved the register width
-------------------------------
For 64-bit ISA (including 64lp64, 64ilp32), callee can't determine the
correct width used in the register, so they saved the maximum width of
the ISA register, i.e., xlen size. We also found this rule in x86-x32,
mips-n32, and aarch64ilp32, which comes from 64lp64. See PATCH [20]
Here are two downsides of this:
- It would cause a difference with 32ilp32's stack frame, and s64ilp32
reuses 32ilp32 software stack. Thus, many additional compatible
problems would happen during the porting of 64ilp32 software.
- It also increases the budget of the stack usage.
<setup_vm>:
auipc a3,0xff3fb
add a3,a3,1234 # c0000000
li a5,-1
lui a4,0xc0000
addw sp,sp,-96
srl a5,a5,0x20
subw a4,a4,a3
auipc a2,0x111a
add a2,a2,1212 # c1d1f000
sd s0,80(sp)----+
sd s1,72(sp) |
sd s2,64(sp) |
sd s7,24(sp) |
sd s8,16(sp) |
sd s9,8(sp) |-> All <= 32b widths, but occupy 64b
sd ra,88(sp) | stack space.
sd s3,56(sp) | Affect memory footprint & cache
sd s4,48(sp) | performance.
sd s5,40(sp) |
sd s6,32(sp) |
sd s10,0(sp)----+
sll a1,a4,0x20
subw a2,a2,a3
and a4,a4,a5
So here is a proposal to riscv 64ilp32 ABI:
- Let the compiler prevent callee saving ">32b variables" in
callee-registers. (Q: We need to measure, how the influence of
64b variables cross function call?)
EF_RISCV_X32
------------
We add an e_flag (EF_RISCV_X32) to distinguish the 32-bit ELF, which
occupies BIT[6] of the e_flags layout.
ELF Header:
Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00
Class: ELF32
Data: 2's complement, little endian
Version: 1 (current)
OS/ABI: UNIX - System V
ABI Version: 0
Type: REL (Relocatable file)
Machine: RISC-V
Version: 0x1
Entry point address: 0x0
Start of program headers: 0 (bytes into file)
Start of section headers: 24620 (bytes into file)
Flags: 0x21, RVC, X32, soft-float ABI
^^^
64-bit Optimization problem
---------------------------
There is an existing problem in 64ilp32 gcc that combines two pointers
in one register. Liao is solving that problem. Before he finishes the
job, we could prevent it with a simple noinline attribute, fortunately.
struct path {
struct vfsmount *mnt;
struct dentry *dentry;
} __randomize_layout;
struct nameidata {
struct path path;
...
struct path root;
...
} __randomize_layout;
struct nameidata *nd
...
nd->path = nd->root;
6c88 ld a0,24(s1)
^^ // a0 contains two pointers
e088 sd a0,0(s1)
mntget(path->mnt);
// Need "lw a0,0(s1)" or "a0 << 32; a0 >> 32"
2a6150ef jal c01ce946 <mntget> // bug!
Acknowledge
===========
The s64ilp32 needs many other projects' cooperation. Thx, all guys
involved:
- GNU: LiaoShihua <shihua@iscas.ac.cn>,
Jiawe Chen<jiawei@iscas.ac.cn>
- Qemu: Weiwei Li <liweiwei@iscas.ac.cn>
- LLVM: luxufan <luxufan@iscas.ac.cn>,
Chunyu Liao<chunyu@iscas.ac.cn>
- OpenEuler rv32: Junqiang Wang <wangjunqiang@iscas.ac.cn>
- Debian rv32: Bo YU <tsu.yubo@gmail.com>
Han Gao <gaohan@iscas.ac.cn>
- Fedora rv32: Wei Fu <wefu@redhat.com>
References
==========
[1] https://techpubs.jurassic.nl/manuals/0630/developer/Mpro_n32_ABI/sgi_html/index.html
[2] https://wiki.debian.org/Arm64ilp32Port
[3] https://lwn.net/Articles/456731/
[4] https://github.com/riscv/riscv-profiles/releases
[5] https://www.cnx-software.com/2021/10/25/allwinner-d1s-f133-risc-v-processor-64mb-ddr2/
[6] https://milkv.io/duo/
[7] https://twitter.com/tphuang/status/1631308330256801793
[8] https://www.cnx-software.com/2022/12/02/pine64-ox64-sbc-bl808-risc-v-multi-protocol-wisoc-64mb-ram/
Guo Ren (22):
riscv: vdso: Unify vdso32 & compat_vdso into vdso/Makefile
riscv: vdso: Remove compat_vdso/
riscv: vdso: Add time-related vDSO common flow for vdso32
clocksource: riscv: s64ilp32: Use __riscv_xlen instead of CONFIG_32BIT
riscv: s64ilp32: Introduce xlen_t
irqchip: riscv: s64ilp32: Use __riscv_xlen instead of CONFIG_32BIT
riscv: s64ilp32: Add sbi support
riscv: s64ilp32: Add asid support
riscv: s64ilp32: Introduce PTR_L and PTR_S
riscv: s64ilp32: Enable user space runtime environment
riscv: s64ilp32: Add ebpf jit support
riscv: s64ilp32: Add ELF32 support
riscv: s64ilp32: Add ARCH RV64 ILP32 compiling framework
riscv: s64ilp32: Add MMU_SV39 mode support for 32BIT
riscv: s64ilp32: Enable native atomic64
riscv: s64ilp32: Add TImode (128 int) support
riscv: s64ilp32: Implement cmpxchg_double
riscv: s64ilp32: Disable KVM
riscv: Cleanup rv32_defconfig
riscv: s64ilp32: Add rv64ilp32_defconfig
riscv: s64ilp32: Correct the rv64ilp32 stackframe layout
riscv: s64ilp32: Temporary workaround solution to gcc problem
arch/riscv/Kconfig | 36 +++-
arch/riscv/Makefile | 24 ++-
arch/riscv/configs/32-bit.config | 2 -
arch/riscv/configs/64ilp32.config | 2 +
arch/riscv/include/asm/asm.h | 5 +
arch/riscv/include/asm/atomic.h | 6 +
arch/riscv/include/asm/cmpxchg.h | 53 ++++++
arch/riscv/include/asm/cpu_ops_sbi.h | 4 +-
arch/riscv/include/asm/csr.h | 58 +++---
arch/riscv/include/asm/extable.h | 2 +-
arch/riscv/include/asm/page.h | 24 ++-
arch/riscv/include/asm/pgtable-64.h | 42 ++---
arch/riscv/include/asm/pgtable.h | 26 ++-
arch/riscv/include/asm/processor.h | 8 +-
arch/riscv/include/asm/ptrace.h | 96 +++++-----
arch/riscv/include/asm/sbi.h | 24 +--
arch/riscv/include/asm/stacktrace.h | 6 +
arch/riscv/include/asm/timex.h | 10 +-
arch/riscv/include/asm/vdso.h | 34 +++-
arch/riscv/include/asm/vdso/gettimeofday.h | 84 +++++++++
arch/riscv/include/uapi/asm/elf.h | 2 +-
arch/riscv/include/uapi/asm/unistd.h | 1 +
arch/riscv/kernel/Makefile | 3 +-
arch/riscv/kernel/compat_signal.c | 2 +-
arch/riscv/kernel/compat_vdso/.gitignore | 2 -
arch/riscv/kernel/compat_vdso/compat_vdso.S | 8 -
.../kernel/compat_vdso/compat_vdso.lds.S | 3 -
arch/riscv/kernel/compat_vdso/flush_icache.S | 3 -
arch/riscv/kernel/compat_vdso/getcpu.S | 3 -
arch/riscv/kernel/compat_vdso/note.S | 3 -
arch/riscv/kernel/compat_vdso/rt_sigreturn.S | 3 -
arch/riscv/kernel/cpu.c | 4 +-
arch/riscv/kernel/cpu_ops_sbi.c | 4 +-
arch/riscv/kernel/cpufeature.c | 4 +-
arch/riscv/kernel/entry.S | 24 +--
arch/riscv/kernel/head.S | 8 +-
arch/riscv/kernel/process.c | 8 +-
arch/riscv/kernel/sbi.c | 24 +--
arch/riscv/kernel/signal.c | 6 +-
arch/riscv/kernel/traps.c | 4 +-
arch/riscv/kernel/vdso.c | 4 +-
arch/riscv/kernel/vdso/Makefile | 176 ++++++++++++------
..._vdso_offsets.sh => gen_vdso32_offsets.sh} | 2 +-
.../gen_vdso64_offsets.sh} | 2 +-
arch/riscv/kernel/vdso/vgettimeofday.c | 39 +++-
arch/riscv/kernel/vdso32.S | 8 +
arch/riscv/kernel/{vdso/vdso.S => vdso64.S} | 8 +-
arch/riscv/kvm/Kconfig | 1 +
arch/riscv/lib/Makefile | 1 +
arch/riscv/lib/memset.S | 4 +-
arch/riscv/mm/context.c | 16 +-
arch/riscv/mm/fault.c | 13 +-
arch/riscv/mm/init.c | 29 ++-
arch/riscv/net/Makefile | 6 +-
arch/riscv/net/bpf_jit_comp64.c | 10 +-
drivers/clocksource/timer-riscv.c | 2 +-
drivers/irqchip/irq-riscv-intc.c | 4 +-
fs/namei.c | 2 +-
58 files changed, 675 insertions(+), 317 deletions(-)
create mode 100644 arch/riscv/configs/64ilp32.config
delete mode 100644 arch/riscv/kernel/compat_vdso/.gitignore
delete mode 100644 arch/riscv/kernel/compat_vdso/compat_vdso.S
delete mode 100644 arch/riscv/kernel/compat_vdso/compat_vdso.lds.S
delete mode 100644 arch/riscv/kernel/compat_vdso/flush_icache.S
delete mode 100644 arch/riscv/kernel/compat_vdso/getcpu.S
delete mode 100644 arch/riscv/kernel/compat_vdso/note.S
delete mode 100644 arch/riscv/kernel/compat_vdso/rt_sigreturn.S
rename arch/riscv/kernel/vdso/{gen_vdso_offsets.sh => gen_vdso32_offsets.sh} (78%)
rename arch/riscv/kernel/{compat_vdso/gen_compat_vdso_offsets.sh => vdso/gen_vdso64_offsets.sh} (77%)
create mode 100644 arch/riscv/kernel/vdso32.S
rename arch/riscv/kernel/{vdso/vdso.S => vdso64.S} (73%)
Comments
On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: > From: Guo Ren <guoren@linux.alibaba.com> > > This patch series adds s64ilp32 support to riscv. The term s64ilp32 > means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers are all > 32-bit), i.e., running 32-bit Linux kernel on pure 64-bit supervisor > mode. There have been many 64ilp32 abis existing, such as mips-n32 [1], > arm-aarch64ilp32 [2], and x86-x32 [3], but they are all about userspace. > Thus, this should be the first time running a 32-bit Linux kernel with > the 64ilp32 ABI at supervisor mode (If not, correct me). Does anyone actually want this? At a bare minimum we'd need to add it to the psABI, which would presumably also be required on the compiler side of things. It's not even clear anyone wants rv64/ilp32 in userspace, the kernel seems like it'd be even less widely used. > Why 32-bit Linux? > ================= > The motivation for using a 32-bit Linux kernel is to reduce memory > footprint and meet the small capacity of DDR & cache requirement > (e.g., 64/128MB SIP SoC). > > Here are the 32-bit v.s. 64-bit Linux kernel data type comparison > summary: > 32-bit 64-bit > sizeof(page): 32bytes 64bytes > sizeof(list_head): 8bytes 16bytes > sizeof(hlist_head): 8bytes 16bytes > sizeof(vm_area): 68bytes 136bytes > ... > > The size of ilp32's long & pointer is just half of lp64's (rv64 default > abi - longs and pointers are all 64-bit). This significant difference > in data type causes different memory & cache footprint costs. Here is > the comparison measurement between s32ilp32, s64ilp32, and s64lp64 in > the same 128MB qemu system environment: > > Rootfs: > u32ilp32 - Using the same 32-bit userspace rootfs.ext2 (UXL=32) binary > from buildroot 2023.02-rc3, qemu_riscv32_virt_defconfig > > Linux: > s32ilp32 - Linux version 6.3.0-rc1 (124MB) > rv32_defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile > defconfig 32-bit.config > > s64lp64 - Linux version 6.3.0-rc1 (126MB) > defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile defconfig > > s64ilp32 - Linux version 6.3.0-rc1 (126MB) > rv64ilp32_defconfig: $(Q)$(MAKE) -f $(srctree)/Makefile > defconfig 64ilp32.config > > Opensbi: > m64lp64 - (2MB) OpenSBI v1.2-80-g4b28afc98bbe > m32ilp32 - (4MB) OpenSBI v1.2-80-g4b28afc98bbe > > +----------------------------------------+-------- > | u32ilp32 | > | UXL=32 | Rootfs > +----------------------------------------+-------- > | +----------+ +---------+ | +---------+ | > | | s64ilp32 | | s64lp64 | | | s32ilp32| | > | | SXL=64 | | SXL=64 | | | SXL=32 | | Linux > | +----------+ +---------+ | +---------+ | > +----------------------------------------+-------- > | +----------------------+ | +---------+ | > | | m64lp64 | | | m32ilp32| | > | | MXL=64 | | | MXL=32 | | Opensbi > | +----------------------+ | +---------+ | > +----------------------------------------+-------- > | +----------------------+ | +---------+ | > | | qemu-rv64 | | |qemu-rv32| | HW > | +----------------------+ | +---------+ | > +----------------------------------------+-------- > > Mem-usage: > (s32ilp32) # free > total used free shared buff/cache available > Mem: 100040 8380 88244 44 3416 88080 > > (s64lp64) # free > total used free shared buff/cache available > Mem: 91568 11848 75796 44 3924 75952 > > (s64ilp32) # free > total used free shared buff/cache available > Mem: 101952 8528 90004 44 3420 89816 > ^^^^^ > > It's a rough measurement based on the current default config without any > modification, and 32-bit (s32ilp32, s64ilp32) saved more than 16% memory > to 64-bit (s64lp64). But s32ilp32 & s64ilp32 have a similar memory > footprint (about 0.33% difference), meaning s64ilp32 has a big chance to > replace s32ilp32 on the 64-bit machine. > > Why s64ilp32? > ============= > The current RISC-V has the profiles of RVA20S64, RVA22S64, and RVA23S64 > (ongoing) [4], but no RVA**S32 profile exists or any ongoing plan. That > means when a vendor wants to produce a 32-bit s-mode RISC-V Application > Processor, they have no shape to follow. Therefore, many cheap riscv > chips have come out but follow the RVA2xS64 profiles, such as Allwinner > D1/D1s/F133 [5], SOPHGO CV1800B [6], Canaan Kendryte k230 [7], and > Bouffalo Lab BL808 which are typically cortex a7/a35/a53 product > scenarios. The D1 & CV1800B & BL808 didn't support UXL=32 (32-bit U-mode), > so they need a new u64ilp32 userspace ABI which has no software ecosystem > for the current. Thus, the first landing of s64ilp32 would be on Canaan > Kendryte k230, which has c908 with rv64gcv and compat user mode > (sstatus.uxl=32/64), which could support the existing rv32 userspace > software ecosystem. > > Another reason for inventing s64ilp32 is performance benefits and > simplify 64-bit CPU hardware design (v.s. s32ilp32). > > Why s64ilp32 has better performance? > ==================================== > Generally speaking, we should build a 32-bit hardware s-mode to run > 32-bit Linux on a 64-bit processor (such as Linux-arm32 on cortex-a53). > Or only use old 32ilp32-abi on a 64-bit machine (such as mips > SYS_SUPPORTS_32BIT_KERNEL). These can't reuse performance-related > features and instructions of the 64-bit hardware, such as 64-bit ALU, > AMO, and LD/SD, which would cause significant performance gaps on many > Linux features: > > - memcpy/memset/strcmp (s64ilp32 has half of the instructions count > and double the bandwidth of load/store instructions than s32ilp32.) > > - ebpf JIT is a 64-bit virtual ISA, which is not suitable > for mapping to s32ilp32. > > - Atomic64 (s64ilp32 has the exact native instructions mapping as > s64lp64, but s32ilp32 only uses generic_atomic64, a tradeoff & > limited software solution.) > > - 64-bit native arithmetic instructions for "long long" type > > - Support cmxchg_double for slub (The 2nd 32-bit Linux > supports the feature, the 1st is i386.) > > - ... > > Compared with the user space ecosystem, the 32-bit Linux kernel is more > eager to need 64ilp32 to improve performance because the Linux kernel > can't utilize float-point/vector features of the ISA. > > Let's look at performance from another perspective (s64ilp32 v.s. > s64lp64). Just as the first chapter said, the pointer size of ilp32 is > half of the lp64, and it reduces the size of the critical data structs > (e.g., page, list, ...). That means the cache of using ilp32 could > contain double data that lp64 with the same cache capacity, which is a > natural advantage of 32-bit. > > Why s64ilp32 simplifies CPU design? > =================================== > Yes, there are a lot of runing 32-bit Linux on 64-bit hardware examples > in history, such as arm cortex a35/a53/a55, which implements the 32-bit > EL1/EL2/EL3 hardware mode to support 32-bit Linux. We could follow Arm's > style, but riscv could choose another better way. Compared to UXL=32, > the MXL=SXL=32 has many CSR-related hardware functionalities, which > causes a lot of effort to mix them into 64-bit hardware. The s64ilp32 > works on MXL=SXL=64 mode, so the CPU vendors needn't implement 32-bit > machine and supervisor modes. > > How does s64ilp32 work? > ======================= > The s64ilp32 is the same as the s64lp64 compat mode from a hardware > view, i.e., MXL=SXL=64 + UXL=32. Because the s64ilp32 uses CONFIG_32BIT > of Linux, it only supports u32ilp32 abi user space, the current standard > rv32 software ecosystem, and it can't work with u64lp64 abi (I don't > want that complex and useless stuff). But it may work with u64ilp32 in the > future; now, the s64ilp32 depends on the UXL=32 feature of the hardware. > > The 64ilp32 gcc still uses sign-extend lw & auipc to generate address > variables because inserting zero-extend instructions to mask the highest > 32-bit would cause significant code size and performance problems. Thus, > we invented an OS approach to solve the problem: > - When satp=bare and start physical address < 2GB, there is no sign-extend > address problem. > - When satp=bare and start physical address > 2GB, we need zjpm liked > hardware extensions to mask high 32bit. > (Fortunately, all existed SoCs' (D1/D1s/F133, CV1800B, k230, BL808) > start physical address < 2GB.) > - When satp=sv39, we invent double mapping to make the sign-extended > virtual address the same as the zero-extended virtual address. > > +--------+ +---------+ +--------+ > | | +--| 511:PUD1| | | > | | | +---------+ | | > | | | | 510:PUD0|--+ | | > | | | +---------+ | | | > | | | | | | | | > | | | | | | | | > | | | | | | | | > | | | | INVALID | | | | > | | | | | | | | > | .... | | | | | | .... | > | | | | | | | | > | | | +---------+ | | | > | | +--| 3:PUD1 | | | | > | | | +---------+ | | | > | | | | 2:PUD0 |--+ | | > | | | +---------+ | | | > | | | |1:USR_PUD| | | | > | | | +---------+ | | | > | | | |0:USR_PUD| | | | > +--------+<--+ +---------+ +-->+--------+ > PUD1 ^ PGD PUD0 > 1GB | 4GB 1GB > | > +----------+ > | Sv39 PGDP| > +----------+ > SATP > > The size of xlen was always equal to the pointer/long size before > s64ilp32 emerged. So we need to introduce a new type of data - xlen_t, > which could deal with CSR-related and callee-save/restore operations. > > Some kernel features use 32BIT/64BIT to determine the exact ISA, such as > ebpf JIT would map to rv32 ISA when CONFIG_32BIT=y. But s64ilp32 needs > the ebpf JIT map to rv64 ISA when CONFIG_32BIT=y and we need to use > another config to distinguish the difference. > > More detials, please review the path series. > > How to run s64ilp32? > ==================== > > GNU toolchain > ------------- > git clone https://github.com/Liaoshihua/riscv-gnu-toolchain.git > cd riscv-gnu-toolchain > ./configure --prefix="$PWD/opt-rv64-ilp32/" --with-arch=rv64imac --with-abi=ilp32 > make linux > export PATH=$PATH:$PWD/opt-rv64-ilp32/bin/ > > Opensbi > ------- > git clone https://github.com/riscv-software-src/opensbi.git > CROSS_COMPILE=riscv64-unknown-linux-gnu- make PLATFORM=generic > > Linux kernel > ------------ > git clone https://github.com/guoren83/linux.git -b s64ilp32 > cd linux > make ARCH=riscv CROSS_COMPILE=riscv64-unknown-linux-gnu- rv64ilp32_defconfig > make ARCH=riscv CROSS_COMPILE=riscv64-unknown-linux-gnu- all > > Rootfs > ------ > git clone git://git.busybox.net/buildroot > cd buildroot > make qemu_riscv32_virt_defconfig > make > > Qemu > ---- > git clone https://github.com/plctlab/plct-qemu.git -b plct-s64ilp32-dev > cd plct-qemu > mkdir build > cd build > ../qemu/configure --target-list="riscv64-softmmu riscv32-softmmu" > make > > Run > --- > ./qemu-system-riscv64 -cpu rv64 -M virt -m 128m -nographic -bios fw_dynamic.bin -kernel Image -drive file=rootfs.ext2,format=raw,id=hd0 -device virtio-blk-device,drive=hd0 -append "rootwait root=/dev/vda ro console=ttyS0 earlycon=sbi" -netdev user,id=net0 -device virtio-net-device,netdev=net0 > > OpenSBI v1.2-119-gdc1c7db05e07 > ____ _____ ____ _____ > / __ \ / ____| _ \_ _| > | | | |_ __ ___ _ __ | (___ | |_) || | > | | | | '_ \ / _ \ '_ \ \___ \| _ < | | > | |__| | |_) | __/ | | |____) | |_) || |_ > \____/| .__/ \___|_| |_|_____/|___/_____| > | | > |_| > > Platform Name : riscv-virtio,qemu > Platform Features : medeleg > Platform HART Count : 1 > Platform IPI Device : aclint-mswi > Platform Timer Device : aclint-mtimer @ 10000000Hz > Platform Console Device : uart8250 > Platform HSM Device : --- > Platform PMU Device : --- > Platform Reboot Device : sifive_test > Platform Shutdown Device : sifive_test > Platform Suspend Device : --- > Platform CPPC Device : --- > Firmware Base : 0x60000000 > Firmware Size : 360 KB > Firmware RW Offset : 0x40000 > Runtime SBI Version : 1.0 > > Domain0 Name : root > Domain0 Boot HART : 0 > Domain0 HARTs : 0* > Domain0 Region00 : 0x0000000002000000-0x000000000200ffff M: (I,R,W) S/U: () > Domain0 Region01 : 0x0000000060040000-0x000000006005ffff M: (R,W) S/U: () > Domain0 Region02 : 0x0000000060000000-0x000000006003ffff M: (R,X) S/U: () > Domain0 Region03 : 0x0000000000000000-0xffffffffffffffff M: (R,W,X) S/U: (R,W,X) > Domain0 Next Address : 0x0000000060200000 > Domain0 Next Arg1 : 0x0000000067e00000 > Domain0 Next Mode : S-mode > Domain0 SysReset : yes > Domain0 SysSuspend : yes > > Boot HART ID : 0 > Boot HART Domain : root > Boot HART Priv Version : v1.12 > Boot HART Base ISA : rv64imafdch > Boot HART ISA Extensions : time,sstc > Boot HART PMP Count : 16 > Boot HART PMP Granularity : 4 > Boot HART PMP Address Bits: 54 > Boot HART MHPM Count : 16 > Boot HART MIDELEG : 0x0000000000001666 > Boot HART MEDELEG : 0x0000000000f0b509 > [ 0.000000] Linux version 6.3.0-rc1-00086-gc8d2fedb997a (guoren@fedora) (riscv64-unknown-linux-gnu-gcc (g5e578a16201f) 13.0.1 20230206 (experimental), GNU ld (GNU Binutils) 2.40.50.20230205) #1 SMP Sun May 14 10:46:42 EDT 2023 > [ 0.000000] random: crng init done > [ 0.000000] OF: fdt: Ignoring memory range 0x60000000 - 0x60200000 > [ 0.000000] Machine model: riscv-virtio,qemu > [ 0.000000] efi: UEFI not found. > [ 0.000000] OF: reserved mem: 0x60000000..0x6003ffff (256 KiB) map non-reusable mmode_resv1@60000000 > [ 0.000000] OF: reserved mem: 0x60040000..0x6005ffff (128 KiB) map non-reusable mmode_resv0@60040000 > [ 0.000000] Zone ranges: > [ 0.000000] Normal [mem 0x0000000060200000-0x0000000067ffffff] > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000060200000-0x0000000067ffffff] > [ 0.000000] Initmem setup node 0 [mem 0x0000000060200000-0x0000000067ffffff] > [ 0.000000] On node 0, zone Normal: 512 pages in unavailable ranges > [ 0.000000] SBI specification v1.0 detected > [ 0.000000] SBI implementation ID=0x1 Version=0x10002 > [ 0.000000] SBI TIME extension detected > [ 0.000000] SBI IPI extension detected > [ 0.000000] SBI RFENCE extension detected > [ 0.000000] SBI SRST extension detected > [ 0.000000] SBI HSM extension detected > [ 0.000000] riscv: base ISA extensions acdfhim > [ 0.000000] riscv: ELF capabilities acdfim > [ 0.000000] percpu: Embedded 13 pages/cpu s24352 r8192 d20704 u53248 > [ 0.000000] Built 1 zonelists, mobility grouping on. Total pages: 31941 > [ 0.000000] Kernel command line: rootwait root=/dev/vda ro console=ttyS0 earlycon=sbi norandmaps > [ 0.000000] Dentry cache hash table entries: 16384 (order: 4, 65536 bytes, linear) > [ 0.000000] Inode-cache hash table entries: 8192 (order: 3, 32768 bytes, linear) > [ 0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off > [ 0.000000] Virtual kernel memory layout: > [ 0.000000] fixmap : 0x9ce00000 - 0x9d000000 (2048 kB) > [ 0.000000] pci io : 0x9d000000 - 0x9e000000 ( 16 MB) > [ 0.000000] vmemmap : 0x9e000000 - 0xa0000000 ( 32 MB) > [ 0.000000] vmalloc : 0xa0000000 - 0xc0000000 ( 512 MB) > [ 0.000000] lowmem : 0xc0000000 - 0xc7e00000 ( 126 MB) > [ 0.000000] Memory: 97748K/129024K available (8699K kernel code, 8867K rwdata, 4096K rodata, 4204K init, 361K bss, 31276K reserved, 0K cma-reserved) > ... > Starting network: udhcpc: started, v1.36.0 > udhcpc: broadcasting discover > udhcpc: broadcasting select for 10.0.2.15, server 10.0.2.2 > udhcpc: lease of 10.0.2.15 obtained from 10.0.2.2, lease time 86400 > deleting routers > adding dns 10.0.2.3 > OK > > Welcome to Buildroot > buildroot login: root > # cat /proc/cpuinfo > processor : 0 > hart : 0 > isa : rv64imafdch_zihintpause_zbb_sstc > mmu : sv39 > mvendorid : 0x0 > marchid : 0x70232 > mimpid : 0x70232 > > # uname -a > Linux buildroot 6.3.0-rc1-00086-gc8d2fedb997a #1 SMP Sun May 14 10:46:42 EDT 2023 riscv32 GNU/Linux > # ls /lib/ > ld-linux-riscv32-ilp32d.so.1 libgcc_s.so.1 > libanl.so.1 libm.so.6 > libatomic.so libnss_dns.so.2 > libatomic.so.1 libnss_files.so.2 > libatomic.so.1.2.0 libpthread.so.0 > libc.so.6 libresolv.so.2 > libcrypt.so.1 librt.so.1 > libdl.so.2 libutil.so.1 > libgcc_s.so modules > > # cat /proc/99/maps > 0000000055554000-0000000055634000 r-xp 00000000 00000000fe:00 17 /bin/busybox > 0000000055634000-0000000055636000 r--p 00000000df000 00000000fe:00 17 /bin/busybox > 0000000055636000-0000000055637000 rw-p 00000000e1000 00000000fe:00 17 /bin/busybox > 0000000055637000-0000000055659000 rw-p 00000000 00:00 0 [heap] > 0000000077e8d000-0000000077fbe000 r-xp 00000000 00000000fe:00 137 /lib/libc.so.6 > 0000000077fbe000-0000000077fbf000 ---p 00000000131000 00000000fe:00 137 /lib/libc.so.6 > 0000000077fbf000-0000000077fc1000 r--p 00000000131000 00000000fe:00 137 /lib/libc.so.6 > 0000000077fc1000-0000000077fc2000 rw-p 00000000133000 00000000fe:00 137 /lib/libc.so.6 > 0000000077fc2000-0000000077fcc000 rw-p 00000000 00:00 0 > 0000000077fcc000-0000000077fd4000 r-xp 00000000 00000000fe:00 146 /lib/libresolv.so.2 > 0000000077fd4000-0000000077fd5000 ---p 000000008000 00000000fe:00 146 /lib/libresolv.so.2 > 0000000077fd5000-0000000077fd6000 r--p 000000008000 00000000fe:00 146 /lib/libresolv.so.2 > 0000000077fd6000-0000000077fd7000 rw-p 000000009000 00000000fe:00 146 /lib/libresolv.so.2 > 0000000077fd7000-0000000077fd9000 rw-p 00000000 00:00 0 > 0000000077fd9000-0000000077fdb000 r--p 00000000 00:00 0 [vvar] > 0000000077fdb000-0000000077fdd000 r-xp 00000000 00:00 0 [vdso] > 0000000077fdd000-0000000077ffc000 r-xp 00000000 00000000fe:00 132 /lib/ld-linux-riscv32-ilp32d.so.1 > 0000000077ffd000-0000000077ffe000 r--p 000000001f000 00000000fe:00 132 /lib/ld-linux-riscv32-ilp32d.so.1 > 0000000077ffe000-0000000077fff000 rw-p 0000000020000 00000000fe:00 132 /lib/ld-linux-riscv32-ilp32d.so.1 > 000000007ffde000-000000007ffff000 rw-p 00000000 00:00 0 [stack] > > Other resources > =============== > > OpenEuler riscv32 rootfs > ------------------------ > The OpenEuler riscv32 rootfs you can download from here: > https://repo.tarsier-infra.com/openEuler-RISC-V/obs/archive/rv32/openeuler-image-qemu-riscv32-20221111070036.rootfs.ext4 > (Made by Junqiang Wang) > > Debain riscv32 rootfs > --------------------- > The Debian riscv32 rootfs you can download from here: > https://github.com/yuzibo/riscv32 > (Made by Bo YU and Han Gao) > > Fedora riscv32 rootfs > --------------------- > https://fedoraproject.org/wiki/Architectures/RISC-V/RV32 > (Made by Wei Fu) > > LLVM 64ilp32 > ------------ > git clone https://github.com/luxufan/llvm-project.git -b rv64-ilp32 > cd llvm-project > mkdir build && cd build > cmake ../llvm -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_TARGETS_TO_BUILD=“X86;RISCV" -DLLVM_ENABLE_PROJECTS="clang;lld" > ninja all > > (LLVM development status is that CC=clang can compile the kernel with > LLVM=1 but has not yet booted successfully.) > > Patch organization > ================== > This series depends on 64ilp32 toolchain patches that are not upstream > yet. > > PATCH [0-1] unify vdso32 & compat_vdso > PATCH [2] adds time-related vDSO common flow for vdso32 > PATCH [3] adds s64ilp32 support of clocksource driver > PATCH [5] adds s64ilp32 support of irqchip driver > PATCH [4,6-12] add basic data types and compiling framework > PATCH [13] adds MMU_SV39 support > PATCH [14] adds native atomic64 > PATCH [15] adds TImode > PATCH [16] adds cmpxchg_double > PATCH [17-19] cleanup kconfig & add defconfig > PATCH [20-21] fix temporary compiler problems > > Open issues > =========== > > Callee saved the register width > ------------------------------- > For 64-bit ISA (including 64lp64, 64ilp32), callee can't determine the > correct width used in the register, so they saved the maximum width of > the ISA register, i.e., xlen size. We also found this rule in x86-x32, > mips-n32, and aarch64ilp32, which comes from 64lp64. See PATCH [20] > > Here are two downsides of this: > - It would cause a difference with 32ilp32's stack frame, and s64ilp32 > reuses 32ilp32 software stack. Thus, many additional compatible > problems would happen during the porting of 64ilp32 software. > - It also increases the budget of the stack usage. > <setup_vm>: > auipc a3,0xff3fb > add a3,a3,1234 # c0000000 > li a5,-1 > lui a4,0xc0000 > addw sp,sp,-96 > srl a5,a5,0x20 > subw a4,a4,a3 > auipc a2,0x111a > add a2,a2,1212 # c1d1f000 > sd s0,80(sp)----+ > sd s1,72(sp) | > sd s2,64(sp) | > sd s7,24(sp) | > sd s8,16(sp) | > sd s9,8(sp) |-> All <= 32b widths, but occupy 64b > sd ra,88(sp) | stack space. > sd s3,56(sp) | Affect memory footprint & cache > sd s4,48(sp) | performance. > sd s5,40(sp) | > sd s6,32(sp) | > sd s10,0(sp)----+ > sll a1,a4,0x20 > subw a2,a2,a3 > and a4,a4,a5 > > So here is a proposal to riscv 64ilp32 ABI: > - Let the compiler prevent callee saving ">32b variables" in > callee-registers. (Q: We need to measure, how the influence of > 64b variables cross function call?) > > EF_RISCV_X32 > ------------ > We add an e_flag (EF_RISCV_X32) to distinguish the 32-bit ELF, which > occupies BIT[6] of the e_flags layout. > > ELF Header: > Magic: 7f 45 4c 46 01 01 01 00 00 00 00 00 00 00 00 00 > Class: ELF32 > Data: 2's complement, little endian > Version: 1 (current) > OS/ABI: UNIX - System V > ABI Version: 0 > Type: REL (Relocatable file) > Machine: RISC-V > Version: 0x1 > Entry point address: 0x0 > Start of program headers: 0 (bytes into file) > Start of section headers: 24620 (bytes into file) > Flags: 0x21, RVC, X32, soft-float ABI > ^^^ > 64-bit Optimization problem > --------------------------- > There is an existing problem in 64ilp32 gcc that combines two pointers > in one register. Liao is solving that problem. Before he finishes the > job, we could prevent it with a simple noinline attribute, fortunately. > struct path { > struct vfsmount *mnt; > struct dentry *dentry; > } __randomize_layout; > > struct nameidata { > struct path path; > ... > struct path root; > ... > } __randomize_layout; > > struct nameidata *nd > ... > nd->path = nd->root; > 6c88 ld a0,24(s1) > ^^ // a0 contains two pointers > e088 sd a0,0(s1) > mntget(path->mnt); > // Need "lw a0,0(s1)" or "a0 << 32; a0 >> 32" > 2a6150ef jal c01ce946 <mntget> // bug! > > Acknowledge > =========== > The s64ilp32 needs many other projects' cooperation. Thx, all guys > involved: > - GNU: LiaoShihua <shihua@iscas.ac.cn>, > Jiawe Chen<jiawei@iscas.ac.cn> > - Qemu: Weiwei Li <liweiwei@iscas.ac.cn> > - LLVM: luxufan <luxufan@iscas.ac.cn>, > Chunyu Liao<chunyu@iscas.ac.cn> > - OpenEuler rv32: Junqiang Wang <wangjunqiang@iscas.ac.cn> > - Debian rv32: Bo YU <tsu.yubo@gmail.com> > Han Gao <gaohan@iscas.ac.cn> > - Fedora rv32: Wei Fu <wefu@redhat.com> > > References > ========== > [1] https://techpubs.jurassic.nl/manuals/0630/developer/Mpro_n32_ABI/sgi_html/index.html > [2] https://wiki.debian.org/Arm64ilp32Port > [3] https://lwn.net/Articles/456731/ > [4] https://github.com/riscv/riscv-profiles/releases > [5] https://www.cnx-software.com/2021/10/25/allwinner-d1s-f133-risc-v-processor-64mb-ddr2/ > [6] https://milkv.io/duo/ > [7] https://twitter.com/tphuang/status/1631308330256801793 > [8] https://www.cnx-software.com/2022/12/02/pine64-ox64-sbc-bl808-risc-v-multi-protocol-wisoc-64mb-ram/ > > Guo Ren (22): > riscv: vdso: Unify vdso32 & compat_vdso into vdso/Makefile > riscv: vdso: Remove compat_vdso/ > riscv: vdso: Add time-related vDSO common flow for vdso32 > clocksource: riscv: s64ilp32: Use __riscv_xlen instead of CONFIG_32BIT > riscv: s64ilp32: Introduce xlen_t > irqchip: riscv: s64ilp32: Use __riscv_xlen instead of CONFIG_32BIT > riscv: s64ilp32: Add sbi support > riscv: s64ilp32: Add asid support > riscv: s64ilp32: Introduce PTR_L and PTR_S > riscv: s64ilp32: Enable user space runtime environment > riscv: s64ilp32: Add ebpf jit support > riscv: s64ilp32: Add ELF32 support > riscv: s64ilp32: Add ARCH RV64 ILP32 compiling framework > riscv: s64ilp32: Add MMU_SV39 mode support for 32BIT > riscv: s64ilp32: Enable native atomic64 > riscv: s64ilp32: Add TImode (128 int) support > riscv: s64ilp32: Implement cmpxchg_double > riscv: s64ilp32: Disable KVM > riscv: Cleanup rv32_defconfig > riscv: s64ilp32: Add rv64ilp32_defconfig > riscv: s64ilp32: Correct the rv64ilp32 stackframe layout > riscv: s64ilp32: Temporary workaround solution to gcc problem > > arch/riscv/Kconfig | 36 +++- > arch/riscv/Makefile | 24 ++- > arch/riscv/configs/32-bit.config | 2 - > arch/riscv/configs/64ilp32.config | 2 + > arch/riscv/include/asm/asm.h | 5 + > arch/riscv/include/asm/atomic.h | 6 + > arch/riscv/include/asm/cmpxchg.h | 53 ++++++ > arch/riscv/include/asm/cpu_ops_sbi.h | 4 +- > arch/riscv/include/asm/csr.h | 58 +++--- > arch/riscv/include/asm/extable.h | 2 +- > arch/riscv/include/asm/page.h | 24 ++- > arch/riscv/include/asm/pgtable-64.h | 42 ++--- > arch/riscv/include/asm/pgtable.h | 26 ++- > arch/riscv/include/asm/processor.h | 8 +- > arch/riscv/include/asm/ptrace.h | 96 +++++----- > arch/riscv/include/asm/sbi.h | 24 +-- > arch/riscv/include/asm/stacktrace.h | 6 + > arch/riscv/include/asm/timex.h | 10 +- > arch/riscv/include/asm/vdso.h | 34 +++- > arch/riscv/include/asm/vdso/gettimeofday.h | 84 +++++++++ > arch/riscv/include/uapi/asm/elf.h | 2 +- > arch/riscv/include/uapi/asm/unistd.h | 1 + > arch/riscv/kernel/Makefile | 3 +- > arch/riscv/kernel/compat_signal.c | 2 +- > arch/riscv/kernel/compat_vdso/.gitignore | 2 - > arch/riscv/kernel/compat_vdso/compat_vdso.S | 8 - > .../kernel/compat_vdso/compat_vdso.lds.S | 3 - > arch/riscv/kernel/compat_vdso/flush_icache.S | 3 - > arch/riscv/kernel/compat_vdso/getcpu.S | 3 - > arch/riscv/kernel/compat_vdso/note.S | 3 - > arch/riscv/kernel/compat_vdso/rt_sigreturn.S | 3 - > arch/riscv/kernel/cpu.c | 4 +- > arch/riscv/kernel/cpu_ops_sbi.c | 4 +- > arch/riscv/kernel/cpufeature.c | 4 +- > arch/riscv/kernel/entry.S | 24 +-- > arch/riscv/kernel/head.S | 8 +- > arch/riscv/kernel/process.c | 8 +- > arch/riscv/kernel/sbi.c | 24 +-- > arch/riscv/kernel/signal.c | 6 +- > arch/riscv/kernel/traps.c | 4 +- > arch/riscv/kernel/vdso.c | 4 +- > arch/riscv/kernel/vdso/Makefile | 176 ++++++++++++------ > ..._vdso_offsets.sh => gen_vdso32_offsets.sh} | 2 +- > .../gen_vdso64_offsets.sh} | 2 +- > arch/riscv/kernel/vdso/vgettimeofday.c | 39 +++- > arch/riscv/kernel/vdso32.S | 8 + > arch/riscv/kernel/{vdso/vdso.S => vdso64.S} | 8 +- > arch/riscv/kvm/Kconfig | 1 + > arch/riscv/lib/Makefile | 1 + > arch/riscv/lib/memset.S | 4 +- > arch/riscv/mm/context.c | 16 +- > arch/riscv/mm/fault.c | 13 +- > arch/riscv/mm/init.c | 29 ++- > arch/riscv/net/Makefile | 6 +- > arch/riscv/net/bpf_jit_comp64.c | 10 +- > drivers/clocksource/timer-riscv.c | 2 +- > drivers/irqchip/irq-riscv-intc.c | 4 +- > fs/namei.c | 2 +- > 58 files changed, 675 insertions(+), 317 deletions(-) > create mode 100644 arch/riscv/configs/64ilp32.config > delete mode 100644 arch/riscv/kernel/compat_vdso/.gitignore > delete mode 100644 arch/riscv/kernel/compat_vdso/compat_vdso.S > delete mode 100644 arch/riscv/kernel/compat_vdso/compat_vdso.lds.S > delete mode 100644 arch/riscv/kernel/compat_vdso/flush_icache.S > delete mode 100644 arch/riscv/kernel/compat_vdso/getcpu.S > delete mode 100644 arch/riscv/kernel/compat_vdso/note.S > delete mode 100644 arch/riscv/kernel/compat_vdso/rt_sigreturn.S > rename arch/riscv/kernel/vdso/{gen_vdso_offsets.sh => gen_vdso32_offsets.sh} (78%) > rename arch/riscv/kernel/{compat_vdso/gen_compat_vdso_offsets.sh => vdso/gen_vdso64_offsets.sh} (77%) > create mode 100644 arch/riscv/kernel/vdso32.S > rename arch/riscv/kernel/{vdso/vdso.S => vdso64.S} (73%)
On Thu, May 18, 2023, at 17:38, Palmer Dabbelt wrote: > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: >> From: Guo Ren <guoren@linux.alibaba.com> >> >> This patch series adds s64ilp32 support to riscv. The term s64ilp32 >> means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers are all >> 32-bit), i.e., running 32-bit Linux kernel on pure 64-bit supervisor >> mode. There have been many 64ilp32 abis existing, such as mips-n32 [1], >> arm-aarch64ilp32 [2], and x86-x32 [3], but they are all about userspace. >> Thus, this should be the first time running a 32-bit Linux kernel with >> the 64ilp32 ABI at supervisor mode (If not, correct me). > > Does anyone actually want this? At a bare minimum we'd need to add it > to the psABI, which would presumably also be required on the compiler > side of things. > > It's not even clear anyone wants rv64/ilp32 in userspace, the kernel > seems like it'd be even less widely used. We have had long discussions about supporting ilp32 userspace on arm64, and I think almost everyone is glad we never merged it into the mainline kernel, so we don't have to worry about supporting it in the future. The cost of supporting an extra user space ABI is huge, and I'm sure you don't want to go there. The other two cited examples (mips-n32 and x86-x32) are pretty much unused now as well, but still have a maintenance burden until they can finally get removed. If for some crazy reason you'd still want the 64ilp32 ABI in user space, running the kernel this way is probably still a bad idea, but that one is less clear. There is clearly a small memory penalty of running a 64-bit kernel for larger data structures (page, inode, task_struct, ...) and vmlinux, and there is no huge additional maintenance cost on top of the ABI itself that you'd need either way, but using a 64-bit address space in the kernel has some important advantages even when running 32-bit userland: processes can use the entire 4GB virtual space, while the kernel can address more than 768MB of lowmem, and KASLR has more bits to work with for randomization. On RISCV, some additional features (VMAP_STACK, KASAN, KFENCE, ...) depend on 64-bit kernels even though they don't strictly need that. Arnd
On Thu, 18 May 2023, Palmer Dabbelt wrote: > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: > > > This patch series adds s64ilp32 support to riscv. The term s64ilp32 > > means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers are all > > 32-bit), i.e., running 32-bit Linux kernel on pure 64-bit supervisor > > mode. There have been many 64ilp32 abis existing, such as mips-n32 [1], > > arm-aarch64ilp32 [2], and x86-x32 [3], but they are all about userspace. > > Thus, this should be the first time running a 32-bit Linux kernel with > > the 64ilp32 ABI at supervisor mode (If not, correct me). > > Does anyone actually want this? At a bare minimum we'd need to add it to the > psABI, which would presumably also be required on the compiler side of things. > > It's not even clear anyone wants rv64/ilp32 in userspace, the kernel seems > like it'd be even less widely used. We've certainly talked to folks who are interested in RV64 ILP32 userspace with an LP64 kernel. The motivation is the usual one: to reduce data size and therefore (ideally) BOM cost. I think this work, if it goes forward, would need to go hand in hand with the RVIA psABI group. The RV64 ILP32 kernel and ILP32 userspace approach implemented by this patch is intriguing, but I guess for me, the question is whether it's worth the extra hassle vs. a pure RV32 kernel & userspace. - Paul
On Thu, 18 May 2023, Arnd Bergmann wrote: > We have had long discussions about supporting ilp32 userspace on > arm64, and I think almost everyone is glad we never merged it into > the mainline kernel, so we don't have to worry about supporting it > in the future. The cost of supporting an extra user space ABI > is huge, and I'm sure you don't want to go there. The other two > cited examples (mips-n32 and x86-x32) are pretty much unused now > as well, but still have a maintenance burden until they can finally > get removed. There probably hasn't been much pressure to support Aarch64 ILP32 since ARM still has hardware support for Aarch32. Will be interesting to see if that's still the case after ARM drops Aarch32 support for future designs. - Paul
On Fri, May 19, 2023, at 02:38, Paul Walmsley wrote: > On Thu, 18 May 2023, Arnd Bergmann wrote: > >> We have had long discussions about supporting ilp32 userspace on >> arm64, and I think almost everyone is glad we never merged it into >> the mainline kernel, so we don't have to worry about supporting it >> in the future. The cost of supporting an extra user space ABI >> is huge, and I'm sure you don't want to go there. The other two >> cited examples (mips-n32 and x86-x32) are pretty much unused now >> as well, but still have a maintenance burden until they can finally >> get removed. > > There probably hasn't been much pressure to support Aarch64 ILP32 since > ARM still has hardware support for Aarch32. Will be interesting to see if > that's still the case after ARM drops Aarch32 support for future designs. I think there was a some pressure for 64ilp32 from Arm when aarch64 support was originally added, as they always planned to drop aarch32 support eventually, but I don't see that coming back now. I think the situation is quite different as well: On aarch64, there is a significant cost in supporting aarch32 userspace because of the complexity of that particular instruction set, but at the same time there is also a huge amount of software that is compiled for or written to support aarch32 software, and nobody wants to replace that. There are also a lot of existing arm32 chips with guaranteed availability well into the 2030s, new 32-bit-only chips based on Cortex-A7 (originally released in 2011) coming out constantly, and even the latest low-end core (Cortex-A510 r1). It's probably going to be several years before that core even shows up in low-memory systems, and then decades before this stops being available in SoCs, even in the unlikely case that no future low-end cores support aarch32-el0 mode (it's already been announced that there are no plans for future high-end cores with aarch32 mode, but those won't be used in low-memory configurations anyway). For RISC-V, I have not seen much interest in Linux userspace for the existing rv32 mode, so you could argue that there is not much to lose in abandoning it. On the other hand, the cost of adding rv32 support to an rv64 core should be very small as all the instructions are already present in some other encoding, and developers have already spent a significant amount of work on bringing up rv32 userspace that would all have to be done again for a new ABI, and you'd end up splitting the already tiny developer base for 32-bit riscv in two for the existing rv32 side and a new rv64ilp32 side. I suppose the answer in both cases is the same though: if a SoC maker wants to sell a product to users with low memory, they should pick a CPU core that implements standard 32-bit user space support rather than making a mess of it and expecting software to work around it. Arnd
On Fri, May 19, 2023 at 2:29 AM Arnd Bergmann <arnd@arndb.de> wrote: > > On Thu, May 18, 2023, at 17:38, Palmer Dabbelt wrote: > > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: > >> From: Guo Ren <guoren@linux.alibaba.com> > >> > >> This patch series adds s64ilp32 support to riscv. The term s64ilp32 > >> means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers are all > >> 32-bit), i.e., running 32-bit Linux kernel on pure 64-bit supervisor > >> mode. There have been many 64ilp32 abis existing, such as mips-n32 [1], > >> arm-aarch64ilp32 [2], and x86-x32 [3], but they are all about userspace. > >> Thus, this should be the first time running a 32-bit Linux kernel with > >> the 64ilp32 ABI at supervisor mode (If not, correct me). > > > > Does anyone actually want this? At a bare minimum we'd need to add it > > to the psABI, which would presumably also be required on the compiler > > side of things. > > > > It's not even clear anyone wants rv64/ilp32 in userspace, the kernel > > seems like it'd be even less widely used. > > We have had long discussions about supporting ilp32 userspace on > arm64, and I think almost everyone is glad we never merged it into > the mainline kernel, so we don't have to worry about supporting it > in the future. The cost of supporting an extra user space ABI > is huge, and I'm sure you don't want to go there. The other two > cited examples (mips-n32 and x86-x32) are pretty much unused now > as well, but still have a maintenance burden until they can finally > get removed. > > If for some crazy reason you'd still want the 64ilp32 ABI in user > space, running the kernel this way is probably still a bad idea, > but that one is less clear. There is clearly a small memory > penalty of running a 64-bit kernel for larger data structures > (page, inode, task_struct, ...) and vmlinux, and there is no I don't think it's a small memory penalty, our measurement is about 16% with defconfig, see "Why 32-bit Linux?" section. This patch series doesn't add 64ilp32 userspace abi, but it seems you also don't like to run 32-bit Linux kernel on 64-bit hardware, right? The motivation of s64ilp32 (running 32-bit Linux kernel on 64-bit s-mode): - The target hardware (Canaan Kendryte k230) only supports MXL=64, SXL=64, UXL=64/32. - The 64-bit Linux + compat 32-bit app can't satisfy the 64/128MB scenarios. > huge additional maintenance cost on top of the ABI itself > that you'd need either way, but using a 64-bit address space > in the kernel has some important advantages even when running > 32-bit userland: processes can use the entire 4GB virtual > space, while the kernel can address more than 768MB of lowmem, > and KASLR has more bits to work with for randomization. On > RISCV, some additional features (VMAP_STACK, KASAN, KFENCE, > ...) depend on 64-bit kernels even though they don't > strictly need that. I agree that the 64-bit linux kernel has more functionalities, but: - What do you think about linux on a 64/128MB SoC? Could it be affordable to VMAP_STACK, KASAN, KFENCE? - I think 32-bit Linux & RTOS have monopolized this market (64/128MB scenarios), right? > > Arnd
On Fri, May 19, 2023, at 17:31, Guo Ren wrote: > On Fri, May 19, 2023 at 2:29 AM Arnd Bergmann <arnd@arndb.de> wrote: >> On Thu, May 18, 2023, at 17:38, Palmer Dabbelt wrote: >> > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: >> >> If for some crazy reason you'd still want the 64ilp32 ABI in user >> space, running the kernel this way is probably still a bad idea, >> but that one is less clear. There is clearly a small memory >> penalty of running a 64-bit kernel for larger data structures >> (page, inode, task_struct, ...) and vmlinux, and there is no > I don't think it's a small memory penalty, our measurement is about > 16% with defconfig, see "Why 32-bit Linux?" section. > > This patch series doesn't add 64ilp32 userspace abi, but it seems you > also don't like to run 32-bit Linux kernel on 64-bit hardware, right? Ok, I'm sorry for missing the important bit here. So if this can still use the normal 32-bit user space, the cost of this patch set is not huge, and it's something that can be beneficial in a few cases, though I suspect most users are still better off running 64-bit kernels. > The motivation of s64ilp32 (running 32-bit Linux kernel on 64-bit s-mode): > - The target hardware (Canaan Kendryte k230) only supports MXL=64, > SXL=64, UXL=64/32. > - The 64-bit Linux + compat 32-bit app can't satisfy the 64/128MB scenarios. > >> huge additional maintenance cost on top of the ABI itself >> that you'd need either way, but using a 64-bit address space >> in the kernel has some important advantages even when running >> 32-bit userland: processes can use the entire 4GB virtual >> space, while the kernel can address more than 768MB of lowmem, >> and KASLR has more bits to work with for randomization. On >> RISCV, some additional features (VMAP_STACK, KASAN, KFENCE, >> ...) depend on 64-bit kernels even though they don't >> strictly need that. > > I agree that the 64-bit linux kernel has more functionalities, but: > - What do you think about linux on a 64/128MB SoC? Could it be > affordable to VMAP_STACK, KASAN, KFENCE? I would definitely recommend VMAP_STACK, but that can be implemented and is used on other 32-bit architectures (ppc32, arm32) without a huge cost. The larger virtual user address space can help even on machines with 128MB, though most applications probably don't care at that point. > - I think 32-bit Linux & RTOS have monopolized this market (64/128MB > scenarios), right? The minimum amount of RAM that makes a system usable for Linux is constantly going up, so I think with 64MB, most new projects are already better off running some RTOS kernel instead of Linux. The ones that are still usable today probably won't last a lot of distro upgrades before the bloat catches up with them, but I can see how your patch set can give them a few extra years of updates. For the 256MB+ systems, I would expect the sensitive kernel allocations to be small enough that the series makes little difference. The 128MB systems are the most interesting ones here, and I'm curious to see where you spot most of the memory usage differences, I'll also reply to your initial mail for that. Arnd
On Fri, 19 May 2023 09:53:35 PDT (-0700), Arnd Bergmann wrote: > On Fri, May 19, 2023, at 17:31, Guo Ren wrote: >> On Fri, May 19, 2023 at 2:29 AM Arnd Bergmann <arnd@arndb.de> wrote: >>> On Thu, May 18, 2023, at 17:38, Palmer Dabbelt wrote: >>> > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: >>> >>> If for some crazy reason you'd still want the 64ilp32 ABI in user >>> space, running the kernel this way is probably still a bad idea, >>> but that one is less clear. There is clearly a small memory >>> penalty of running a 64-bit kernel for larger data structures >>> (page, inode, task_struct, ...) and vmlinux, and there is no >> I don't think it's a small memory penalty, our measurement is about >> 16% with defconfig, see "Why 32-bit Linux?" section. >> >> This patch series doesn't add 64ilp32 userspace abi, but it seems you >> also don't like to run 32-bit Linux kernel on 64-bit hardware, right? > > Ok, I'm sorry for missing the important bit here. So if this can > still use the normal 32-bit user space, the cost of this patch set > is not huge, and it's something that can be beneficial in a few > cases, though I suspect most users are still better off running > 64-bit kernels. Running a normal 32-bit userspace would require HW support for the 32-bit mode switch for userspace, though (rv32 isn't a subset of rv64, so there's nothing we can do to make those binaries function correctly with uABI). The userspace-only mode switch is a bit simpler than the user+supervisor switch, but it seems like vendors who really want the memory savings would just implement both mode switches. >> The motivation of s64ilp32 (running 32-bit Linux kernel on 64-bit s-mode): >> - The target hardware (Canaan Kendryte k230) only supports MXL=64, >> SXL=64, UXL=64/32. >> - The 64-bit Linux + compat 32-bit app can't satisfy the 64/128MB scenarios. >> >>> huge additional maintenance cost on top of the ABI itself >>> that you'd need either way, but using a 64-bit address space >>> in the kernel has some important advantages even when running >>> 32-bit userland: processes can use the entire 4GB virtual >>> space, while the kernel can address more than 768MB of lowmem, >>> and KASLR has more bits to work with for randomization. On >>> RISCV, some additional features (VMAP_STACK, KASAN, KFENCE, >>> ...) depend on 64-bit kernels even though they don't >>> strictly need that. >> >> I agree that the 64-bit linux kernel has more functionalities, but: >> - What do you think about linux on a 64/128MB SoC? Could it be >> affordable to VMAP_STACK, KASAN, KFENCE? > > I would definitely recommend VMAP_STACK, but that can be implemented > and is used on other 32-bit architectures (ppc32, arm32) without a > huge cost. The larger virtual user address space can help even on > machines with 128MB, though most applications probably don't care at > that point. At least having them as an option seems reasonable. Historically we haven't gated new base systems on having every feature the others do, though (!MMU, rv32, etc). >> - I think 32-bit Linux & RTOS have monopolized this market (64/128MB >> scenarios), right? > > The minimum amount of RAM that makes a system usable for Linux is > constantly going up, so I think with 64MB, most new projects are > already better off running some RTOS kernel instead of Linux. > The ones that are still usable today probably won't last a lot > of distro upgrades before the bloat catches up with them, but I > can see how your patch set can give them a few extra years of > updates. We also have 32-bit kernel support. Systems that have tens of MB of RAM tend to end up with some memory technology that doesn't scale to gigabytes these days, and since that's fixed when the chip is built it seems like those folks would be better off just having HW support for 32-bit kernels (and maybe not even bothering with HW support for 64-bit kernels). > For the 256MB+ systems, I would expect the sensitive kernel > allocations to be small enough that the series makes little > difference. The 128MB systems are the most interesting ones > here, and I'm curious to see where you spot most of the > memory usage differences, I'll also reply to your initial > mail for that. Thanks. I agree we need to see some real systems that benefit from this, as it's a pretty big support cost. Just defconfig sizes doesn't mean a whole lot, as users on these very constrained systems aren't likely to run defconfig anyway. If someone's going to use it then I'm fine taking the code, it just seems like a very thin set of possible use cases. We've already got almost no users in RISC-V land, I've got a feeling this is esoteric enough to actually have zero. > > Arnd
On Thu, May 18, 2023, at 15:09, guoren@kernel.org wrote: > From: Guo Ren <guoren@linux.alibaba.com> > Why 32-bit Linux? > ================= > The motivation for using a 32-bit Linux kernel is to reduce memory > footprint and meet the small capacity of DDR & cache requirement > (e.g., 64/128MB SIP SoC). > > Here are the 32-bit v.s. 64-bit Linux kernel data type comparison > summary: > 32-bit 64-bit > sizeof(page): 32bytes 64bytes > sizeof(list_head): 8bytes 16bytes > sizeof(hlist_head): 8bytes 16bytes > sizeof(vm_area): 68bytes 136bytes > ... > Mem-usage: > (s32ilp32) # free > total used free shared buff/cache available > Mem: 100040 8380 88244 44 3416 88080 > > (s64lp64) # free > total used free shared buff/cache available > Mem: 91568 11848 75796 44 3924 75952 > > (s64ilp32) # free > total used free shared buff/cache available > Mem: 101952 8528 90004 44 3420 89816 > ^^^^^ > > It's a rough measurement based on the current default config without any > modification, and 32-bit (s32ilp32, s64ilp32) saved more than 16% memory > to 64-bit (s64lp64). But s32ilp32 & s64ilp32 have a similar memory > footprint (about 0.33% difference), meaning s64ilp32 has a big chance to > replace s32ilp32 on the 64-bit machine. I've tried to run the same numbers for the debate about running 32-bit vs 64-bit arm kernels in the past, but focused mostly on slightly larger systems, but I looked mainly at the 512MB case, as that is the most cost-efficient DDR3 memory configuration and fairly common. What I'd like to understand better in your example is where the 14MB of memory went. I assume this is for 128MB of total RAM, so we know that 1MB went into additional 'struct page' objects (32 bytes * 32768 pages). It would be good to know where the dynamic allocations went and if they are reclaimable (e.g. inodes) or non-reclaimable (e.g. kmalloc-128). For the vmlinux size, is this already a minimal config that one would run on a board with 128MB of RAM, or a defconfig that includes a lot of stuff that is only relevant for other platforms but also grows on 64-bit? What do you see in /proc/slabinfo, /proc/meminfo/, and 'size vmlinux' for the s64ilp32 and s64lp64 kernels here? Arnd
On Sat, May 20, 2023 at 12:54 AM Arnd Bergmann <arnd@arndb.de> wrote: > > On Fri, May 19, 2023, at 17:31, Guo Ren wrote: > > On Fri, May 19, 2023 at 2:29 AM Arnd Bergmann <arnd@arndb.de> wrote: > >> On Thu, May 18, 2023, at 17:38, Palmer Dabbelt wrote: > >> > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: > >> > >> If for some crazy reason you'd still want the 64ilp32 ABI in user > >> space, running the kernel this way is probably still a bad idea, > >> but that one is less clear. There is clearly a small memory > >> penalty of running a 64-bit kernel for larger data structures > >> (page, inode, task_struct, ...) and vmlinux, and there is no > > I don't think it's a small memory penalty, our measurement is about > > 16% with defconfig, see "Why 32-bit Linux?" section. > > > > This patch series doesn't add 64ilp32 userspace abi, but it seems you > > also don't like to run 32-bit Linux kernel on 64-bit hardware, right? > > Ok, I'm sorry for missing the important bit here. So if this can > still use the normal 32-bit user space, the cost of this patch set > is not huge, and it's something that can be beneficial in a few > cases, though I suspect most users are still better off running > 64-bit kernels. > > > The motivation of s64ilp32 (running 32-bit Linux kernel on 64-bit s-mode): > > - The target hardware (Canaan Kendryte k230) only supports MXL=64, > > SXL=64, UXL=64/32. > > - The 64-bit Linux + compat 32-bit app can't satisfy the 64/128MB scenarios. > > > >> huge additional maintenance cost on top of the ABI itself > >> that you'd need either way, but using a 64-bit address space > >> in the kernel has some important advantages even when running > >> 32-bit userland: processes can use the entire 4GB virtual > >> space, while the kernel can address more than 768MB of lowmem, > >> and KASLR has more bits to work with for randomization. On > >> RISCV, some additional features (VMAP_STACK, KASAN, KFENCE, > >> ...) depend on 64-bit kernels even though they don't > >> strictly need that. > > > > I agree that the 64-bit linux kernel has more functionalities, but: > > - What do you think about linux on a 64/128MB SoC? Could it be > > affordable to VMAP_STACK, KASAN, KFENCE? > > I would definitely recommend VMAP_STACK, but that can be implemented > and is used on other 32-bit architectures (ppc32, arm32) without a > huge cost. The larger virtual user address space can help even on > machines with 128MB, though most applications probably don't care at > that point. Good point, I would support VMAP_STACK in ARCH_RV64ILP32. > > > - I think 32-bit Linux & RTOS have monopolized this market (64/128MB > > scenarios), right? > > The minimum amount of RAM that makes a system usable for Linux is > constantly going up, so I think with 64MB, most new projects are > already better off running some RTOS kernel instead of Linux. > The ones that are still usable today probably won't last a lot > of distro upgrades before the bloat catches up with them, but I > can see how your patch set can give them a few extra years of > updates. Linux development costs much cheaper than RTOS, so the vendors would first develop a Linux version. If it succeeds in the market, the vendors will create a cost-down solution. So their first choice is to cut down the memory footprint of the first Linux version instead of moving to RTOS. With the price of 128MB-DDR3 & 64MB-DDR2 being more and more similar, 32bit-Linux has more opportunities to instead of RTOS. > > For the 256MB+ systems, I would expect the sensitive kernel > allocations to be small enough that the series makes little > difference. The 128MB systems are the most interesting ones > here, and I'm curious to see where you spot most of the > memory usage differences, I'll also reply to your initial > mail for that. Thx, I aslo recommand you read about "Why s64ilp32 has better performance?" section :) How do you think running arm32-Linux on coretex-A35/A53/A55? > > Arnd
On Sat, May 20, 2023 at 4:20 AM Arnd Bergmann <arnd@arndb.de> wrote: > > On Thu, May 18, 2023, at 15:09, guoren@kernel.org wrote: > > From: Guo Ren <guoren@linux.alibaba.com> > > Why 32-bit Linux? > > ================= > > The motivation for using a 32-bit Linux kernel is to reduce memory > > footprint and meet the small capacity of DDR & cache requirement > > (e.g., 64/128MB SIP SoC). > > > > Here are the 32-bit v.s. 64-bit Linux kernel data type comparison > > summary: > > 32-bit 64-bit > > sizeof(page): 32bytes 64bytes > > sizeof(list_head): 8bytes 16bytes > > sizeof(hlist_head): 8bytes 16bytes > > sizeof(vm_area): 68bytes 136bytes > > ... > > > Mem-usage: > > (s32ilp32) # free > > total used free shared buff/cache available > > Mem: 100040 8380 88244 44 3416 88080 > > > > (s64lp64) # free > > total used free shared buff/cache available > > Mem: 91568 11848 75796 44 3924 75952 > > > > (s64ilp32) # free > > total used free shared buff/cache available > > Mem: 101952 8528 90004 44 3420 89816 > > ^^^^^ > > > > It's a rough measurement based on the current default config without any > > modification, and 32-bit (s32ilp32, s64ilp32) saved more than 16% memory > > to 64-bit (s64lp64). But s32ilp32 & s64ilp32 have a similar memory > > footprint (about 0.33% difference), meaning s64ilp32 has a big chance to > > replace s32ilp32 on the 64-bit machine. > > I've tried to run the same numbers for the debate about running > 32-bit vs 64-bit arm kernels in the past, but focused mostly on > slightly larger systems, but I looked mainly at the 512MB case, > as that is the most cost-efficient DDR3 memory configuration > and fairly common. 512MB is extravagant, in my opinion. In the IPC market, 32/64MB is for 480P/720P/1080p, 128/256MB is for 1080p/2k, and 512/1024MB is for 4K. > 512MB chips is less than 5% of the total (I guess). Even in 512MB chips, the additional memory is for the frame buffer, not the Linux system. I agree for the > 512MB scenarios would make it less sensitive on a 32/64-bit Linux kernel. > > What I'd like to understand better in your example is where > the 14MB of memory went. I assume this is for 128MB of total > RAM, so we know that 1MB went into additional 'struct page' > objects (32 bytes * 32768 pages). It would be good to know > where the dynamic allocations went and if they are reclaimable > (e.g. inodes) or non-reclaimable (e.g. kmalloc-128). > > For the vmlinux size, is this already a minimal config > that one would run on a board with 128MB of RAM, or a > defconfig that includes a lot of stuff that is only relevant > for other platforms but also grows on 64-bit? It's not minimal config, it's defconfig. So I say it's a roungh measurement :) I admit I wanted a little bit to exaggerate it, but that's the starting point for cutting down memory usage for most people, right? During the past year, we have been convincing our customers to use the s64lp64 + u32ilp32, but they can't tolerate even 1% memory additional cost in 64MB/128MB scenarios and then chose cortex-a7/a35, which could run 32-bit Linux. I think it's too early to talk about throwing 32-bit Linux into the garbage, not only for the reason of memory footprint but also for the ingrained opinion of the people. Changing their mind needs a long time. > > What do you see in /proc/slabinfo, /proc/meminfo/, and > 'size vmlinux' for the s64ilp32 and s64lp64 kernels here? Both s64ilp32 & s64lp64 use the same u32ilp32_rootfs.ext2 binary and the same opensbi binary. All are opensbi(2MB) + Linux(126MB) memory layout. Here is the result: s64ilp32: [ 0.000000] Virtual kernel memory layout: [ 0.000000] fixmap : 0x9ce00000 - 0x9d000000 (2048 kB) [ 0.000000] pci io : 0x9d000000 - 0x9e000000 ( 16 MB) [ 0.000000] vmemmap : 0x9e000000 - 0xa0000000 ( 32 MB) [ 0.000000] vmalloc : 0xa0000000 - 0xc0000000 ( 512 MB) [ 0.000000] lowmem : 0xc0000000 - 0xc7e00000 ( 126 MB) [ 0.000000] Memory: 97748K/129024K available (8699K kernel code, 8867K rwdata, 4096K rodata, 4204K init, 361K bss, 31276K reserved, 0K cma-reserved) ... # free total used free shared buff/cache available Mem: 101952 8516 90016 44 3420 89828 Swap: 0 0 0 # cat /proc/meminfo MemTotal: 101952 kB MemFree: 90016 kB MemAvailable: 89836 kB Buffers: 292 kB Cached: 2484 kB SwapCached: 0 kB Active: 2556 kB Inactive: 656 kB Active(anon): 40 kB Inactive(anon): 440 kB Active(file): 2516 kB Inactive(file): 216 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 32 kB Writeback: 0 kB AnonPages: 480 kB Mapped: 1804 kB Shmem: 44 kB KReclaimable: 644 kB Slab: 4536 kB SReclaimable: 644 kB SUnreclaim: 3892 kB KernelStack: 344 kB PageTables: 112 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 50976 kB Committed_AS: 2040 kB VmallocTotal: 524288 kB VmallocUsed: 112 kB VmallocChunk: 0 kB Percpu: 64 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB # cat /proc/slabinfo [68/1691] slabinfo - version: 2.1 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> ext4_groupinfo_1k 28 28 144 28 1 : tunables 0 0 0 : slabdata 1 1 0 p9_req_t 0 0 104 39 1 : tunables 0 0 0 : slabdata 0 0 0 UDPv6 0 0 1088 15 4 : tunables 0 0 0 : slabdata 0 0 0 tw_sock_TCPv6 0 0 200 20 1 : tunables 0 0 0 : slabdata 0 0 0 request_sock_TCPv6 0 0 240 17 1 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 0 0 2048 8 4 : tunables 0 0 0 : slabdata 0 0 0 bio-72 32 32 128 32 1 : tunables 0 0 0 : slabdata 1 1 0 bfq_io_cq 0 0 1000 8 2 : tunables 0 0 0 : slabdata 0 0 0 bio-184 21 21 192 21 1 : tunables 0 0 0 : slabdata 1 1 0 mqueue_inode_cache 10 10 768 10 2 : tunables 0 0 0 : slabdata 1 1 0 v9fs_inode_cache 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0 nfs4_xattr_cache_cache 0 0 1848 17 8 : tunables 0 0 0 : slabdata 0 0 0 nfs_direct_cache 0 0 152 26 1 : tunables 0 0 0 : slabdata 0 0 0 nfs_read_data 36 36 640 12 2 : tunables 0 0 0 : slabdata 3 3 0 nfs_inode_cache 0 0 832 19 4 : tunables 0 0 0 : slabdata 0 0 0 isofs_inode_cache 0 0 528 15 2 : tunables 0 0 0 : slabdata 0 0 0 fat_inode_cache 0 0 632 25 4 : tunables 0 0 0 : slabdata 0 0 0 fat_cache 0 0 24 170 1 : tunables 0 0 0 : slabdata 0 0 0 jbd2_journal_handle 0 0 48 85 1 : tunables 0 0 0 : slabdata 0 0 0 jbd2_journal_head 0 0 80 51 1 : tunables 0 0 0 : slabdata 0 0 0 ext4_fc_dentry_update 0 0 88 46 1 : tunables 0 0 0 : slabdata 0 0 0 ext4_inode_cache 88 88 984 8 2 : tunables 0 0 0 : slabdata 11 11 0 ext4_allocation_context 36 36 112 36 1 : tunables 0 0 0 : slabdata 1 1 0 ext4_io_end_vec 0 0 24 170 1 : tunables 0 0 0 : slabdata 0 0 0 pending_reservation 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 extent_status 256 256 32 128 1 : tunables 0 0 0 : slabdata 2 2 0 mbcache 102 102 40 102 1 : tunables 0 0 0 : slabdata 1 1 0 dio 0 0 384 10 1 : tunables 0 0 0 : slabdata 0 0 0 audit_tree_mark 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 rpc_inode_cache 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0 ip4-frags 0 0 152 26 1 : tunables 0 0 0 : slabdata 0 0 0 RAW 9 9 896 9 2 : tunables 0 0 0 : slabdata 1 1 0 UDP 8 8 960 8 2 : tunables 0 0 0 : slabdata 1 1 0 tw_sock_TCP 0 0 200 20 1 : tunables 0 0 0 : slabdata 0 0 0 request_sock_TCP 0 0 240 17 1 : tunables 0 0 0 : slabdata 0 0 0 TCP 0 0 1920 8 4 : tunables 0 0 0 : slabdata 0 0 0 hugetlbfs_inode_cache 8 8 504 8 1 : tunables 0 0 0 : slabdata 1 1 0 bio-164 42 42 192 21 1 : tunables 0 0 0 : slabdata 2 2 0 ep_head 0 0 8 512 1 : tunables 0 0 0 : slabdata 0 0 0 dax_cache 14 14 576 14 2 : tunables 0 0 0 : slabdata 1 1 0 sgpool-128 16 16 2048 8 4 : tunables 0 0 0 : slabdata 2 2 0 sgpool-64 8 8 1024 8 2 : tunables 0 0 0 : slabdata 1 1 0 request_queue 13 13 616 13 2 : tunables 0 0 0 : slabdata 1 1 0 blkdev_ioc 0 0 80 51 1 : tunables 0 0 0 : slabdata 0 0 0 bio-120 64 64 128 32 1 : tunables 0 0 0 : slabdata 2 2 0 biovec-max 40 40 3072 10 8 : tunables 0 0 0 : slabdata 4 4 0 biovec-128 0 0 1536 10 4 : tunables 0 0 0 : slabdata 0 0 0 [19/1691] biovec-64 10 10 768 10 2 : tunables 0 0 0 : slabdata 1 1 0 dmaengine-unmap-2 128 128 32 128 1 : tunables 0 0 0 : slabdata 1 1 0 sock_inode_cache 22 22 704 11 2 : tunables 0 0 0 : slabdata 2 2 0 skbuff_small_head 14 14 576 14 2 : tunables 0 0 0 : slabdata 1 1 0 skbuff_fclone_cache 0 0 448 9 1 : tunables 0 0 0 : slabdata 0 0 0 file_lock_cache 28 28 144 28 1 : tunables 0 0 0 : slabdata 1 1 0 buffer_head 357 357 80 51 1 : tunables 0 0 0 : slabdata 7 7 0 proc_dir_entry 256 256 128 32 1 : tunables 0 0 0 : slabdata 8 8 0 pde_opener 0 0 24 170 1 : tunables 0 0 0 : slabdata 0 0 0 proc_inode_cache 60 60 536 15 2 : tunables 0 0 0 : slabdata 4 4 0 seq_file 42 42 96 42 1 : tunables 0 0 0 : slabdata 1 1 0 sigqueue 85 85 48 85 1 : tunables 0 0 0 : slabdata 1 1 0 bdev_cache 14 14 1152 14 4 : tunables 0 0 0 : slabdata 1 1 0 shmem_inode_cache 637 637 600 13 2 : tunables 0 0 0 : slabdata 49 49 0 kernfs_node_cache 13938 13938 88 46 1 : tunables 0 0 0 : slabdata 303 303 0 inode_cache 360 360 496 8 1 : tunables 0 0 0 : slabdata 45 45 0 dentry 1196 1196 152 26 1 : tunables 0 0 0 : slabdata 46 46 0 names_cache 8 8 4096 8 8 : tunables 0 0 0 : slabdata 1 1 0 net_namespace 0 0 2944 11 8 : tunables 0 0 0 : slabdata 0 0 0 iint_cache 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 key_jar 105 105 192 21 1 : tunables 0 0 0 : slabdata 5 5 0 uts_namespace 0 0 416 19 2 : tunables 0 0 0 : slabdata 0 0 0 nsproxy 102 102 40 102 1 : tunables 0 0 0 : slabdata 1 1 0 vm_area_struct 255 255 80 51 1 : tunables 0 0 0 : slabdata 5 5 0 signal_cache 55 55 704 11 2 : tunables 0 0 0 : slabdata 5 5 0 sighand_cache 60 60 1088 15 4 : tunables 0 0 0 : slabdata 4 4 0 anon_vma_chain 384 384 32 128 1 : tunables 0 0 0 : slabdata 3 3 0 anon_vma 168 168 72 56 1 : tunables 0 0 0 : slabdata 3 3 0 perf_event 0 0 816 10 2 : tunables 0 0 0 : slabdata 0 0 0 maple_node 32 32 256 16 1 : tunables 0 0 0 : slabdata 2 2 0 radix_tree_node 338 338 304 13 1 : tunables 0 0 0 : slabdata 26 26 0 task_group 8 8 512 8 1 : tunables 0 0 0 : slabdata 1 1 0 mm_struct 20 20 768 10 2 : tunables 0 0 0 : slabdata 2 2 0 vmap_area 102 102 40 102 1 : tunables 0 0 0 : slabdata 1 1 0 page->ptl 256 256 16 256 1 : tunables 0 0 0 : slabdata 1 1 0 kmalloc-cg-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-cg-4k 8 8 4096 8 8 : tunables 0 0 0 : slabdata 1 1 0 kmalloc-cg-2k 72 72 2048 8 4 : tunables 0 0 0 : slabdata 9 9 0 kmalloc-cg-1k 32 32 1024 8 2 : tunables 0 0 0 : slabdata 4 4 0 kmalloc-cg-512 32 32 512 8 1 : tunables 0 0 0 : slabdata 4 4 0 kmalloc-cg-256 96 96 256 16 1 : tunables 0 0 0 : slabdata 6 6 0 kmalloc-cg-192 63 63 192 21 1 : tunables 0 0 0 : slabdata 3 3 0 kmalloc-cg-128 160 160 128 32 1 : tunables 0 0 0 : slabdata 5 5 0 kmalloc-cg-64 128 128 64 64 1 : tunables 0 0 0 : slabdata 2 2 0 kmalloc-rcl-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-2k 0 0 2048 8 4 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-1k 0 0 1024 8 2 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-512 0 0 512 8 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-256 0 0 256 16 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-192 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-128 0 0 128 32 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-64 0 0 64 64 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-8k 12 12 8192 4 8 : tunables 0 0 0 : slabdata 3 3 0 kmalloc-4k 16 16 4096 8 8 : tunables 0 0 0 : slabdata 2 2 0 kmalloc-2k 40 40 2048 8 4 : tunables 0 0 0 : slabdata 5 5 0 kmalloc-1k 88 88 1024 8 2 : tunables 0 0 0 : slabdata 11 11 0 kmalloc-512 856 856 512 8 1 : tunables 0 0 0 : slabdata 107 107 0 kmalloc-256 64 64 256 16 1 : tunables 0 0 0 : slabdata 4 4 0 kmalloc-192 126 126 192 21 1 : tunables 0 0 0 : slabdata 6 6 0 kmalloc-128 1056 1056 128 32 1 : tunables 0 0 0 : slabdata 33 33 0 kmalloc-64 5302 5312 64 64 1 : tunables 0 0 0 : slabdata 83 83 0 kmem_cache_node 128 128 64 64 1 : tunables 0 0 0 : slabdata 2 2 0 kmem_cache 128 128 128 32 1 : tunables 0 0 0 : slabdata 4 4 0 s64lp64: [ 0.000000] Virtual kernel memory layout: [ 0.000000] fixmap : 0xff1bfffffee00000 - 0xff1bffffff000000 (2048 kB) [ 0.000000] pci io : 0xff1bffffff000000 - 0xff1c000000000000 ( 16 MB) [ 0.000000] vmemmap : 0xff1c000000000000 - 0xff20000000000000 (1024 TB) [ 0.000000] vmalloc : 0xff20000000000000 - 0xff60000000000000 (16384 TB) [ 0.000000] modules : 0xffffffff01579000 - 0xffffffff80000000 (2026 MB) [ 0.000000] lowmem : 0xff60000000000000 - 0xff60000008000000 ( 128 MB) [ 0.000000] kernel : 0xffffffff80000000 - 0xffffffffffffffff (2047 MB) [ 0.000000] Memory: 89380K/131072K available (8638K kernel code, 4979K rwdata, 4096K rodata, 2191K init, 477K bss, 41692K reserved, 0K cma-reserved) ... # free total used free shared buff/cache available Mem: 91568 11472 76264 48 3832 76376 Swap: 0 0 0 # cat /proc/meminfo MemTotal: 91568 kB MemFree: 76220 kB MemAvailable: 76352 kB Buffers: 292 kB Cached: 2488 kB SwapCached: 0 kB Active: 2560 kB Inactive: 656 kB Active(anon): 44 kB Inactive(anon): 440 kB Active(file): 2516 kB Inactive(file): 216 kB Unevictable: 0 kB Mlocked: 0 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 16 kB Writeback: 0 kB AnonPages: 480 kB Mapped: 1804 kB Shmem: 48 kB KReclaimable: 1092 kB Slab: 6900 kB SReclaimable: 1092 kB SUnreclaim: 5808 kB KernelStack: 688 kB PageTables: 120 kB SecPageTables: 0 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 45784 kB Committed_AS: 2044 kB VmallocTotal: 17592186044416 kB VmallocUsed: 904 kB VmallocChunk: 0 kB Percpu: 88 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB # cat /proc/slabinfo slabinfo - version: 2.1 # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail> ext4_groupinfo_1k 19 19 208 19 1 : tunables 0 0 0 : slabdata 1 1 0 p9_req_t 0 0 176 23 1 : tunables 0 0 0 : slabdata 0 0 0 ip6-frags 0 0 208 19 1 : tunables 0 0 0 : slabdata 0 0 0 UDPv6 0 0 1472 11 4 : tunables 0 0 0 : slabdata 0 0 0 tw_sock_TCPv6 0 0 264 15 1 : tunables 0 0 0 : slabdata 0 0 0 request_sock_TCPv6 0 0 312 13 1 : tunables 0 0 0 : slabdata 0 0 0 TCPv6 0 0 2560 12 8 : tunables 0 0 0 : slabdata 0 0 0 bio-96 32 32 128 32 1 : tunables 0 0 0 : slabdata 1 1 0 bfq_io_cq 0 0 1352 12 4 : tunables 0 0 0 : slabdata 0 0 0 bfq_queue 0 0 576 14 2 : tunables 0 0 0 : slabdata 0 0 0 mqueue_inode_cache 14 14 1152 14 4 : tunables 0 0 0 : slabdata 1 1 0 v9fs_inode_cache 0 0 888 9 2 : tunables 0 0 0 : slabdata 0 0 0 nfs4_xattr_cache_cache 0 0 3168 10 8 : tunables 0 0 0 : slabdata 0 0 0 nfs_direct_cache 0 0 264 15 1 : tunables 0 0 0 : slabdata 0 0 0 nfs_commit_data 11 11 704 11 2 : tunables 0 0 0 : slabdata 1 1 0 nfs_read_data 36 36 896 9 2 : tunables 0 0 0 : slabdata 4 4 0 nfs_inode_cache 0 0 1272 25 8 : tunables 0 0 0 : slabdata 0 0 0 isofs_inode_cache 0 0 824 19 4 : tunables 0 0 0 : slabdata 0 0 0 fat_inode_cache 0 0 976 8 2 : tunables 0 0 0 : slabdata 0 0 0 fat_cache 0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0 jbd2_journal_head 0 0 144 28 1 : tunables 0 0 0 : slabdata 0 0 0 jbd2_revoke_table_s 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 ext4_fc_dentry_update 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 ext4_inode_cache 105 105 1496 21 8 : tunables 0 0 0 : slabdata 5 5 0 ext4_allocation_context 30 30 136 30 1 : tunables 0 0 0 : slabdata 1 1 0 ext4_prealloc_space 34 34 120 34 1 : tunables 0 0 0 : slabdata 1 1 0 ext4_system_zone 102 102 40 102 1 : tunables 0 0 0 : slabdata 1 1 0 ext4_io_end_vec 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 bio_post_read_ctx 170 170 48 85 1 : tunables 0 0 0 : slabdata 2 2 0 pending_reservation 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 extent_status 102 102 40 102 1 : tunables 0 0 0 : slabdata 1 1 0 mbcache 0 0 56 73 1 : tunables 0 0 0 : slabdata 0 0 0 dnotify_struct 0 0 32 128 1 : tunables 0 0 0 : slabdata 0 0 0 pid_namespace 0 0 160 25 1 : tunables 0 0 0 : slabdata 0 0 0 posix_timers_cache 0 0 272 15 1 : tunables 0 0 0 : slabdata 0 0 0 rpc_inode_cache 0 0 832 19 4 : tunables 0 0 0 : slabdata 0 0 0 UNIX 12 12 1344 12 4 : tunables 0 0 0 : slabdata 1 1 0 ip4-frags 0 0 224 18 1 : tunables 0 0 0 : slabdata 0 0 0 xfrm_dst_cache 0 0 320 12 1 : tunables 0 0 0 : slabdata 0 0 0 ip_fib_trie 85 85 48 85 1 : tunables 0 0 0 : slabdata 1 1 0 ip_fib_alias 73 73 56 73 1 : tunables 0 0 0 : slabdata 1 1 0 UDP 12 12 1280 12 4 : tunables 0 0 0 : slabdata 1 1 0 [35/1689] tw_sock_TCP 0 0 264 15 1 : tunables 0 0 0 : slabdata 0 0 0 request_sock_TCP 0 0 312 13 1 : tunables 0 0 0 : slabdata 0 0 0 TCP 0 0 2432 13 8 : tunables 0 0 0 : slabdata 0 0 0 hugetlbfs_inode_cache 10 10 784 10 2 : tunables 0 0 0 : slabdata 1 1 0 bio-224 48 48 256 16 1 : tunables 0 0 0 : slabdata 3 3 0 ep_head 0 0 16 256 1 : tunables 0 0 0 : slabdata 0 0 0 inotify_inode_mark 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 dax_cache 8 8 960 8 2 : tunables 0 0 0 : slabdata 1 1 0 sgpool-128 10 10 3072 10 8 : tunables 0 0 0 : slabdata 1 1 0 sgpool-64 10 10 1536 10 4 : tunables 0 0 0 : slabdata 1 1 0 sgpool-16 10 10 384 10 1 : tunables 0 0 0 : slabdata 1 1 0 request_queue 15 15 1040 15 4 : tunables 0 0 0 : slabdata 1 1 0 bio-160 42 42 192 21 1 : tunables 0 0 0 : slabdata 2 2 0 biovec-128 8 8 2048 8 4 : tunables 0 0 0 : slabdata 1 1 0 biovec-64 8 8 1024 8 2 : tunables 0 0 0 : slabdata 1 1 0 user_namespace 0 0 632 25 4 : tunables 0 0 0 : slabdata 0 0 0 uid_cache 84 84 192 21 1 : tunables 0 0 0 : slabdata 4 4 0 dmaengine-unmap-2 64 64 64 64 1 : tunables 0 0 0 : slabdata 1 1 0 sock_inode_cache 24 24 1024 8 2 : tunables 0 0 0 : slabdata 3 3 0 skbuff_small_head 12 12 640 12 2 : tunables 0 0 0 : slabdata 1 1 0 skbuff_fclone_cache 0 0 512 8 1 : tunables 0 0 0 : slabdata 0 0 0 file_lock_cache 17 17 232 17 1 : tunables 0 0 0 : slabdata 1 1 0 fsnotify_mark_connector 0 0 56 73 1 : tunables 0 0 0 : slabdata 0 0 0 pde_opener 0 0 40 102 1 : tunables 0 0 0 : slabdata 0 0 0 proc_inode_cache 57 57 848 19 4 : tunables 0 0 0 : slabdata 3 3 0 seq_file 26 26 152 26 1 : tunables 0 0 0 : slabdata 1 1 0 sigqueue 51 51 80 51 1 : tunables 0 0 0 : slabdata 1 1 0 bdev_cache 18 18 1792 9 4 : tunables 0 0 0 : slabdata 2 2 0 shmem_inode_cache 646 646 936 17 4 : tunables 0 0 0 : slabdata 38 38 0 kernfs_iattrs_cache 0 0 96 42 1 : tunables 0 0 0 : slabdata 0 0 0 kernfs_node_cache 14304 14304 128 32 1 : tunables 0 0 0 : slabdata 447 447 0 filp 84 84 320 12 1 : tunables 0 0 0 : slabdata 7 7 0 inode_cache 360 360 776 10 2 : tunables 0 0 0 : slabdata 36 36 0 dentry 1188 1188 216 18 1 : tunables 0 0 0 : slabdata 66 66 0 names_cache 48 48 4096 8 8 : tunables 0 0 0 : slabdata 6 6 0 net_namespace 0 0 3840 8 8 : tunables 0 0 0 : slabdata 0 0 0 iint_cache 0 0 152 26 1 : tunables 0 0 0 : slabdata 0 0 0 uts_namespace 0 0 432 9 1 : tunables 0 0 0 : slabdata 0 0 0 nsproxy 56 56 72 56 1 : tunables 0 0 0 : slabdata 1 1 0 vm_area_struct 240 240 136 30 1 : tunables 0 0 0 : slabdata 8 8 0 files_cache 22 22 704 11 2 : tunables 0 0 0 : slabdata 2 2 0 signal_cache 56 56 1152 14 4 : tunables 0 0 0 : slabdata 4 4 0 sighand_cache 57 57 1664 19 8 : tunables 0 0 0 : slabdata 3 3 0 task_struct 55 55 2880 11 8 : tunables 0 0 0 : slabdata 5 5 0 anon_vma 120 120 136 30 1 : tunables 0 0 0 : slabdata 4 4 0 perf_event 0 0 1152 14 4 : tunables 0 0 0 : slabdata 0 0 0 maple_node 304 304 256 16 1 : tunables 0 0 0 : slabdata 19 19 0 radix_tree_node 350 350 584 14 2 : tunables 0 0 0 : slabdata 25 25 0 task_group 10 10 768 10 2 : tunables 0 0 0 : slabdata 1 1 0 mm_struct 22 22 1408 11 4 : tunables 0 0 0 : slabdata 2 2 0 vmap_area 168 168 72 56 1 : tunables 0 0 0 : slabdata 3 3 0 page->ptl 170 170 24 170 1 : tunables 0 0 0 : slabdata 1 1 0 kmalloc-cg-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-cg-4k 24 24 4096 8 8 : tunables 0 0 0 : slabdata 3 3 0 kmalloc-cg-2k 32 32 2048 8 4 : tunables 0 0 0 : slabdata 4 4 0 kmalloc-cg-1k 24 24 1024 8 2 : tunables 0 0 0 : slabdata 3 3 0 kmalloc-cg-512 32 32 512 8 1 : tunables 0 0 0 : slabdata 4 4 0 kmalloc-cg-256 16 16 256 16 1 : tunables 0 0 0 : slabdata 1 1 0 kmalloc-cg-192 147 147 192 21 1 : tunables 0 0 0 : slabdata 7 7 0 kmalloc-cg-128 64 64 128 32 1 : tunables 0 0 0 : slabdata 2 2 0 kmalloc-cg-64 320 320 64 64 1 : tunables 0 0 0 : slabdata 5 5 0 kmalloc-rcl-8k 0 0 8192 4 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-4k 0 0 4096 8 8 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-2k 0 0 2048 8 4 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-1k 0 0 1024 8 2 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-512 0 0 512 8 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-256 0 0 256 16 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-192 0 0 192 21 1 : tunables 0 0 0 : slabdata 0 0 0 kmalloc-rcl-128 320 320 128 32 1 : tunables 0 0 0 : slabdata 10 10 0 kmalloc-rcl-64 64 64 64 64 1 : tunables 0 0 0 : slabdata 1 1 0 kmalloc-8k 12 12 8192 4 8 : tunables 0 0 0 : slabdata 3 3 0 kmalloc-4k 16 16 4096 8 8 : tunables 0 0 0 : slabdata 2 2 0 kmalloc-2k 64 64 2048 8 4 : tunables 0 0 0 : slabdata 8 8 0 kmalloc-1k 840 840 1024 8 2 : tunables 0 0 0 : slabdata 105 105 0 kmalloc-512 144 144 512 8 1 : tunables 0 0 0 : slabdata 18 18 0 kmalloc-256 816 816 256 16 1 : tunables 0 0 0 : slabdata 51 51 0 kmalloc-192 252 252 192 21 1 : tunables 0 0 0 : slabdata 12 12 0 kmalloc-128 480 480 128 32 1 : tunables 0 0 0 : slabdata 15 15 0 kmalloc-64 4912 4928 64 64 1 : tunables 0 0 0 : slabdata 77 77 0 kmem_cache_node 128 128 128 32 1 : tunables 0 0 0 : slabdata 4 4 0 kmem_cache 126 126 192 21 1 : tunables 0 0 0 : slabdata 6 6 0 > > Arnd
On Sat, May 20, 2023, at 04:53, Guo Ren wrote: > On Sat, May 20, 2023 at 4:20 AM Arnd Bergmann <arnd@arndb.de> wrote: >> On Thu, May 18, 2023, at 15:09, guoren@kernel.org wrote: >> >> I've tried to run the same numbers for the debate about running >> 32-bit vs 64-bit arm kernels in the past, but focused mostly on >> slightly larger systems, but I looked mainly at the 512MB case, >> as that is the most cost-efficient DDR3 memory configuration >> and fairly common. > 512MB is extravagant, in my opinion. In the IPC market, 32/64MB is for > 480P/720P/1080p, 128/256MB is for 1080p/2k, and 512/1024MB is for 4K. >> 512MB chips is less than 5% of the total (I guess). Even in 512MB > chips, the additional memory is for the frame buffer, not the Linux > system. This depends a lot on the target application of course. For a phone or NAS box, 512MB is probably the lower limit. What I observe in arch/arm/ devicetree submissions, in board-db.org, and when looking at industrial Arm board vendor websites is that 512MB is the most common configuration, and I think 1GB is still more common than 256MB even for 32-bit machines. There is of course a difference between number of individual products, and number of machines shipped in a given configuration, and I guess you have a good point that the cheapest ones are also the ones that ship in the highest volume. >> What I'd like to understand better in your example is where >> the 14MB of memory went. I assume this is for 128MB of total >> RAM, so we know that 1MB went into additional 'struct page' >> objects (32 bytes * 32768 pages). It would be good to know >> where the dynamic allocations went and if they are reclaimable >> (e.g. inodes) or non-reclaimable (e.g. kmalloc-128). >> >> For the vmlinux size, is this already a minimal config >> that one would run on a board with 128MB of RAM, or a >> defconfig that includes a lot of stuff that is only relevant >> for other platforms but also grows on 64-bit? > It's not minimal config, it's defconfig. So I say it's a roungh > measurement :) > > I admit I wanted a little bit to exaggerate it, but that's the > starting point for cutting down memory usage for most people, right? > During the past year, we have been convincing our customers to use the > s64lp64 + u32ilp32, but they can't tolerate even 1% memory additional > cost in 64MB/128MB scenarios and then chose cortex-a7/a35, which could > run 32-bit Linux. I think it's too early to talk about throwing 32-bit > Linux into the garbage, not only for the reason of memory footprint > but also for the ingrained opinion of the people. Changing their mind > needs a long time. > >> >> What do you see in /proc/slabinfo, /proc/meminfo/, and >> 'size vmlinux' for the s64ilp32 and s64lp64 kernels here? > Both s64ilp32 & s64lp64 use the same u32ilp32_rootfs.ext2 binary and > the same opensbi binary. > All are opensbi(2MB) + Linux(126MB) memory layout. > > Here is the result: > > s64ilp32: > [ 0.000000] Virtual kernel memory layout: > [ 0.000000] fixmap : 0x9ce00000 - 0x9d000000 (2048 kB) > [ 0.000000] pci io : 0x9d000000 - 0x9e000000 ( 16 MB) > [ 0.000000] vmemmap : 0x9e000000 - 0xa0000000 ( 32 MB) > [ 0.000000] vmalloc : 0xa0000000 - 0xc0000000 ( 512 MB) > [ 0.000000] lowmem : 0xc0000000 - 0xc7e00000 ( 126 MB) > [ 0.000000] Memory: 97748K/129024K available (8699K kernel code, > 8867K rwdata, 4096K rodata, 4204K init, 361K bss, 31276K reserved, 0K > cma-reserved) Ok, so it saves only a little bit on .text/.init/.bss/.rodata, but there is a 4MB difference in rwdata, and a total of 10.4MB difference in "reserved" size, which I think includes all of the above plus the mem_map[] array. 89380K/131072K available (8638K kernel code, 4979K rwdata, 4096K rodata, 2191K init, 477K bss, 41692K reserved, 0K cma-reserved) Oddly, I don't see anywhere close to 8KB in a riscv64 defconfig build (linux-next, gcc-13), so I don't know where that comes from: $ size -A build/tmp/vmlinux | sort -k2 -nr | head Total 13518684 .text 8896058 18446744071562076160 .rodata 2219008 18446744071576748032 .data 933760 18446744071583039488 .bss 476080 18446744071584092160 .init.text 264718 18446744071572553728 __ksymtab_strings 183986 18446744071579214312 __ksymtab_gpl 122928 18446744071579091384 __ksymtab 109080 18446744071578982304 __bug_table 98352 18446744071583973248 > KReclaimable: 644 kB > Slab: 4536 kB > SReclaimable: 644 kB > SUnreclaim: 3892 kB > KernelStack: 344 kB These look like the only notable differences in meminfo: KReclaimable: 1092 kB Slab: 6900 kB SReclaimable: 1092 kB SUnreclaim: 5808 kB KernelStack: 688 kB The largest chunk here is 2MB in non-reclaimable slab allocations, or a 50% growth of those. The kernel stacks are doubled as expected, but that's only 344KB, similarly for reclaimable slabs. > # cat /proc/slabinfo > > [68/1691] > slabinfo - version: 2.1 > # name <active_objs> <num_objs> <objsize> <objperslab> > <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : > slabdata <active_slabs> <num_slabs> <sharedavail> > ext4_groupinfo_1k 28 28 144 28 1 : tunables 0 0 > 0 : slabdata 1 1 0 > p9_req_t 0 0 104 39 1 : tunables 0 0 Did you perhaps miss a few lines while pasting these? It seems odd that some caches only show up in the ilp32 case (proc_dir_entry, bd2_journa_handle, buffer_head, biovec_max, anon_vma_chain, ...) and some others are only in the lp64 case (UNIX, ext4_prealloc_space, files_cache, filp, ip_fib_alias, task_struct, uid_cache, ...). Looking at the ones that are in both and have the largest size increase, I see # lp64 1788 kernfs_node_cache 14304 128 590 shmem_inode_cache 646 936 272 inode_cache 360 776 153 ext4_inode_cache 105 1496 250 dentry 1188 216 192 names_cache 48 4096 199 radix_tree_node 350 584 307 kmalloc-64 4912 64 60 kmalloc-128 480 128 47 kmalloc-192 252 192 204 kmalloc-256 816 256 72 kmalloc-512 144 512 840 kmalloc-1k 840 1024 # ilp32 1197 kernfs_node_cache 13938 88 373 shmem_inode_cache 637 600 174 inode_cache 360 496 84 ext4_inode_cache 88 984 177 dentry 1196 152 32 names_cache 8 4096 100 radix_tree_node 338 304 331 kmalloc-64 5302 64 132 kmalloc-128 1056 128 23 kmalloc-192 126 192 16 kmalloc-256 64 256 428 kmalloc-512 856 512 88 kmalloc-1k 88 1024 So sysfs (kernfs_node_cache) has the largest chunk of the 2MB non-reclaimable slab, grown 50% from 1.2MB to 1.8MB. In some cases, this could be avoided entirely by turning off sysfs, but most users can't do that. shmem_inode_cache is probably mostly devtmpfs, the other inode caches ones are smaller and likely reclaimable. It's interesting how the largest slab cache ends up being the kmalloc-1k cache (840 1K objects) on lp64, but the kmalloc-512 cache (856 512B objects) on ilp32. My guess is that the majority of this is from a single callsite that has an allocation groing just beyond 512B. This alone seems significant enough to need further investigation, I would hope we can completely avoid these by adding a custom slab cache. I don't see this effect on an arm64 boot though, for me the 512B allocations are much higher the 1K ones. Maybe you can identify the culprit using the boot-time traces as listed in https://elinux.org/Kernel_dynamic_memory_analysis#Dynamic That might help everyone running a 64-bit kernel on low-memory configurations, though it would of course slightly weaken your argument for an ilp32 kernel ;-) Arnd
On Sat, May 20, 2023 at 6:13 PM Arnd Bergmann <arnd@arndb.de> wrote: > > On Sat, May 20, 2023, at 04:53, Guo Ren wrote: > > On Sat, May 20, 2023 at 4:20 AM Arnd Bergmann <arnd@arndb.de> wrote: > >> On Thu, May 18, 2023, at 15:09, guoren@kernel.org wrote: > >> > >> I've tried to run the same numbers for the debate about running > >> 32-bit vs 64-bit arm kernels in the past, but focused mostly on > >> slightly larger systems, but I looked mainly at the 512MB case, > >> as that is the most cost-efficient DDR3 memory configuration > >> and fairly common. > > 512MB is extravagant, in my opinion. In the IPC market, 32/64MB is for > > 480P/720P/1080p, 128/256MB is for 1080p/2k, and 512/1024MB is for 4K. > >> 512MB chips is less than 5% of the total (I guess). Even in 512MB > > chips, the additional memory is for the frame buffer, not the Linux > > system. > > This depends a lot on the target application of course. For > a phone or NAS box, 512MB is probably the lower limit. > > What I observe in arch/arm/ devicetree submissions, in board-db.org, > and when looking at industrial Arm board vendor websites is that > 512MB is the most common configuration, and I think 1GB is still > more common than 256MB even for 32-bit machines. There is of course > a difference between number of individual products, and number of > machines shipped in a given configuration, and I guess you have > a good point that the cheapest ones are also the ones that ship > in the highest volume. > > >> What I'd like to understand better in your example is where > >> the 14MB of memory went. I assume this is for 128MB of total > >> RAM, so we know that 1MB went into additional 'struct page' > >> objects (32 bytes * 32768 pages). It would be good to know > >> where the dynamic allocations went and if they are reclaimable > >> (e.g. inodes) or non-reclaimable (e.g. kmalloc-128). > >> > >> For the vmlinux size, is this already a minimal config > >> that one would run on a board with 128MB of RAM, or a > >> defconfig that includes a lot of stuff that is only relevant > >> for other platforms but also grows on 64-bit? > > It's not minimal config, it's defconfig. So I say it's a roungh > > measurement :) > > > > I admit I wanted a little bit to exaggerate it, but that's the > > starting point for cutting down memory usage for most people, right? > > During the past year, we have been convincing our customers to use the > > s64lp64 + u32ilp32, but they can't tolerate even 1% memory additional > > cost in 64MB/128MB scenarios and then chose cortex-a7/a35, which could > > run 32-bit Linux. I think it's too early to talk about throwing 32-bit > > Linux into the garbage, not only for the reason of memory footprint > > but also for the ingrained opinion of the people. Changing their mind > > needs a long time. > > > >> > >> What do you see in /proc/slabinfo, /proc/meminfo/, and > >> 'size vmlinux' for the s64ilp32 and s64lp64 kernels here? > > Both s64ilp32 & s64lp64 use the same u32ilp32_rootfs.ext2 binary and > > the same opensbi binary. > > All are opensbi(2MB) + Linux(126MB) memory layout. > > > > Here is the result: > > > > s64ilp32: > > [ 0.000000] Virtual kernel memory layout: > > [ 0.000000] fixmap : 0x9ce00000 - 0x9d000000 (2048 kB) > > [ 0.000000] pci io : 0x9d000000 - 0x9e000000 ( 16 MB) > > [ 0.000000] vmemmap : 0x9e000000 - 0xa0000000 ( 32 MB) > > [ 0.000000] vmalloc : 0xa0000000 - 0xc0000000 ( 512 MB) > > [ 0.000000] lowmem : 0xc0000000 - 0xc7e00000 ( 126 MB) > > [ 0.000000] Memory: 97748K/129024K available (8699K kernel code, > > 8867K rwdata, 4096K rodata, 4204K init, 361K bss, 31276K reserved, 0K > > cma-reserved) > > Ok, so it saves only a little bit on .text/.init/.bss/.rodata, but > there is a 4MB difference in rwdata, and a total of 10.4MB difference > in "reserved" size, which I think includes all of the above plus > the mem_map[] array. > > 89380K/131072K available (8638K kernel code, 4979K rwdata, 4096K rodata, 2191K init, 477K bss, 41692K reserved, 0K cma-reserved) > > Oddly, I don't see anywhere close to 8KB in a riscv64 defconfig > build (linux-next, gcc-13), so I don't know where that comes > from: > > $ size -A build/tmp/vmlinux | sort -k2 -nr | head > Total 13518684 > .text 8896058 18446744071562076160 > .rodata 2219008 18446744071576748032 > .data 933760 18446744071583039488 > .bss 476080 18446744071584092160 > .init.text 264718 18446744071572553728 > __ksymtab_strings 183986 18446744071579214312 > __ksymtab_gpl 122928 18446744071579091384 > __ksymtab 109080 18446744071578982304 > __bug_table 98352 18446744071583973248 > > > > > KReclaimable: 644 kB > > Slab: 4536 kB > > SReclaimable: 644 kB > > SUnreclaim: 3892 kB > > KernelStack: 344 kB > > These look like the only notable differences in meminfo: > > KReclaimable: 1092 kB > Slab: 6900 kB > SReclaimable: 1092 kB > SUnreclaim: 5808 kB > KernelStack: 688 kB > > The largest chunk here is 2MB in non-reclaimable slab allocations, > or a 50% growth of those. > > The kernel stacks are doubled as expected, but that's only 344KB, > similarly for reclaimable slabs. > > > # cat /proc/slabinfo > > > > [68/1691] > > slabinfo - version: 2.1 > > # name <active_objs> <num_objs> <objsize> <objperslab> > > <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : > > slabdata <active_slabs> <num_slabs> <sharedavail> > > ext4_groupinfo_1k 28 28 144 28 1 : tunables 0 0 > > 0 : slabdata 1 1 0 > > p9_req_t 0 0 104 39 1 : tunables 0 0 > > Did you perhaps miss a few lines while pasting these? It seems > odd that some caches only show up in the ilp32 case (proc_dir_entry, > bd2_journa_handle, buffer_head, biovec_max, anon_vma_chain, ...) and > some others are only in the lp64 case (UNIX, ext4_prealloc_space, > files_cache, filp, ip_fib_alias, task_struct, uid_cache, ...). > > Looking at the ones that are in both and have the largest size > increase, I see > > # lp64 > 1788 kernfs_node_cache 14304 128 > 590 shmem_inode_cache 646 936 > 272 inode_cache 360 776 > 153 ext4_inode_cache 105 1496 > 250 dentry 1188 216 > 192 names_cache 48 4096 > 199 radix_tree_node 350 584 > 307 kmalloc-64 4912 64 > 60 kmalloc-128 480 128 > 47 kmalloc-192 252 192 > 204 kmalloc-256 816 256 > 72 kmalloc-512 144 512 > 840 kmalloc-1k 840 1024 > > # ilp32 > 1197 kernfs_node_cache 13938 88 > 373 shmem_inode_cache 637 600 > 174 inode_cache 360 496 > 84 ext4_inode_cache 88 984 > 177 dentry 1196 152 > 32 names_cache 8 4096 > 100 radix_tree_node 338 304 > 331 kmalloc-64 5302 64 > 132 kmalloc-128 1056 128 > 23 kmalloc-192 126 192 > 16 kmalloc-256 64 256 > 428 kmalloc-512 856 512 > 88 kmalloc-1k 88 1024 > > So sysfs (kernfs_node_cache) has the largest chunk of the > 2MB non-reclaimable slab, grown 50% from 1.2MB to 1.8MB. > In some cases, this could be avoided entirely by turning > off sysfs, but most users can't do that. > shmem_inode_cache is probably mostly devtmpfs, the > other inode caches ones are smaller and likely reclaimable. > > It's interesting how the largest slab cache ends up > being the kmalloc-1k cache (840 1K objects) on lp64, > but the kmalloc-512 cache (856 512B objects) on ilp32. > My guess is that the majority of this is from a single > callsite that has an allocation groing just beyond 512B. > This alone seems significant enough to need further > investigation, I would hope we can completely avoid > these by adding a custom slab cache. I don't see this > effect on an arm64 boot though, for me the 512B allocations > are much higher the 1K ones. > > Maybe you can identify the culprit using the boot-time traces > as listed in https://elinux.org/Kernel_dynamic_memory_analysis#Dynamic > That might help everyone running a 64-bit kernel on > low-memory configurations, though it would of course slightly > weaken your argument for an ilp32 kernel ;-) Thx for the detailed reply, I would try your approches mentioned lately. But these about traditional CONFIG_32BIT v.s. CONFIG_64BIT comparation. Besides the detailed analysis data, we also would meet the people's concept problem. Such as struct page, struct list_head, and some variables containing pointers, ilp32's would be significantly smaller than lp64. That means ilp32 is smaller than lp64 in people's minds. This concept would prevent vendors from accepting lp64 as a cost-down solution. They even won't try, which I've met these years. I was an lp64 kernel supporter last year, but I met a lot of arguments on s64lp64 + u32ilp32. Some guys are using arm32 Linux; they want to stay on 32-bit Linux to ensure their complex C code can work. So our argument about "ilp32 v.s. lp64" won't have a result. Let's change another view, cache utilization. These 64/128MB SoCs also have limited cache capacities (L1-32KB+L2-128KB/only L1-64KB). Such as List walk and stack saving/restoring are very common in Linux. What do you think about "32-bit v.s. 64-bit" cache utilization? > > Arnd
On Fri, May 19, 2023 at 8:14 AM Paul Walmsley <paul.walmsley@sifive.com> wrote: > > On Thu, 18 May 2023, Palmer Dabbelt wrote: > > > On Thu, 18 May 2023 06:09:51 PDT (-0700), guoren@kernel.org wrote: > > > > > This patch series adds s64ilp32 support to riscv. The term s64ilp32 > > > means smode-xlen=64 and -mabi=ilp32 (ints, longs, and pointers are all > > > 32-bit), i.e., running 32-bit Linux kernel on pure 64-bit supervisor > > > mode. There have been many 64ilp32 abis existing, such as mips-n32 [1], > > > arm-aarch64ilp32 [2], and x86-x32 [3], but they are all about userspace. > > > Thus, this should be the first time running a 32-bit Linux kernel with > > > the 64ilp32 ABI at supervisor mode (If not, correct me). > > > > Does anyone actually want this? At a bare minimum we'd need to add it to the > > psABI, which would presumably also be required on the compiler side of things. > > > > It's not even clear anyone wants rv64/ilp32 in userspace, the kernel seems > > like it'd be even less widely used. > > We've certainly talked to folks who are interested in RV64 ILP32 userspace > with an LP64 kernel. The motivation is the usual one: to reduce data size > and therefore (ideally) BOM cost. I think this work, if it goes forward, > would need to go hand in hand with the RVIA psABI group. > > The RV64 ILP32 kernel and ILP32 userspace approach implemented by this > patch is intriguing, but I guess for me, the question is whether it's > worth the extra hassle vs. a pure RV32 kernel & userspace. Running pure RV32 kernel on 64-bit hardware is not an intelligent choice (such as cortex-a35/a53/a55), because they wasted 64-bit hw capabilities, and the hardware designer would waste additional resources & time on 32-bit machine & supervisor modes (In arm it is called EL3/EL2/EL1 modes). Think about too many PMP CSRs, PMU CSRs, and mode switch ... it's definitely wrong to follow the cortex-a35/a53/a55 way to deal with riscv32 on a 64-bit hardware. The chapter "Why s64ilp32 has better performance?" give out the improvement v.s. pure 32-bit, I repeat it here: - memcpy/memset/strcmp (s64ilp32 has half of the number of instructions and double the bandwidth per load/store instruction than s32ilp32.) - ebpf JIT is a 64-bit virtual ISA, which couldn't be sufficient mapping by s32ilp32, but s64ilp32 could (just like s64lp64). - Atomic64 (s64ilp32 has the exact native instructions mapping as s64lp64, but s32ilp32 only uses generic_atomic64, a tradeoff & limited software solution.) - 64-bit native arithmetic instructions for "long long" type - riscv s64ilp32 could support cmxchg_double for slub (The 2nd 32-bit Linux supports the feature, the 1st is i386.) > > > - Paul