[2/2] riscv: Disable misaligned access probe when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
Message ID | 20240131-disable_misaligned_probe_config-v1-2-98d155e9cda8@rivosinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-47710-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2685:b0:106:209c:c626 with SMTP id mn5csp252293dyc; Wed, 31 Jan 2024 22:44:17 -0800 (PST) X-Google-Smtp-Source: AGHT+IFcRjoGN64kpnypxtbmGzJTH4+2BTlWLZoSfw/X0Gpj2MjOTcyWjs8ewkY9ahFxJgQopghl X-Received: by 2002:a05:6808:146:b0:3bd:4a74:1e76 with SMTP id h6-20020a056808014600b003bd4a741e76mr3937879oie.56.1706769857061; Wed, 31 Jan 2024 22:44:17 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706769857; cv=pass; d=google.com; s=arc-20160816; b=S7isjGeZ1XyAIai/sdylPLpvTowCvhPsIWlrIswIiWaYBOTFPe1u59xheKpWotYjy8 0wHW4tEeqk2ipFdIhnRuCwCU7K1CINBG2wAZBt+xQn9yXtKRorKWnODg9dKaHQoLcSkA /wNArtGYSz5YsAWqnOPcugK526ybFjVKQ6SF5Gkf5YdmLbQ0T8uaALKLUU2C87rCszYC vr4FW7g0JRZ1enbyZPdioNh3G0cJayJfG9v+41IMHMlDbkjPIwsH8kvzA8mgm9j29uZN /Vg0cMiYgrRGhWxZib2dDyQS8KbbqdmFol9pe0bb//5kd+xtfTSZ2TxXtBfWvmmeKb5R y/KA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :subject:date:from:dkim-signature; bh=yKa7kvpXjsFDP6A15VTktM7IVUcEVRchGZclVVqoNrQ=; fh=x0uAI1w3B2ZDgGNiL3YqJxsVRNb7BFBqdrfsLQSMoS0=; b=gygXIVCqeHn3ASbZUN3lrqrFhFuvm9Tjwk1fqhkWJYc/0Q/Ba0SZN1vg4fRanVCp9Z lndaaVvdEmbMPSDMP7p0NuQL5PMeCU+zinVj8/tBJDBOHijSOBocb70P8z0rDFgCoGYA Ao55VYXLzlyFJlcuHbvKl+Tag14+PQ0GqYTtb/vjSbXSYM/6FCuwJVCYh0aJkLbM1VLN eSzELkFjgalWiX+eci7jVDIfFTLTK1BPGleITgMESahZkfjVHmuCrDYJ2/Nlo72puM4A ydT7DVOKtRKDeetHW6d0IQlj8JZqDCSAMlEwBvnLBcoTY7KPec9SNdZCfr5n3l/VVT/h I0mA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=H88Eiw37; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-47710-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47710-ouuuleilei=gmail.com@vger.kernel.org" X-Forwarded-Encrypted: i=1; AJvYcCV8JpgasFUU2bWIvVXwlG6uyUETEDXFtDdGG5xc9+YqddOl2tN9lR7kAzhRApDRlCUcdp7GSXDNn3/SA8OYKgige71dyg== Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id q12-20020a056a00084c00b006dbac97dfccsi11853496pfk.141.2024.01.31.22.44.16 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 22:44:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-47710-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20230601.gappssmtp.com header.s=20230601 header.b=H88Eiw37; arc=pass (i=1 spf=pass spfdomain=rivosinc.com dkim=pass dkdomain=rivosinc-com.20230601.gappssmtp.com); spf=pass (google.com: domain of linux-kernel+bounces-47710-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-47710-ouuuleilei=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id DDA6E28DC87 for <ouuuleilei@gmail.com>; Thu, 1 Feb 2024 06:42:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1560C159586; Thu, 1 Feb 2024 06:40:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b="H88Eiw37" Received: from mail-oi1-f170.google.com (mail-oi1-f170.google.com [209.85.167.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F068E1586D5 for <linux-kernel@vger.kernel.org>; Thu, 1 Feb 2024 06:40:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.167.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706769649; cv=none; b=qDXKkZcomVWPfvXErrq7wU9TB+pnzQ06LxKVU4mQvp7/X4WL7EYRkLWEkEkyHUygcYxgOdfVJvrhp9G4weMLRYra/eSOGYCCa1asdklUcA/oxuBEsGwnGaK75rMfCbwCzhjv5NqfC3TX2DbTKKYYk4+LqTAuyaj+ifhXujmSxOU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706769649; c=relaxed/simple; bh=wzSLJadpkHjmp9X2BoqvKpkVX4rq2iE3DOomMnQVqdY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=uWRM1Hie+lVlg08hFGES2Mxw7squl6d3P8Xn4IUbi/bflpj90mysHenVW0gxrJ8HNJb+6+Apqg5SSqIuI8VY6fSlui1jFnzmsSSBRbxhxq8MjubvDeqxnxtv2DjitiCNoSqgLVLp5+OGdrDXmPqilWulho5aLbrIb7X7j+8Q9xs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com; spf=pass smtp.mailfrom=rivosinc.com; dkim=pass (2048-bit key) header.d=rivosinc-com.20230601.gappssmtp.com header.i=@rivosinc-com.20230601.gappssmtp.com header.b=H88Eiw37; arc=none smtp.client-ip=209.85.167.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=rivosinc.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=rivosinc.com Received: by mail-oi1-f170.google.com with SMTP id 5614622812f47-3bba50cd318so508744b6e.0 for <linux-kernel@vger.kernel.org>; Wed, 31 Jan 2024 22:40:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1706769647; x=1707374447; darn=vger.kernel.org; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:from:to:cc:subject:date:message-id :reply-to; bh=yKa7kvpXjsFDP6A15VTktM7IVUcEVRchGZclVVqoNrQ=; b=H88Eiw3773xOC7trmqvPF/7voSs4bn5tr8lZBn48rAp+zRN4tUrMpE7ERNUa0ryDgD OdHx7UF9NL9RQLqaz9SYrTGxfyeF94x3qtz4areb4tV8ZG4WmXzs1LwGPCC2OSpco1uf a48s1JRZ/fNR1ITmEcWOpDuHM+JiaUbo1b9uLBuNO6csGQaJ21q3VOWPuifmXaXQ23TU COBJsNIQOGV29/ZzLc0j1PaBl3kJGQxqY7Xft90akGqrDKTVizEp3pc2PwCS5sqGLh9E Zs2oBFH9YLyMbaqWwb28tvlKIl6WG1l/WKexENxGNwkHcaxYo096AQ2o1X6jyHXM/g01 0+yQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706769647; x=1707374447; h=cc:to:in-reply-to:references:message-id:content-transfer-encoding :mime-version:subject:date:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yKa7kvpXjsFDP6A15VTktM7IVUcEVRchGZclVVqoNrQ=; b=oktbu0FDo77RUDQ5MOQnxl0khCCRaRC1LrPpJujHDYE4gfY5k9IIZpLV3eRO8oDng8 KG3QiKilGHWyYa2rhg2C2uczNWavdI70ftY8zpEevbFY62UmpQJjjClxLfRp/YzVoWcA JYGiJpb07NTSe0nOVtUxfbspbs+JkNlfsAaMYzCGg7Fu25Sjzfk37zLIZfsG46VbWPSV BzbwBKVSfBPbbz+YNtAVpsrUpcT0av+1P2i/uR5eA5e2+LeaQYBrL8LsYAfTEqHCE38o VuK8G99WCtEVMxUCX5vhqg5el6+IG68uezytobQZfQLzXhPIIeY1wRJ/6KOppQrhLo+D Bi0Q== X-Gm-Message-State: AOJu0Yyvsl2aSuSIDWD6ulKArxV+60mdoo0fXL3S3+3JvQTAhNpuNCMQ GnbJE+X2pjmix0D9Wfb+MZJG8e/teNKH5L6uvBbitIObPnVcNEH+v0RwW11bSM4= X-Received: by 2002:a05:6808:ec2:b0:3bd:db8e:b1d8 with SMTP id q2-20020a0568080ec200b003bddb8eb1d8mr4149440oiv.31.1706769646931; Wed, 31 Jan 2024 22:40:46 -0800 (PST) Received: from charlie.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id y9-20020aa79e09000000b006ddc7af02c1sm10925764pfq.9.2024.01.31.22.40.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 31 Jan 2024 22:40:46 -0800 (PST) From: Charlie Jenkins <charlie@rivosinc.com> Date: Wed, 31 Jan 2024 22:40:23 -0800 Subject: [PATCH 2/2] riscv: Disable misaligned access probe when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20240131-disable_misaligned_probe_config-v1-2-98d155e9cda8@rivosinc.com> References: <20240131-disable_misaligned_probe_config-v1-0-98d155e9cda8@rivosinc.com> In-Reply-To: <20240131-disable_misaligned_probe_config-v1-0-98d155e9cda8@rivosinc.com> To: Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, Jisheng Zhang <jszhang@kernel.org>, Evan Green <evan@rivosinc.com> Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Charlie Jenkins <charlie@rivosinc.com> X-Mailer: b4 0.12.3 X-Developer-Signature: v=1; a=ed25519-sha256; t=1706769643; l=3922; i=charlie@rivosinc.com; s=20231120; h=from:subject:message-id; bh=wzSLJadpkHjmp9X2BoqvKpkVX4rq2iE3DOomMnQVqdY=; b=DTn4IwM4ZIGuqtofPt6Obk2d6Ns8T0POg2dAYgBso9mTIalwSw7BMF2KHtVOxaqOzJGsrbvkF /jb1oVOA2joBeZ7x66cEr8DJ1S+QAdH1efsmP1wIb6hgiOKp5c/RKnH X-Developer-Key: i=charlie@rivosinc.com; a=ed25519; pk=t4RSWpMV1q5lf/NWIeR9z58bcje60/dbtxxmoSfBEcs= X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789677909667517819 X-GMAIL-MSGID: 1789677909667517819 |
Series |
riscv: Use CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS to set misaligned access speed
|
|
Commit Message
Charlie Jenkins
Feb. 1, 2024, 6:40 a.m. UTC
When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be
set to have fast misaligned access without needing to probe.
Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
---
arch/riscv/include/asm/cpufeature.h | 7 +++++++
arch/riscv/kernel/cpufeature.c | 4 ++++
arch/riscv/kernel/sys_hwprobe.c | 4 ++++
arch/riscv/kernel/traps_misaligned.c | 4 ++++
4 files changed, 19 insertions(+)
Comments
On 01/02/2024 07:40, Charlie Jenkins wrote: > When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be > set to have fast misaligned access without needing to probe. > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > --- > arch/riscv/include/asm/cpufeature.h | 7 +++++++ > arch/riscv/kernel/cpufeature.c | 4 ++++ > arch/riscv/kernel/sys_hwprobe.c | 4 ++++ > arch/riscv/kernel/traps_misaligned.c | 4 ++++ > 4 files changed, 19 insertions(+) > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > index dfdcca229174..7d8d64783e38 100644 > --- a/arch/riscv/include/asm/cpufeature.h > +++ b/arch/riscv/include/asm/cpufeature.h > @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi > return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); > } > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); > > static __always_inline bool has_fast_misaligned_accesses(void) > { > return static_branch_likely(&fast_misaligned_access_speed_key); > } > +#else > +static __always_inline bool has_fast_misaligned_accesses(void) > +{ > + return true; > +} > +#endif > #endif > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > index 89920f84d0a3..d787846c0b68 100644 > --- a/arch/riscv/kernel/cpufeature.c > +++ b/arch/riscv/kernel/cpufeature.c > @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; > /* Per-cpu ISA extensions. */ > struct riscv_isainfo hart_isa[NR_CPUS]; > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > /* Performance information */ > DEFINE_PER_CPU(long, misaligned_access_speed); > > static cpumask_t fast_misaligned_access; > +#endif > > /** > * riscv_isa_extension_base() - Get base extension word > @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) > return hwcap; > } > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > static int check_unaligned_access(void *param) > { > int cpu = smp_processor_id(); > @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) > } > > arch_initcall(check_unaligned_access_all_cpus); > +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ > > void riscv_user_isa_enable(void) > { Hi Charlie, Generally, having so much ifdef in various pieces of code is probably not a good idea. AFAICT, if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is enabled, the whole misaligned access speed checking could be opt-out. which means that probably everything related to misaligned accesses should be moved in it's own file build it only for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=n only. > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > index a7c56b41efd2..3f1a6edfdb08 100644 > --- a/arch/riscv/kernel/sys_hwprobe.c > +++ b/arch/riscv/kernel/sys_hwprobe.c > @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) > > static u64 hwprobe_misaligned(const struct cpumask *cpus) > { > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > int cpu; > u64 perf = -1ULL; > > @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) > return RISCV_HWPROBE_MISALIGNED_UNKNOWN; > > return perf; > +#else > + return RISCV_HWPROBE_MISALIGNED_FAST; > +#endif > } > > static void hwprobe_one_pair(struct riscv_hwprobe *pair, > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned> index 8ded225e8c5b..c24f79d769f6 100644 > --- a/arch/riscv/kernel/traps_misaligned.c > +++ b/arch/riscv/kernel/traps_misaligned.c > @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; > +#endif I think that rather using ifdefery inside this file (traps_misaligned.c) it can be totally opt-out in case we have CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS since it implies that misaligned accesses are not emulated (at least that is my understanding). Thanks, Clément > > if (!unaligned_enabled) > return -1; > @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) > return 0; > } > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > bool check_unaligned_access_emulated(int cpu) > { > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) > } > unaligned_ctl = true; > } > +#endif > > bool unaligned_ctl_available(void) > { >
On Thu, Feb 01, 2024 at 02:43:43PM +0100, Clément Léger wrote: > > > On 01/02/2024 07:40, Charlie Jenkins wrote: > > When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be > > set to have fast misaligned access without needing to probe. > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > --- > > arch/riscv/include/asm/cpufeature.h | 7 +++++++ > > arch/riscv/kernel/cpufeature.c | 4 ++++ > > arch/riscv/kernel/sys_hwprobe.c | 4 ++++ > > arch/riscv/kernel/traps_misaligned.c | 4 ++++ > > 4 files changed, 19 insertions(+) > > > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > > index dfdcca229174..7d8d64783e38 100644 > > --- a/arch/riscv/include/asm/cpufeature.h > > +++ b/arch/riscv/include/asm/cpufeature.h > > @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi > > return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); > > } > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); > > > > static __always_inline bool has_fast_misaligned_accesses(void) > > { > > return static_branch_likely(&fast_misaligned_access_speed_key); > > } > > +#else > > +static __always_inline bool has_fast_misaligned_accesses(void) > > +{ > > + return true; > > +} > > +#endif > > #endif > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > > index 89920f84d0a3..d787846c0b68 100644 > > --- a/arch/riscv/kernel/cpufeature.c > > +++ b/arch/riscv/kernel/cpufeature.c > > @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; > > /* Per-cpu ISA extensions. */ > > struct riscv_isainfo hart_isa[NR_CPUS]; > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > /* Performance information */ > > DEFINE_PER_CPU(long, misaligned_access_speed); > > > > static cpumask_t fast_misaligned_access; > > +#endif > > > > /** > > * riscv_isa_extension_base() - Get base extension word > > @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) > > return hwcap; > > } > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > static int check_unaligned_access(void *param) > > { > > int cpu = smp_processor_id(); > > @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) > > } > > > > arch_initcall(check_unaligned_access_all_cpus); > > +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ > > > > void riscv_user_isa_enable(void) > > { > > Hi Charlie, > > Generally, having so much ifdef in various pieces of code is probably > not a good idea. > > AFAICT, if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is enabled, the whole > misaligned access speed checking could be opt-out. which means that > probably everything related to misaligned accesses should be moved in > it's own file build it only for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=n > only. I will look into doing something more clever here! I agree it is not very nice to have so many ifdefs scattered. > > > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > > index a7c56b41efd2..3f1a6edfdb08 100644 > > --- a/arch/riscv/kernel/sys_hwprobe.c > > +++ b/arch/riscv/kernel/sys_hwprobe.c > > @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) > > > > static u64 hwprobe_misaligned(const struct cpumask *cpus) > > { > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > int cpu; > > u64 perf = -1ULL; > > > > @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) > > return RISCV_HWPROBE_MISALIGNED_UNKNOWN; > > > > return perf; > > +#else > > + return RISCV_HWPROBE_MISALIGNED_FAST; > > +#endif > > } > > > > static void hwprobe_one_pair(struct riscv_hwprobe *pair, > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned> index 8ded225e8c5b..c24f79d769f6 100644 > > --- a/arch/riscv/kernel/traps_misaligned.c > > +++ b/arch/riscv/kernel/traps_misaligned.c > > @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) > > > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; > > +#endif > > I think that rather using ifdefery inside this file (traps_misaligned.c) > it can be totally opt-out in case we have > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS since it implies that misaligned > accesses are not emulated (at least that is my understanding). > That's a great idea, I believe that is correct. - Charlie > Thanks, > > Clément > > > > > > if (!unaligned_enabled) > > return -1; > > @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) > > return 0; > > } > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > bool check_unaligned_access_emulated(int cpu) > > { > > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > > @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) > > } > > unaligned_ctl = true; > > } > > +#endif > > > > bool unaligned_ctl_available(void) > > { > > > > >
I am a little confused here - I was testing with 6.8-rc1 and it didn't seem to have the behavior of performing the probe (The probe kills boot performance in my application and I've had to patch out the probe in mid-6.x kernels). Did something get reverted to bring back the probe even when CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=Y between rc1 and trunk? Or am I misremembering/accidentally patched? On Thu, Feb 1, 2024 at 11:10 AM Charlie Jenkins <charlie@rivosinc.com> wrote: > > On Thu, Feb 01, 2024 at 02:43:43PM +0100, Clément Léger wrote: > > > > > > On 01/02/2024 07:40, Charlie Jenkins wrote: > > > When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be > > > set to have fast misaligned access without needing to probe. > > > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > > --- > > > arch/riscv/include/asm/cpufeature.h | 7 +++++++ > > > arch/riscv/kernel/cpufeature.c | 4 ++++ > > > arch/riscv/kernel/sys_hwprobe.c | 4 ++++ > > > arch/riscv/kernel/traps_misaligned.c | 4 ++++ > > > 4 files changed, 19 insertions(+) > > > > > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > > > index dfdcca229174..7d8d64783e38 100644 > > > --- a/arch/riscv/include/asm/cpufeature.h > > > +++ b/arch/riscv/include/asm/cpufeature.h > > > @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi > > > return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); > > > } > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); > > > > > > static __always_inline bool has_fast_misaligned_accesses(void) > > > { > > > return static_branch_likely(&fast_misaligned_access_speed_key); > > > } > > > +#else > > > +static __always_inline bool has_fast_misaligned_accesses(void) > > > +{ > > > + return true; > > > +} > > > +#endif > > > #endif > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > > > index 89920f84d0a3..d787846c0b68 100644 > > > --- a/arch/riscv/kernel/cpufeature.c > > > +++ b/arch/riscv/kernel/cpufeature.c > > > @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; > > > /* Per-cpu ISA extensions. */ > > > struct riscv_isainfo hart_isa[NR_CPUS]; > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > /* Performance information */ > > > DEFINE_PER_CPU(long, misaligned_access_speed); > > > > > > static cpumask_t fast_misaligned_access; > > > +#endif > > > > > > /** > > > * riscv_isa_extension_base() - Get base extension word > > > @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) > > > return hwcap; > > > } > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > static int check_unaligned_access(void *param) > > > { > > > int cpu = smp_processor_id(); > > > @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) > > > } > > > > > > arch_initcall(check_unaligned_access_all_cpus); > > > +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ > > > > > > void riscv_user_isa_enable(void) > > > { > > > > Hi Charlie, > > > > Generally, having so much ifdef in various pieces of code is probably > > not a good idea. > > > > AFAICT, if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is enabled, the whole > > misaligned access speed checking could be opt-out. which means that > > probably everything related to misaligned accesses should be moved in > > it's own file build it only for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=n > > only. > > I will look into doing something more clever here! I agree it is not > very nice to have so many ifdefs scattered. > > > > > > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > > > index a7c56b41efd2..3f1a6edfdb08 100644 > > > --- a/arch/riscv/kernel/sys_hwprobe.c > > > +++ b/arch/riscv/kernel/sys_hwprobe.c > > > @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) > > > > > > static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > { > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > int cpu; > > > u64 perf = -1ULL; > > > > > > @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > return RISCV_HWPROBE_MISALIGNED_UNKNOWN; > > > > > > return perf; > > > +#else > > > + return RISCV_HWPROBE_MISALIGNED_FAST; > > > +#endif > > > } > > > > > > static void hwprobe_one_pair(struct riscv_hwprobe *pair, > > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned> index 8ded225e8c5b..c24f79d769f6 100644 > > > --- a/arch/riscv/kernel/traps_misaligned.c > > > +++ b/arch/riscv/kernel/traps_misaligned.c > > > @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) > > > > > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; > > > +#endif > > > > I think that rather using ifdefery inside this file (traps_misaligned.c) > > it can be totally opt-out in case we have > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS since it implies that misaligned > > accesses are not emulated (at least that is my understanding). > > > > That's a great idea, I believe that is correct. > > - Charlie > > > Thanks, > > > > Clément > > > > > > > > > > if (!unaligned_enabled) > > > return -1; > > > @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) > > > return 0; > > > } > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > bool check_unaligned_access_emulated(int cpu) > > > { > > > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > > > @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) > > > } > > > unaligned_ctl = true; > > > } > > > +#endif > > > > > > bool unaligned_ctl_available(void) > > > { > > > > > > > > > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Thu, Feb 01, 2024 at 11:57:04AM -0800, Charles Lohr wrote: > I am a little confused here - I was testing with 6.8-rc1 and it didn't > seem to have the behavior of performing the probe (The probe kills > boot performance in my application and I've had to patch out the probe > in mid-6.x kernels). > > Did something get reverted to bring back the probe even when > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=Y between rc1 and trunk? Or am > I misremembering/accidentally patched? After pulling a clean version of 6.8-rc1 and setting CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS I still see the probe happen. Before sending this I looked for a patch that disabled the probe but was unable to find one, if there exists a patch can you point me to it? - Charlie > > On Thu, Feb 1, 2024 at 11:10 AM Charlie Jenkins <charlie@rivosinc.com> wrote: > > > > On Thu, Feb 01, 2024 at 02:43:43PM +0100, Clément Léger wrote: > > > > > > > > > On 01/02/2024 07:40, Charlie Jenkins wrote: > > > > When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be > > > > set to have fast misaligned access without needing to probe. > > > > > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > > > --- > > > > arch/riscv/include/asm/cpufeature.h | 7 +++++++ > > > > arch/riscv/kernel/cpufeature.c | 4 ++++ > > > > arch/riscv/kernel/sys_hwprobe.c | 4 ++++ > > > > arch/riscv/kernel/traps_misaligned.c | 4 ++++ > > > > 4 files changed, 19 insertions(+) > > > > > > > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > > > > index dfdcca229174..7d8d64783e38 100644 > > > > --- a/arch/riscv/include/asm/cpufeature.h > > > > +++ b/arch/riscv/include/asm/cpufeature.h > > > > @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi > > > > return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); > > > > } > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); > > > > > > > > static __always_inline bool has_fast_misaligned_accesses(void) > > > > { > > > > return static_branch_likely(&fast_misaligned_access_speed_key); > > > > } > > > > +#else > > > > +static __always_inline bool has_fast_misaligned_accesses(void) > > > > +{ > > > > + return true; > > > > +} > > > > +#endif > > > > #endif > > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > > > > index 89920f84d0a3..d787846c0b68 100644 > > > > --- a/arch/riscv/kernel/cpufeature.c > > > > +++ b/arch/riscv/kernel/cpufeature.c > > > > @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; > > > > /* Per-cpu ISA extensions. */ > > > > struct riscv_isainfo hart_isa[NR_CPUS]; > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > /* Performance information */ > > > > DEFINE_PER_CPU(long, misaligned_access_speed); > > > > > > > > static cpumask_t fast_misaligned_access; > > > > +#endif > > > > > > > > /** > > > > * riscv_isa_extension_base() - Get base extension word > > > > @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) > > > > return hwcap; > > > > } > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > static int check_unaligned_access(void *param) > > > > { > > > > int cpu = smp_processor_id(); > > > > @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) > > > > } > > > > > > > > arch_initcall(check_unaligned_access_all_cpus); > > > > +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ > > > > > > > > void riscv_user_isa_enable(void) > > > > { > > > > > > Hi Charlie, > > > > > > Generally, having so much ifdef in various pieces of code is probably > > > not a good idea. > > > > > > AFAICT, if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is enabled, the whole > > > misaligned access speed checking could be opt-out. which means that > > > probably everything related to misaligned accesses should be moved in > > > it's own file build it only for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=n > > > only. > > > > I will look into doing something more clever here! I agree it is not > > very nice to have so many ifdefs scattered. > > > > > > > > > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > > > > index a7c56b41efd2..3f1a6edfdb08 100644 > > > > --- a/arch/riscv/kernel/sys_hwprobe.c > > > > +++ b/arch/riscv/kernel/sys_hwprobe.c > > > > @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) > > > > > > > > static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > > { > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > int cpu; > > > > u64 perf = -1ULL; > > > > > > > > @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > > return RISCV_HWPROBE_MISALIGNED_UNKNOWN; > > > > > > > > return perf; > > > > +#else > > > > + return RISCV_HWPROBE_MISALIGNED_FAST; > > > > +#endif > > > > } > > > > > > > > static void hwprobe_one_pair(struct riscv_hwprobe *pair, > > > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned> index 8ded225e8c5b..c24f79d769f6 100644 > > > > --- a/arch/riscv/kernel/traps_misaligned.c > > > > +++ b/arch/riscv/kernel/traps_misaligned.c > > > > @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) > > > > > > > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; > > > > +#endif > > > > > > I think that rather using ifdefery inside this file (traps_misaligned.c) > > > it can be totally opt-out in case we have > > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS since it implies that misaligned > > > accesses are not emulated (at least that is my understanding). > > > > > > > That's a great idea, I believe that is correct. > > > > - Charlie > > > > > Thanks, > > > > > > Clément > > > > > > > > > > > > > > if (!unaligned_enabled) > > > > return -1; > > > > @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) > > > > return 0; > > > > } > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > bool check_unaligned_access_emulated(int cpu) > > > > { > > > > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > > > > @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) > > > > } > > > > unaligned_ctl = true; > > > > } > > > > +#endif > > > > > > > > bool unaligned_ctl_available(void) > > > > { > > > > > > > > > > > > > > > > > _______________________________________________ > > linux-riscv mailing list > > linux-riscv@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-riscv
I am very sorry for wasting your time - I did have it patched out in the build system here. I can't wait for this feature to land, so I can enjoy faster boot times without a patch. Charles On Thu, Feb 1, 2024 at 12:47 PM Charlie Jenkins <charlie@rivosinc.com> wrote: > > On Thu, Feb 01, 2024 at 11:57:04AM -0800, Charles Lohr wrote: > > I am a little confused here - I was testing with 6.8-rc1 and it didn't > > seem to have the behavior of performing the probe (The probe kills > > boot performance in my application and I've had to patch out the probe > > in mid-6.x kernels). > > > > Did something get reverted to bring back the probe even when > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=Y between rc1 and trunk? Or am > > I misremembering/accidentally patched? > > After pulling a clean version of 6.8-rc1 and setting > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS I still see the probe happen. > Before sending this I looked for a patch that disabled the probe but was > unable to find one, if there exists a patch can you point me to it? > > - Charlie > > > > > On Thu, Feb 1, 2024 at 11:10 AM Charlie Jenkins <charlie@rivosinc.com> wrote: > > > > > > On Thu, Feb 01, 2024 at 02:43:43PM +0100, Clément Léger wrote: > > > > > > > > > > > > On 01/02/2024 07:40, Charlie Jenkins wrote: > > > > > When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be > > > > > set to have fast misaligned access without needing to probe. > > > > > > > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > > > > --- > > > > > arch/riscv/include/asm/cpufeature.h | 7 +++++++ > > > > > arch/riscv/kernel/cpufeature.c | 4 ++++ > > > > > arch/riscv/kernel/sys_hwprobe.c | 4 ++++ > > > > > arch/riscv/kernel/traps_misaligned.c | 4 ++++ > > > > > 4 files changed, 19 insertions(+) > > > > > > > > > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > > > > > index dfdcca229174..7d8d64783e38 100644 > > > > > --- a/arch/riscv/include/asm/cpufeature.h > > > > > +++ b/arch/riscv/include/asm/cpufeature.h > > > > > @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi > > > > > return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); > > > > > } > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); > > > > > > > > > > static __always_inline bool has_fast_misaligned_accesses(void) > > > > > { > > > > > return static_branch_likely(&fast_misaligned_access_speed_key); > > > > > } > > > > > +#else > > > > > +static __always_inline bool has_fast_misaligned_accesses(void) > > > > > +{ > > > > > + return true; > > > > > +} > > > > > +#endif > > > > > #endif > > > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > > > > > index 89920f84d0a3..d787846c0b68 100644 > > > > > --- a/arch/riscv/kernel/cpufeature.c > > > > > +++ b/arch/riscv/kernel/cpufeature.c > > > > > @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; > > > > > /* Per-cpu ISA extensions. */ > > > > > struct riscv_isainfo hart_isa[NR_CPUS]; > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > /* Performance information */ > > > > > DEFINE_PER_CPU(long, misaligned_access_speed); > > > > > > > > > > static cpumask_t fast_misaligned_access; > > > > > +#endif > > > > > > > > > > /** > > > > > * riscv_isa_extension_base() - Get base extension word > > > > > @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) > > > > > return hwcap; > > > > > } > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > static int check_unaligned_access(void *param) > > > > > { > > > > > int cpu = smp_processor_id(); > > > > > @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) > > > > > } > > > > > > > > > > arch_initcall(check_unaligned_access_all_cpus); > > > > > +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ > > > > > > > > > > void riscv_user_isa_enable(void) > > > > > { > > > > > > > > Hi Charlie, > > > > > > > > Generally, having so much ifdef in various pieces of code is probably > > > > not a good idea. > > > > > > > > AFAICT, if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is enabled, the whole > > > > misaligned access speed checking could be opt-out. which means that > > > > probably everything related to misaligned accesses should be moved in > > > > it's own file build it only for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=n > > > > only. > > > > > > I will look into doing something more clever here! I agree it is not > > > very nice to have so many ifdefs scattered. > > > > > > > > > > > > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > > > > > index a7c56b41efd2..3f1a6edfdb08 100644 > > > > > --- a/arch/riscv/kernel/sys_hwprobe.c > > > > > +++ b/arch/riscv/kernel/sys_hwprobe.c > > > > > @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) > > > > > > > > > > static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > > > { > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > int cpu; > > > > > u64 perf = -1ULL; > > > > > > > > > > @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > > > return RISCV_HWPROBE_MISALIGNED_UNKNOWN; > > > > > > > > > > return perf; > > > > > +#else > > > > > + return RISCV_HWPROBE_MISALIGNED_FAST; > > > > > +#endif > > > > > } > > > > > > > > > > static void hwprobe_one_pair(struct riscv_hwprobe *pair, > > > > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned> index 8ded225e8c5b..c24f79d769f6 100644 > > > > > --- a/arch/riscv/kernel/traps_misaligned.c > > > > > +++ b/arch/riscv/kernel/traps_misaligned.c > > > > > @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) > > > > > > > > > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; > > > > > +#endif > > > > > > > > I think that rather using ifdefery inside this file (traps_misaligned.c) > > > > it can be totally opt-out in case we have > > > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS since it implies that misaligned > > > > accesses are not emulated (at least that is my understanding). > > > > > > > > > > That's a great idea, I believe that is correct. > > > > > > - Charlie > > > > > > > Thanks, > > > > > > > > Clément > > > > > > > > > > > > > > > > > > if (!unaligned_enabled) > > > > > return -1; > > > > > @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) > > > > > return 0; > > > > > } > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > bool check_unaligned_access_emulated(int cpu) > > > > > { > > > > > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > > > > > @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) > > > > > } > > > > > unaligned_ctl = true; > > > > > } > > > > > +#endif > > > > > > > > > > bool unaligned_ctl_available(void) > > > > > { > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > linux-riscv mailing list > > > linux-riscv@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Thu, Feb 01, 2024 at 01:39:53PM -0800, Charles Lohr wrote: > I am very sorry for wasting your time - I did have it patched out in > the build system here. I can't wait for this feature to land, so I can > enjoy faster boot times without a patch. > > Charles No worries! - Charlie > > On Thu, Feb 1, 2024 at 12:47 PM Charlie Jenkins <charlie@rivosinc.com> wrote: > > > > On Thu, Feb 01, 2024 at 11:57:04AM -0800, Charles Lohr wrote: > > > I am a little confused here - I was testing with 6.8-rc1 and it didn't > > > seem to have the behavior of performing the probe (The probe kills > > > boot performance in my application and I've had to patch out the probe > > > in mid-6.x kernels). > > > > > > Did something get reverted to bring back the probe even when > > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=Y between rc1 and trunk? Or am > > > I misremembering/accidentally patched? > > > > After pulling a clean version of 6.8-rc1 and setting > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS I still see the probe happen. > > Before sending this I looked for a patch that disabled the probe but was > > unable to find one, if there exists a patch can you point me to it? > > > > - Charlie > > > > > > > > On Thu, Feb 1, 2024 at 11:10 AM Charlie Jenkins <charlie@rivosinc.com> wrote: > > > > > > > > On Thu, Feb 01, 2024 at 02:43:43PM +0100, Clément Léger wrote: > > > > > > > > > > > > > > > On 01/02/2024 07:40, Charlie Jenkins wrote: > > > > > > When CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is selected, the cpus can be > > > > > > set to have fast misaligned access without needing to probe. > > > > > > > > > > > > Signed-off-by: Charlie Jenkins <charlie@rivosinc.com> > > > > > > --- > > > > > > arch/riscv/include/asm/cpufeature.h | 7 +++++++ > > > > > > arch/riscv/kernel/cpufeature.c | 4 ++++ > > > > > > arch/riscv/kernel/sys_hwprobe.c | 4 ++++ > > > > > > arch/riscv/kernel/traps_misaligned.c | 4 ++++ > > > > > > 4 files changed, 19 insertions(+) > > > > > > > > > > > > diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h > > > > > > index dfdcca229174..7d8d64783e38 100644 > > > > > > --- a/arch/riscv/include/asm/cpufeature.h > > > > > > +++ b/arch/riscv/include/asm/cpufeature.h > > > > > > @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi > > > > > > return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); > > > > > > } > > > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > > DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); > > > > > > > > > > > > static __always_inline bool has_fast_misaligned_accesses(void) > > > > > > { > > > > > > return static_branch_likely(&fast_misaligned_access_speed_key); > > > > > > } > > > > > > +#else > > > > > > +static __always_inline bool has_fast_misaligned_accesses(void) > > > > > > +{ > > > > > > + return true; > > > > > > +} > > > > > > +#endif > > > > > > #endif > > > > > > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c > > > > > > index 89920f84d0a3..d787846c0b68 100644 > > > > > > --- a/arch/riscv/kernel/cpufeature.c > > > > > > +++ b/arch/riscv/kernel/cpufeature.c > > > > > > @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; > > > > > > /* Per-cpu ISA extensions. */ > > > > > > struct riscv_isainfo hart_isa[NR_CPUS]; > > > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > > /* Performance information */ > > > > > > DEFINE_PER_CPU(long, misaligned_access_speed); > > > > > > > > > > > > static cpumask_t fast_misaligned_access; > > > > > > +#endif > > > > > > > > > > > > /** > > > > > > * riscv_isa_extension_base() - Get base extension word > > > > > > @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) > > > > > > return hwcap; > > > > > > } > > > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > > static int check_unaligned_access(void *param) > > > > > > { > > > > > > int cpu = smp_processor_id(); > > > > > > @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) > > > > > > } > > > > > > > > > > > > arch_initcall(check_unaligned_access_all_cpus); > > > > > > +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ > > > > > > > > > > > > void riscv_user_isa_enable(void) > > > > > > { > > > > > > > > > > Hi Charlie, > > > > > > > > > > Generally, having so much ifdef in various pieces of code is probably > > > > > not a good idea. > > > > > > > > > > AFAICT, if CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS is enabled, the whole > > > > > misaligned access speed checking could be opt-out. which means that > > > > > probably everything related to misaligned accesses should be moved in > > > > > it's own file build it only for CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=n > > > > > only. > > > > > > > > I will look into doing something more clever here! I agree it is not > > > > very nice to have so many ifdefs scattered. > > > > > > > > > > > > > > > diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c > > > > > > index a7c56b41efd2..3f1a6edfdb08 100644 > > > > > > --- a/arch/riscv/kernel/sys_hwprobe.c > > > > > > +++ b/arch/riscv/kernel/sys_hwprobe.c > > > > > > @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) > > > > > > > > > > > > static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > > > > { > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > > int cpu; > > > > > > u64 perf = -1ULL; > > > > > > > > > > > > @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) > > > > > > return RISCV_HWPROBE_MISALIGNED_UNKNOWN; > > > > > > > > > > > > return perf; > > > > > > +#else > > > > > > + return RISCV_HWPROBE_MISALIGNED_FAST; > > > > > > +#endif > > > > > > } > > > > > > > > > > > > static void hwprobe_one_pair(struct riscv_hwprobe *pair, > > > > > > diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned> index 8ded225e8c5b..c24f79d769f6 100644 > > > > > > --- a/arch/riscv/kernel/traps_misaligned.c > > > > > > +++ b/arch/riscv/kernel/traps_misaligned.c > > > > > > @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) > > > > > > > > > > > > perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); > > > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > > *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; > > > > > > +#endif > > > > > > > > > > I think that rather using ifdefery inside this file (traps_misaligned.c) > > > > > it can be totally opt-out in case we have > > > > > CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS since it implies that misaligned > > > > > accesses are not emulated (at least that is my understanding). > > > > > > > > > > > > > That's a great idea, I believe that is correct. > > > > > > > > - Charlie > > > > > > > > > Thanks, > > > > > > > > > > Clément > > > > > > > > > > > > > > > > > > > > > > if (!unaligned_enabled) > > > > > > return -1; > > > > > > @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) > > > > > > return 0; > > > > > > } > > > > > > > > > > > > +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS > > > > > > bool check_unaligned_access_emulated(int cpu) > > > > > > { > > > > > > long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); > > > > > > @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) > > > > > > } > > > > > > unaligned_ctl = true; > > > > > > } > > > > > > +#endif > > > > > > > > > > > > bool unaligned_ctl_available(void) > > > > > > { > > > > > > > > > > > > > > > > > > > > > > > > > > > > > _______________________________________________ > > > > linux-riscv mailing list > > > > linux-riscv@lists.infradead.org > > > > http://lists.infradead.org/mailman/listinfo/linux-riscv
diff --git a/arch/riscv/include/asm/cpufeature.h b/arch/riscv/include/asm/cpufeature.h index dfdcca229174..7d8d64783e38 100644 --- a/arch/riscv/include/asm/cpufeature.h +++ b/arch/riscv/include/asm/cpufeature.h @@ -137,10 +137,17 @@ static __always_inline bool riscv_cpu_has_extension_unlikely(int cpu, const unsi return __riscv_isa_extension_available(hart_isa[cpu].isa, ext); } +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS DECLARE_STATIC_KEY_FALSE(fast_misaligned_access_speed_key); static __always_inline bool has_fast_misaligned_accesses(void) { return static_branch_likely(&fast_misaligned_access_speed_key); } +#else +static __always_inline bool has_fast_misaligned_accesses(void) +{ + return true; +} +#endif #endif diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 89920f84d0a3..d787846c0b68 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -43,10 +43,12 @@ static DECLARE_BITMAP(riscv_isa, RISCV_ISA_EXT_MAX) __read_mostly; /* Per-cpu ISA extensions. */ struct riscv_isainfo hart_isa[NR_CPUS]; +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS /* Performance information */ DEFINE_PER_CPU(long, misaligned_access_speed); static cpumask_t fast_misaligned_access; +#endif /** * riscv_isa_extension_base() - Get base extension word @@ -706,6 +708,7 @@ unsigned long riscv_get_elf_hwcap(void) return hwcap; } +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS static int check_unaligned_access(void *param) { int cpu = smp_processor_id(); @@ -946,6 +949,7 @@ static int check_unaligned_access_all_cpus(void) } arch_initcall(check_unaligned_access_all_cpus); +#endif /* CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS */ void riscv_user_isa_enable(void) { diff --git a/arch/riscv/kernel/sys_hwprobe.c b/arch/riscv/kernel/sys_hwprobe.c index a7c56b41efd2..3f1a6edfdb08 100644 --- a/arch/riscv/kernel/sys_hwprobe.c +++ b/arch/riscv/kernel/sys_hwprobe.c @@ -149,6 +149,7 @@ static bool hwprobe_ext0_has(const struct cpumask *cpus, unsigned long ext) static u64 hwprobe_misaligned(const struct cpumask *cpus) { +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS int cpu; u64 perf = -1ULL; @@ -168,6 +169,9 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus) return RISCV_HWPROBE_MISALIGNED_UNKNOWN; return perf; +#else + return RISCV_HWPROBE_MISALIGNED_FAST; +#endif } static void hwprobe_one_pair(struct riscv_hwprobe *pair, diff --git a/arch/riscv/kernel/traps_misaligned.c b/arch/riscv/kernel/traps_misaligned.c index 8ded225e8c5b..c24f79d769f6 100644 --- a/arch/riscv/kernel/traps_misaligned.c +++ b/arch/riscv/kernel/traps_misaligned.c @@ -413,7 +413,9 @@ int handle_misaligned_load(struct pt_regs *regs) perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, addr); +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS *this_cpu_ptr(&misaligned_access_speed) = RISCV_HWPROBE_MISALIGNED_EMULATED; +#endif if (!unaligned_enabled) return -1; @@ -596,6 +598,7 @@ int handle_misaligned_store(struct pt_regs *regs) return 0; } +#ifndef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS bool check_unaligned_access_emulated(int cpu) { long *mas_ptr = per_cpu_ptr(&misaligned_access_speed, cpu); @@ -640,6 +643,7 @@ void unaligned_emulation_finish(void) } unaligned_ctl = true; } +#endif bool unaligned_ctl_available(void) {