From patchwork Tue Aug 1 22:11:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evan Green X-Patchwork-Id: 12967 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp94705vqx; Tue, 1 Aug 2023 17:08:20 -0700 (PDT) X-Google-Smtp-Source: APBJJlGULdkkTYktQlyN7O5c2kR+6N8tAe/kd3NapC2JsnOEA43Pu53jegWs2ysPd6MoAfo4kaFV X-Received: by 2002:a05:6870:169c:b0:1bb:8d33:8260 with SMTP id j28-20020a056870169c00b001bb8d338260mr18561749oae.56.1690934899975; Tue, 01 Aug 2023 17:08:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690934899; cv=none; d=google.com; s=arc-20160816; b=fZVxOiF5XxH4QqK6fjViLa+j7WkAh/imTJYSOA17ReegyAPmDTtL5UbT9NAhhDamYR eC+8qEcLAZZgAnGrhoyC1YZVGb1JBim8B41U7NXCrbNDZGWTLAZCfj3GyrtoQSy+zfvg l1aFax4aYyQnEBwDjtIEDEWKjtGnt9No1L4BO9jbDtFOt8oscYLEKtuP8oIC0XzecaEK q0/q9+jN4CCFYVNj3LCnJ2wkEZG+7Lkscp7UPIFV0VZyrb5MAjKJygnvI+QgjesEf4UH qRQT9rVfDmQ1wJXob7qQyDsErkUwYamBd+K9DFQ8L9u1Re1rspYLZlHjQcL7C2PxFWPV kHUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=EphR5J8NlVw9aGESxNNo7nFogyYyjvlP41J+N6zpMN4=; fh=7L1JIjc4dstlwj/b8dGTd4DC+gVdx8r4Y29Tb9eXOQ8=; b=LraihygT4DZEiCIcHTEv29kWCw4E2HlOqJVrRICS7VGUWcZSwZjfsMs/R7vv17XiZ7 ObG7bkWU7PrUCe54Vi7FvNp3QmohVuErervXhjIBVBYq/8wpwj7rdaCdLd3mOqcjb/DU ExyHyKDrr9keCsKbTfHPyLCpfpzEkVc69Ku9T25TGTX9lNm4UKCp/4tby0vylAl/cIJ5 8tGdTZzq+zPbjalELouTdlicqJmfzh1cV42YXLG/yxdrU21L0HwBJuLOxjvR+Do9OGS7 4DGM0khX8T87SPpTPrGhPjxH9RLWGJoO6HDREsQItS/ZUh6n5fDGLZkNMVqDYlo2pwS8 xY8w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=y7NW+zie; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z8-20020a637e08000000b005638b026f6bsi9554647pgc.522.2023.08.01.17.08.06; Tue, 01 Aug 2023 17:08:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20221208.gappssmtp.com header.s=20221208 header.b=y7NW+zie; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232182AbjHAWLx (ORCPT + 99 others); Tue, 1 Aug 2023 18:11:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232445AbjHAWLs (ORCPT ); Tue, 1 Aug 2023 18:11:48 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BAC402122 for ; Tue, 1 Aug 2023 15:11:46 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1bb81809ca8so48891415ad.3 for ; Tue, 01 Aug 2023 15:11:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1690927906; x=1691532706; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=EphR5J8NlVw9aGESxNNo7nFogyYyjvlP41J+N6zpMN4=; b=y7NW+zieDLM/W7+ypsqsCtPiWnnIH04Nh/EnZlNAiWtB1wCcSeS/2P170ExkpuMcat l17yWzv/OURRhmg8ep8l3xYYGWphXO2/LBuJKw64khflFVDpnsB99uJAEJxZzdahvEFa 2mpp6qHBfYF718gM4Eu2uoW7V40xlSnySbdeWewNRXt2JAiq24pH/4XbxyJYHqPvt9aG CbIxHqFnH1IWyAWCBKB/KWDXfTh6RxfEN4uFtLi2VjRCjiXwCIKviU6MiKot4Q8j/sHb luG9DlHBlbQvWXuu0hlqbSP9yR4lffg2dkgu1JU0K9G/8GhaMId+VlauMaMLaJR1iPt4 vM3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690927906; x=1691532706; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=EphR5J8NlVw9aGESxNNo7nFogyYyjvlP41J+N6zpMN4=; b=WvMAx0WS5LwvNqZrTLC4YwnierJAO8kX+/qqis+Zcj38NQmeLcOxePXvwoMNcs3tZR gE56TVw2zWa3R4CaoYXkLg0Kacwr+wWfOx2qkh1UbyGT9vNWf3pVxAOHtqsGJmPIcd58 /yNG4kfwFXjnGN2xs6BFiKR2BDum05TG2Y/CedmmD3rda0vIFTG+gwa1VDNlduhU/ea+ otEOkEEegx9/rm/XfYIRlng/9pFBAI3dHXn0pl8OvurGIeDvZNb/Nv/K7okDJKHbqW4T fd++kU4PBqhNftXwkdfYFltXvWtON9vReUycdQFHxgPQeXaG+kQ017DwskhmBm0BTqKK 4d7Q== X-Gm-Message-State: ABy/qLaFosczE3j51+8hYAvix9XWJPJG7qdjPPbgnu8RozLwrkGN+S8q MqhsHT6kci2sgMzI6v/mZ/Kwig== X-Received: by 2002:a17:903:120a:b0:1b8:b288:626e with SMTP id l10-20020a170903120a00b001b8b288626emr15998535plh.35.1690927906280; Tue, 01 Aug 2023 15:11:46 -0700 (PDT) Received: from evan.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id u9-20020a17090a410900b00268dac826d4sm19586pjf.0.2023.08.01.15.11.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 01 Aug 2023 15:11:45 -0700 (PDT) From: Evan Green To: Palmer Dabbelt Cc: David Laight , Simon Hosie , Evan Green , Albert Ou , Alexandre Ghiti , Andrew Jones , Andy Chiu , Anup Patel , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Conor Dooley , Greentime Hu , Guo Ren , Heiko Stuebner , Jisheng Zhang , Jonathan Corbet , Ley Foon Tan , Marc Zyngier , Masahiro Yamada , Palmer Dabbelt , Paul Walmsley , Randy Dunlap , Samuel Holland , Sia Jee Heng , Sunil V L , Xianting Tian , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: [PATCH v3 0/2] RISC-V: Probe for misaligned access speed Date: Tue, 1 Aug 2023 15:11:35 -0700 Message-Id: <20230801221138.2086734-1-evan@rivosinc.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773073753749962615 X-GMAIL-MSGID: 1773073753749962615 The current setting for the hwprobe bit indicating misaligned access speed is controlled by a vendor-specific feature probe function. This is essentially a per-SoC table we have to maintain on behalf of each vendor going forward. Let's convert that instead to something we detect at runtime. We have two assembly routines at the heart of our probe: one that does a bunch of word-sized accesses (without aligning its input buffer), and the other that does byte accesses. If we can move a larger number of bytes using misaligned word accesses than we can with the same amount of time doing byte accesses, then we can declare misaligned accesses as "fast". The tradeoff of reducing this maintenance burden is boot time. We spend 4-6 jiffies per core doing this measurement (0-2 on jiffie edge alignment, and 4 on measurement). The timing loop was based on raid6_choose_gen(), which uses (16+1)*N jiffies (where N is the number of algorithms). By taking only the fastest iteration out of all attempts for use in the comparison, variance between runs is very low. On my THead C906, it looks like this: [ 0.047563] cpu0: Ratio of byte access time to unaligned word access is 4.34, unaligned accesses are fast Several others have chimed in with results on slow machines with the older algorithm, which took all runs into account, including noise like interrupts. Even with this variation, results indicate that in all cases (fast, slow, and emulated) the measured numbers are nowhere near each other (always multiple factors away). Changes in v3: - Fix documentation indentation (Conor) - Rename __copy_..._unaligned() to __riscv_copy_..._unaligned() (Conor) - Renamed c0,c1 to start_cycles, end_cycles (Conor) - Renamed j0,j1 to start_jiffies, now - Renamed check_unaligned_access0() to check_unaligned_access_boot_cpu() (Conor) Changes in v2: - Explain more in the commit message (Conor) - Use a new algorithm that looks for the fastest run (David) - Clarify documentatin further (David and Conor) - Unify around a single word, "unaligned" (Conor) - Align asm operands, and other misc whitespace changes (Conor) Evan Green (2): RISC-V: Probe for unaligned access speed RISC-V: alternative: Remove feature_probe_func Documentation/riscv/hwprobe.rst | 11 ++- arch/riscv/errata/thead/errata.c | 8 --- arch/riscv/include/asm/alternative.h | 5 -- arch/riscv/include/asm/cpufeature.h | 2 + arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/alternative.c | 19 ----- arch/riscv/kernel/copy-unaligned.S | 71 ++++++++++++++++++ arch/riscv/kernel/copy-unaligned.h | 13 ++++ arch/riscv/kernel/cpufeature.c | 104 +++++++++++++++++++++++++++ arch/riscv/kernel/smpboot.c | 3 +- 10 files changed, 198 insertions(+), 39 deletions(-) create mode 100644 arch/riscv/kernel/copy-unaligned.S create mode 100644 arch/riscv/kernel/copy-unaligned.h