Message ID | 20230206201455.1790329-5-evan@rivosinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2441548wrn; Mon, 6 Feb 2023 12:20:18 -0800 (PST) X-Google-Smtp-Source: AK7set/n9+sk7yvxPVvZ495KQzHtxlBIgdV8te1GV9eeUEB0eDBv4ZGUBWg3MumDADuu82UTG131 X-Received: by 2002:a17:90b:4a87:b0:22c:1bd6:77d8 with SMTP id lp7-20020a17090b4a8700b0022c1bd677d8mr1011344pjb.11.1675714818236; Mon, 06 Feb 2023 12:20:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675714818; cv=none; d=google.com; s=arc-20160816; b=sQBvOu4TAIEObxUKd1tLgyWOIMAo5E7ujG3MgWvol8n05GooCjEg+zlkERJVgE0vx9 IEvjWi8pFhDsRjPfAfcWREzwQ7kWgkR+GtvQT4WyuA60VyWKpvOwYCVOcasT35iXbMaE 4/TbUmvAGx4tx5IOrz4pMFQOY6yy8sFeGLRxRc+wK0cQwHXG4tIwq/W0Ulpb2/+ecvc/ zQknNjJG0lGFuty0RzN5P+EBNeqPUOzl0SFIqqNzxCzqgUGRmGB8wEe3+gJ+Dled9ZGI 45IlX4zhgQ8GQzOXPyc7lHcvVmyj0qPaA8YZtyseB9rS7bEFbTt39y/f+tSq8qZzSLzx 7DLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ggzc1GTHoLiBAu4kfMHEm/fbizYyfDbBZlVMBkPr/K4=; b=edixS+1nRjqSONTVJo0V7v9oKElqL2BTnrIK+JnogBKQHV3hx4XaLh3NaHOuGD1nsW NIbKICvhf1GygappT1QxKAqMxnebQPyX9uW0Vb3ioX7CZOh7y9BL4yFENZZSpwf48ZSn M5/geegx+b2IykTCNKhGwqf8Rtu8UN5ZFgEohEBxGtCGLR3YzboU5aE212P5Rg68STAo s0SCyyiwo51Ub3i/KM5p80ox21RggOLYznoQz7uKfL9xmY0PssNotJ+hL//gcSCwgQh2 4iIm6A+WU//KxRSHMAr5+e5S7Gah8yZX6XrpSXaT7oAVrtT0353HH2rLPZdnVdjBDLWV CT9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=LJTUsmkj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ds12-20020a17090b08cc00b00230aa06e06dsi5847215pjb.105.2023.02.06.12.20.05; Mon, 06 Feb 2023 12:20:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=LJTUsmkj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229899AbjBFUPw (ORCPT <rfc822;kmanaouilinux@gmail.com> + 99 others); Mon, 6 Feb 2023 15:15:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50954 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230073AbjBFUPt (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 6 Feb 2023 15:15:49 -0500 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2A0D2914A for <linux-kernel@vger.kernel.org>; Mon, 6 Feb 2023 12:15:34 -0800 (PST) Received: by mail-pj1-x102f.google.com with SMTP id pj3so12735794pjb.1 for <linux-kernel@vger.kernel.org>; Mon, 06 Feb 2023 12:15:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ggzc1GTHoLiBAu4kfMHEm/fbizYyfDbBZlVMBkPr/K4=; b=LJTUsmkjEjxAWc1IoFf2wxj/67dVsvTcPqb/jJBjxsyqYyVz3ScQAmEmLk7rLdnoH2 0G+w1w7rUgLxoJG6JGeK3oBnyoikVbeT2iqgsNILC3XuFyOKGDcmP6bbCSZGhfVOECgU 3sXnzYhBeGAERqormKcypgWG9XWhAAPUpOh4F6QyBmnObP3SI24BGHjffKmGOd2C6h6M nFlREjaXPwWSA4WBnjQF1PYXsDM8gl0fe+zFGdgbvHUSF0tQ9pj4/mXQRf2gDg9e6iTA ZA4SYFH9Hgm45V+ECqXetPqrjaeYznz677JdCHLdnP+AJ8p2xJmxcea26IMGvNKEjbXU 86dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ggzc1GTHoLiBAu4kfMHEm/fbizYyfDbBZlVMBkPr/K4=; b=GO2f0VEHvVsLedhyhaQ3txBXR9dUub4RVhtee4NHdKkDAlAF0DpTOAhxpcPN10cdYB kgm6uJ7niqnB6a1dVskltEbWAstCGb8fUduXgz94gxaLNUm6nyyL+ji3mSMOoWQWyhOS yTLwHu7AGeEOSKRackZ09uQSc2dd71qnK8MC0fUvPUt1ai74QWLtHSb57P1jSnByEYR+ 5gvIvqpw23liwtnLVosObMF8rSErQy51s1WBn7EscaSvhwOPyfdzjDBKuVh7O0GaXn1I jnWs2wqgT6y4scOCugr1T7T8Ko2gVnf94TJ4skgQ1vKZ06kkIpTUKklV/QDkMuQUDfK+ Owzw== X-Gm-Message-State: AO0yUKXxsUCZMy/R7OCTWHv+7xF0f9ebn4P7EmWLSgctVY3sYA24wkF4 PNGIY/CAp7ylRP+T5pNygN33Qw== X-Received: by 2002:a17:90a:54:b0:230:acb2:e3f0 with SMTP id 20-20020a17090a005400b00230acb2e3f0mr840256pjb.33.1675714534354; Mon, 06 Feb 2023 12:15:34 -0800 (PST) Received: from evan.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id k10-20020a63ab4a000000b004df4fbb9823sm6425079pgp.68.2023.02.06.12.15.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Feb 2023 12:15:33 -0800 (PST) From: Evan Green <evan@rivosinc.com> To: Palmer Dabbelt <palmer@rivosinc.com> Cc: Conor Dooley <conor@kernel.org>, vineetg@rivosinc.com, heiko@sntech.de, slewis@rivosinc.com, Evan Green <evan@rivosinc.com>, Albert Ou <aou@eecs.berkeley.edu>, Krzysztof Kozlowski <krzysztof.kozlowski+dt@linaro.org>, Palmer Dabbelt <palmer@dabbelt.com>, Paul Walmsley <paul.walmsley@sifive.com>, Rob Herring <robh+dt@kernel.org>, devicetree@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Subject: [PATCH v2 4/6] dt-bindings: Add RISC-V misaligned access performance Date: Mon, 6 Feb 2023 12:14:53 -0800 Message-Id: <20230206201455.1790329-5-evan@rivosinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230206201455.1790329-1-evan@rivosinc.com> References: <20230206201455.1790329-1-evan@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757114341317720923?= X-GMAIL-MSGID: =?utf-8?q?1757114341317720923?= |
Series |
RISC-V Hardware Probing User Interface
|
|
Commit Message
Evan Green
Feb. 6, 2023, 8:14 p.m. UTC
From: Palmer Dabbelt <palmer@rivosinc.com> This key allows device trees to specify the performance of misaligned accesses to main memory regions from each CPU in the system. Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> Signed-off-by: Evan Green <evan@rivosinc.com> --- (no changes since v1) Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
Comments
On Mon, 06 Feb 2023 12:14:53 -0800, Evan Green wrote: > From: Palmer Dabbelt <palmer@rivosinc.com> > > This key allows device trees to specify the performance of misaligned > accesses to main memory regions from each CPU in the system. > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > Signed-off-by: Evan Green <evan@rivosinc.com> > --- > > (no changes since v1) > > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > My bot found errors running 'make DT_CHECKER_FLAGS=-m dt_binding_check' on your patch (DT_CHECKER_FLAGS is new in v5.13): yamllint warnings/errors: ./Documentation/devicetree/bindings/riscv/cpus.yaml:91:72: [error] syntax error: mapping values are not allowed here (syntax) dtschema/dtc warnings/errors: make[1]: *** Deleting file 'Documentation/devicetree/bindings/riscv/cpus.example.dts' Documentation/devicetree/bindings/riscv/cpus.yaml:91:72: mapping values are not allowed here make[1]: *** [Documentation/devicetree/bindings/Makefile:26: Documentation/devicetree/bindings/riscv/cpus.example.dts] Error 1 make[1]: *** Waiting for unfinished jobs.... ./Documentation/devicetree/bindings/riscv/cpus.yaml:91:72: mapping values are not allowed here /builds/robherring/dt-review-ci/linux/Documentation/devicetree/bindings/riscv/cpus.yaml: ignoring, error parsing file make: *** [Makefile:1508: dt_binding_check] Error 2 doc reference errors (make refcheckdocs): See https://patchwork.ozlabs.org/project/devicetree-bindings/patch/20230206201455.1790329-5-evan@rivosinc.com The base for the series is generally the latest rc1. A different dependency should be noted in *this* patch. If you already ran 'make dt_binding_check' and didn't see the above error(s), then make sure 'yamllint' is installed and dt-schema is up to date: pip3 install dtschema --upgrade Please check and re-submit after running the above command yourself. Note that DT_SCHEMA_FILES can be set to your schema file to speed up checking your schema. However, it must be unset to test all examples with your schema.
On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote: > From: Palmer Dabbelt <palmer@rivosinc.com> > > This key allows device trees to specify the performance of misaligned > accesses to main memory regions from each CPU in the system. > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > Signed-off-by: Evan Green <evan@rivosinc.com> > --- > > (no changes since v1) > > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml > index c6720764e765..2c09bd6f2927 100644 > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > @@ -85,6 +85,21 @@ properties: > $ref: "/schemas/types.yaml#/definitions/string" > pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ > > + riscv,misaligned-access-performance: > + description: > + Identifies the performance of misaligned memory accesses to main memory > + regions. There are three flavors of unaligned access performance: "emulated" > + means that misaligned accesses are emulated via software and thus > + extremely slow, "slow" means that misaligned accesses are supported by > + hardware but still slower that aligned accesses sequences, and "fast" > + means that misaligned accesses are as fast or faster than the > + cooresponding aligned accesses sequences. > + $ref: "/schemas/types.yaml#/definitions/string" > + enum: > + - emulated > + - slow > + - fast I don't think this belongs in DT. (I'm not sure about a userspace interface either.) Can't this be tested and determined at runtime? Do misaligned accesses and compare the performance. We already do this for things like memcpy or crypto implementation selection. Rob
From: Rob Herring > Sent: 07 February 2023 17:06 > > On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote: > > From: Palmer Dabbelt <palmer@rivosinc.com> > > > > This key allows device trees to specify the performance of misaligned > > accesses to main memory regions from each CPU in the system. > > > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > > Signed-off-by: Evan Green <evan@rivosinc.com> > > --- > > > > (no changes since v1) > > > > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml > b/Documentation/devicetree/bindings/riscv/cpus.yaml > > index c6720764e765..2c09bd6f2927 100644 > > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > > @@ -85,6 +85,21 @@ properties: > > $ref: "/schemas/types.yaml#/definitions/string" > > pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ > > > > + riscv,misaligned-access-performance: > > + description: > > + Identifies the performance of misaligned memory accesses to main memory > > + regions. There are three flavors of unaligned access performance: "emulated" > > + means that misaligned accesses are emulated via software and thus > > + extremely slow, "slow" means that misaligned accesses are supported by > > + hardware but still slower that aligned accesses sequences, and "fast" > > + means that misaligned accesses are as fast or faster than the > > + cooresponding aligned accesses sequences. > > + $ref: "/schemas/types.yaml#/definitions/string" > > + enum: > > + - emulated > > + - slow > > + - fast > > I don't think this belongs in DT. (I'm not sure about a userspace > interface either.) > > Can't this be tested and determined at runtime? Do misaligned accesses > and compare the performance. We already do this for things like memcpy > or crypto implementation selection. There is also an long discussion about misaligned accesses for loooongarch. Basically if you want to run a common kernel (and userspace) you have to default to compiling everything with -mno-stict-align so that the compiler generates byte accesses for anything marked 'packed' (etc). Run-time tests can optimise some hot-spots. In any case 'slow' is probably pointless - unless the accesses take more than 1 or 2 extra cycles. Oh, and you really never, ever want to emulate them. Technically misaligned reads on (some) x86-64 cpu are slower than aligned ones, but the difference is marginal. I've measured two 64bit misaligned reads every clock. But it is consistently slower by much less than one clock per cache line. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, 08 Feb 2023 04:45:10 PST (-0800), David.Laight@ACULAB.COM wrote: > From: Rob Herring >> Sent: 07 February 2023 17:06 >> >> On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote: >> > From: Palmer Dabbelt <palmer@rivosinc.com> >> > >> > This key allows device trees to specify the performance of misaligned >> > accesses to main memory regions from each CPU in the system. >> > >> > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> >> > Signed-off-by: Evan Green <evan@rivosinc.com> >> > --- >> > >> > (no changes since v1) >> > >> > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ >> > 1 file changed, 15 insertions(+) >> > >> > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml >> b/Documentation/devicetree/bindings/riscv/cpus.yaml >> > index c6720764e765..2c09bd6f2927 100644 >> > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml >> > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml >> > @@ -85,6 +85,21 @@ properties: >> > $ref: "/schemas/types.yaml#/definitions/string" >> > pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ >> > >> > + riscv,misaligned-access-performance: >> > + description: >> > + Identifies the performance of misaligned memory accesses to main memory >> > + regions. There are three flavors of unaligned access performance: "emulated" >> > + means that misaligned accesses are emulated via software and thus >> > + extremely slow, "slow" means that misaligned accesses are supported by >> > + hardware but still slower that aligned accesses sequences, and "fast" >> > + means that misaligned accesses are as fast or faster than the >> > + cooresponding aligned accesses sequences. >> > + $ref: "/schemas/types.yaml#/definitions/string" >> > + enum: >> > + - emulated >> > + - slow >> > + - fast >> >> I don't think this belongs in DT. (I'm not sure about a userspace >> interface either.) [Kind of answered below.] >> Can't this be tested and determined at runtime? Do misaligned accesses >> and compare the performance. We already do this for things like memcpy >> or crypto implementation selection. We've had a history of broken firmware emulation of misaligned accesses wreaking havoc. We don't run into concrete bugs there because we avoid misaligned accesses as much as possible in the kernel, but I'd be worried that we'd trigger a lot of these when probing for misaligned accesses. > There is also an long discussion about misaligned accesses > for loooongarch. > > Basically if you want to run a common kernel (and userspace) > you have to default to compiling everything with -mno-stict-align > so that the compiler generates byte accesses for anything > marked 'packed' (etc). > > Run-time tests can optimise some hot-spots. > > In any case 'slow' is probably pointless - unless the accesses > take more than 1 or 2 extra cycles. [Also below.] > Oh, and you really never, ever want to emulate them. Unfortunately we're kind of stuck with this one: the specs used to require that misaligned accesses were supported and thus there's a bunch of firmwares that emulate them (and various misaligned accesses spread around, though they're kind of a mess). The specs no longer require this support, but just dropping it from firmware will break binaries. There's been some vague plans to dig out of this, but it'd require some sort of firmware interface additions in order to turn off the emulation and that's going to take a while. As it stands we've got a bunch of users that just want to know when they can emit misaligned accesses. > Technically misaligned reads on (some) x86-64 cpu are slower > than aligned ones, but the difference is marginal. > I've measured two 64bit misaligned reads every clock. > But it is consistently slower by much less than one clock > per cache line. The "fast" case is explicitly written to catch that flavor of implementation. The "slow" one is a bit vaguer, but the general idea is to catch implementations that end up with some sort of pipeline flush on misaligned accesses. We've got a lot of very small in-order processors in RISC-V land, and while I haven't gotten around to benchmarking them all my guess is that the spec requirement for support ended up with some simple implementations. FWIW: I checked the c906 RTL and it's setting some exception-related info on misaligned accesses, but I'd need to actually benchmark on to know for sure and they're kind of a headache to deal with. > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK > Registration No: 1397386 (Wales)
On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote: > From: Palmer Dabbelt <palmer@rivosinc.com> > > This key allows device trees to specify the performance of misaligned > accesses to main memory regions from each CPU in the system. > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > Signed-off-by: Evan Green <evan@rivosinc.com> > --- > > (no changes since v1) > > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml > index c6720764e765..2c09bd6f2927 100644 > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > @@ -85,6 +85,21 @@ properties: > $ref: "/schemas/types.yaml#/definitions/string" > pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ > > + riscv,misaligned-access-performance: > + description: > + Identifies the performance of misaligned memory accesses to main memory > + regions. There are three flavors of unaligned access performance: "emulated" Is the performance: emulated the source of the dt_binding_check issues? And the fix is as simple as: - description: + description: | ? > + means that misaligned accesses are emulated via software and thus > + extremely slow, "slow" means that misaligned accesses are supported by > + hardware but still slower that aligned accesses sequences, and "fast" > + means that misaligned accesses are as fast or faster than the > + cooresponding aligned accesses sequences. > + $ref: "/schemas/types.yaml#/definitions/string" > + enum: > + - emulated > + - slow > + - fast > + > # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here > timebase-frequency: false > > -- > 2.25.1 >
On Tue, Feb 14, 2023 at 1:26 PM Conor Dooley <conor@kernel.org> wrote: > > On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote: > > From: Palmer Dabbelt <palmer@rivosinc.com> > > > > This key allows device trees to specify the performance of misaligned > > accesses to main memory regions from each CPU in the system. > > > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > > Signed-off-by: Evan Green <evan@rivosinc.com> > > --- > > > > (no changes since v1) > > > > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml > > index c6720764e765..2c09bd6f2927 100644 > > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > > @@ -85,6 +85,21 @@ properties: > > $ref: "/schemas/types.yaml#/definitions/string" > > pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ > > > > + riscv,misaligned-access-performance: > > + description: > > + Identifies the performance of misaligned memory accesses to main memory > > + regions. There are three flavors of unaligned access performance: "emulated" > > Is the performance: emulated the source of the dt_binding_check issues? > And the fix is as simple as: > - description: > + description: | > ? Yep, I can pass cleanly with that change. Thanks!
On Thu, Feb 09, 2023 at 08:51:22AM -0800, Palmer Dabbelt wrote: > On Wed, 08 Feb 2023 04:45:10 PST (-0800), David.Laight@ACULAB.COM wrote: > > From: Rob Herring > > > Sent: 07 February 2023 17:06 > > > > > > On Mon, Feb 06, 2023 at 12:14:53PM -0800, Evan Green wrote: > > > > From: Palmer Dabbelt <palmer@rivosinc.com> > > > > > > > > This key allows device trees to specify the performance of misaligned > > > > accesses to main memory regions from each CPU in the system. > > > > > > > > Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com> > > > > Signed-off-by: Evan Green <evan@rivosinc.com> > > > > --- > > > > > > > > (no changes since v1) > > > > > > > > Documentation/devicetree/bindings/riscv/cpus.yaml | 15 +++++++++++++++ > > > > 1 file changed, 15 insertions(+) > > > > > > > > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml > > > b/Documentation/devicetree/bindings/riscv/cpus.yaml > > > > index c6720764e765..2c09bd6f2927 100644 > > > > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml > > > > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml > > > > @@ -85,6 +85,21 @@ properties: > > > > $ref: "/schemas/types.yaml#/definitions/string" > > > > pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ > > > > > > > > + riscv,misaligned-access-performance: > > > > + description: > > > > + Identifies the performance of misaligned memory accesses to main memory > > > > + regions. There are three flavors of unaligned access performance: "emulated" > > > > + means that misaligned accesses are emulated via software and thus > > > > + extremely slow, "slow" means that misaligned accesses are supported by > > > > + hardware but still slower that aligned accesses sequences, and "fast" > > > > + means that misaligned accesses are as fast or faster than the > > > > + cooresponding aligned accesses sequences. > > > > + $ref: "/schemas/types.yaml#/definitions/string" > > > > + enum: > > > > + - emulated > > > > + - slow > > > > + - fast > > > > > > I don't think this belongs in DT. (I'm not sure about a userspace > > > interface either.) > > [Kind of answered below.] > > > > Can't this be tested and determined at runtime? Do misaligned accesses > > > and compare the performance. We already do this for things like memcpy > > > or crypto implementation selection. > > We've had a history of broken firmware emulation of misaligned accesses > wreaking havoc. We don't run into concrete bugs there because we avoid > misaligned accesses as much as possible in the kernel, but I'd be worried > that we'd trigger a lot of these when probing for misaligned accesses. Then how do you distinguish between emulated and working vs. emulated and broken? Sounds like the kernel running things would motivate fixing firmware. :) If not, then broken platforms can disable the check with a kernel command line flag. > > > There is also an long discussion about misaligned accesses > > for loooongarch. > > > > Basically if you want to run a common kernel (and userspace) > > you have to default to compiling everything with -mno-stict-align > > so that the compiler generates byte accesses for anything > > marked 'packed' (etc). > > > > Run-time tests can optimise some hot-spots. > > > > In any case 'slow' is probably pointless - unless the accesses > > take more than 1 or 2 extra cycles. > > [Also below.] > > > Oh, and you really never, ever want to emulate them. > > Unfortunately we're kind of stuck with this one: the specs used to require > that misaligned accesses were supported and thus there's a bunch of > firmwares that emulate them (and various misaligned accesses spread around, > though they're kind of a mess). The specs no longer require this support, > but just dropping it from firmware will break binaries. > > There's been some vague plans to dig out of this, but it'd require some sort > of firmware interface additions in order to turn off the emulation and > that's going to take a while. As it stands we've got a bunch of users that > just want to know when they can emit misaligned accesses. > > > Technically misaligned reads on (some) x86-64 cpu are slower > > than aligned ones, but the difference is marginal. > > I've measured two 64bit misaligned reads every clock. > > But it is consistently slower by much less than one clock > > per cache line. > > The "fast" case is explicitly written to catch that flavor of > implementation. > > The "slow" one is a bit vaguer, but the general idea is to catch > implementations that end up with some sort of pipeline flush on misaligned > accesses. We've got a lot of very small in-order processors in RISC-V land, > and while I haven't gotten around to benchmarking them all my guess is that > the spec requirement for support ended up with some simple implementations. If userspace wants to get into microarchitecture level optimizations, it should just look at the CPU model. IOW, use the CPU compatible to infer things rather than continuously adding properties in an adhoc manor trying to parameterize everything. Rob
diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml index c6720764e765..2c09bd6f2927 100644 --- a/Documentation/devicetree/bindings/riscv/cpus.yaml +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml @@ -85,6 +85,21 @@ properties: $ref: "/schemas/types.yaml#/definitions/string" pattern: ^rv(?:64|32)imaf?d?q?c?b?v?k?h?(?:_[hsxz](?:[a-z])+)*$ + riscv,misaligned-access-performance: + description: + Identifies the performance of misaligned memory accesses to main memory + regions. There are three flavors of unaligned access performance: "emulated" + means that misaligned accesses are emulated via software and thus + extremely slow, "slow" means that misaligned accesses are supported by + hardware but still slower that aligned accesses sequences, and "fast" + means that misaligned accesses are as fast or faster than the + cooresponding aligned accesses sequences. + $ref: "/schemas/types.yaml#/definitions/string" + enum: + - emulated + - slow + - fast + # RISC-V requires 'timebase-frequency' in /cpus, so disallow it here timebase-frequency: false