From patchwork Sun Nov 13 23:05:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19474 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865097wru; Sun, 13 Nov 2022 15:07:22 -0800 (PST) X-Google-Smtp-Source: AA0mqf5S0sREybmJoPi8NOZA036JANf50LG3V4NzWsRyzrphfuTNEAraSK/cEQ6DWHCR4FzgR9zQ X-Received: by 2002:a17:906:6592:b0:7ae:9677:d8dd with SMTP id x18-20020a170906659200b007ae9677d8ddmr8623548ejn.71.1668380841944; Sun, 13 Nov 2022 15:07:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380841; cv=none; d=google.com; s=arc-20160816; b=XlsfWsm5REw4DLUnQHpUe8nGYryCiB6Od5SPF98YHqVyctl/dU0aT36Dele4ogaQQq auv0rH8QJVf8kH7TTaVdOK8d3h4Y163rYdOjhxuf+kR+QcZFk1coYCPKHn8D2OYrXVXu gzCA0ixPRW5k/mZrCcJDg1gu6NXK4T0Tg18w8BcQ4JEWrObvJpBRmO+1aB3Vfn/ph7KQ 2w7OHruSVSm6kVUTMF72gcl35Q8wDBdDXDsTlMXXOxb5zl/Zr6LkWL3EmFnP1wYxLqQ4 aAcm4x9FqgrA/02kdU4k3c3Qe/WN5Q5VBXzQ2W41h7GyrXbiJbbn3QwdZG/iyXY9rA2Z +U7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature:dmarc-filter:delivered-to; bh=kDf6CpvIiHTXCKB+bGNFUiN2AZZOL3Lt7/xgFFvIVw0=; b=hcSWsebtlxtFlZJUpazBtfQZ0E0n58yvNqCUtu/QOsRtD8bYl6PqKTFv6PQVKtBOSj ovw/KWw5HAp5KxMCmC+r7/kkYWgjcQVzRSnSqb+pQPc5UPvJT2x4PIKEg3iDDFX6Z7Bv 4m3SojbY5JJqOLcMHzkGZlzZ1GLywSLzUT8SOimR2uVCS6cr9LdraxCGptd/lSBxtfCQ RA7U6xqkvB+rhB1tHyuDWOH+P6gS9JTjBmK07fogPK3u8QipvUJHxhwZECeJb8c2wwKS fnRRF7VHld2xb87weChqiTHQahR6n/GtFBsuIOxSr0DWru4JPC53e0g9t1khXNSJx1Vl uVWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=AiVnnOny; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id wt5-20020a170906ee8500b0073d8830e4c7si7585035ejb.954.2022.11.13.15.07.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:07:21 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=AiVnnOny; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8C16138AA240 for ; Sun, 13 Nov 2022 23:06:11 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 360913857365 for ; Sun, 13 Nov 2022 23:05:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 360913857365 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id 13so24506685ejn.3 for ; Sun, 13 Nov 2022 15:05:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=kDf6CpvIiHTXCKB+bGNFUiN2AZZOL3Lt7/xgFFvIVw0=; b=AiVnnOnyLSOGjL+tkaZIXrr9DNy2Pc7TOlV8P5hmG5SpSMIAep5+sCvMSSIF03e6Ur JdBsd51ttc+z/Wocfux0Z8qdtV74RINtiBiNhYgawDfaosdTNmmXbHBuaB2V0mufE5AR 6l5c60nRH78+MwlE7tNkRQne1VtG7YK6nZ2OKpW6XqRu/k/qpM+SLHpDbbUztgq2AUnd 5/j2jhf6mLcINX80xkCd9sKZ6a99iBR5M/BA2aQp65mUs3AzfAald6x3XpwuvtpF0+JU s2Dc922mCdkuqP6bZGe4TwAuRCxN5A2ccot4GIh+fq5Wo8uYML3a2P6UJHx0x6Y8uwql 5zsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kDf6CpvIiHTXCKB+bGNFUiN2AZZOL3Lt7/xgFFvIVw0=; b=ljWdJSwi6orWywv08mPxw0i59bgNMo3ZuKMSs58NHXFUixcUvAVSR/BStP5SbcyjtW neNwoW2cWQv18WhU2q+YH6KBCwTl44rvMC8uC5epAapXtvW8X6+lDRH9oQ8aIARiYsyn TyWUa+U3v6np8LQZ8LT8Kd+KpjTv6P9oA0fAalekF3A/wS2P84GPTDWh8Y3FphxRKjjC S/IY746MhtGPM3tUSSt1eOY1Lt3lnbDviMNDVz5gmo1E6G5e7dCEoC+ObgGMcDG4+l0G sXq+zEreYcMqN7hVWLviXv23sjPWxX8l66quP3esmO5XhZCJQ5yL3nGtCxgtj5AsFfpO Xb7Q== X-Gm-Message-State: ANoB5pmerw7JC0o4sxvVbz+ANTB1lg6q7UTjcsABhwebhjMqnvkqLus6 fzYzdDr4XPjSWWZHBll9zeUzpf/g6kamjWYR X-Received: by 2002:a17:906:f196:b0:78d:6a9b:216c with SMTP id gs22-20020a170906f19600b0078d6a9b216cmr8542359ejb.602.1668380725766; Sun, 13 Nov 2022 15:05:25 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:25 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Subject: [PATCH 1/7] riscv: bitmanip: add orc.b as an unspec Date: Mon, 14 Nov 2022 00:05:15 +0100 Message-Id: <20221113230521.712693-2-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424109334225086?= X-GMAIL-MSGID: =?utf-8?q?1749424109334225086?= From: Philipp Tomsich As a basis for optimized string functions (e.g., the by-pieces implementations), we need orc.b available. This adds orc.b as an unspec, so we can expand to it. gcc/ChangeLog: * config/riscv/bitmanip.md (orcb2): Add orc.b as an unspec. * config/riscv/riscv.md: Add UNSPEC_ORC_B. Signed-off-by: Philipp Tomsich --- gcc/config/riscv/bitmanip.md | 8 ++++++++ gcc/config/riscv/riscv.md | 3 +++ 2 files changed, 11 insertions(+) diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index b44fb9517e7..3dbe6002974 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -242,6 +242,14 @@ (define_insn "rotlsi3_sext" "rolw\t%0,%1,%2" [(set_attr "type" "bitmanip")]) +;; orc.b (or-combine) is added as an unspec for the benefit of the support +;; for optimized string functions (such as strcmp). +(define_insn "orcb2" + [(set (match_operand:X 0 "register_operand" "=r") + (unspec:X [(match_operand:X 1 "register_operand" "r")] UNSPEC_ORC_B))] + "TARGET_ZBB" + "orc.b\t%0,%1") + (define_insn "bswap2" [(set (match_operand:X 0 "register_operand" "=r") (bswap:X (match_operand:X 1 "register_operand" "r")))] diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 798f7370a08..532289dd178 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -62,6 +62,9 @@ (define_c_enum "unspec" [ ;; Stack tie UNSPEC_TIE + + ;; OR-COMBINE + UNSPEC_ORC_B ]) (define_c_enum "unspecv" [ From patchwork Sun Nov 13 23:05:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19475 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865619wru; Sun, 13 Nov 2022 15:09:05 -0800 (PST) X-Google-Smtp-Source: AA0mqf7gxES+o3O/J3gbto4sK6tEFIbLvJtH/XOB0AckaMyQjpUlWmwhg4Ut2Fz1XkHIfiCVSfvG X-Received: by 2002:a05:6402:1949:b0:467:9864:9463 with SMTP id f9-20020a056402194900b0046798649463mr5177754edz.360.1668380945276; Sun, 13 Nov 2022 15:09:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380945; cv=none; d=google.com; s=arc-20160816; b=zjKcHED0rL17QM8qx5u2hyuQj8wjtlAFYbURrw1uGTFIKzvi7TO2p9ttbRKwSizeKX wo7WYNvlAVxUeWNNjcsyglqwEfEIz18gObywt959VGug7/sJmXttbhbzWvU7saXsCfHI JSp4I10zgnopBvymfs8hBSZy5NRhutWzZnOPXLUDY4tKKDIff7dXT5ha/0RxmrVVr+Yf BjkcvssyWSA1BBvmBaJ5L3NWTLL7kYEuGXEH7lOImt3M4kVaanrWlmEDWM/U8f2NPZDf HNz6tOX5m4GRUpqWkuiRZHhvHdUlReW7NyKYSebONbc7SVLLBHAYcd2Hwri1ZjC6c5cQ lTSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=bjhtEefrkufDLOq14J9obOBH5W8nvfgMIKxgd2KsfsU=; b=nelvglvbsiU3ISKfdgAr+3FFTd5TwvkvwSSLDrna6Kfh0dOgIpULWEIwfEy8UHWt13 782vectFxpr7DDsTbcreJwb4entIT+Sgscd3j+aXrk692+AP311SOAdT++RTolKho7ht w5PhlkkHSUbHNK0HQ93JnsRcv45b+JKxGAYY1FiX+39oSlGIZXjS3zxmch2zep+toGdv 4UjqF8XdfRfz3WbFTuLJnKvceZHSrR67oltlf2DfHhOElEmvYu8Lt1iWMKfqEbwvukjJ hl0fso4csdoDoL0IafHToqv8lu6fWIaLPF0WDlsZFOYuyK0fKVOlvpFETmrjB4VRzj90 Iplg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=A+UIsEmc; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id 21-20020a508e15000000b00461f44d7bfcsi6018425edw.573.2022.11.13.15.09.05 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:09:05 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=A+UIsEmc; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7FBBE3894C39 for ; Sun, 13 Nov 2022 23:06:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x52c.google.com (mail-ed1-x52c.google.com [IPv6:2a00:1450:4864:20::52c]) by sourceware.org (Postfix) with ESMTPS id 48C7D3856975 for ; Sun, 13 Nov 2022 23:05:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 48C7D3856975 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x52c.google.com with SMTP id a13so14983467edj.0 for ; Sun, 13 Nov 2022 15:05:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bjhtEefrkufDLOq14J9obOBH5W8nvfgMIKxgd2KsfsU=; b=A+UIsEmcnYucC0tlylj9bZY/V2kKKvMsRsLrbIfTRq4stEhD14X50svmReLgSY/AAn 7T0YV0NNwiAwp798P7WaBvld9OUvxnJpeO032Yf35cN8S+bwEzAr13GskUYbcH1DY4eI a+dzW2LOfu7evj5fw5S71T/x4VKqIA7z1A8FAOAXNx918LW352kEOUCVeXZqVYlIiuPh K7vS1V17jXGiQ1+ethCYMY4adZj5D4jUcR1O9thVx9w1ba3FXSGYrZ1T6yy6hvRmY+RP ijWXT++okLY02D34gGeGGHjpCXZ/kcYKJlnMkbyeEv1iTEgK1H/1OMiW2VWpnSMKI6CQ Omgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bjhtEefrkufDLOq14J9obOBH5W8nvfgMIKxgd2KsfsU=; b=Xpl/f7i1gVXWDEsturAwHRNBUb8TbH2zDJh2cQ2QaJj3/kZjM8xDbqH4RCZAKZnauB 8Uq/2jP7eQnVB/MI71buTOyQ34KUDbguW9g9h/4tNhSPvjEuibMCXbwGnuDPFXZdUuaG H3Wt2+3qPJg8LSQQQnB++gNfOgt2lBwXbuX1XmsNLDWTy1ipGqpAxrPlPggYem5lJ6pc AdLuqXG/hwUYqJ2h16l4P1lFRdGL93vq9GY4BEGwCbKERnmXdICnnMjIFXhItyGjiSeN U8+I+lb2X8rKLh8kS/vyQTM/tlVgC1HLKOtmDjcXKGXIbI1oyLa6USChNBQMeUbDBktR wAbw== X-Gm-Message-State: ANoB5pnBfSvQmx5p9xvZyBRX2oRuXkljOMksxGKEXRhpgXhoT1vKpuax rxUNC50jCb3e9HRuQik9GOsNqkEKG/EY5Two X-Received: by 2002:aa7:d38b:0:b0:467:71de:fe10 with SMTP id x11-20020aa7d38b000000b0046771defe10mr7854943edq.63.1668380726814; Sun, 13 Nov 2022 15:05:26 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:26 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 2/7] riscv: bitmanip/zbb: Add prefix/postfix and enable visiblity Date: Mon, 14 Nov 2022 00:05:16 +0100 Message-Id: <20221113230521.712693-3-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424218180526679?= X-GMAIL-MSGID: =?utf-8?q?1749424218180526679?= From: Christoph Müllner INSNs are usually postfixed by a number representing the argument count. Given the instructions will be used in a later commit, let's make them visible, but add a "riscv_" prefix to avoid conflicts with standard INSNs. gcc/ChangeLog: * config/riscv/bitmanip.md (*_not): Rename INSN. (riscv__not3): Rename INSN. (*xor_not): Rename INSN. (xor_not3): Rename INSN. Signed-off-by: Christoph Müllner --- gcc/config/riscv/bitmanip.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index 3dbe6002974..d6d94e5cdf8 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -119,7 +119,7 @@ (define_insn "*slliuw" ;; ZBB extension. -(define_insn "*_not" +(define_insn "riscv__not3" [(set (match_operand:X 0 "register_operand" "=r") (bitmanip_bitwise:X (not:X (match_operand:X 1 "register_operand" "r")) (match_operand:X 2 "register_operand" "r")))] @@ -128,7 +128,7 @@ (define_insn "*_not" [(set_attr "type" "bitmanip") (set_attr "mode" "")]) -(define_insn "*xor_not" +(define_insn "riscv_xor_not3" [(set (match_operand:X 0 "register_operand" "=r") (not:X (xor:X (match_operand:X 1 "register_operand" "r") (match_operand:X 2 "register_operand" "r"))))] From patchwork Sun Nov 13 23:05:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19473 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865009wru; Sun, 13 Nov 2022 15:07:08 -0800 (PST) X-Google-Smtp-Source: AA0mqf6BxQGyNA0zLw/xVRTbMujceqyYh/Y2BiYslyhhaCRBhkNT7aIToCsIwIViRUMpEO4jfeMT X-Received: by 2002:a05:6402:148c:b0:458:d7b5:9793 with SMTP id e12-20020a056402148c00b00458d7b59793mr9042185edv.377.1668380828612; Sun, 13 Nov 2022 15:07:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380828; cv=none; d=google.com; s=arc-20160816; b=SQtEoiEbt2kDGol5SHtMqMJ4MaxwCILQT/43M1GpFHj7ZKIfs8K8Ok4GDQMk5BT6l6 snGKre9hpTCjJl31DgD1vWX6EYucavXXtko21D0IaLAQlD4y84ybhRAAhvTI6qsurrxZ MnCA5QPsVouwc4o9Q27e2ekeB4KPK0KDRuHM4pivJ4ArNuPGaPIbCFNvX56vcZGnUpbT RaEHwgB/uh+CQWU2Ye2Q+wJyOkAy/ktI41O22Xq+/JVpYYT3F9Y24HXeodnMgLcW25WX I3tWxyJ1o7zPd2aa93ruwJ04EMMm8wTqvyoVxqOTCIvtqLQrVsYlhT5btkuXdZrL5vAn YyDQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=BuvSS4rH1/NUDdF5Vg00Ta0jaFp+8X5cjKSDVCC2uQQ=; b=aYn4X0hBVmdv2hrhqK8Wkggc+ikFyQu8CUX1gV8+bO4VKOgDNGEf3h8WxEcb/Gk5+p u5qPHUzI/K8Z5RLDTbTrPOGoB8IUpWIrnVnzlcUj7gN8fTfhy5WpOrels80G114lySrl FK2Nlufji6aYXXWJi5+sLBiDkRBJz32jM0d2OV9Cltrmf2cdBU6G5GPtpXUHdEUAvv6+ 51w8ltPTtGxqo7ftGbIdwJy3A+fNeuJEGsVxfkeMNaI/icjEX0KpcMHdtRRw5ZOB2aiV 2mJO/QAQag1va2hDVXXV8h7szA/jlVm1bE0jMYD2EmZfhPPu7PWAbtcaBAM32Sc8lvmN ngAg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=EBMTC2Bj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id q7-20020a056402248700b00456964d3369si7985300eda.230.2022.11.13.15.07.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:07:08 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=EBMTC2Bj; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4CC7E3899035 for ; Sun, 13 Nov 2022 23:06:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x535.google.com (mail-ed1-x535.google.com [IPv6:2a00:1450:4864:20::535]) by sourceware.org (Postfix) with ESMTPS id 8B3E6385703A for ; Sun, 13 Nov 2022 23:05:29 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 8B3E6385703A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x535.google.com with SMTP id v17so14907504edc.8 for ; Sun, 13 Nov 2022 15:05:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BuvSS4rH1/NUDdF5Vg00Ta0jaFp+8X5cjKSDVCC2uQQ=; b=EBMTC2BjwBzb2rABcIctCYNDb1f7pvQCb4LRQJX1SFPykiDB+71ljblR8dE7cqBKN4 UQ19WNjS+PD8d6zmhXT2bhBQO7lvNwCUiWGB9tIKeol+tk8eOrtxflrSf6LM3O84cyn/ 9C2BS1eB+UYlk/ogE+VyEQQeyHB6Q3Ux9BfCVaKMazkiQ87w2gifUciS1FtTrTrWhI5V r8ZARfH+jO6If6XQ0hQd/LU9fv7Crw2b7ayVn9kNrHaKutnJDyP9pPNJ/nULE/+W2vej 9JNQ5fsFgRumlZuHod9XycxsCbMX5Wo2eGHhfJn0N7DJUWMFkdLbxMdeGoZfFy8j3PFR VOzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BuvSS4rH1/NUDdF5Vg00Ta0jaFp+8X5cjKSDVCC2uQQ=; b=v7Kuh064Bc/O55vMYyTACUY4h80WiiGUEIaiXU8HYBdNPBVoKHS3QjCq1ITxBPumQF 0tBc7QtUA+Jbk2yPWYMRs6+eJIGF9txOI+sgfaow4Y8b1ddITwq96q+boBmkmdAUvLiZ TtChqfMLquwv70SUbXQKvGfuw+OWSYItYoIlxQqP8h+40NjQy9i9Tm6ozOGxeCz16LIM 7CRbxhSWM1uJ2wZ/WeXSP3gLa4qNU0taIvmTAYyirpMbtu7nkQogHL1f8XzMhv5M9YYC SXFPVT5Yfr+FIuddM4in0Wrzdj/yRnEDQbR6/HXo51wRTsBCHma2okFmg1SPw7hJUGy8 HUIA== X-Gm-Message-State: ANoB5pnSyzSb9K0KMgmCa8BYikc/OUJSGloARCSBUEdo7IXaLQsbvoCR MBA5IkkkHH+eq3o7efssI0GaSCgPHbefdIr2 X-Received: by 2002:a05:6402:c89:b0:462:f2e2:53aa with SMTP id cm9-20020a0564020c8900b00462f2e253aamr9276983edb.384.1668380727989; Sun, 13 Nov 2022 15:05:27 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:27 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 3/7] riscv: Enable overlap-by-pieces via tune param Date: Mon, 14 Nov 2022 00:05:17 +0100 Message-Id: <20221113230521.712693-4-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424095558576413?= X-GMAIL-MSGID: =?utf-8?q?1749424095558576413?= From: Christoph Müllner This patch adds the field overlap_op_by_pieces to the struct riscv_tune_param, which allows to enable the overlap_op_by_pieces infrastructure. gcc/ChangeLog: * config/riscv/riscv.c (struct riscv_tune_param): New field. (riscv_overlap_op_by_pieces): New function. (TARGET_OVERLAP_OP_BY_PIECES_P): Connect to riscv_overlap_op_by_pieces. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv.cc | 17 +++++- .../gcc.target/riscv/memcpy-nonoverlapping.c | 54 +++++++++++++++++++ .../gcc.target/riscv/memcpy-overlapping.c | 50 +++++++++++++++++ .../gcc.target/riscv/memset-nonoverlapping.c | 45 ++++++++++++++++ .../gcc.target/riscv/memset-overlapping.c | 43 +++++++++++++++ 5 files changed, 208 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/riscv/memcpy-nonoverlapping.c create mode 100644 gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c create mode 100644 gcc/testsuite/gcc.target/riscv/memset-nonoverlapping.c create mode 100644 gcc/testsuite/gcc.target/riscv/memset-overlapping.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index a0c00cfb66f..7357cf51cdf 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -243,6 +243,7 @@ struct riscv_tune_param unsigned short fmv_cost; bool slow_unaligned_access; unsigned int fusible_ops; + bool overlap_op_by_pieces; }; /* Information about one micro-arch we know about. */ @@ -331,6 +332,7 @@ static const struct riscv_tune_param rocket_tune_info = { 8, /* fmv_cost */ true, /* slow_unaligned_access */ RISCV_FUSE_NOTHING, /* fusible_ops */ + false, /* overlap_op_by_pieces */ }; /* Costs to use when optimizing for Sifive 7 Series. */ @@ -346,6 +348,7 @@ static const struct riscv_tune_param sifive_7_tune_info = { 8, /* fmv_cost */ true, /* slow_unaligned_access */ RISCV_FUSE_NOTHING, /* fusible_ops */ + false, /* overlap_op_by_pieces */ }; /* Costs to use when optimizing for T-HEAD c906. */ @@ -361,6 +364,7 @@ static const struct riscv_tune_param thead_c906_tune_info = { 8, /* fmv_cost */ false, /* slow_unaligned_access */ RISCV_FUSE_NOTHING, /* fusible_ops */ + false, /* overlap_op_by_pieces */ }; /* Costs to use when optimizing for size. */ @@ -376,6 +380,7 @@ static const struct riscv_tune_param optimize_size_tune_info = { 8, /* fmv_cost */ false, /* slow_unaligned_access */ RISCV_FUSE_NOTHING, /* fusible_ops */ + false, /* overlap_op_by_pieces */ }; /* Costs to use when optimizing for Ventana Micro VT1. */ @@ -393,7 +398,8 @@ static const struct riscv_tune_param ventana_vt1_tune_info = { ( RISCV_FUSE_ZEXTW | RISCV_FUSE_ZEXTH | /* fusible_ops */ RISCV_FUSE_ZEXTWS | RISCV_FUSE_LDINDEXED | RISCV_FUSE_LUI_ADDI | RISCV_FUSE_AUIPC_ADDI | - RISCV_FUSE_LUI_LD | RISCV_FUSE_AUIPC_LD ) + RISCV_FUSE_LUI_LD | RISCV_FUSE_AUIPC_LD ), + true, /* overlap_op_by_pieces */ }; static tree riscv_handle_fndecl_attribute (tree *, tree, tree, int, bool *); @@ -6444,6 +6450,12 @@ riscv_slow_unaligned_access (machine_mode, unsigned int) return riscv_slow_unaligned_access_p; } +static bool +riscv_overlap_op_by_pieces (void) +{ + return tune_param->overlap_op_by_pieces; +} + /* Implement TARGET_CAN_CHANGE_MODE_CLASS. */ static bool @@ -6974,6 +6986,9 @@ riscv_dwarf_poly_indeterminate_value (unsigned int i, unsigned int *factor, #undef TARGET_SLOW_UNALIGNED_ACCESS #define TARGET_SLOW_UNALIGNED_ACCESS riscv_slow_unaligned_access +#undef TARGET_OVERLAP_OP_BY_PIECES_P +#define TARGET_OVERLAP_OP_BY_PIECES_P riscv_overlap_op_by_pieces + #undef TARGET_SECONDARY_MEMORY_NEEDED #define TARGET_SECONDARY_MEMORY_NEEDED riscv_secondary_memory_needed diff --git a/gcc/testsuite/gcc.target/riscv/memcpy-nonoverlapping.c b/gcc/testsuite/gcc.target/riscv/memcpy-nonoverlapping.c new file mode 100644 index 00000000000..1c99e13fc26 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/memcpy-nonoverlapping.c @@ -0,0 +1,54 @@ +/* { dg-do compile } */ +/* { dg-options "-mcpu=sifive-u74 -march=rv64gc -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" "-Og" } } */ + + +#define COPY_N(N) \ +void copy##N (char *src, char *dst) \ +{ \ + dst = __builtin_assume_aligned (dst, 4096); \ + src = __builtin_assume_aligned (src, 4096); \ + __builtin_memcpy (dst, src, N); \ +} + +/* Emits 1x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +COPY_N(11) + +/* Emits 1x {ld,sd} and 1x {lw,lbu,sw,sb}. */ +COPY_N(13) + +/* Emits 1x {ld,sd} and 1x {lw,lhu,sw,sh}. */ +COPY_N(14) + +/* Emits 1x {ld,sd} and 1x {lw,lhu,lbu,sw,sh,sb}. */ +COPY_N(15) + +/* Emits 2x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +COPY_N(19) + +/* Emits 2x {ld,sd} and 1x {lw,lhu,lbu,sw,sh,sb}. */ +COPY_N(23) + +/* The by-pieces infrastructure handles up to 24 bytes. + So the code below is emitted via cpymemsi/block_move_straight. */ + +/* Emits 3x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +COPY_N(27) + +/* Emits 3x {ld,sd} and 1x {lw,lbu,sw,sb}. */ +COPY_N(29) + +/* Emits 3x {ld,sd} and 1x {lw,lhu,lbu,sw,sh,sb}. */ +COPY_N(31) + +/* { dg-final { scan-assembler-times "ld\t" 17 } } */ +/* { dg-final { scan-assembler-times "sd\t" 17 } } */ + +/* { dg-final { scan-assembler-times "lw\t" 6 } } */ +/* { dg-final { scan-assembler-times "sw\t" 6 } } */ + +/* { dg-final { scan-assembler-times "lhu\t" 7 } } */ +/* { dg-final { scan-assembler-times "sh\t" 7 } } */ + +/* { dg-final { scan-assembler-times "lbu\t" 8 } } */ +/* { dg-final { scan-assembler-times "sb\t" 8 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c new file mode 100644 index 00000000000..ffb7248bfd1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c @@ -0,0 +1,50 @@ +/* { dg-do compile } */ +/* { dg-options "-mcpu=ventana-vt1 -march=rv64gc -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" "-Og" } } */ + +#define COPY_N(N) \ +void copy##N (char *src, char *dst) \ +{ \ + dst = __builtin_assume_aligned (dst, 4096); \ + src = __builtin_assume_aligned (src, 4096); \ + __builtin_memcpy (dst, src, N); \ +} + +/* Emits 1x {ld,sd} and 1x {lw,sw}. */ +COPY_N(11) + +/* Emits 2x {ld,sd}. */ +COPY_N(13) + +/* Emits 2x {ld,sd}. */ +COPY_N(14) + +/* Emits 2x {ld,sd}. */ +COPY_N(15) + +/* Emits 2x {ld,sd} and 1x {lw,sw}. */ +COPY_N(19) + +/* Emits 3x ld and 3x sd. */ +COPY_N(23) + +/* The by-pieces infrastructure handles up to 24 bytes. + So the code below is emitted via cpymemsi/block_move_straight. */ + +/* Emits 3x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +COPY_N(27) + +/* Emits 3x {ld,sd} and 1x {lw,lbu,sw,sb}. */ +COPY_N(29) + +/* Emits 3x {ld,sd} and 2x {lw,sw}. */ +COPY_N(31) + +/* { dg-final { scan-assembler-times "ld\t" 21 } } */ +/* { dg-final { scan-assembler-times "sd\t" 21 } } */ + +/* { dg-final { scan-assembler-times "lw\t" 5 } } */ +/* { dg-final { scan-assembler-times "sw\t" 5 } } */ + +/* { dg-final { scan-assembler-times "lbu\t" 2 } } */ +/* { dg-final { scan-assembler-times "sb\t" 2 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/memset-nonoverlapping.c b/gcc/testsuite/gcc.target/riscv/memset-nonoverlapping.c new file mode 100644 index 00000000000..c4311c7a8d0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/memset-nonoverlapping.c @@ -0,0 +1,45 @@ +/* { dg-do compile } */ +/* { dg-options "-mcpu=sifive-u74 -march=rv64gc -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" "-Og" } } */ + +#define ZERO_N(N) \ +void zero##N (char *dst) \ +{ \ + dst = __builtin_assume_aligned (dst, 4096); \ + __builtin_memset (dst, 0, N); \ +} + +/* Emits 1x sd and 1x {sh,sb}. */ +ZERO_N(11) + +/* Emits 1x sd and 1x {sw,sb}. */ +ZERO_N(13) + +/* Emits 1x sd and 1x {sw,sh}. */ +ZERO_N(14) + +/* Emits 1x sd and 1x {sw,sh,sb}. */ +ZERO_N(15) + +/* Emits 2x sd and 1x {sh,sb}. */ +ZERO_N(19) + +/* Emits 2x sd and 1x {sw,sh,sb}. */ +ZERO_N(23) + +/* The by-pieces infrastructure handles up to 24 bytes. + So the code below is emitted via cpymemsi/block_move_straight. */ + +/* Emits 3x sd and 1x {sh,sb}. */ +ZERO_N(27) + +/* Emits 3x sd and 1x {sw,sb}. */ +ZERO_N(29) + +/* Emits 3x sd and 1x {sw,sh,sb}. */ +ZERO_N(31) + +/* { dg-final { scan-assembler-times "sd\t" 17 } } */ +/* { dg-final { scan-assembler-times "sw\t" 6 } } */ +/* { dg-final { scan-assembler-times "sh\t" 7 } } */ +/* { dg-final { scan-assembler-times "sb\t" 8 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/memset-overlapping.c b/gcc/testsuite/gcc.target/riscv/memset-overlapping.c new file mode 100644 index 00000000000..793766b5262 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/memset-overlapping.c @@ -0,0 +1,43 @@ +/* { dg-do compile } */ +/* { dg-options "-mcpu=ventana-vt1 -march=rv64gc -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" "-Og" } } */ + +#define ZERO_N(N) \ +void zero##N (char *dst) \ +{ \ + dst = __builtin_assume_aligned (dst, 4096); \ + __builtin_memset (dst, 0, N); \ +} + +/* Emits 1x sd and 1x sw. */ +ZERO_N(11) + +/* Emits 2x sd. */ +ZERO_N(13) + +/* Emits 2x sd. */ +ZERO_N(14) + +/* Emits 2x sd. */ +ZERO_N(15) + +/* Emits 2x sd and 1x sw. */ +ZERO_N(19) + +/* Emits 3x sd. */ +ZERO_N(23) + +/* The by-pieces infrastructure handles up to 24 bytes. + So the code below is emitted via cpymemsi/block_move_straight. */ + +/* Emits 3x sd and 1x sw. */ +ZERO_N(27) + +/* Emits 4x sd. */ +ZERO_N(29) + +/* Emits 4x sd. */ +ZERO_N(31) + +/* { dg-final { scan-assembler-times "sd\t" 23 } } */ +/* { dg-final { scan-assembler-times "sw\t" 3 } } */ From patchwork Sun Nov 13 23:05:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19476 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865633wru; Sun, 13 Nov 2022 15:09:09 -0800 (PST) X-Google-Smtp-Source: AA0mqf7LnVY/rrJYy0YIsljApsG/RJD6VEGtu/p1HiMTtbMDDi0gaeiZqX0iNQdakMbZTN1S2sUu X-Received: by 2002:a17:906:b1c6:b0:783:5326:90e5 with SMTP id bv6-20020a170906b1c600b00783532690e5mr8405194ejb.374.1668380949831; Sun, 13 Nov 2022 15:09:09 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380949; cv=none; d=google.com; s=arc-20160816; b=ISTfRMO/iyZ/t7PQXXLoa2Iz+k6/59HmuLAkg/XtuOB8iIlQC4viftVFFd6QhKqOXl JQA0pqExnBBiETvnCTbuGPoTUUQtoITAX6lCL3S+NRM8a/5HT7lkyy+IlR6J5Rcrbjc3 wdheMtMjzkEAjXlMjlmOLifmeawjoxuf3L+FH98Y7yLdFQPLaBiq5kDae13hSHhgwFZC K0PI0qDvERrPupnccV+AEjUkSaKaG0t5hzfAszTdJCcaH4ig18pBtjhBbnFvwYv7t5q2 pZzV8ci8VTHDOwDhoUjc6jzyYhDPOo/mU2E/BMHVq8+O9vpwVxmcmLTLROxEVWjALgHc /fJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=MVII97EUAex2exa/LG51nEDdc2kfukU6SsqgvvYX2IM=; b=i3w/oezTJftjRJL4ornKyRNsUELIfKppf508K0IvPSU6L93L9A90q3OW24gt9u7j6M 8DPv5OeaIg/ngKtOaPgrWzPxwwXjUcoVzsp/tpsKQNGX5K0fytfSLskta2cO+FDK50Pk ifj6HoSC58gV4/qzMhdVvEmzSfuL4ZattT8nabrvwBXMtL4qzoYgGxY3L3P3xKdC0AN5 CobDhvEQbdq7EF0M3NNN8LIwXHKZ6jULlN/2dYnxRbTOjeN0f0J+rEamOOmbIx4mJeNA LB3RwCsC4MLvVl4xKDJeimP5DXzK7lmoXqOxNJsnjHOOu8kV0/HHMQJ87fjVZN2qDk7C SQAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=ihoyISVY; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id jz1-20020a170906bb0100b007815ca78c38si1502187ejb.235.2022.11.13.15.09.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:09:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=ihoyISVY; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ACD88395445B for ; Sun, 13 Nov 2022 23:06:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id DAB34386183F for ; Sun, 13 Nov 2022 23:05:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org DAB34386183F Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x62e.google.com with SMTP id 13so24506972ejn.3 for ; Sun, 13 Nov 2022 15:05:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MVII97EUAex2exa/LG51nEDdc2kfukU6SsqgvvYX2IM=; b=ihoyISVYj/nnKD4GPpaYCLYTxvHvbW97K/Nw1Qv7vOyNYmeO7vsxfe2K5n4C51MK1H FtBmj1pwUAG0yzVAB0HycRrtBVE+D+h+T0dCAMcyaW//zqB+DEeeSuQHPb4QZBNeVXWJ WEWJMW7cLIorUBWWlEbWakELmo5jSeJqaL8jQUUJcATWwePRQIKvdTh3QzNA2oOEiqmQ FTe3ocAReB07TdDKXeq4tyF8Mhpry5kZHZUcUcxjo3fjWF/zY+7q1LeFzD9nZEKh39hk OqgNhUzy+sMR3v5ekzxzDFPblXSsxL7LCgLPVujTuItAu/5mjdHAToDLqEKVuKidiIsp tFyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MVII97EUAex2exa/LG51nEDdc2kfukU6SsqgvvYX2IM=; b=oG0QPV/IrV7p5DE7tahqDVJo6Ncozup4QC7cOD1fIJxEeYPfd6KediOuz3gfPKaQXx 6Lw+bvrxcFJbc111DWoM9PQcSNUJMYZoHvwG/J2C6xffFH2pr8kNUCfnDuWU3CuRc6Lw 8iLzkqAUKHMybLG1LCXB4qV2oyAwg7v6mEiDyJT2hYBnfZ4oVTCbBonra+YHjf3tHcDM t3tobrG3tcNWm3zmH5ktDjcGCauiyXdNTOCXYpVd9wDwzJb0Q3QhlDxYafG8+kD5tkcQ Is1VG84yK7KRNyfMiGxiD027IQK5T6DDWRjSJrK8jJ8NfDmuA8Dpt0UxfxEUws2wu936 uh2Q== X-Gm-Message-State: ANoB5pkFrXH1uAr0nXwTt2t5vp7ZlHwO0z375HrWoh8xQtM2oyRmL50m 7JXalPtkkfxblvND6+tdKYpEX9K9qG+hV/Xm X-Received: by 2002:a17:907:8b0a:b0:78d:99f2:a94e with SMTP id sz10-20020a1709078b0a00b0078d99f2a94emr8674086ejc.232.1668380729097; Sun, 13 Nov 2022 15:05:29 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:28 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 4/7] riscv: Move riscv_block_move_loop to separate file Date: Mon, 14 Nov 2022 00:05:18 +0100 Message-Id: <20221113230521.712693-5-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424223000376595?= X-GMAIL-MSGID: =?utf-8?q?1749424223000376595?= From: Christoph Müllner Let's try to not accumulate too much functionality in one single file as this does not really help maintaining or extending the code. So in order to add more similar functionality like riscv_block_move_loop let's move this function to a separate file. This change does not do any functional changes. It does modify a single line in the existing code, that check_GNU_style.py complained about. gcc/ChangeLog: * config.gcc: Add new object riscv-string.o * config/riscv/riscv-protos.h (riscv_expand_block_move): Remove duplicated prototype and move to new section for riscv-string.cc. * config/riscv/riscv.cc (riscv_block_move_straight): Remove function. (riscv_adjust_block_mem): Likewise. (riscv_block_move_loop): Likewise. (riscv_expand_block_move): Likewise. * config/riscv/riscv.md (cpymemsi): Move to new section for riscv-string.cc. * config/riscv/t-riscv: Add compile rule for riscv-string.o * config/riscv/riscv-string.c: New file. Signed-off-by: Christoph Müllner --- gcc/config.gcc | 3 +- gcc/config/riscv/riscv-protos.h | 5 +- gcc/config/riscv/riscv-string.cc | 194 +++++++++++++++++++++++++++++++ gcc/config/riscv/riscv.cc | 155 ------------------------ gcc/config/riscv/riscv.md | 28 ++--- gcc/config/riscv/t-riscv | 4 + 6 files changed, 218 insertions(+), 171 deletions(-) create mode 100644 gcc/config/riscv/riscv-string.cc diff --git a/gcc/config.gcc b/gcc/config.gcc index b5eda046033..fc9e582e713 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -518,7 +518,8 @@ pru-*-*) ;; riscv*) cpu_type=riscv - extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o riscv-v.o" + extra_objs="riscv-builtins.o riscv-c.o riscv-sr.o riscv-shorten-memrefs.o riscv-selftests.o" + extra_objs="${extra_objs} riscv-string.o riscv-v.o" extra_objs="${extra_objs} riscv-vector-builtins.o riscv-vector-builtins-shapes.o riscv-vector-builtins-bases.o" d_target_objs="riscv-d.o" extra_headers="riscv_vector.h" diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a718bb62b4..344515dbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -62,7 +62,6 @@ extern void riscv_expand_conditional_move (rtx, rtx, rtx, rtx_code, rtx, rtx); #endif extern rtx riscv_legitimize_call_address (rtx); extern void riscv_set_return_address (rtx, rtx); -extern bool riscv_expand_block_move (rtx, rtx, rtx); extern rtx riscv_return_addr (int, rtx); extern poly_int64 riscv_initial_elimination_offset (int, int); extern void riscv_expand_prologue (void); @@ -70,7 +69,6 @@ extern void riscv_expand_epilogue (int); extern bool riscv_epilogue_uses (unsigned int); extern bool riscv_can_use_return_insn (void); extern rtx riscv_function_value (const_tree, const_tree, enum machine_mode); -extern bool riscv_expand_block_move (rtx, rtx, rtx); extern bool riscv_store_data_bypass_p (rtx_insn *, rtx_insn *); extern rtx riscv_gen_gpr_save_insn (struct riscv_frame_info *); extern bool riscv_gpr_save_operation_p (rtx); @@ -96,6 +94,9 @@ extern bool riscv_hard_regno_rename_ok (unsigned, unsigned); rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); +/* Routines implemented in riscv-string.c. */ +extern bool riscv_expand_block_move (rtx, rtx, rtx); + /* Information about one CPU we know about. */ struct riscv_cpu_info { /* This CPU's canonical name. */ diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc new file mode 100644 index 00000000000..6882f0be269 --- /dev/null +++ b/gcc/config/riscv/riscv-string.cc @@ -0,0 +1,194 @@ +/* Subroutines used to expand string and block move, clear, + compare and other operations for RISC-V. + Copyright (C) 2011-2022 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + You should have received a copy of the GNU General Public License + along with GCC; see the file COPYING3. If not see + . */ + +#define IN_TARGET_CODE 1 + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "rtl.h" +#include "tree.h" +#include "memmodel.h" +#include "tm_p.h" +#include "ira.h" +#include "print-tree.h" +#include "varasm.h" +#include "explow.h" +#include "expr.h" +#include "output.h" +#include "target.h" +#include "predict.h" +#include "optabs.h" + +/* Emit straight-line code to move LENGTH bytes from SRC to DEST. + Assume that the areas do not overlap. */ + +static void +riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) +{ + unsigned HOST_WIDE_INT offset, delta; + unsigned HOST_WIDE_INT bits; + int i; + enum machine_mode mode; + rtx *regs; + + bits = MAX (BITS_PER_UNIT, + MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest)))); + + mode = mode_for_size (bits, MODE_INT, 0).require (); + delta = bits / BITS_PER_UNIT; + + /* Allocate a buffer for the temporary registers. */ + regs = XALLOCAVEC (rtx, length / delta); + + /* Load as many BITS-sized chunks as possible. Use a normal load if + the source has enough alignment, otherwise use left/right pairs. */ + for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + { + regs[i] = gen_reg_rtx (mode); + riscv_emit_move (regs[i], adjust_address (src, mode, offset)); + } + + /* Copy the chunks to the destination. */ + for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); + + /* Mop up any left-over bytes. */ + if (offset < length) + { + src = adjust_address (src, BLKmode, offset); + dest = adjust_address (dest, BLKmode, offset); + move_by_pieces (dest, src, length - offset, + MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN); + } +} + +/* Helper function for doing a loop-based block operation on memory + reference MEM. Each iteration of the loop will operate on LENGTH + bytes of MEM. + + Create a new base register for use within the loop and point it to + the start of MEM. Create a new memory reference that uses this + register. Store them in *LOOP_REG and *LOOP_MEM respectively. */ + +static void +riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length, + rtx *loop_reg, rtx *loop_mem) +{ + *loop_reg = copy_addr_to_reg (XEXP (mem, 0)); + + /* Although the new mem does not refer to a known location, + it does keep up to LENGTH bytes of alignment. */ + *loop_mem = change_address (mem, BLKmode, *loop_reg); + set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT)); +} + +/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER + bytes at a time. LENGTH must be at least BYTES_PER_ITER. Assume that + the memory regions do not overlap. */ + +static void +riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length, + unsigned HOST_WIDE_INT bytes_per_iter) +{ + rtx label, src_reg, dest_reg, final_src, test; + unsigned HOST_WIDE_INT leftover; + + leftover = length % bytes_per_iter; + length -= leftover; + + /* Create registers and memory references for use within the loop. */ + riscv_adjust_block_mem (src, bytes_per_iter, &src_reg, &src); + riscv_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest); + + /* Calculate the value that SRC_REG should have after the last iteration + of the loop. */ + final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length), + 0, 0, OPTAB_WIDEN); + + /* Emit the start of the loop. */ + label = gen_label_rtx (); + emit_label (label); + + /* Emit the loop body. */ + riscv_block_move_straight (dest, src, bytes_per_iter); + + /* Move on to the next block. */ + riscv_emit_move (src_reg, plus_constant (Pmode, src_reg, bytes_per_iter)); + riscv_emit_move (dest_reg, plus_constant (Pmode, dest_reg, bytes_per_iter)); + + /* Emit the loop condition. */ + test = gen_rtx_NE (VOIDmode, src_reg, final_src); + emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label)); + + /* Mop up any left-over bytes. */ + if (leftover) + riscv_block_move_straight (dest, src, leftover); + else + emit_insn (gen_nop ()); +} + +/* Expand a cpymemsi instruction, which copies LENGTH bytes from + memory reference SRC to memory reference DEST. */ + +bool +riscv_expand_block_move (rtx dest, rtx src, rtx length) +{ + if (CONST_INT_P (length)) + { + unsigned HOST_WIDE_INT hwi_length = UINTVAL (length); + unsigned HOST_WIDE_INT factor, align; + + align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD); + factor = BITS_PER_WORD / align; + + if (optimize_function_for_size_p (cfun) + && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false)) + return false; + + if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) + { + riscv_block_move_straight (dest, src, INTVAL (length)); + return true; + } + else if (optimize && align >= BITS_PER_WORD) + { + unsigned min_iter_words + = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD; + unsigned iter_words = min_iter_words; + unsigned HOST_WIDE_INT bytes = hwi_length; + unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD; + + /* Lengthen the loop body if it shortens the tail. */ + for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++) + { + unsigned cur_cost = iter_words + words % iter_words; + unsigned new_cost = i + words % i; + if (new_cost <= cur_cost) + iter_words = i; + } + + riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD); + return true; + } + } + return false; +} diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 7357cf51cdf..fab40c6f8dc 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -3949,161 +3949,6 @@ riscv_legitimize_call_address (rtx addr) return addr; } -/* Emit straight-line code to move LENGTH bytes from SRC to DEST. - Assume that the areas do not overlap. */ - -static void -riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) -{ - unsigned HOST_WIDE_INT offset, delta; - unsigned HOST_WIDE_INT bits; - int i; - enum machine_mode mode; - rtx *regs; - - bits = MAX (BITS_PER_UNIT, - MIN (BITS_PER_WORD, MIN (MEM_ALIGN (src), MEM_ALIGN (dest)))); - - mode = mode_for_size (bits, MODE_INT, 0).require (); - delta = bits / BITS_PER_UNIT; - - /* Allocate a buffer for the temporary registers. */ - regs = XALLOCAVEC (rtx, length / delta); - - /* Load as many BITS-sized chunks as possible. Use a normal load if - the source has enough alignment, otherwise use left/right pairs. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) - { - regs[i] = gen_reg_rtx (mode); - riscv_emit_move (regs[i], adjust_address (src, mode, offset)); - } - - /* Copy the chunks to the destination. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) - riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); - - /* Mop up any left-over bytes. */ - if (offset < length) - { - src = adjust_address (src, BLKmode, offset); - dest = adjust_address (dest, BLKmode, offset); - move_by_pieces (dest, src, length - offset, - MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), RETURN_BEGIN); - } -} - -/* Helper function for doing a loop-based block operation on memory - reference MEM. Each iteration of the loop will operate on LENGTH - bytes of MEM. - - Create a new base register for use within the loop and point it to - the start of MEM. Create a new memory reference that uses this - register. Store them in *LOOP_REG and *LOOP_MEM respectively. */ - -static void -riscv_adjust_block_mem (rtx mem, unsigned HOST_WIDE_INT length, - rtx *loop_reg, rtx *loop_mem) -{ - *loop_reg = copy_addr_to_reg (XEXP (mem, 0)); - - /* Although the new mem does not refer to a known location, - it does keep up to LENGTH bytes of alignment. */ - *loop_mem = change_address (mem, BLKmode, *loop_reg); - set_mem_align (*loop_mem, MIN (MEM_ALIGN (mem), length * BITS_PER_UNIT)); -} - -/* Move LENGTH bytes from SRC to DEST using a loop that moves BYTES_PER_ITER - bytes at a time. LENGTH must be at least BYTES_PER_ITER. Assume that - the memory regions do not overlap. */ - -static void -riscv_block_move_loop (rtx dest, rtx src, unsigned HOST_WIDE_INT length, - unsigned HOST_WIDE_INT bytes_per_iter) -{ - rtx label, src_reg, dest_reg, final_src, test; - unsigned HOST_WIDE_INT leftover; - - leftover = length % bytes_per_iter; - length -= leftover; - - /* Create registers and memory references for use within the loop. */ - riscv_adjust_block_mem (src, bytes_per_iter, &src_reg, &src); - riscv_adjust_block_mem (dest, bytes_per_iter, &dest_reg, &dest); - - /* Calculate the value that SRC_REG should have after the last iteration - of the loop. */ - final_src = expand_simple_binop (Pmode, PLUS, src_reg, GEN_INT (length), - 0, 0, OPTAB_WIDEN); - - /* Emit the start of the loop. */ - label = gen_label_rtx (); - emit_label (label); - - /* Emit the loop body. */ - riscv_block_move_straight (dest, src, bytes_per_iter); - - /* Move on to the next block. */ - riscv_emit_move (src_reg, plus_constant (Pmode, src_reg, bytes_per_iter)); - riscv_emit_move (dest_reg, plus_constant (Pmode, dest_reg, bytes_per_iter)); - - /* Emit the loop condition. */ - test = gen_rtx_NE (VOIDmode, src_reg, final_src); - emit_jump_insn (gen_cbranch4 (Pmode, test, src_reg, final_src, label)); - - /* Mop up any left-over bytes. */ - if (leftover) - riscv_block_move_straight (dest, src, leftover); - else - emit_insn(gen_nop ()); -} - -/* Expand a cpymemsi instruction, which copies LENGTH bytes from - memory reference SRC to memory reference DEST. */ - -bool -riscv_expand_block_move (rtx dest, rtx src, rtx length) -{ - if (CONST_INT_P (length)) - { - unsigned HOST_WIDE_INT hwi_length = UINTVAL (length); - unsigned HOST_WIDE_INT factor, align; - - align = MIN (MIN (MEM_ALIGN (src), MEM_ALIGN (dest)), BITS_PER_WORD); - factor = BITS_PER_WORD / align; - - if (optimize_function_for_size_p (cfun) - && hwi_length * factor * UNITS_PER_WORD > MOVE_RATIO (false)) - return false; - - if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) - { - riscv_block_move_straight (dest, src, INTVAL (length)); - return true; - } - else if (optimize && align >= BITS_PER_WORD) - { - unsigned min_iter_words - = RISCV_MAX_MOVE_BYTES_PER_LOOP_ITER / UNITS_PER_WORD; - unsigned iter_words = min_iter_words; - unsigned HOST_WIDE_INT bytes = hwi_length; - unsigned HOST_WIDE_INT words = bytes / UNITS_PER_WORD; - - /* Lengthen the loop body if it shortens the tail. */ - for (unsigned i = min_iter_words; i < min_iter_words * 2 - 1; i++) - { - unsigned cur_cost = iter_words + words % iter_words; - unsigned new_cost = i + words % i; - if (new_cost <= cur_cost) - iter_words = i; - } - - riscv_block_move_loop (dest, src, bytes, iter_words * UNITS_PER_WORD); - return true; - } - } - return false; -} - /* Print symbolic operand OP, which is part of a HIGH or LO_SUM in context CONTEXT. HI_RELOC indicates a high-part reloc. */ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 532289dd178..43b97f1181e 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1872,19 +1872,6 @@ (define_split DONE; }) -(define_expand "cpymemsi" - [(parallel [(set (match_operand:BLK 0 "general_operand") - (match_operand:BLK 1 "general_operand")) - (use (match_operand:SI 2 "")) - (use (match_operand:SI 3 "const_int_operand"))])] - "" -{ - if (riscv_expand_block_move (operands[0], operands[1], operands[2])) - DONE; - else - FAIL; -}) - ;; Expand in-line code to clear the instruction cache between operand[0] and ;; operand[1]. (define_expand "clear_cache" @@ -3005,6 +2992,21 @@ (define_insn "riscv_prefetchi_" "prefetch.i\t%a0" ) +;; Expansions from riscv-string.c + +(define_expand "cpymemsi" + [(parallel [(set (match_operand:BLK 0 "general_operand") + (match_operand:BLK 1 "general_operand")) + (use (match_operand:SI 2 "")) + (use (match_operand:SI 3 "const_int_operand"))])] + "" +{ + if (riscv_expand_block_move (operands[0], operands[1], operands[2])) + DONE; + else + FAIL; +}) + (include "bitmanip.md") (include "sync.md") (include "peephole.md") diff --git a/gcc/config/riscv/t-riscv b/gcc/config/riscv/t-riscv index 7997db3d898..5cb58a74a53 100644 --- a/gcc/config/riscv/t-riscv +++ b/gcc/config/riscv/t-riscv @@ -63,6 +63,10 @@ riscv-selftests.o: $(srcdir)/config/riscv/riscv-selftests.cc $(COMPILE) $< $(POSTCOMPILE) +riscv-string.o: $(srcdir)/config/riscv/riscv-string.cc + $(COMPILE) $< + $(POSTCOMPILE) + riscv-v.o: $(srcdir)/config/riscv/riscv-v.cc $(COMPILE) $< $(POSTCOMPILE) From patchwork Sun Nov 13 23:05:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19472 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865002wru; Sun, 13 Nov 2022 15:07:07 -0800 (PST) X-Google-Smtp-Source: AA0mqf4EK0Dpe0ZlYHsomHOwZFKpVzOcv2Nvb2BATJJ13FK1v8GW7PK27++ljPAeWtJiMB2ea7XM X-Received: by 2002:a05:6402:f29:b0:462:30e4:fcf5 with SMTP id i41-20020a0564020f2900b0046230e4fcf5mr9394416eda.115.1668380827791; Sun, 13 Nov 2022 15:07:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380827; cv=none; d=google.com; s=arc-20160816; b=q4UvnD5GKe3ZMoHzu69jSwOUpDb/QmFRaL45+/UVu/QZDvKKrrRE9Ydz0q7bJyrn3n qIYXFKKyDS+pn1KQJuFN2Rbd77Mkg6ehwUL1EzHWeVRBSgyitjbwQ7eWZ2YBLKgVEcjS 1ksF8N2QM7iDYJKDRvkzz59EZAHLp+IZD//l7AyYBYdSMJbitDA9TjxrbmoV0lnzZzYz vbnmU6HxC7FTs5Aie2yvlRZ60Fny4bMsZK8aJLlcVJApZfm+K7bE+Q6emi+RbBgnWJ4U 4zspNtY/eFKKXh/Ynd7wy8kzTjjXNzGWoy6p1k2Z+dyj0oZMMGDqvOswkKZwMu2fLLvf dzzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=DwlzZEs7E5hDOu495G5JEiFL561iOWy66FkQziSy1H2k9X2ackquyjlhE7/o0oaFMU 2AQMUBExMne78hlYEcG/sOZihYriUMJ53TxSfIQbPmlKjM8dCfjBCdjrPWowT3bvlMkq KcgstD/ae8QA+vub8zg02ciNDhqz6l9hxP/+1Kk7UX4cqXsiw7fwoRHpPNKRxQIAQyRe GAx/ZyPyr3nvgyiQIAGI8+0yqgRjx3gb8WI4zhL2q4BPUk0V2QuyFrLNwcvFHBwoXCHD 59i5n1I4188kwG1/ETbAANmXb06w5cXYWg4hU17ksZ/yKana9RQN+xG1ifAE+Ds7h4mk +Mkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=sTO97jEI; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id t2-20020a056402524200b00462ab8923ccsi7584288edd.600.2022.11.13.15.07.07 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:07:07 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=sTO97jEI; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 350903898C6A for ; Sun, 13 Nov 2022 23:06:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x533.google.com (mail-ed1-x533.google.com [IPv6:2a00:1450:4864:20::533]) by sourceware.org (Postfix) with ESMTPS id B0167384F034 for ; Sun, 13 Nov 2022 23:05:31 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B0167384F034 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ed1-x533.google.com with SMTP id s12so14906761edd.5 for ; Sun, 13 Nov 2022 15:05:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=sTO97jEIpdQiU8aOAkFhuFh2aJi0xjSaISwXGHWF2H4WAbSDah87QW74ivqNJzxdJJ XtI4saOL0FFnoBNZzbhlo3XSbrTzQl2nTSmdIroZLkjSIFkauy766j2Pw6S3aXn0P6c0 6Mh0kUI8WB46oWeXBvEMFlt750QVsfsgvsoTBQch6nzPyVJmJrI5lLd+lmVqsN9aP1pE fC2OWyaAmeKqLMqlJV+ViUliltuX9vrkoI9s6rrlJij+OZB1bD/0O9R0aY9nUiiifkkr Kj/Rlmt+xGVPggpeP7gXmtOXKI0vpWzfYpYt1E5aWiHv9zyxEfW6P5LzeFXdRdL2Xu9e I/CQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1ZutoYwdyLokk8NkpUbBx6XK3mlzqoA78Ap3nQT675Q=; b=cA/VagKLeOwTqpsWsN7DNBRL2iOScwfDaEbflVGxYj3XOcAcen3LRHRdKeuUaU4rn4 cIfrWwoDrfFTPw1hBkoXLYd1A9uYtJceLhlE8KHyB07ACaDKrDAnvFGcqYNHjmugrKWu 7SIHjDLXzKtdtLny6mZxWkrt5fV/0dOg7gqMWESSMvHsNHhjQh99hxaFKs8qyL64f2vo 4B7wcFQ9o12QE6pMnxJbZblemxKl+4S2Ps1RTOw2lBLSdymXAv2++bAR461c0DefM0Cd e2me5t6gVhgD/84gry9MK39P0JZBhJkfqJZxxl8Z2eRRNVhIPTg2xThRiuEtCX5uHBgS COwg== X-Gm-Message-State: ANoB5pl84N1xb+1FuNv5tw2EZhnfW5I1WezluG4aWdpo293tfFHs3xha BJG7UiIbLnjQJ1KkiUQBTGFhUt20DMFLNqm+ X-Received: by 2002:a05:6402:528f:b0:464:4a3f:510b with SMTP id en15-20020a056402528f00b004644a3f510bmr9635313edb.222.1668380730190; Sun, 13 Nov 2022 15:05:30 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:29 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 5/7] riscv: Use by-pieces to do overlapping accesses in block_move_straight Date: Mon, 14 Nov 2022 00:05:19 +0100 Message-Id: <20221113230521.712693-6-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424094807643475?= X-GMAIL-MSGID: =?utf-8?q?1749424094807643475?= From: Christoph Müllner The current implementation of riscv_block_move_straight() emits a couple of load-store pairs with maximum width (e.g. 8-byte for RV64). The remainder is handed over to move_by_pieces(), which emits code based target settings like slow_unaligned_access and overlap_op_by_pieces. move_by_pieces() will emit overlapping memory accesses with maximum width only if the given length exceeds the size of one access (e.g. 15-bytes for 8-byte accesses). This patch changes the implementation of riscv_block_move_straight() such, that it preserves a remainder within the interval [delta..2*delta) instead of [0..delta), so that overlapping memory access may be emitted (if the requirements for them are given). gcc/ChangeLog: * config/riscv/riscv-string.c (riscv_block_move_straight): Adjust range for emitted load/store pairs. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-string.cc | 8 ++++---- .../gcc.target/riscv/memcpy-overlapping.c | 19 ++++++++----------- 2 files changed, 12 insertions(+), 15 deletions(-) diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 6882f0be269..1137df475be 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -57,18 +57,18 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length) delta = bits / BITS_PER_UNIT; /* Allocate a buffer for the temporary registers. */ - regs = XALLOCAVEC (rtx, length / delta); + regs = XALLOCAVEC (rtx, length / delta - 1); /* Load as many BITS-sized chunks as possible. Use a normal load if the source has enough alignment, otherwise use left/right pairs. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++) { regs[i] = gen_reg_rtx (mode); riscv_emit_move (regs[i], adjust_address (src, mode, offset)); } /* Copy the chunks to the destination. */ - for (offset = 0, i = 0; offset + delta <= length; offset += delta, i++) + for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++) riscv_emit_move (adjust_address (dest, mode, offset), regs[i]); /* Mop up any left-over bytes. */ @@ -166,7 +166,7 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) if (hwi_length <= (RISCV_MAX_MOVE_BYTES_STRAIGHT / factor)) { - riscv_block_move_straight (dest, src, INTVAL (length)); + riscv_block_move_straight (dest, src, hwi_length); return true; } else if (optimize && align >= BITS_PER_WORD) diff --git a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c index ffb7248bfd1..ef95bfb879b 100644 --- a/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c +++ b/gcc/testsuite/gcc.target/riscv/memcpy-overlapping.c @@ -25,26 +25,23 @@ COPY_N(15) /* Emits 2x {ld,sd} and 1x {lw,sw}. */ COPY_N(19) -/* Emits 3x ld and 3x sd. */ +/* Emits 3x {ld,sd}. */ COPY_N(23) /* The by-pieces infrastructure handles up to 24 bytes. So the code below is emitted via cpymemsi/block_move_straight. */ -/* Emits 3x {ld,sd} and 1x {lhu,lbu,sh,sb}. */ +/* Emits 3x {ld,sd} and 1x {lw,sw}. */ COPY_N(27) -/* Emits 3x {ld,sd} and 1x {lw,lbu,sw,sb}. */ +/* Emits 4x {ld,sd}. */ COPY_N(29) -/* Emits 3x {ld,sd} and 2x {lw,sw}. */ +/* Emits 4x {ld,sd}. */ COPY_N(31) -/* { dg-final { scan-assembler-times "ld\t" 21 } } */ -/* { dg-final { scan-assembler-times "sd\t" 21 } } */ +/* { dg-final { scan-assembler-times "ld\t" 23 } } */ +/* { dg-final { scan-assembler-times "sd\t" 23 } } */ -/* { dg-final { scan-assembler-times "lw\t" 5 } } */ -/* { dg-final { scan-assembler-times "sw\t" 5 } } */ - -/* { dg-final { scan-assembler-times "lbu\t" 2 } } */ -/* { dg-final { scan-assembler-times "sb\t" 2 } } */ +/* { dg-final { scan-assembler-times "lw\t" 3 } } */ +/* { dg-final { scan-assembler-times "sw\t" 3 } } */ From patchwork Sun Nov 13 23:05:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19477 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865643wru; Sun, 13 Nov 2022 15:09:14 -0800 (PST) X-Google-Smtp-Source: AA0mqf4BW5ZSPA42/4aNuqA9UPRT3COsOx7RoJ/Yq9YB9UMuXM0PWfSCnpcnNxH/zggQU7ZAE5ov X-Received: by 2002:a17:906:2509:b0:772:e95f:cdce with SMTP id i9-20020a170906250900b00772e95fcdcemr8744553ejb.78.1668380954158; Sun, 13 Nov 2022 15:09:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380954; cv=none; d=google.com; s=arc-20160816; b=Yrsd6uTbD88GiS7dLXqzf+YV3YcvwID91le8TKzvAgjcKeOOG3drl/XsD8DYOxVgBi 0QFN0obq54WvZX3xDMqnyKofsJIQdQuLVrEwV99gg3NmEM47zHBEtL78re2E3cgn1wXG EIbTEhe2wFq7sltiHn7g6eYM2QkpKGGmc5yv26PvSDeAuP/gqBjZr7PiKawt8MBB/+ig b53OCW5bipSCOU87ApkxZb9qkFwedKEL+oCdU+C9TrAc8tQM7RtC/IAPLLCaKX9HLjnP +xt0b5863PMi+1JImtCbt1nPq8jhjufpd/PjpFLBGJvZ7aHT9iQb0TC41fqY7AYoQBDi My1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=y/i1KWm+N2/lT6CRt+S2+TkvU1JlRv3UFled9kukbLszXL5CKHiLprlaKi6VHfUlhZ ENfFsuYCWIgifzqUw51HfyKtcsNtmznMaWo6IigS6MkBdLBAgrR4AI43EL+MlyDmJYKo do1nGNLHxxvfx83tqtOt3IuTGTeYqeM2B/8gFXAjSq55OM+GCi77Q6es+oNTQgN4zDGI sbDw1QnmenD+mbX0YE/oL4Srv07dAq6DB7lplA0cMrBxT+RjszHmQKZj05J0OAgy1tKi F51g0p4OPr9H16O9WIv/N54CAUr+oZGA4F0KBHvLuop5qbyPfm66H27YErna8jZO3dJz HpuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=ih7F7Xt+; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id sy8-20020a1709076f0800b007ae4ed48290si7142200ejc.279.2022.11.13.15.09.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:09:14 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=ih7F7Xt+; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1FB893885C23 for ; Sun, 13 Nov 2022 23:06:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id D956238515D7 for ; Sun, 13 Nov 2022 23:05:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D956238515D7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id k2so24476296ejr.2 for ; Sun, 13 Nov 2022 15:05:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=ih7F7Xt+idSp3dcxHHoNtE/ipsrON32Zay+E1aup7PggsvUcP2ESidq1nkFHFrxcwa mKVj8xirM+aJdEObcQZzlMuFYtq7UnVljcuJ3aIbrGTrU0WWDaYHE4CnPv+ujl7f/yVf K0b20u8mfW6r+OK/mpyyVJdXtmbgwnCY9/jCaxFQCiyKVXhJiwsPszKEgXYfwd0tAO0f yzp+sx37haQM/wNd15UFcH0Ie0X4X7mDWQdbNklfiZg1eRMn5BHXiL9BYWBYIZNVJs29 OVq32jjVhKaGXvYNjwKz23G32bEzKzT6MEZbMzBHzpvCYnSZuDTUPWsRLaoNWsd4LVIh 2r9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=AAdnuBGpjE93UXl9CZshSoJTN+UQIQVY56dbBxYD0XMNW6p+i/LPm+kwUW6Qw14+bZ NZ2y1mo01O844wGgNeZjpyGMIxGLdtd8Z+in1hXUk6gk4+nchEuJ4Q+RyEC9E8RyU5Lg bEwYtIaMlWBmFES2LjQQAP9SiQoA/wERFZGcieHZiHFybUD3eOw0Ca+WIwaUImsAE9js kztc6H6O33RnEeRcn5mLh0gUKW7ppVjShPHsTVZq89jIBwXTHkIwXBCRoSNmLQVT7rll 5W93ljGDRUr3wJZdpSWup6kI5z3vhURl26aKeEakE1/3fgzE2mXeY6wHXaQ/lOpDPClm kLZQ== X-Gm-Message-State: ANoB5pnOteyQ1XbU98UGDs6yqhJ/DteSe7E+V1C71h61NO12IOT6TsQZ K1NR821RJaQUu9N8Kua2BAQChq2ZX78OBkCS X-Received: by 2002:a17:906:2856:b0:7a9:a59c:4be with SMTP id s22-20020a170906285600b007a9a59c04bemr8554809ejc.556.1668380731290; Sun, 13 Nov 2022 15:05:31 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:30 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 6/7] riscv: Add support for strlen inline expansion Date: Mon, 14 Nov 2022 00:05:20 +0100 Message-Id: <20221113230521.712693-7-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424227627770442?= X-GMAIL-MSGID: =?utf-8?q?1749424227627770442?= From: Christoph Müllner This patch implements the expansion of the strlen builtin using Zbb instructions (if available) for aligned strings using the following sequence: li a3,-1 addi a4,a0,8 .L2: ld a5,0(a0) addi a0,a0,8 orc.b a5,a5 beq a5,a3,6 <.L2> not a5,a5 ctz a5,a5 srli a5,a5,0x3 add a0,a0,a5 sub a0,a0,a4 This allows to inline calls to strlen(), with optimized code for determining the length of a string. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_strlen): New prototype. * config/riscv/riscv-string.cc (riscv_emit_unlikely_jump): New function. (GEN_EMIT_HELPER2): New helper macro. (GEN_EMIT_HELPER3): New helper macro. (do_load_from_addr): New helper function. (riscv_expand_strlen_zbb): New function. (riscv_expand_strlen): New function. * config/riscv/riscv.md (strlen): Invoke expansion functions for strlen. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 149 ++++++++++++++++++ gcc/config/riscv/riscv.md | 28 ++++ .../gcc.target/riscv/zbb-strlen-unaligned.c | 13 ++ gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 18 +++ 5 files changed, 209 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 344515dbaf4..18187e3bd78 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -96,6 +96,7 @@ rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ extern bool riscv_expand_block_move (rtx, rtx, rtx); +extern bool riscv_expand_strlen (rtx[]); /* Information about one CPU we know about. */ struct riscv_cpu_info { diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 1137df475be..bf96522b608 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -38,6 +38,81 @@ #include "predict.h" #include "optabs.h" +/* Emit unlikely jump instruction. */ + +static rtx_insn * +riscv_emit_unlikely_jump (rtx insn) +{ + rtx_insn *jump = emit_jump_insn (insn); + add_reg_br_prob_note (jump, profile_probability::very_unlikely ()); + return jump; +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER2(name) \ +static rtx_insn * \ +do_## name ## 2(rtx dest, rtx src) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di2 (dest, src)); \ + else \ + insn = emit_insn (gen_ ## name ## si2 (dest, src)); \ + return insn; \ +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER3(name) \ +static rtx_insn * \ +do_## name ## 3(rtx dest, rtx src1, rtx src2) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di3 (dest, src1, src2)); \ + else \ + insn = emit_insn (gen_ ## name ## si3 (dest, src1, src2)); \ + return insn; \ +} + +GEN_EMIT_HELPER3(add) /* do_add3 */ +GEN_EMIT_HELPER3(sub) /* do_sub3 */ +GEN_EMIT_HELPER3(lshr) /* do_lshr3 */ +GEN_EMIT_HELPER2(orcb) /* do_orcb2 */ +GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ +GEN_EMIT_HELPER2(clz) /* do_clz2 */ +GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ +GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ + +/* Helper function to load a byte or a Pmode register. + + MODE is the mode to use for the load (QImode or Pmode). + DEST is the destination register for the data. + ADDR_REG is the register that holds the address. + ADDR is the address expression to load from. + + This function returns an rtx containing the register, + where the ADDR is stored. */ + +static rtx +do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) +{ + rtx mem = gen_rtx_MEM (mode, addr_reg); + MEM_COPY_ATTRIBUTES (mem, addr); + set_mem_size (mem, GET_MODE_SIZE (mode)); + + if (mode == QImode) + do_zero_extendqi2 (dest, mem); + else if (mode == Pmode) + emit_move_insn (dest, mem); + else + gcc_unreachable (); + + return addr_reg; +} + + /* Emit straight-line code to move LENGTH bytes from SRC to DEST. Assume that the areas do not overlap. */ @@ -192,3 +267,77 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) } return false; } + +/* If the provided string is aligned, then read XLEN bytes + in a loop and use orc.b to find NUL-bytes. */ + +static bool +riscv_expand_strlen_zbb (rtx result, rtx src, rtx align) +{ + rtx m1, addr, addr_plus_regsz, word, zeros; + rtx loop_label, cond; + + gcc_assert (TARGET_ZBB); + + /* The alignment needs to be known and big enough. */ + if (!CONST_INT_P (align) || UINTVAL (align) < GET_MODE_SIZE (Pmode)) + return false; + + m1 = gen_reg_rtx (Pmode); + addr = copy_addr_to_reg (XEXP (src, 0)); + addr_plus_regsz = gen_reg_rtx (Pmode); + word = gen_reg_rtx (Pmode); + zeros = gen_reg_rtx (Pmode); + + emit_insn (gen_rtx_SET (m1, constm1_rtx)); + do_add3 (addr_plus_regsz, addr, GEN_INT (UNITS_PER_WORD)); + + loop_label = gen_label_rtx (); + emit_label (loop_label); + + /* Load a word and use orc.b to find a zero-byte. */ + do_load_from_addr (Pmode, word, addr, src); + do_add3 (addr, addr, GEN_INT (UNITS_PER_WORD)); + do_orcb2 (word, word); + cond = gen_rtx_EQ (VOIDmode, word, m1); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond, + word, m1, loop_label)); + + /* Calculate the return value by counting zero-bits. */ + do_one_cmpl2 (word, word); + if (TARGET_BIG_ENDIAN) + do_clz2 (zeros, word); + else + do_ctz2 (zeros, word); + + do_lshr3 (zeros, zeros, GEN_INT (exact_log2 (BITS_PER_UNIT))); + do_add3 (addr, addr, zeros); + do_sub3 (result, addr, addr_plus_regsz); + + return true; +} + +/* Expand a strlen operation and return true if successful. + Return false if we should let the compiler generate normal + code, probably a strlen call. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the source. + OPERANDS[2] is the search byte (must be 0) + OPERANDS[3] is the alignment in bytes. */ + +bool +riscv_expand_strlen (rtx operands[]) +{ + rtx result = operands[0]; + rtx src = operands[1]; + rtx search_char = operands[2]; + rtx align = operands[3]; + + gcc_assert (search_char == const0_rtx); + + if (TARGET_ZBB) + return riscv_expand_strlen_zbb (result, src, align); + + return false; +} diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 43b97f1181e..f05c764c3d4 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -65,6 +65,9 @@ (define_c_enum "unspec" [ ;; OR-COMBINE UNSPEC_ORC_B + + ;; ZBB STRLEN + UNSPEC_STRLEN ]) (define_c_enum "unspecv" [ @@ -3007,6 +3010,31 @@ (define_expand "cpymemsi" FAIL; }) +;; Search character in string (generalization of strlen). +;; Argument 0 is the resulting offset +;; Argument 1 is the string +;; Argument 2 is the search character +;; Argument 3 is the alignment + +(define_expand "strlen" + [(set (match_operand:X 0 "register_operand") + (unspec:X [(match_operand:BLK 1 "general_operand") + (match_operand:SI 2 "const_int_operand") + (match_operand:SI 3 "const_int_operand")] + UNSPEC_STRLEN))] + "" +{ + rtx search_char = operands[2]; + + if (optimize_insn_for_size_p () || search_char != const0_rtx) + FAIL; + + if (riscv_expand_strlen (operands)) + DONE; + else + FAIL; +}) + (include "bitmanip.md") (include "sync.md") (include "peephole.md") diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c new file mode 100644 index 00000000000..39da70a5021 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler-not "orc.b\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c new file mode 100644 index 00000000000..d01b7fc552d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + s = __builtin_assume_aligned (s, 4096); + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler "orc.b\t" } } */ +/* { dg-final { scan-assembler-not "jalr" } } */ +/* { dg-final { scan-assembler-not "call" } } */ +/* { dg-final { scan-assembler-not "jr" } } */ +/* { dg-final { scan-assembler-not "tail" } } */ From patchwork Sun Nov 13 23:05:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19478 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865923wru; Sun, 13 Nov 2022 15:10:18 -0800 (PST) X-Google-Smtp-Source: AA0mqf42WiKUedzGp4FNSONh6OGFE7n1D2FnArzfw0hE75DRW8VcFO5QQAPajLEl58hjJ1PMQcHF X-Received: by 2002:a17:906:2782:b0:78d:77b1:a433 with SMTP id j2-20020a170906278200b0078d77b1a433mr8694455ejc.486.1668381018634; Sun, 13 Nov 2022 15:10:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668381018; cv=none; d=google.com; s=arc-20160816; b=nROagi+21wCtKco7FgujIZ0yC9P6zfJB/T7QhHT7gGUtTuPqx7LIqvWil/kTTpqjac RaYlCovT4AIRu5/ArLgXmJ5jeeFXeL1BlcsJ384J9SdTF1h0UjYpgH0bvWrpqjMzv4lw K99iasm2KZg6A15Wnr5jFIzv5AG5MwDOKH4dSWof+MhYqKLm+H6wvQ/IyliRVB/b1zhe XFvi2CmjnTo1kxpdvuJsVaIHeqpupO6XGkfA5dDcmhqaxjUv16i7hrVz8Ra+1DXx8V2M C4aXviAjj/mTTuvC08pn93IsTkrwPTpCRJF5jnzmRjC2mxTNhh6E0bkbUFlwGg+L9fBL i7yA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=NJaE0g4ZEsMhS6Pl/08H9GkVtKqTVvsLy0QFIabwUmw=; b=DtvEb4SBRhPcO7WAl/7hE3JeYUm0FoUYQhNMgFZXboJ0tKBuFPsiC5WrQHYFwhrAA9 jH6oIBZ3yzoBVl9mjJ9Am5VCDHCK19a+HGn5AF/mSATxSmvGhcMGEh1K4AvMRpveVFBE 8ikfr5ZXDYkuFh4zp3vR+RKbnOhVunZOdZ7jZ9d+YwCRx8qqxplTsHigYrFE3tazEqUL ip5ZCPgoGCD9XC2pGYAGFLWQjpzOaZsKiZq2MuiszDnn/UCn50ZwzheSinTHHQ8depJV ZPGwKYD+8bNu3qa1u99QqPVD/u2p3Izu4fl+nA39gLjNeXXtXOe8iwUjQBjWv5YJbps+ mg3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=qnJq2onv; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id ka23-20020a170907991700b0073083c63edcsi5840811ejc.306.2022.11.13.15.10.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:10:18 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=qnJq2onv; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9802938A8157 for ; Sun, 13 Nov 2022 23:07:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id 2796E384F00C for ; Sun, 13 Nov 2022 23:05:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2796E384F00C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id t25so24392438ejb.8 for ; Sun, 13 Nov 2022 15:05:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NJaE0g4ZEsMhS6Pl/08H9GkVtKqTVvsLy0QFIabwUmw=; b=qnJq2onvKgXWprzqROnoVmAZDD4ep3e3sxATS8kkakPaIdk57ITM9tPUuKlRr5oQD/ 6PcYYEsxMz7zSOIzqenDjRUj5coUIceHgkAg3j8nHPp3sdJ7kO4XvjijLiWfO04b/3T5 HqZaCrbyjbyl6GhvhCbYambApdLv06t4mN0MFl0k9dcsRQUDgdw4XXHfeGroP3inXogm VZbPt89faAU5kUgVw9l0cVVvMYoFp2KogsYpHH43YTQowUShjp1aaklvZlkURgecRSFk I4s2K5aKgVAPZyTPwo/ImmuV7A+AN7MQ1higYvqwuA/+0o1/XXnldVgUnFzRf7WrnVuB ZR+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NJaE0g4ZEsMhS6Pl/08H9GkVtKqTVvsLy0QFIabwUmw=; b=GFmBj+bxu350rCEEx+QpFj+7qDISkuofWUScO1AdvcS6PplvzmUaFCV4Lxc93gPOQk DSpYixunCbUxjXaHEaV+Jsuaeze5kz7zM4d9ArRpOKWzKE3AbEEiRGrENPBpQxJ2VDHI YBaXrFr1YEQ4wjoDAglNy3uL8EW5xcyYcRG4EnivEtNwBYpua+646rgGqAtUSdDgS59K +eqZ+6LQZtW+5jbb2hJG5oaXG4KkPDVp8n79oQr9TC2CZSrUIgRza16mIhsYDtg1Khwf DNkgpbbFRo5q8e0ekhmqJLMnm/Ev1tXhGrd4nmIv4XYQ4pB/8uCg8+M2LJ0GSyk9DblF W5Vg== X-Gm-Message-State: ANoB5pkWPI1glJWp+3ORdES7z2Oujf0waUdFf3csk6XaOFQMkJAAZj2L USJEOr7IiPtDDTqZvaWhf6dcpkCu6F9Lb3I7 X-Received: by 2002:a17:906:d9b:b0:7ae:acea:fca6 with SMTP id m27-20020a1709060d9b00b007aeaceafca6mr8775876eji.150.1668380732379; Sun, 13 Nov 2022 15:05:32 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:31 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 7/7] riscv: Add support for str(n)cmp inline expansion Date: Mon, 14 Nov 2022 00:05:21 +0100 Message-Id: <20221113230521.712693-8-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424295440773111?= X-GMAIL-MSGID: =?utf-8?q?1749424295440773111?= From: Christoph Müllner This patch implements expansions for the cmpstrsi and the cmpstrnsi builtins using Zbb instructions (if available). This allows to inline calls to strcmp() and strncmp(). The expansion basically emits a peeled comparison sequence (i.e. a peeled comparison loop) which compares XLEN bits per step if possible. The emitted sequence can be controlled, by setting the maximum number of compared bytes (-mstring-compare-inline-limit). gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_strn_compare): New prototype. * config/riscv/riscv-string.cc (GEN_EMIT_HELPER3): New helper macros. (GEN_EMIT_HELPER2): New helper macros. (expand_strncmp_zbb_sequence): New function. (riscv_emit_str_compare_zbb): New function. (riscv_expand_strn_compare): New function. * config/riscv/riscv.md (cmpstrnsi): Invoke expansion functions for strn_compare. (cmpstrsi): Invoke expansion functions for strn_compare. * config/riscv/riscv.opt: Add new parameter '-mstring-compare-inline-limit'. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 344 ++++++++++++++++++ gcc/config/riscv/riscv.md | 46 +++ gcc/config/riscv/riscv.opt | 5 + .../gcc.target/riscv/zbb-strcmp-unaligned.c | 36 ++ gcc/testsuite/gcc.target/riscv/zbb-strcmp.c | 55 +++ 6 files changed, 487 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 18187e3bd78..7f334be333c 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -97,6 +97,7 @@ rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ extern bool riscv_expand_block_move (rtx, rtx, rtx); extern bool riscv_expand_strlen (rtx[]); +extern bool riscv_expand_strn_compare (rtx[], int); /* Information about one CPU we know about. */ struct riscv_cpu_info { diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index bf96522b608..f157e04ac0c 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -84,6 +84,11 @@ GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ GEN_EMIT_HELPER2(clz) /* do_clz2 */ GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ +GEN_EMIT_HELPER3(xor) /* do_xor3 */ +GEN_EMIT_HELPER3(ashl) /* do_ashl3 */ +GEN_EMIT_HELPER2(bswap) /* do_bswap2 */ +GEN_EMIT_HELPER3(riscv_ior_not) /* do_riscv_ior_not3 */ +GEN_EMIT_HELPER3(riscv_and_not) /* do_riscv_and_not3 */ /* Helper function to load a byte or a Pmode register. @@ -268,6 +273,345 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) return false; } +/* Generate the sequence of compares for strcmp/strncmp using zbb instructions. + BYTES_TO_COMPARE is the number of bytes to be compared. + BASE_ALIGN is the smaller of the alignment of the two strings. + ORIG_SRC1 is the unmodified rtx for the first string. + ORIG_SRC2 is the unmodified rtx for the second string. + DATA1 is the register for loading the first string. + DATA2 is the register for loading the second string. + HAS_NUL is the register holding non-NUL bytes for NUL-bytes in the string. + TARGET is the rtx for the result register (SImode) + EQUALITY_COMPARE_REST if set, then we hand over to libc if string matches. + END_LABEL is the location before the calculation of the result value. + FINAL_LABEL is the location after the calculation of the result value. */ + +static void +expand_strncmp_zbb_sequence (unsigned HOST_WIDE_INT bytes_to_compare, + rtx src1, rtx src2, rtx data1, rtx data2, + rtx target, rtx orc, bool equality_compare_rest, + rtx end_label, rtx final_label) +{ + const unsigned HOST_WIDE_INT p_mode_size = GET_MODE_SIZE (Pmode); + rtx src1_addr = force_reg (Pmode, XEXP (src1, 0)); + rtx src2_addr = force_reg (Pmode, XEXP (src2, 0)); + unsigned HOST_WIDE_INT offset = 0; + + rtx m1 = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (m1, constm1_rtx)); + + /* Generate a compare sequence. */ + while (bytes_to_compare > 0) + { + machine_mode load_mode = QImode; + unsigned HOST_WIDE_INT load_mode_size = 1; + if (bytes_to_compare > 1) + { + load_mode = Pmode; + load_mode_size = p_mode_size; + } + unsigned HOST_WIDE_INT cmp_bytes = 0; + + if (bytes_to_compare >= load_mode_size) + cmp_bytes = load_mode_size; + else + cmp_bytes = bytes_to_compare; + + unsigned HOST_WIDE_INT remain = bytes_to_compare - cmp_bytes; + + /* load_mode_size...bytes we will read + cmp_bytes...bytes we will compare (might be less than load_mode_size) + bytes_to_compare...bytes we will compare (incl. cmp_bytes) + remain...bytes left to compare (excl. cmp_bytes) */ + + rtx addr1 = gen_rtx_PLUS (Pmode, src1_addr, GEN_INT (offset)); + rtx addr2 = gen_rtx_PLUS (Pmode, src2_addr, GEN_INT (offset)); + + do_load_from_addr (load_mode, data1, addr1, src1); + do_load_from_addr (load_mode, data2, addr2, src2); + + if (load_mode_size == 1) + { + /* Special case for comparing just single (last) byte. */ + gcc_assert (remain == 0); + + if (!equality_compare_rest) + { + /* Calculate difference and jump to final_label. */ + rtx result = gen_reg_rtx (Pmode); + do_sub3 (result, data1, data2); + emit_insn (gen_movsi (target, gen_lowpart (SImode, result))); + emit_jump_insn (gen_jump (final_label)); + } + else + { + /* Compare both bytes and jump to final_label if not equal. */ + rtx result = gen_reg_rtx (Pmode); + do_sub3 (result, data1, data2); + emit_insn (gen_movsi (target, gen_lowpart (SImode, result))); + /* Check if str1[i] is NULL. */ + rtx cond1 = gen_rtx_EQ (VOIDmode, data1, const0_rtx); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond1, + data1, const0_rtx, final_label)); + /* Check if str1[i] == str2[i]. */ + rtx cond2 = gen_rtx_NE (VOIDmode, data1, data2); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond2, + data1, data2, final_label)); + /* Processing will fall through to libc calls. */ + } + } + else + { + /* Eliminate irrelevant data (behind the N-th character). */ + if (bytes_to_compare < p_mode_size) + { + gcc_assert (remain == 0); + /* Set a NUL-byte after the relevant data (behind the string). */ + unsigned long im = 0xffUL; + rtx imask = gen_rtx_CONST_INT (Pmode, im); + rtx m_reg = gen_reg_rtx (Pmode); + emit_insn (gen_rtx_SET (m_reg, imask)); + do_ashl3 (m_reg, m_reg, GEN_INT (cmp_bytes * BITS_PER_UNIT)); + do_riscv_and_not3 (data1, m_reg, data1); + do_riscv_and_not3 (data2, m_reg, data2); + do_orcb2 (orc, data1); + emit_jump_insn (gen_jump (end_label)); + } + else + { + /* Check if data1 contains a NUL character. */ + do_orcb2 (orc, data1); + rtx cond1 = gen_rtx_NE (VOIDmode, orc, m1); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond1, orc, m1, + end_label)); + + /* Break out if u1 != u2 */ + rtx cond2 = gen_rtx_NE (VOIDmode, data1, data2); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond2, data1, + data2, end_label)); + + /* Fast-exit for complete and equal strings. */ + if (remain == 0 && !equality_compare_rest) + { + /* All compared and everything was equal. */ + emit_insn (gen_rtx_SET (target, gen_rtx_CONST_INT (SImode, 0))); + emit_jump_insn (gen_jump (final_label)); + } + } + } + + offset += cmp_bytes; + bytes_to_compare -= cmp_bytes; + } + /* Processing will fall through to libc calls. */ +} + +/* Emit a string comparison sequence using Zbb instruction. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the first source. + OPERANDS[2] is the second source. + If NO_LENGTH is zero, then: + OPERANDS[3] is the length. + OPERANDS[4] is the alignment in bytes. + If NO_LENGTH is nonzero, then: + OPERANDS[3] is the alignment in bytes. + BYTES_TO_COMPARE is the maximum number of bytes to compare. + EQUALITY_COMPARE_REST defines if str(n)cmp should be called on equality. + */ + +static bool +riscv_emit_str_compare_zbb (rtx operands[], int no_length, + unsigned HOST_WIDE_INT bytes_to_compare, + bool equality_compare_rest) +{ + const unsigned HOST_WIDE_INT p_mode_size = GET_MODE_SIZE (Pmode); + rtx target = operands[0]; + rtx src1 = operands[1]; + rtx src2 = operands[2]; + rtx bytes_rtx = NULL; + rtx align_rtx = operands[3]; + + if (!no_length) + { + bytes_rtx = operands[3]; + align_rtx = operands[4]; + } + + gcc_assert (TARGET_ZBB); + + /* Enable only if we can access at least one XLEN-register. */ + if (bytes_to_compare < p_mode_size) + return false; + + /* Limit to 12-bits (maximum load-offset). */ + if (bytes_to_compare > IMM_REACH) + return false; + + /* We don't support big endian. */ + if (BYTES_BIG_ENDIAN) + return false; + + /* We need to know the alignment. */ + if (!CONST_INT_P (align_rtx)) + return false; + + unsigned HOST_WIDE_INT base_align = UINTVAL (align_rtx); + unsigned HOST_WIDE_INT required_align = p_mode_size; + if (base_align < required_align) + return false; + + rtx data1 = gen_reg_rtx (Pmode); + rtx data2 = gen_reg_rtx (Pmode); + rtx orc = gen_reg_rtx (Pmode); + rtx end_label = gen_label_rtx (); + rtx final_label = gen_label_rtx (); + + /* Generate a sequence of zbb instructions to compare out + to the length specified. */ + expand_strncmp_zbb_sequence (bytes_to_compare, src1, src2, data1, data2, + target, orc, equality_compare_rest, + end_label, final_label); + + if (equality_compare_rest) + { + /* Update pointers past what has been compared already. */ + rtx src1_addr = force_reg (Pmode, XEXP (src1, 0)); + rtx src2_addr = force_reg (Pmode, XEXP (src2, 0)); + unsigned HOST_WIDE_INT offset = bytes_to_compare; + rtx src1 = force_reg (Pmode, + gen_rtx_PLUS (Pmode, src1_addr, GEN_INT (offset))); + rtx src2 = force_reg (Pmode, + gen_rtx_PLUS (Pmode, src2_addr, GEN_INT (offset))); + + /* Construct call to strcmp/strncmp to compare the rest of the string. */ + if (no_length) + { + tree fun = builtin_decl_explicit (BUILT_IN_STRCMP); + emit_library_call_value (XEXP (DECL_RTL (fun), 0), + target, LCT_NORMAL, GET_MODE (target), + src1, Pmode, src2, Pmode); + } + else + { + unsigned HOST_WIDE_INT bytes = UINTVAL (bytes_rtx); + unsigned HOST_WIDE_INT delta = bytes - bytes_to_compare; + gcc_assert (delta > 0); + rtx len_rtx = gen_reg_rtx (Pmode); + emit_move_insn (len_rtx, gen_int_mode (delta, Pmode)); + tree fun = builtin_decl_explicit (BUILT_IN_STRNCMP); + emit_library_call_value (XEXP (DECL_RTL (fun), 0), + target, LCT_NORMAL, GET_MODE (target), + src1, Pmode, src2, Pmode, len_rtx, Pmode); + } + + emit_jump_insn (gen_jump (final_label)); + } + + emit_barrier (); /* No fall-through. */ + + emit_label (end_label); + + /* Convert non-equal bytes into non-NUL bytes. */ + rtx diff = gen_reg_rtx (Pmode); + do_xor3 (diff, data1, data2); + do_orcb2 (diff, diff); + + /* Convert non-equal or NUL-bytes into non-NUL bytes. */ + rtx syndrome = gen_reg_rtx (Pmode); + do_riscv_ior_not3 (syndrome, orc, diff); + + /* Count the number of equal bits from the beginning of the word. */ + rtx shift = gen_reg_rtx (Pmode); + do_ctz2 (shift, syndrome); + + do_bswap2 (data1, data1); + do_bswap2 (data2, data2); + + /* The most-significant-non-zero bit of the syndrome marks either the + first bit that is different, or the top bit of the first zero byte. + Shifting left now will bring the critical information into the + top bits. */ + do_ashl3 (data1, data1, gen_lowpart (QImode, shift)); + do_ashl3 (data2, data2, gen_lowpart (QImode, shift)); + + /* But we need to zero-extend (char is unsigned) the value and then + perform a signed 32-bit subtraction. */ + unsigned int shiftr = p_mode_size * BITS_PER_UNIT - 8; + do_lshr3 (data1, data1, GEN_INT (shiftr)); + do_lshr3 (data2, data2, GEN_INT (shiftr)); + + rtx result = gen_reg_rtx (Pmode); + do_sub3 (result, data1, data2); + emit_insn (gen_movsi (target, gen_lowpart (SImode, result))); + + /* And we are done. */ + emit_label (final_label); + return true; +} + +/* Expand a string compare operation with length, and return + true if successful. Return false if we should let the + compiler generate normal code, probably a strncmp call. + If NO_LENGTH is set, there is no upper bound of the strings. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the first source. + OPERANDS[2] is the second source. + If NO_LENGTH is zero, then: + OPERANDS[3] is the length. + OPERANDS[4] is the alignment in bytes. + If NO_LENGTH is nonzero, then: + OPERANDS[3] is the alignment in bytes. */ + +bool +riscv_expand_strn_compare (rtx operands[], int no_length) +{ + rtx bytes_rtx = NULL; + const unsigned HOST_WIDE_INT compare_max = riscv_string_compare_inline_limit; + unsigned HOST_WIDE_INT compare_length; /* How much to compare inline. */ + bool equality_compare_rest = false; /* Call libc to compare remainder. */ + + if (riscv_string_compare_inline_limit == 0) + return false; + + /* Decide how many bytes to compare inline and what to do if there is + no difference detected at the end of the compared bytes. + We might call libc to continue the comparison. */ + if (no_length) + { + compare_length = compare_max; + equality_compare_rest = true; + } + else + { + /* If we have a length, it must be constant. */ + bytes_rtx = operands[3]; + if (!CONST_INT_P (bytes_rtx)) + return false; + + unsigned HOST_WIDE_INT bytes = UINTVAL (bytes_rtx); + if (bytes <= compare_max) + { + compare_length = bytes; + equality_compare_rest = false; + } + else + { + compare_length = compare_max; + equality_compare_rest = true; + } + } + + if (TARGET_ZBB) + { + return riscv_emit_str_compare_zbb (operands, no_length, compare_length, + equality_compare_rest); + } + + return false; +} + /* If the provided string is aligned, then read XLEN bytes in a loop and use orc.b to find NUL-bytes. */ diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index f05c764c3d4..dce33a4b638 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -3010,6 +3010,52 @@ (define_expand "cpymemsi" FAIL; }) +;; String compare N insn. +;; Argument 0 is the target (result) +;; Argument 1 is the source1 +;; Argument 2 is the source2 +;; Argument 3 is the length +;; Argument 4 is the alignment + +(define_expand "cmpstrnsi" + [(parallel [(set (match_operand:SI 0) + (compare:SI (match_operand:BLK 1) + (match_operand:BLK 2))) + (use (match_operand:SI 3)) + (use (match_operand:SI 4))])] + "" +{ + if (optimize_insn_for_size_p ()) + FAIL; + + if (riscv_expand_strn_compare (operands, 0)) + DONE; + else + FAIL; +}) + +;; String compare insn. +;; Argument 0 is the target (result) +;; Argument 1 is the destination +;; Argument 2 is the source +;; Argument 3 is the alignment + +(define_expand "cmpstrsi" + [(parallel [(set (match_operand:SI 0) + (compare:SI (match_operand:BLK 1) + (match_operand:BLK 2))) + (use (match_operand:SI 3))])] + "" +{ + if (optimize_insn_for_size_p ()) + FAIL; + + if (riscv_expand_strn_compare (operands, 1)) + DONE; + else + FAIL; +}) + ;; Search character in string (generalization of strlen). ;; Argument 0 is the resulting offset ;; Argument 1 is the string diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt index 7c3ca48d1cc..fdf768ae9a7 100644 --- a/gcc/config/riscv/riscv.opt +++ b/gcc/config/riscv/riscv.opt @@ -249,3 +249,8 @@ Enum(isa_spec_class) String(20191213) Value(ISA_SPEC_CLASS_20191213) misa-spec= Target RejectNegative Joined Enum(isa_spec_class) Var(riscv_isa_spec) Init(TARGET_DEFAULT_ISA_SPEC) Set the version of RISC-V ISA spec. + +mstring-compare-inline-limit= +Target Var(riscv_string_compare_inline_limit) Init(64) RejectNegative Joined UInteger Save +Max number of bytes to compare. + diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c new file mode 100644 index 00000000000..2126c849e0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c @@ -0,0 +1,36 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -mstring-compare-inline-limit=64" } */ + +typedef long unsigned int size_t; + +int +my_str_cmp (const char *s1, const char *s2) +{ + return __builtin_strcmp (s1, s2); +} + +int +my_str_cmp_const (const char *s1) +{ + return __builtin_strcmp (s1, "foo"); +} + +int +my_strn_cmp (const char *s1, const char *s2, size_t n) +{ + return __builtin_strncmp (s1, s2, n); +} + +int +my_strn_cmp_const (const char *s1, size_t n) +{ + return __builtin_strncmp (s1, "foo", n); +} + +int +my_strn_cmp_bounded (const char *s1, const char *s2) +{ + return __builtin_strncmp (s1, s2, 42); +} + +/* { dg-final { scan-assembler-not "orc.b\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strcmp.c b/gcc/testsuite/gcc.target/riscv/zbb-strcmp.c new file mode 100644 index 00000000000..3465e7ffee3 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strcmp.c @@ -0,0 +1,55 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64 -mstring-compare-inline-limit=64" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" "-Og" } } */ + +typedef long unsigned int size_t; + +/* Emits 8+1 orc.b instructions. */ + +int +my_str_cmp (const char *s1, const char *s2) +{ + s1 = __builtin_assume_aligned (s1, 4096); + s2 = __builtin_assume_aligned (s2, 4096); + return __builtin_strcmp (s1, s2); +} + +/* 8+1 because the backend does not know the size of "foo". */ + +int +my_str_cmp_const (const char *s1) +{ + s1 = __builtin_assume_aligned (s1, 4096); + return __builtin_strcmp (s1, "foo"); +} + +/* Emits 6+1 orc.b instructions. */ + +int +my_strn_cmp (const char *s1, const char *s2) +{ + s1 = __builtin_assume_aligned (s1, 4096); + s2 = __builtin_assume_aligned (s2, 4096); + return __builtin_strncmp (s1, s2, 42); +} + +/* Note expanded because the backend does not know the size of "foo". */ + +int +my_strn_cmp_const (const char *s1, size_t n) +{ + s1 = __builtin_assume_aligned (s1, 4096); + return __builtin_strncmp (s1, "foo", n); +} + +/* Emits 6+1 orc.b instructions. */ + +int +my_strn_cmp_bounded (const char *s1, const char *s2) +{ + s1 = __builtin_assume_aligned (s1, 4096); + s2 = __builtin_assume_aligned (s2, 4096); + return __builtin_strncmp (s1, s2, 42); +} + +/* { dg-final { scan-assembler-times "orc.b\t" 32 } } */