From patchwork Sun Nov 13 23:05:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 19477 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1865643wru; Sun, 13 Nov 2022 15:09:14 -0800 (PST) X-Google-Smtp-Source: AA0mqf4BW5ZSPA42/4aNuqA9UPRT3COsOx7RoJ/Yq9YB9UMuXM0PWfSCnpcnNxH/zggQU7ZAE5ov X-Received: by 2002:a17:906:2509:b0:772:e95f:cdce with SMTP id i9-20020a170906250900b00772e95fcdcemr8744553ejb.78.1668380954158; Sun, 13 Nov 2022 15:09:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668380954; cv=none; d=google.com; s=arc-20160816; b=Yrsd6uTbD88GiS7dLXqzf+YV3YcvwID91le8TKzvAgjcKeOOG3drl/XsD8DYOxVgBi 0QFN0obq54WvZX3xDMqnyKofsJIQdQuLVrEwV99gg3NmEM47zHBEtL78re2E3cgn1wXG EIbTEhe2wFq7sltiHn7g6eYM2QkpKGGmc5yv26PvSDeAuP/gqBjZr7PiKawt8MBB/+ig b53OCW5bipSCOU87ApkxZb9qkFwedKEL+oCdU+C9TrAc8tQM7RtC/IAPLLCaKX9HLjnP +xt0b5863PMi+1JImtCbt1nPq8jhjufpd/PjpFLBGJvZ7aHT9iQb0TC41fqY7AYoQBDi My1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dmarc-filter:delivered-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=y/i1KWm+N2/lT6CRt+S2+TkvU1JlRv3UFled9kukbLszXL5CKHiLprlaKi6VHfUlhZ ENfFsuYCWIgifzqUw51HfyKtcsNtmznMaWo6IigS6MkBdLBAgrR4AI43EL+MlyDmJYKo do1nGNLHxxvfx83tqtOt3IuTGTeYqeM2B/8gFXAjSq55OM+GCi77Q6es+oNTQgN4zDGI sbDw1QnmenD+mbX0YE/oL4Srv07dAq6DB7lplA0cMrBxT+RjszHmQKZj05J0OAgy1tKi F51g0p4OPr9H16O9WIv/N54CAUr+oZGA4F0KBHvLuop5qbyPfm66H27YErna8jZO3dJz HpuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=ih7F7Xt+; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id sy8-20020a1709076f0800b007ae4ed48290si7142200ejc.279.2022.11.13.15.09.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:09:14 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@vrull.eu header.s=google header.b=ih7F7Xt+; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1FB893885C23 for ; Sun, 13 Nov 2022 23:06:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x634.google.com (mail-ej1-x634.google.com [IPv6:2a00:1450:4864:20::634]) by sourceware.org (Postfix) with ESMTPS id D956238515D7 for ; Sun, 13 Nov 2022 23:05:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org D956238515D7 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x634.google.com with SMTP id k2so24476296ejr.2 for ; Sun, 13 Nov 2022 15:05:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=ih7F7Xt+idSp3dcxHHoNtE/ipsrON32Zay+E1aup7PggsvUcP2ESidq1nkFHFrxcwa mKVj8xirM+aJdEObcQZzlMuFYtq7UnVljcuJ3aIbrGTrU0WWDaYHE4CnPv+ujl7f/yVf K0b20u8mfW6r+OK/mpyyVJdXtmbgwnCY9/jCaxFQCiyKVXhJiwsPszKEgXYfwd0tAO0f yzp+sx37haQM/wNd15UFcH0Ie0X4X7mDWQdbNklfiZg1eRMn5BHXiL9BYWBYIZNVJs29 OVq32jjVhKaGXvYNjwKz23G32bEzKzT6MEZbMzBHzpvCYnSZuDTUPWsRLaoNWsd4LVIh 2r9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=766OZe45cUEucim0xIwOv0WPrtSuI0J2OaUnLSmIVBo=; b=AAdnuBGpjE93UXl9CZshSoJTN+UQIQVY56dbBxYD0XMNW6p+i/LPm+kwUW6Qw14+bZ NZ2y1mo01O844wGgNeZjpyGMIxGLdtd8Z+in1hXUk6gk4+nchEuJ4Q+RyEC9E8RyU5Lg bEwYtIaMlWBmFES2LjQQAP9SiQoA/wERFZGcieHZiHFybUD3eOw0Ca+WIwaUImsAE9js kztc6H6O33RnEeRcn5mLh0gUKW7ppVjShPHsTVZq89jIBwXTHkIwXBCRoSNmLQVT7rll 5W93ljGDRUr3wJZdpSWup6kI5z3vhURl26aKeEakE1/3fgzE2mXeY6wHXaQ/lOpDPClm kLZQ== X-Gm-Message-State: ANoB5pnOteyQ1XbU98UGDs6yqhJ/DteSe7E+V1C71h61NO12IOT6TsQZ K1NR821RJaQUu9N8Kua2BAQChq2ZX78OBkCS X-Received: by 2002:a17:906:2856:b0:7a9:a59c:4be with SMTP id s22-20020a170906285600b007a9a59c04bemr8554809ejc.556.1668380731290; Sun, 13 Nov 2022 15:05:31 -0800 (PST) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id ku3-20020a170907788300b007ae21bbdd3fsm2361281ejc.162.2022.11.13.15.05.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Nov 2022 15:05:30 -0800 (PST) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Cc: =?utf-8?q?Christoph_M=C3=BCllner?= Subject: [PATCH 6/7] riscv: Add support for strlen inline expansion Date: Mon, 14 Nov 2022 00:05:20 +0100 Message-Id: <20221113230521.712693-7-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221113230521.712693-1-christoph.muellner@vrull.eu> References: <20221113230521.712693-1-christoph.muellner@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749424227627770442?= X-GMAIL-MSGID: =?utf-8?q?1749424227627770442?= From: Christoph Müllner This patch implements the expansion of the strlen builtin using Zbb instructions (if available) for aligned strings using the following sequence: li a3,-1 addi a4,a0,8 .L2: ld a5,0(a0) addi a0,a0,8 orc.b a5,a5 beq a5,a3,6 <.L2> not a5,a5 ctz a5,a5 srli a5,a5,0x3 add a0,a0,a5 sub a0,a0,a4 This allows to inline calls to strlen(), with optimized code for determining the length of a string. gcc/ChangeLog: * config/riscv/riscv-protos.h (riscv_expand_strlen): New prototype. * config/riscv/riscv-string.cc (riscv_emit_unlikely_jump): New function. (GEN_EMIT_HELPER2): New helper macro. (GEN_EMIT_HELPER3): New helper macro. (do_load_from_addr): New helper function. (riscv_expand_strlen_zbb): New function. (riscv_expand_strlen): New function. * config/riscv/riscv.md (strlen): Invoke expansion functions for strlen. Signed-off-by: Christoph Müllner --- gcc/config/riscv/riscv-protos.h | 1 + gcc/config/riscv/riscv-string.cc | 149 ++++++++++++++++++ gcc/config/riscv/riscv.md | 28 ++++ .../gcc.target/riscv/zbb-strlen-unaligned.c | 13 ++ gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 18 +++ 5 files changed, 209 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 344515dbaf4..18187e3bd78 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -96,6 +96,7 @@ rtl_opt_pass * make_pass_shorten_memrefs (gcc::context *ctxt); /* Routines implemented in riscv-string.c. */ extern bool riscv_expand_block_move (rtx, rtx, rtx); +extern bool riscv_expand_strlen (rtx[]); /* Information about one CPU we know about. */ struct riscv_cpu_info { diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 1137df475be..bf96522b608 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -38,6 +38,81 @@ #include "predict.h" #include "optabs.h" +/* Emit unlikely jump instruction. */ + +static rtx_insn * +riscv_emit_unlikely_jump (rtx insn) +{ + rtx_insn *jump = emit_jump_insn (insn); + add_reg_br_prob_note (jump, profile_probability::very_unlikely ()); + return jump; +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER2(name) \ +static rtx_insn * \ +do_## name ## 2(rtx dest, rtx src) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di2 (dest, src)); \ + else \ + insn = emit_insn (gen_ ## name ## si2 (dest, src)); \ + return insn; \ +} + +/* Emit proper instruction depending on type of dest. */ + +#define GEN_EMIT_HELPER3(name) \ +static rtx_insn * \ +do_## name ## 3(rtx dest, rtx src1, rtx src2) \ +{ \ + rtx_insn *insn; \ + if (GET_MODE (dest) == DImode) \ + insn = emit_insn (gen_ ## name ## di3 (dest, src1, src2)); \ + else \ + insn = emit_insn (gen_ ## name ## si3 (dest, src1, src2)); \ + return insn; \ +} + +GEN_EMIT_HELPER3(add) /* do_add3 */ +GEN_EMIT_HELPER3(sub) /* do_sub3 */ +GEN_EMIT_HELPER3(lshr) /* do_lshr3 */ +GEN_EMIT_HELPER2(orcb) /* do_orcb2 */ +GEN_EMIT_HELPER2(one_cmpl) /* do_one_cmpl2 */ +GEN_EMIT_HELPER2(clz) /* do_clz2 */ +GEN_EMIT_HELPER2(ctz) /* do_ctz2 */ +GEN_EMIT_HELPER2(zero_extendqi) /* do_zero_extendqi2 */ + +/* Helper function to load a byte or a Pmode register. + + MODE is the mode to use for the load (QImode or Pmode). + DEST is the destination register for the data. + ADDR_REG is the register that holds the address. + ADDR is the address expression to load from. + + This function returns an rtx containing the register, + where the ADDR is stored. */ + +static rtx +do_load_from_addr (machine_mode mode, rtx dest, rtx addr_reg, rtx addr) +{ + rtx mem = gen_rtx_MEM (mode, addr_reg); + MEM_COPY_ATTRIBUTES (mem, addr); + set_mem_size (mem, GET_MODE_SIZE (mode)); + + if (mode == QImode) + do_zero_extendqi2 (dest, mem); + else if (mode == Pmode) + emit_move_insn (dest, mem); + else + gcc_unreachable (); + + return addr_reg; +} + + /* Emit straight-line code to move LENGTH bytes from SRC to DEST. Assume that the areas do not overlap. */ @@ -192,3 +267,77 @@ riscv_expand_block_move (rtx dest, rtx src, rtx length) } return false; } + +/* If the provided string is aligned, then read XLEN bytes + in a loop and use orc.b to find NUL-bytes. */ + +static bool +riscv_expand_strlen_zbb (rtx result, rtx src, rtx align) +{ + rtx m1, addr, addr_plus_regsz, word, zeros; + rtx loop_label, cond; + + gcc_assert (TARGET_ZBB); + + /* The alignment needs to be known and big enough. */ + if (!CONST_INT_P (align) || UINTVAL (align) < GET_MODE_SIZE (Pmode)) + return false; + + m1 = gen_reg_rtx (Pmode); + addr = copy_addr_to_reg (XEXP (src, 0)); + addr_plus_regsz = gen_reg_rtx (Pmode); + word = gen_reg_rtx (Pmode); + zeros = gen_reg_rtx (Pmode); + + emit_insn (gen_rtx_SET (m1, constm1_rtx)); + do_add3 (addr_plus_regsz, addr, GEN_INT (UNITS_PER_WORD)); + + loop_label = gen_label_rtx (); + emit_label (loop_label); + + /* Load a word and use orc.b to find a zero-byte. */ + do_load_from_addr (Pmode, word, addr, src); + do_add3 (addr, addr, GEN_INT (UNITS_PER_WORD)); + do_orcb2 (word, word); + cond = gen_rtx_EQ (VOIDmode, word, m1); + riscv_emit_unlikely_jump (gen_cbranch4 (Pmode, cond, + word, m1, loop_label)); + + /* Calculate the return value by counting zero-bits. */ + do_one_cmpl2 (word, word); + if (TARGET_BIG_ENDIAN) + do_clz2 (zeros, word); + else + do_ctz2 (zeros, word); + + do_lshr3 (zeros, zeros, GEN_INT (exact_log2 (BITS_PER_UNIT))); + do_add3 (addr, addr, zeros); + do_sub3 (result, addr, addr_plus_regsz); + + return true; +} + +/* Expand a strlen operation and return true if successful. + Return false if we should let the compiler generate normal + code, probably a strlen call. + + OPERANDS[0] is the target (result). + OPERANDS[1] is the source. + OPERANDS[2] is the search byte (must be 0) + OPERANDS[3] is the alignment in bytes. */ + +bool +riscv_expand_strlen (rtx operands[]) +{ + rtx result = operands[0]; + rtx src = operands[1]; + rtx search_char = operands[2]; + rtx align = operands[3]; + + gcc_assert (search_char == const0_rtx); + + if (TARGET_ZBB) + return riscv_expand_strlen_zbb (result, src, align); + + return false; +} diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 43b97f1181e..f05c764c3d4 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -65,6 +65,9 @@ (define_c_enum "unspec" [ ;; OR-COMBINE UNSPEC_ORC_B + + ;; ZBB STRLEN + UNSPEC_STRLEN ]) (define_c_enum "unspecv" [ @@ -3007,6 +3010,31 @@ (define_expand "cpymemsi" FAIL; }) +;; Search character in string (generalization of strlen). +;; Argument 0 is the resulting offset +;; Argument 1 is the string +;; Argument 2 is the search character +;; Argument 3 is the alignment + +(define_expand "strlen" + [(set (match_operand:X 0 "register_operand") + (unspec:X [(match_operand:BLK 1 "general_operand") + (match_operand:SI 2 "const_int_operand") + (match_operand:SI 3 "const_int_operand")] + UNSPEC_STRLEN))] + "" +{ + rtx search_char = operands[2]; + + if (optimize_insn_for_size_p () || search_char != const0_rtx) + FAIL; + + if (riscv_expand_strlen (operands)) + DONE; + else + FAIL; +}) + (include "bitmanip.md") (include "sync.md") (include "peephole.md") diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c new file mode 100644 index 00000000000..39da70a5021 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler-not "orc.b\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/zbb-strlen.c b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c new file mode 100644 index 00000000000..d01b7fc552d --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-strlen.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zbb -mabi=lp64" } */ +/* { dg-skip-if "" { *-*-* } { "-Os" } } */ + +typedef long unsigned int size_t; + +size_t +my_str_len (const char *s) +{ + s = __builtin_assume_aligned (s, 4096); + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler "orc.b\t" } } */ +/* { dg-final { scan-assembler-not "jalr" } } */ +/* { dg-final { scan-assembler-not "call" } } */ +/* { dg-final { scan-assembler-not "jr" } } */ +/* { dg-final { scan-assembler-not "tail" } } */