From patchwork Fri Dec 1 15:21:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Robin Dapp X-Patchwork-Id: 172541 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp1199454vqy; Fri, 1 Dec 2023 07:22:18 -0800 (PST) X-Google-Smtp-Source: AGHT+IHR0grG+n7lpZesfGY3wuUxWe372Ga13CH74XHmIe/ju9P1TQsmdiAGNpvNECIu2LFMBT11 X-Received: by 2002:ac8:5f4a:0:b0:423:84a6:f267 with SMTP id y10-20020ac85f4a000000b0042384a6f267mr31027383qta.0.1701444138324; Fri, 01 Dec 2023 07:22:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701444138; cv=pass; d=google.com; s=arc-20160816; b=CLI4CcRCT0w5YB0MyzeJZEW/9Mpmy6Gv7Ghu33aD0RPSdZnQlFnj027hBezRL8X/hD goaMAoB8XVUzf+m0xvbAhy4sKunwEnIviTTbazTuw4FfJPDeJvmrnp9rWF2fEk87v3o4 xYPOcMnmPKUMBIMYJQ9LKlcerqSVL9MifQwJbEvjjoWGCFdVvosh9LdLqdtGEhLSGOmg 1ADJQ4leDmEciGsCc3mvBrTgkHlCF5/kKkt+tthKklOF/kkNo9w7ll3TXl1gGSZ0kma1 c7ifCoQtqiqsUHeoGZy+tPKd+H76kPCNPjoaDxU3Dqz5ElA3dfoHn5m4rU9hCEryTn8t c6jQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :subject:from:to:content-language:cc:user-agent:mime-version:date :message-id:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=DQAAKQxvzXaxaMiVkH2EWVcxGZEpnCY3nA6o2xOH2D8=; fh=MP1/pUCFlWKCuVsZg/AyiqM/gAN9eNrTjtzrUP16xWo=; b=VJnShcspaiRVmfRHfxJgjo98M5B+W1XKxLRYK4fTg9iqNZ8AuLDW8E4Zf7KaqN/vZa /xbLMUrWB33MaIyOzsQs/g+5+eRZz6uJErsTA/oBbSDivVLzru6pc7eP95QrTE8cogVj axB18iZV98wxy+lN30c+UX44ZN1S7tGBPM8D5NaiJL2NUe6Wc5zlYLShJrrwXQ0pm9NY rUadoa4NTh2HQ5u9f2dheaLVQF8YS1MyFsXko8qdoxrVKjzZ8N3EXUbd0vfu6JeR6llg RmbIoSgkQCOIHQFgtcF4xfHAEcAy30sMm96M3L91GwI/JQ2dWhzjOb/WAU06CFK8vJa5 VDxg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=CkCBl4XS; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ff20-20020a05622a4d9400b004236ed850d4si3781241qtb.166.2023.12.01.07.22.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 07:22:18 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=CkCBl4XS; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1A3493857340 for ; Fri, 1 Dec 2023 15:22:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x629.google.com (mail-ej1-x629.google.com [IPv6:2a00:1450:4864:20::629]) by sourceware.org (Postfix) with ESMTPS id 66B70385E004 for ; Fri, 1 Dec 2023 15:21:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 66B70385E004 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 66B70385E004 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::629 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701444111; cv=none; b=RjIrO0jDFeYbLlz51tHxDjmaD0XFcRd4dC0afYOkehzZiioRMKD1Bp0q5+eYUluOvJgBa2jzriE+S4lmm1yBbOV7NMpYBJ+FhdTDnq47T6wX3AJZn3lByJ5eRUErIU/WI2QJwI8kLiV8WVOWkoccho9uksE5+eHILO07YnLumME= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701444111; c=relaxed/simple; bh=WZwRjRqfkLPwVO1lNPktJY0jqU2yaPgQX8UWppOnEhk=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=CuaBLddtCMKXt1u+8X0M+M8BxZAM1sITFYh/UTY+bmcP3MW1YhRN3D6DtEAbefoN7QuSJUX3R/zhVwI7EhINcZAM8tmxQzdgkabXP7YS65BgHkWU/wTr7RC+21Q9yDOWKTtWOjh+fnwwW5DU+EaXExHryqZSi0K5kS8AAPJDOkE= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x629.google.com with SMTP id a640c23a62f3a-a013d22effcso324064766b.2 for ; Fri, 01 Dec 2023 07:21:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1701444108; x=1702048908; darn=gcc.gnu.org; h=content-transfer-encoding:subject:from:to:content-language:cc :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=DQAAKQxvzXaxaMiVkH2EWVcxGZEpnCY3nA6o2xOH2D8=; b=CkCBl4XSMybeHEm0FUtYWet6+L6uz0fiJN37R70RQa6vlO5K/kGE8Q2M5uj67cupCP cz0cWnlFkvcEmNN/kFxdNPFKVgEAHMrJV+3fG3jJog8dbs/06iajym0l25orrn11h+e+ 4snh1P0CKrNldpfFdDJqPewtXAE0S8GKcWBnNCD14ahMS9ZY2akQvbciPdJYkkZTfdOj 2CWM3nI8J+NuZoLvtpqrCvmJVnZeu4kiHQkM+YsQZkxr2466AR+fUSTkKScGcznwJNbR xByqM4BO95D1H+L2hT2xUgede9xtQT4w4rrOaGQAKJLsWJqS2DXAyYwuMIJ2dKzDAImm DvsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701444108; x=1702048908; h=content-transfer-encoding:subject:from:to:content-language:cc :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=DQAAKQxvzXaxaMiVkH2EWVcxGZEpnCY3nA6o2xOH2D8=; b=GKYGdk4MmO0B/737KaAKARm8Lfgb/T3hbuaNrmXJTvv8x29+ef6AGhjA/mv3dp2YB3 Dgjg1/B7yjPOv30wLEChOnOlh7Wt4t/zOScN+gPmwJPgRDnBY2MgToT8q0KlBioWBP6H K1jAbjULPMpFltVDpuAy3c30du6EVuyqmnCT9sybjWsIgfan2kWPu747Ck49P3fyte1C VPdSH5C+0r/jRrKsXga9znXBNjg3iX39t8UKGwg5tgfxKpINXaD1Fa3fGGaQa/4OjIsE PsEvpEVTkZRrKtWit4789/cLsoqsZOGN98b9n89ybd+hJnyuu7zPLjBxP3LIhzuYjum0 iJlg== X-Gm-Message-State: AOJu0YzWkTaARkDxn5bPPxgprptlWKYk/IxUQJwZR4yxXDF2dVycz5ru jG6sLGd2qiMpsLQeiS98NvXf9fusFrw= X-Received: by 2002:a17:906:cc5a:b0:a19:a19b:c71d with SMTP id mm26-20020a170906cc5a00b00a19a19bc71dmr748714ejb.109.1701444107669; Fri, 01 Dec 2023 07:21:47 -0800 (PST) Received: from [192.168.1.23] (ip-149-172-150-237.um42.pools.vodafone-ip.de. [149.172.150.237]) by smtp.gmail.com with ESMTPSA id lu5-20020a170906fac500b00a18e21b4dcasm1742012ejb.126.2023.12.01.07.21.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 01 Dec 2023 07:21:47 -0800 (PST) Message-ID: <072e8569-e08b-4a22-adb5-64e888bd471b@gmail.com> Date: Fri, 1 Dec 2023 16:21:47 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: rdapp.gcc@gmail.com Content-Language: en-US To: gcc-patches , palmer , Kito Cheng , jeffreyalaw , "juzhe.zhong@rivai.ai" From: Robin Dapp Subject: [PATCH] RISC-V: Add vectorized strlen. X-Spam-Status: No, score=-9.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784093488714366742 X-GMAIL-MSGID: 1784093488714366742 Hi, this patch implements a vectorized strlen by re-using and slightly adjusting the rawmemchr implementation. Rawmemchr returns the address of the needle while strlen returns the difference between needle address and start address. As before, strlen expansion is guarded by -minline-strlen. While testing with -minline-strlen I encountered a vsetvl problem in memcpy-chk.c where we didn't insert a vsetvl at the proper spot (after a setjmp). This needs to be fixed separately and I figured I'd post this patch as-is. Regards Robin gcc/ChangeLog: * config/riscv/riscv-protos.h (expand_rawmemchr): Add strlen parameter. * config/riscv/riscv-string.cc (riscv_expand_strlen): Call rawmemchr. (expand_rawmemchr): Add strlen handling. * config/riscv/riscv.md: Add TARGET_VECTOR to strlen expander. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/builtin/strlen-run.c: New test. * gcc.target/riscv/rvv/autovec/builtin/strlen.c: New test. --- gcc/config/riscv/riscv-protos.h | 2 +- gcc/config/riscv/riscv-string.cc | 41 ++++++++++++++----- gcc/config/riscv/riscv.md | 8 +--- .../riscv/rvv/autovec/builtin/strlen-run.c | 37 +++++++++++++++++ .../riscv/rvv/autovec/builtin/strlen.c | 12 ++++++ 5 files changed, 83 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 695ee24ad6f..c94c82a9973 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -557,7 +557,7 @@ void expand_cond_unop (unsigned, rtx *); void expand_cond_binop (unsigned, rtx *); void expand_cond_ternop (unsigned, rtx *); void expand_popcount (rtx *); -void expand_rawmemchr (machine_mode, rtx, rtx, rtx); +void expand_rawmemchr (machine_mode, rtx, rtx, rtx, bool = false); void emit_vec_extract (rtx, rtx, poly_int64); /* Rounding mode bitfield for fixed point VXRM. */ diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc index 594ff49fc5a..6cde1bf89a0 100644 --- a/gcc/config/riscv/riscv-string.cc +++ b/gcc/config/riscv/riscv-string.cc @@ -588,9 +588,16 @@ riscv_expand_strlen_scalar (rtx result, rtx src, rtx align) bool riscv_expand_strlen (rtx result, rtx src, rtx search_char, rtx align) { + if (TARGET_VECTOR && stringop_strategy & STRATEGY_VECTOR) + { + riscv_vector::expand_rawmemchr (E_QImode, result, src, search_char, + /* strlen */ true); + return true; + } + gcc_assert (search_char == const0_rtx); - if (TARGET_ZBB || TARGET_XTHEADBB) + if ((TARGET_ZBB || TARGET_XTHEADBB) && stringop_strategy & STRATEGY_SCALAR) return riscv_expand_strlen_scalar (result, src, align); return false; @@ -979,12 +986,13 @@ expand_block_move (rtx dst_in, rtx src_in, rtx length_in) } -/* Implement rawmemchr using vector instructions. +/* Implement rawmemchr and strlen using vector instructions. It can be assumed that the needle is in the haystack, otherwise the behavior is undefined. */ void -expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat) +expand_rawmemchr (machine_mode mode, rtx dst, rtx haystack, rtx needle, + bool strlen) { /* rawmemchr: @@ -1005,6 +1013,9 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat) */ gcc_assert (TARGET_VECTOR); + if (strlen) + gcc_assert (mode == E_QImode); + unsigned int isize = GET_MODE_SIZE (mode).to_constant (); int lmul = TARGET_MAX_LMUL; poly_int64 nunits = exact_div (BYTES_PER_RISCV_VECTOR * lmul, isize); @@ -1028,12 +1039,13 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat) return a pointer to the matching byte. */ unsigned int shift = exact_log2 (GET_MODE_SIZE (mode).to_constant ()); - rtx src_addr = copy_addr_to_reg (XEXP (src, 0)); + rtx src_addr = copy_addr_to_reg (XEXP (haystack, 0)); + rtx start_addr = copy_addr_to_reg (XEXP (haystack, 0)); rtx loop = gen_label_rtx (); emit_label (loop); - rtx vsrc = change_address (src, vmode, src_addr); + rtx vsrc = change_address (haystack, vmode, src_addr); /* Bump the pointer. */ rtx step = gen_reg_rtx (Pmode); @@ -1052,8 +1064,8 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat) emit_insn (gen_read_vldi_zero_extend (cnt)); /* Compare needle with haystack and store in a mask. */ - rtx eq = gen_rtx_EQ (mask_mode, gen_const_vec_duplicate (vmode, pat), vec); - rtx vmsops[] = {mask, eq, vec, pat}; + rtx eq = gen_rtx_EQ (mask_mode, gen_const_vec_duplicate (vmode, needle), vec); + rtx vmsops[] = {mask, eq, vec, needle}; emit_nonvlmax_insn (code_for_pred_eqne_scalar (vmode), riscv_vector::COMPARE_OP, vmsops, cnt); @@ -1066,9 +1078,18 @@ expand_rawmemchr (machine_mode mode, rtx dst, rtx src, rtx pat) rtx test = gen_rtx_LT (VOIDmode, end, const0_rtx); emit_jump_insn (gen_cbranch4 (Pmode, test, end, const0_rtx, loop)); - /* We found something at SRC + END * [1,2,4,8]. */ - emit_insn (gen_rtx_SET (end, gen_rtx_ASHIFT (Pmode, end, GEN_INT (shift)))); - emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end))); + if (strlen) + { + /* For strlen, return the length. */ + emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end))); + emit_insn (gen_rtx_SET (dst, gen_rtx_MINUS (Pmode, dst, start_addr))); + } + else + { + /* For rawmemchr, return the position at SRC + END * [1,2,4,8]. */ + emit_insn (gen_rtx_SET (end, gen_rtx_ASHIFT (Pmode, end, GEN_INT (shift)))); + emit_insn (gen_rtx_SET (dst, gen_rtx_PLUS (Pmode, src_addr, end))); + } } } diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 6056391c6dc..54015eed57c 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -3747,13 +3747,9 @@ (define_expand "strlen" (match_operand:SI 2 "const_int_operand") (match_operand:SI 3 "const_int_operand")] UNSPEC_STRLEN))] - "riscv_inline_strlen && !optimize_size && (TARGET_ZBB || TARGET_XTHEADBB)" + "riscv_inline_strlen && !optimize_size + && (TARGET_ZBB || TARGET_XTHEADBB || TARGET_VECTOR)" { - rtx search_char = operands[2]; - - if (search_char != const0_rtx) - FAIL; - if (riscv_expand_strlen (operands[0], operands[1], operands[2], operands[3])) DONE; else diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c new file mode 100644 index 00000000000..d29297a5f86 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen-run.c @@ -0,0 +1,37 @@ +/* { dg-do run } */ +/* { dg-additional-options "-O3 -minline-strlen" } */ + +int +__attribute__ ((noipa)) +foo (const char *s) +{ + return __builtin_strlen (s); +} + +int +__attribute__ ((noipa)) +foo2 (const char *s) +{ + int n = 0; + while (*s++ != '\0') + { + asm volatile (""); + n++; + } + return n; +} + +#define SZ 10 + +int main () +{ + const char *s[SZ] + = {"", "asdf", "0", "\0", "!@#$%***m1123fdnmoi43", + "a", "z", "1", "9", "12345678901234567889012345678901234567890"}; + + for (int i = 0; i < SZ; i++) + { + if (foo (s[i]) != foo2 (s[i])) + __builtin_abort (); + } +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c new file mode 100644 index 00000000000..0c6cca63ebf --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/builtin/strlen.c @@ -0,0 +1,12 @@ +/* { dg-do compile { target { riscv_v } } } */ +/* { dg-additional-options "-O3 -minline-strlen" } */ + +int +__attribute__ ((noipa)) +foo (const char *s) +{ + return __builtin_strlen (s); +} + +/* { dg-final { scan-assembler-times "vle8ff" 1 } } */ +/* { dg-final { scan-assembler-times "vfirst.m" 1 } } */