From patchwork Wed Sep 6 16:07:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Christoph_M=C3=BCllner?= X-Patchwork-Id: 13793 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ab0a:0:b0:3f2:4152:657d with SMTP id m10csp2409285vqo; Wed, 6 Sep 2023 09:08:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEqLZdeZn+/oJMUmj9uSXvlCJyM/pQcfDlihGfUexxRwpzcb3P7CwCzJB9lC7ZbvEbP716U X-Received: by 2002:a17:906:53c1:b0:9a5:cc2b:50f1 with SMTP id p1-20020a17090653c100b009a5cc2b50f1mr2535787ejo.67.1694016512760; Wed, 06 Sep 2023 09:08:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694016512; cv=none; d=google.com; s=arc-20160816; b=KnFPbRm56ettxvhECotropMWWUnQrHFWMB8Lo9+eAsA+a/AI3SWGGbbKGeueBoNveg oe4qs62gMmKdEBYA3yspqO7SS1OqnN7WrCLru25MRYuU3WEzBps8t9+Y7ag1x47XrBsQ GCXF/4i3Cx1+gD6IbJIdmKo/D1ObrYjskAwR5gVdRLhJNDAh9zCyJtYGzFWK2KJJVn6C jwOMHzSFw8I2QrhyGgUBVUbX91rtYApeaTLon+gV5O5o2ogPXBtByx6r4fev6E9f7mHS JCBTIzYdkjrVtkMwsPpR9WykHGNITP83mGZE3/H2LGizQMsvshAx49SrIUKCFYvisP7s h3IA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature :dmarc-filter:delivered-to; bh=oco1m6UO11OasvbJ5D/xjRTSmgjVcIE6u0wKsMK6YUM=; fh=+Tzd+TqnZdlHygD2c2HRoq08AoaISebVb4juE/jMCyU=; b=YWtFK73cfHMEhq+TdRfB33790B78mJlFs4jbNK88uqRwIb/fxV5kmtOzVH3lm1itCz +O/AKbV6OA4gJYngbKrIMO6QERV6KjGO3yzbYWeEDJBwsdEcRfeOuquRDUFlNDGp8dQs WxEzkkEZHvlMbd8SRmaVPRDcpp7eWyUH7ByjsvXWx825QodgI+19ERBqjj16SWgL1v8r 1jsLzIQBIjkKHsAJ5tCtt+9WjtttPGRtQPa6/rFGQu1MZUlLvHfZ3Fl1eIg5P2RXnUR6 ckf2fmnxx1QJLbvwD56Ai2CZ0QSUUbyF5yFyD6xqGoC/VoQQlYL1eAErUUw3n7Vw2CEb duSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@vrull.eu header.s=google header.b=oPE3sFWV; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id hb8-20020a170906b88800b0097391f75082si9272344ejb.838.2023.09.06.09.08.32 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 09:08:32 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=fail header.i=@vrull.eu header.s=google header.b=oPE3sFWV; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 32502385C6E0 for ; Wed, 6 Sep 2023 16:08:14 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x636.google.com (mail-ej1-x636.google.com [IPv6:2a00:1450:4864:20::636]) by sourceware.org (Postfix) with ESMTPS id 5F4B23857806 for ; Wed, 6 Sep 2023 16:07:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5F4B23857806 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-9a21b6d105cso569332566b.3 for ; Wed, 06 Sep 2023 09:07:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1694016464; x=1694621264; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=oco1m6UO11OasvbJ5D/xjRTSmgjVcIE6u0wKsMK6YUM=; b=oPE3sFWVBq7jAtwKNlcG4kQJN/tzW3OnLBZaTm2HggyUNr4ZZhIQMwJUS1OJlGjXY6 8TKqK1ePWrn4EKY76pLqxDZfqT7EQREKkH0sWbo8+IQ1CqxcI4qyEda4ZE3jDHPfHME8 JRBCVE0beIy4hVhYyjnh8MEczmDlM2NJJ08eWOK6rGMMby/t9gU1dvLbmBF2pG0N1si5 zTT2uerFa8tdtywpuUSjczpEGYz2foWPtf/2QMM7rXAgUSWJixIqlcX4ncBQ5xp2a4uS safajuT4+yTJjA6ekRJ1Ckxnga4v+sHadPXUTsVzLSdC7ic0zDuKah9POwR4rSVVn/ng a2xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694016464; x=1694621264; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=oco1m6UO11OasvbJ5D/xjRTSmgjVcIE6u0wKsMK6YUM=; b=aMD4buLsZyhsA6J8DRdCOMWOUHAOOAsptEcVZY5Qm9X8Or0EwE2CHQWmDoGK7vQ5aK yQEgBBzikkcbpb0yDle56IuPLSPCwTDWoXDrsBUXXpEwkTcfXSasnzd25ndTSMRJ2Qq9 m/lHnoqHxjbbFXxLltflluHcNO0RlPLm9fp9kB+MuVVqEzlqkl9MMk7UwZtxI6imvtlE vNl3oYA+UXLhnHbxtBZsvYrAc/eQsJ9n37XKJYnNUQmFlXSUrjOe3u178BAkh4qGFlCj 7kIybdmIgjxHdOwndvWjmh3iVZkXNYKdoL/D3ORJhEBz6gvcwaQqItQRzOYuQVY+KEta VtBw== X-Gm-Message-State: AOJu0YzSB83p7lJVn45YojEV96dqd23G3EHOjiw1QFU5t+FrEDCPKqeK RTf2q1cGkKHvAbSC7cs+tYVhr0pY7NHiZ5QNcUY= X-Received: by 2002:a17:906:768d:b0:9a2:ecd:d95d with SMTP id o13-20020a170906768d00b009a20ecdd95dmr3121306ejm.68.1694016464485; Wed, 06 Sep 2023 09:07:44 -0700 (PDT) Received: from beast.fritz.box (62-178-148-172.cable.dynamic.surfer.at. [62.178.148.172]) by smtp.gmail.com with ESMTPSA id oz19-20020a170906cd1300b0098e42bef736sm9330351ejb.176.2023.09.06.09.07.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 09:07:43 -0700 (PDT) From: Christoph Muellner To: gcc-patches@gcc.gnu.org, Kito Cheng , Jim Wilson , Palmer Dabbelt , Andrew Waterman , Philipp Tomsich , Jeff Law , Vineet Gupta Subject: [PATCH v2 0/2] riscv: Introduce strlen/strcmp/strncmp inline expansion Date: Wed, 6 Sep 2023 18:07:32 +0200 Message-ID: <20230906160734.2422522-1-christoph.muellner@vrull.eu> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Spam-Status: No, score=-6.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, JMQ_SPF_NEUTRAL, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776305059059631364 X-GMAIL-MSGID: 1776305059059631364 From: Christoph Müllner This series introduces strlen/strcmp/strncmp inline expansion for Zbb/XTheadBb. In the last months, glibc as well as the Linux kernel merged changes for optimized string processing for RISC-V. The instruction, which enables optimized string routines is Zbb's orc.b (or T-Head's th.tstnbz) instruction. This patch attempts to add optimized string processing to GCC with the following properties: * strlen: inline a loop if the string is xlen-aligned * strcmp/strncmp: inline a peeled comparison loop sequence if both strings are xlen-aligned I've already posted the idea in a previous series last November (therefore, this series is called 'v2'): * https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605996.html * https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605998.html Back then, there were a couple of comments, which have been addressed, but the str(n)cmp patch has been restructured to make the code easier to digest. In total the following changes are made: * Address Jeff's comments for the strlen patch * Change str(n)cmp flags according to Kito's comments * Ensure that all flags are documented * Break str(n)cmp expansion into several functions * Add support for XTheadBb's th.tstnbz I have not introduced "-minline-str[n]cmp=[bitmanip|vector|auto]" or "-mstringop-strategy=alg" because we only have one bitmanip/scalar expansion. But it is possible to add this in the future (or not and decide based on mtune). By default all optimizations are disabled, so there should be no risk of regressions. Testing was done using the following strategy: * Enablement/flag tests are part of the patches * Correctness was tested using qemu-user with glibc's string tests compiled for: ** rv64gc (baseline) QEMU_CPU=rv64 ** rv64gc_zbb (limit=64) QEMU_CPU=rv64,zbb=false (must fail) ** rv64gc_zbb (limit=64) QEMU_CPU=rv64,zbb=true ** rv64gc_zbb (limit=32) QEMU_CPU=rv64,zbb=true ** rv64gc_xtheadbb (limit=64) QEMU_CPU=rv64 (must fail) ** rv64gc_xtheadbb (limit=64) QEMU_CPU=thead-c906 ** rv64gc_xtheadbb (limit=8) QEMU_CPU=thead-c906 ** rv32gc_zbb (limit=64) QEMU_CPU=rv32,zbb=true * SPEC CPU 2017 intrate base/peak with LTO Christoph Müllner (2): riscv: Add support for strlen inline expansion riscv: Add support for str(n)cmp inline expansion gcc/config.gcc | 3 +- gcc/config/riscv/bitmanip.md | 2 +- gcc/config/riscv/riscv-protos.h | 4 + gcc/config/riscv/riscv-string.cc | 594 ++++++++++++++++++ gcc/config/riscv/riscv.md | 72 ++- gcc/config/riscv/riscv.opt | 16 + gcc/config/riscv/t-riscv | 6 + gcc/config/riscv/thead.md | 9 +- gcc/doc/invoke.texi | 29 +- gcc/emit-rtl.cc | 24 + gcc/rtl.h | 2 + .../gcc.target/riscv/xtheadbb-strcmp.c | 57 ++ .../riscv/xtheadbb-strlen-unaligned.c | 14 + .../gcc.target/riscv/xtheadbb-strlen.c | 19 + .../gcc.target/riscv/zbb-strcmp-disabled-2.c | 38 ++ .../gcc.target/riscv/zbb-strcmp-disabled.c | 38 ++ .../gcc.target/riscv/zbb-strcmp-limit.c | 57 ++ .../gcc.target/riscv/zbb-strcmp-unaligned.c | 38 ++ gcc/testsuite/gcc.target/riscv/zbb-strcmp.c | 57 ++ .../gcc.target/riscv/zbb-strlen-disabled-2.c | 15 + .../gcc.target/riscv/zbb-strlen-disabled.c | 15 + .../gcc.target/riscv/zbb-strlen-unaligned.c | 14 + gcc/testsuite/gcc.target/riscv/zbb-strlen.c | 19 + 23 files changed, 1137 insertions(+), 5 deletions(-) create mode 100644 gcc/config/riscv/riscv-string.cc create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strcmp.c create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadbb-strlen.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-disabled-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-disabled.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-limit.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strcmp.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-disabled.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen-unaligned.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-strlen.c