From patchwork Wed May 24 05:47:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexandre Oliva X-Patchwork-Id: 98299 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2617025vqo; Tue, 23 May 2023 22:48:04 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5aR8jsAW7QJ7TyA+INo+5yxP+l08ipbGCODzT3tjr7lLwxCPxnxy5YMFBJ4s8Yii+vVReL X-Received: by 2002:a50:e61a:0:b0:50c:cde7:285b with SMTP id y26-20020a50e61a000000b0050ccde7285bmr986809edm.29.1684907284348; Tue, 23 May 2023 22:48:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684907284; cv=none; d=google.com; s=arc-20160816; b=sSCDI3U8wnlPIm17hzG8Jout+IOYJxIdSjEVgMsiplA2Kr9l2l8qci+4FqCOzMiO8Z PVGBjSNYE7gOj5x+55JbNUX1A+u5MkGPPsr0/ucET2juAJkyPrG7TWTe49aC3rr2I30r mtcO+QOcef4BBvOQmumVZ4IiWheyfdmOiSmwo1NLRFmM+4pjF8eTwclWjkXRixZgAbP/ PQCIf57QikoMf3zAz/6MXSXNEJj4yWQSXxSXYKUIWiiHVqkmOplcrRDdb6tahRcVM4vL 8TIsvVtkQ4txjXOkUsIgUgbCzOApZfl7ouBFfHUH/cV6dDn72n0UiTVhsgWbATI6srND NHCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:mime-version :user-agent:message-id:date:organization:subject:cc:to:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=U0o19aI0KkWpbn3DzU6ZkPAKzoO/szppbMfH4iPxYS4=; b=bKr705/BlsbmLcHhOgsi+yjnP7s7M1fu5NGgn+T37v3LMEPbHipk1ycNnhzfsBMB7F v5eTRpNh9vDPloNA4saw9kBxAWETowFkJSaBodpfMVYCO/1Wbhsd5aEvmE/GPW1NfmLY 0aPLdo/7BZ4Efg28/YUWrELpRpBNE9sY+K6QqpnJbx0BIrvZInAov+1HYJZs/BrwUiN6 kwr08igxBODsWjyY4xoV0d34VUECMcVNWysXZkGqpV+CAuzq8nCs9WMH4oOKTXY32VJJ oWNf1XEFbiXCL/hmHr9vYSzwYjoiraAsuranJl5JqZy6aNQg5WUxp+20LpQJ5U6foqWl /xng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=HhaeSjIl; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id s3-20020aa7c543000000b0050ca8c906a3si3066289edr.48.2023.05.23.22.48.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 22:48:04 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=HhaeSjIl; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EC3383858430 for ; Wed, 24 May 2023 05:48:02 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org EC3383858430 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684907283; bh=U0o19aI0KkWpbn3DzU6ZkPAKzoO/szppbMfH4iPxYS4=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=HhaeSjIlN6pNOHk6BF6KKksKZkChNSU/B1IGdCLu9AgWWIFs3d6/6hNn2HozHFKon QCeZtTcDGOwPMpeO5NPVapsJyQUnLwe0j1yHxa7KWrvj39ANMe4lPnpVMYlJnXup48 WPz0Vliuxi8R43/23TOED8CiGIGjwHjiFOU8TXsc= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ot1-x336.google.com (mail-ot1-x336.google.com [IPv6:2607:f8b0:4864:20::336]) by sourceware.org (Postfix) with ESMTPS id C70663858D37 for ; Wed, 24 May 2023 05:47:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C70663858D37 Received: by mail-ot1-x336.google.com with SMTP id 46e09a7af769-6af873d1d8bso165216a34.3 for ; Tue, 23 May 2023 22:47:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684907237; x=1687499237; h=mime-version:user-agent:message-id:date:errors-to:organization :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=U0o19aI0KkWpbn3DzU6ZkPAKzoO/szppbMfH4iPxYS4=; b=a9OJec59Bq5gV+Y35PtdFwDzXOSanFRx1/GRDckOPGeJpplD4bMR0JyLO4lhHfGL3w TrIRJqfQqTsAmeNQ9unaWBVudkdjabPQktE9I+ReIVbJ7WwCE4Ure8pd7bM+sMYtBEEZ Kw+PlRDMq+pfzhLno2XxmO+omhRdl/qaBDOlaNLh9x6/TS8kW3kbmaZyUw4Y9Oq1wwzH Kd06554Iyk+Bkx1encZNJW5N84SX6wGJkstNKLT4V3S+rIvD85s6QiwDYAAaj2jIJt0K O907swJUXiNmvoawbVlrRjRdpYNw3IiIhulHgViM3WD99qsAv1FNAeVonF7mePcB8F01 YS9w== X-Gm-Message-State: AC+VfDwRD704qzYjLFM7Erjc953VlH1YiO2fhbF13H79A0LrNZryvtef f9zMwpEHmw0Fb021MlAhp7j57VZ7hpMq1kfRGUo= X-Received: by 2002:a05:6830:17:b0:6af:95f9:7adc with SMTP id c23-20020a056830001700b006af95f97adcmr1296440otp.14.1684907237061; Tue, 23 May 2023 22:47:17 -0700 (PDT) Received: from free.home ([2804:7f1:2080:6383:46d9:ede8:ee97:8cc0]) by smtp.gmail.com with ESMTPSA id c10-20020a9d75ca000000b00697be532609sm4192397otl.73.2023.05.23.22.47.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 May 2023 22:47:16 -0700 (PDT) Received: from livre (livre.home [172.31.160.2]) by free.home (8.15.2/8.15.2) with ESMTPS id 34O5l7YO3582075 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Wed, 24 May 2023 02:47:07 -0300 To: gcc-patches@gcc.gnu.org, "H.J. Lu" Cc: Jan Hubicka , Uros Bizjak Subject: [PATCH] [x86] reenable dword MOVE_MAX for better memmove inlining Organization: Free thinker, does not speak for AdaCore Date: Wed, 24 May 2023 02:47:07 -0300 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Alexandre Oliva via Gcc-patches From: Alexandre Oliva Reply-To: Alexandre Oliva Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766753340444450350?= X-GMAIL-MSGID: =?utf-8?q?1766753340444450350?= MOVE_MAX on x86* used to accept up to 16 bytes, even without SSE, which enabled inlining of small memmove by loading and then storing the entire range. After the "x86: Update piecewise move and store" r12-2666 change, memmove of more than 4 bytes would not be inlined in gimple_fold_bultin_memory_op, failing the expectations of a few tests. I can see how lowering it for MOVE_MAX_PIECES can get us better codegen decisions overall, but surely inlining memmove with 2 32-bit loads and stores is better than an outline call that requires setting up 3 arguments. I suppose even 3 or 4 could do better. But maybe it is gimple_fold_builtin_memory_op that needs tweaking? Anyhow, this patch raises MOVE_MAX back a little for non-SSE targets, while preserving the new value for MOVE_MAX_PIECES. Bootstrapped on x86_64-linux-gnu. Also tested on ppc- and x86-vx7r2 with gcc-12. for gcc/ChangeLog * config/i386/i386.h (MOVE_MAX): Rename to... (MOVE_MAX_VEC): ... this. Add NONVEC parameter, and use it as the last resort, instead of UNITS_PER_WORD. (MOVE_MAX): Reintroduce in terms of MOVE_MAX_VEC, with 2*UNITS_PER_WORD. (MOVE_MAX_PIECES): Likewise, but with UNITS_PER_WORD. --- gcc/config/i386/i386.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index c7439f89bdf92..5293a332a969a 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1801,7 +1801,9 @@ typedef struct ix86_args { is the number of bytes at a time which we can move efficiently. MOVE_MAX_PIECES defaults to MOVE_MAX. */ -#define MOVE_MAX \ +#define MOVE_MAX MOVE_MAX_VEC (2 * UNITS_PER_WORD) +#define MOVE_MAX_PIECES MOVE_MAX_VEC (UNITS_PER_WORD) +#define MOVE_MAX_VEC(NONVEC) \ ((TARGET_AVX512F \ && (ix86_move_max == PVW_AVX512 \ || ix86_store_max == PVW_AVX512)) \ @@ -1813,7 +1815,7 @@ typedef struct ix86_args { : ((TARGET_SSE2 \ && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \ && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \ - ? 16 : UNITS_PER_WORD))) + ? 16 : (NONVEC)))) /* STORE_MAX_PIECES is the number of bytes at a time that we can store efficiently. Allow 16/32/64 bytes only if inter-unit move is enabled