From patchwork Wed Jan 3 15:42:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: xndcn X-Patchwork-Id: 184784 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp5095597dyb; Wed, 3 Jan 2024 07:44:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IHQ82CT2TAtgohZzkpbsGvvzrvA/RiYGWyEt+kBYJ8GLyLTQ48A89VfWUS4N1cuJ49G8HdT X-Received: by 2002:a05:620a:205e:b0:781:c52a:60b9 with SMTP id d30-20020a05620a205e00b00781c52a60b9mr5506392qka.30.1704296642533; Wed, 03 Jan 2024 07:44:02 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1704296642; cv=pass; d=google.com; s=arc-20160816; b=f9wiZlg3imuRHfnjdBeLSb5cSo7cwcJ30ZduCBP9vB046HGjzC43aayznSak076KJy mquhPjrBvMsZASMc7fxevUkcRIucbjQmdoUonXMv+TDpMvyfAcWWg7p6II+S+vw7kOMH 8o/H6O5w+p7KTruAnDOdfdi0jD/08C6jQHdQu1rqHx9rJ+8fItviCGF4SnXy6IEdIVBg fnzkvEBxIUI/zu8GGlNrIfOAz5qw7w7VyClV5STsaXmRrF9G59e2pRmF2hKLIFrvgzdo VBEUFVYePpUKuRB0suWxhfxpHGvTrHOM6jj+IrQ0sm9k9Tg+xQndCI3x52tlOFSoDesy pWCw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:message-id:date:from :mime-version:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=1j4BDJ5GPneh0CyiMzC+9DxzC1gzWnQHfZuBFW1tQ8k=; fh=+j3J1HHpdUv2+eUvkMKwDGvBISkQJs7/IFA38c9ZRbY=; b=WpA9Z0qAXI7lJFb3wSvuCq+1WiNVMi63SCP9J0O6qIagJAj8RigEM2aKn2k2TMXbJH 6rUfzXrDJGj/dDAgT3BFI2PMRGceQpGGFAwcT8He2VpHZwg+iLfKLx+LPvQpQhtyVUzG VP+Sc7jtlynY9KbunekEgAozznlJYxiSejKKwIuGP+9Hqm7aBXn2wxhFK7EqaP7FeJRw KwmIE47bP9w4HLVpXd02IDmj3SxmP67PYFPT4/uX60RDCpj1mTBFiLcplX6l6kJEm3Hu WaLSShl+FtZNcX7bagOPekbeY3BvOBOg0pRwzvvMZz6fdAKiBN9q3OlfdQV83iovkm62 7naA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=hzbWOEcm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b10-20020a05620a126a00b007812a53f113si26062481qkl.368.2024.01.03.07.44.02 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 03 Jan 2024 07:44:02 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=hzbWOEcm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 303C13858C2A for ; Wed, 3 Jan 2024 15:44:02 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ej1-x62e.google.com (mail-ej1-x62e.google.com [IPv6:2a00:1450:4864:20::62e]) by sourceware.org (Postfix) with ESMTPS id D82583858D38 for ; Wed, 3 Jan 2024 15:43:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D82583858D38 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D82583858D38 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::62e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704296594; cv=none; b=xIjIHx2jI6a+RDmMuT+w1nTDqvBxO6GbJdrmruVPo1/xxCsAqNXvyFoj/MFDAk0gRLAx4dZ5Uf1uXDYYoUiWD7hCJBRqzN5JEOoRNAAlOsDDcn+Al0N8zJ7bKiVly332grInS5TsfTxKp5hIZPsGBYI/fK5ZjTBlS2i6yq6D33Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704296594; c=relaxed/simple; bh=HkOk/IPgq4AFfsKp98FR9ArK/FnPo6ggjVyWwOzSf9s=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=ScFJo09ilyKdGQ3WModNwVbSMJ+PqJISQMthZV5mckg+6LF5N6/NSraAjed62zAmR9vXrC5fzWP/j8ovVahbBoaoLNGON5/Zi+L6bkJjLKEHFGg0tY0nJb4dtDS1n8doGX04877w76zZz0lz5hbP0UPi2ae1iE1g3pPeosX7YxY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-a27f0963a80so275750366b.0 for ; Wed, 03 Jan 2024 07:43:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704296590; x=1704901390; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=1j4BDJ5GPneh0CyiMzC+9DxzC1gzWnQHfZuBFW1tQ8k=; b=hzbWOEcmlGuK2dOKQC7nqNTG07VNwdGDxHvFBwaY0ddVHOan+yqdoxWGXIObCdx8VK AIRZjFN7ozC6KtlZSP1+c1GWCLhpM255ZRNXNU2AY8of4pqDHq3w3SQHz16jQ40+Bg/P XzNDWdcB3zOe3j9ozGyR0TfNCq/q6TkgOEwqHmphs+5JNDBXz89TSMmUrmx7yLN0cmHt ZMlBacIcpJWaHMuk4wMQRTauMvNBYM+prmHzdWsiwQ+ZPYyEfgqL5x9ty8+dwXk6lPTy lWQuZwfV6Faam/EofInYDH67eByOrvl8oUPube9KNLZdSnNq6blVaRyk9piVXplS/nWV 1Rxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704296590; x=1704901390; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=1j4BDJ5GPneh0CyiMzC+9DxzC1gzWnQHfZuBFW1tQ8k=; b=npFjjqWW9nyCxAs2wMgVkQn3lvBZjuUdqzfedzmIjTu4hPEmhW9OEOPAW+T6UVUxvF BC7AyG05hnhWM2wr+XX04dZqjO/OrnCFH+9rIk97HLzhFBkf3Rp8keHNd58AFtuUKwSt 7VpzMMcPUPwVP55fXYqmg9kXQQjqGXj0rXyx08Fp1qzJ6+OyM9WtDlvdH8+LpAlFkR2Y Fub5wM7NzvDZU4i3Yhpi7b7OnqLzjjP5s0R+2ntZov6h0WmYhtyuT6TqIUn4ziQEYfav k95OsyXBJVBTwFxmCfNH2rMYIXx4e4q7DeU+EoaeFQYgzQq82W4HiHIeak4HuT59DK9k Fhuw== X-Gm-Message-State: AOJu0Yy2Tn9tnL9XrfjJzIFixqApIi+1vK6NCriDn6xzkS+T1/bntsiH VP/ukmulhvlgMn5oLoyV4hgSq98lTEyozy4gPlqvDxbUsb8= X-Received: by 2002:a17:906:6008:b0:a1c:616e:cdd2 with SMTP id o8-20020a170906600800b00a1c616ecdd2mr8582600ejj.35.1704296590006; Wed, 03 Jan 2024 07:43:10 -0800 (PST) MIME-Version: 1.0 From: xndcn Date: Wed, 3 Jan 2024 23:42:58 +0800 Message-ID: Subject: Ping: [PATCH] enable ATOMIC_COMPARE_EXCHANGE opt for floating type or types contains padding To: gcc-patches@gcc.gnu.org, jakub@redhat.com X-Spam-Status: No, score=-7.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, HTML_MESSAGE, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787084555944894082 X-GMAIL-MSGID: 1787084555944894082 Hi, I am new to this, and I really need your advice, thanks. I noticed PR71716 and I want to enable ATOMIC_COMPARE_EXCHANGE internal-fn optimization for floating type or types contains padding (e.g., long double). Please correct me if I happen to make any mistakes, Thanks! Firstly, about the concerns of sNaNs float/doduble value, it seems work well and shall have been covered by testsuite/gcc.dg/atomic/c11-atomic-exec-5.c Secondly, since ATOMIC_COMPARE_EXCHANGE is only enabled when expected var is only addressable because of the call, the padding bits can not be modified by any other stmts. So we can save all bits after ATOMIC_COMPARE_EXCHANGE call and extract the padding bits. After first iteration, the extracted padding bits can be mixed with the expected var. Bootstrapped/regtested on x86_64-linux. I did some benchmarks, and there is some significant time optimization for float/double types, while there is no regression for long double type. Thanks, xndcn gcc/ChangeLog: * gimple-fold.cc (optimize_atomic_compare_exchange_p): enable for SCALAR_FLOAT_TYPE_P type of expected var, or if TYPE_PRECISION is different from mode's bitsize (fold_builtin_atomic_compare_exchange): if TYPE_PRECISION is different from mode's bitsize, try to keep track all the bits and mix it with VIEW_CONVERT_EXPR(expected). Signed-off-by: xndcn --- gcc/gimple-fold.cc | 77 ++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 71 insertions(+), 6 deletions(-) gsi_insert_before (gsi, g, GSI_SAME_STMT); @@ -5362,6 +5359,67 @@ fold_builtin_atomic_compare_exchange (gimple_stmt_iterator *gsi) build1 (VIEW_CONVERT_EXPR, itype, gimple_assign_lhs (g))); gsi_insert_before (gsi, g, GSI_SAME_STMT); + + // VIEW_CONVERT_EXPRs might not preserve all the bits. See PR71716. + // so we have to keep track all bits here. + if (maybe_ne (TYPE_PRECISION (etype), + GET_MODE_BITSIZE (TYPE_MODE (etype)))) + { + gimple_stmt_iterator cgsi + = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun))); + allbits = create_tmp_var (itype); + // allbits is initialized to 0, which can be ignored first time + gimple *init_stmt + = gimple_build_assign (allbits, build_int_cst (itype, 0)); + gsi_insert_before (&cgsi, init_stmt, GSI_SAME_STMT); + tree maskbits = create_tmp_var (itype); + // maskbits is initialized to full 1 (0xFFF...) + init_stmt = gimple_build_assign (maskbits, build1 (BIT_NOT_EXPR, + itype, allbits)); + gsi_insert_before (&cgsi, init_stmt, GSI_SAME_STMT); + + // g = g & maskbits + g = gimple_build_assign (make_ssa_name (itype), + build2 (BIT_AND_EXPR, itype, + gimple_assign_lhs (g), maskbits)); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + + gimple *def_mask = gimple_build_assign ( + make_ssa_name (itype), + build2 (LSHIFT_EXPR, itype, build_int_cst (itype, 1), + build_int_cst (itype, TYPE_PRECISION (etype)))); + gsi_insert_before (gsi, def_mask, GSI_SAME_STMT); + def_mask = gimple_build_assign (make_ssa_name (itype), + build2 (MINUS_EXPR, itype, + gimple_assign_lhs (def_mask), + build_int_cst (itype, 1))); + gsi_insert_before (gsi, def_mask, GSI_SAME_STMT); + // maskbits = (1 << TYPE_PRECISION (etype)) - 1 + def_mask = gimple_build_assign (maskbits, SSA_NAME, + gimple_assign_lhs (def_mask)); + gsi_insert_before (gsi, def_mask, GSI_SAME_STMT); + + // paddingbits = (~maskbits) & allbits + def_mask + = gimple_build_assign (make_ssa_name (itype), + build1 (BIT_NOT_EXPR, itype, + gimple_assign_lhs (def_mask))); + gsi_insert_before (gsi, def_mask, GSI_SAME_STMT); + def_mask + = gimple_build_assign (make_ssa_name (itype), + build2 (BIT_AND_EXPR, itype, allbits, + gimple_assign_lhs (def_mask))); + gsi_insert_before (gsi, def_mask, GSI_SAME_STMT); + + // g = g | paddingbits, i.e., + // g = (VIEW_CONVERT_EXPR(expected) & maskbits) + // | (allbits &(~maskbits)) + g = gimple_build_assign (make_ssa_name (itype), + build2 (BIT_IOR_EXPR, itype, + gimple_assign_lhs (g), + gimple_assign_lhs (def_mask))); + gsi_insert_before (gsi, g, GSI_SAME_STMT); + } } int flag = (integer_onep (gimple_call_arg (stmt, 3)) ? 256 : 0) + int_size_in_bytes (itype); @@ -5410,6 +5468,13 @@ fold_builtin_atomic_compare_exchange (gimple_stmt_iterator *gsi) gsi_insert_after (gsi, g, GSI_NEW_STMT); if (!useless_type_conversion_p (TREE_TYPE (expected), itype)) { + // save all bits here + if (maybe_ne (TYPE_PRECISION (etype), + GET_MODE_BITSIZE (TYPE_MODE (etype)))) + { + g = gimple_build_assign (allbits, SSA_NAME, gimple_assign_lhs (g)); + gsi_insert_after (gsi, g, GSI_NEW_STMT); + } g = gimple_build_assign (make_ssa_name (TREE_TYPE (expected)), VIEW_CONVERT_EXPR, build1 (VIEW_CONVERT_EXPR, TREE_TYPE (expected), diff --git a/gcc/gimple-fold.cc b/gcc/gimple-fold.cc index cb4b57250..321ff4f41 100644 --- a/gcc/gimple-fold.cc +++ b/gcc/gimple-fold.cc @@ -5306,12 +5306,7 @@ optimize_atomic_compare_exchange_p (gimple *stmt) || !auto_var_in_fn_p (TREE_OPERAND (expected, 0), current_function_decl) || TREE_THIS_VOLATILE (etype) || VECTOR_TYPE_P (etype) - || TREE_CODE (etype) == COMPLEX_TYPE - /* Don't optimize floating point expected vars, VIEW_CONVERT_EXPRs - might not preserve all the bits. See PR71716. */ - || SCALAR_FLOAT_TYPE_P (etype) - || maybe_ne (TYPE_PRECISION (etype), - GET_MODE_BITSIZE (TYPE_MODE (etype)))) + || TREE_CODE (etype) == COMPLEX_TYPE) return false; tree weak = gimple_call_arg (stmt, 3); @@ -5350,8 +5345,10 @@ fold_builtin_atomic_compare_exchange (gimple_stmt_iterator *gsi) tree itype = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (parmt))); tree ctype = build_complex_type (itype); tree expected = TREE_OPERAND (gimple_call_arg (stmt, 1), 0); + tree etype = TREE_TYPE (expected); bool throws = false; edge e = NULL; + tree allbits = NULL_TREE; gimple *g = gimple_build_assign (make_ssa_name (TREE_TYPE (expected)), expected);