From patchwork Mon Jul 17 09:02:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121137 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp989841vqt; Mon, 17 Jul 2023 02:04:42 -0700 (PDT) X-Google-Smtp-Source: APBJJlHETD8e9QXeLTTrZl62H0UoFPWGRXs7Ik+2KgLCVA3kq1n4/wd02sNqJERZXOKxOHd0gqCQ X-Received: by 2002:a17:906:73d9:b0:997:b843:7cb2 with SMTP id n25-20020a17090673d900b00997b8437cb2mr3888180ejl.60.1689584682058; Mon, 17 Jul 2023 02:04:42 -0700 (PDT) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id h2-20020a170906854200b00993ebae9927si12469977ejy.784.2023.07.17.02.04.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:04:42 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=TbAhNi9D; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F2B3B3857016 for ; Mon, 17 Jul 2023 09:04:31 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org F2B3B3857016 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584672; bh=y3dJcGxeLAAgu2b31fdZlymhJ1lAGyLfh306gjGNA58=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=TbAhNi9Df572AouJfpd2szU4wviVcqAJI42a2ot6z9pvxXW0rh3NhjRP2ELNCK3Ig WI+ftzlJ+eZpXBkhcOHHMMhkD4JUoZfTHLTM5yOl6UgARHwTLq1XoKLgITmZMQqpGE wZzBQ5DwzuBERECI2sWJ6GCK8SNhAeBoLctZo0AU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.148]) by sourceware.org (Postfix) with ESMTPS id 6BBED3858C2A for ; Mon, 17 Jul 2023 09:03:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6BBED3858C2A Received: from localhost (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 7DECB322571 for ; Mon, 17 Jul 2023 11:03:36 +0200 (CEST) Received: from fx408 (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 54ADE3229F6 for ; Mon, 17 Jul 2023 11:03:36 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01on0109.outbound.protection.outlook.com [104.47.24.109]) by fx408.security-mail.net (Postfix) with ESMTPS id B2E20322571 for ; Mon, 17 Jul 2023 11:03:35 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:33 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:33 +0000 X-Virus-Scanned: E-securemail Secumail-id: <745d.64b503e7.873ea.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ekia+nymyXNnCyWf4o5sKWXMemhncg0GUZg2km1FidMmaxEK21FEcv/+DOFzopaS6Uiuc9EP3xZCSIVAWkwZEyer/fsmNCH4jZQlRm0vbZbvAtQ262G8jTkSaa12sNvoDoTJd4/FHdWHWgcuXhdJquxhCKaKGpSCA3dLIH9p2MIj1HVp4LpmCZpqyBL/l0cgyDVjw2RJi5o8B62Ap14XhB/hkQxFVphRy8HPMuOl90KqrVAWs+DOeQb25U2OXPzxGgZJFu9QirlsK+qDnPX4tPg4wM7YFYFFSaW2o6XQUzJ0ttAuBZ+6WWMef0PtXlBEOZHqcfgTNVEekJpP9POgAg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=y3dJcGxeLAAgu2b31fdZlymhJ1lAGyLfh306gjGNA58=; b=NrVFsyUgE0+Wue7EhOTEAFrY5oSLg0V7mrQrTQWFBhaXi6AKvbwuo2PNUHYowu/XUCpAw+5jqPfFJYzw5HWMNygRMs6SGa/yjIL6WDeqpIlPYF0ziVAVWFqH3EbWywyJqz2ckYO2o14Z5st4XKjf/KRBz7cfstS0iwshVarapXitmW2GHPzeFDR3ZwAgshZiINUXCv/K8H9SAh6qGxSY546Kmh/XkKBE8YLfUtKwnz/QXP342Fxieugx6ghc1crCXl9+6GzgbGhWVKnjcgu/SXXpDXEzUoosEUW+FFiZ0kRrPbFuVHdsOkr1iI8P9mZUSfF8Lz6KRqgpRpwVEvoxJQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 1/9] Native complex operations: Conditional lowering Date: Mon, 17 Jul 2023 11:02:42 +0200 Message-ID: <20230717090250.4645-2-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO4P265CA0283.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:38f::14) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: b9c71150-6555-429f-2417-08db86a4b2e7 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Qd2O2A+MM+zQj1rBpaIq3rmlko6+ZE5xqfGaA/Iglo1quKxryOZjx9Kg1DooJrwqIi89yrFixIm7fN+Adz8s7td4a0u8WSIfjui9HnndL1xO6daUPsi6YE0/09fTZFYYDXS4kwUUISH1Zke/2030kUpOh1O/u/bzANl7ZpWBA/saRDNJejPfACpyEb/vuepqSrA18sH9IVnfunCT30x6UVakObLpvHSY1UaZRiPE59+wRgKhkTGzSkC0dhlo+iHwebLRxpMa9zUjK/eASEusX88NlveWKAbwPw2LbBeDASRLY6gYoFIJDzaG/LP8u4xN0HHSRc+7tVdc9GldlQwv6MdWKTN4EdnpGB/mXX4mP6pCERkl5Op7d0BHoCl60jYWOjbxemXGwdXJosv3ZprkOTydiJoACzWLHtHrr9d9kukrMxnlEX0Ij0n/aHEiyq/McBronslYDL+oowAreEB/0MI5S8VRQkS7Zvg8vix2zitDSrP2byQYrL3TWCGRlRRWk8aiZ4100Tn3hry1z03LkvMA7JnktJtvpKxRz19S8rRah2FwYbnUcMTHiQZxA5xg X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(6666004)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(30864003)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: /AJ54/BG7Yv7uDt1xIc+Sy3OJEsU+jT2Hlh1HEZceYaNAZjEos5RIKHmHtl7cUEyNT/A1jop4Wb5UgY3f2i1THvK7s9qz9/Ukd/8+yDh82Y64b3BSM6QIZY9hMLskfh6JL3Xh4NKPrAdNiXGPAGJDJweRKhM9ITLEJ8srl3QVIqb8dPTOzlF6x7qppcQ942M/W1a4GCLC3XXGaKQYnzQRh7yb2GqLRiT7g7YoE7TW+g6VNqxZXhjfMz0jjeDW5jBIWNtC6hkL97ntB/7B+Fh4v82mqRyr+zXPvVVDnbGCSDeEwoarHK433N7U90itGaGy9tTejY21sLba3msH263zNhLDTb1mkSuTIsRjnIRUZUB/CsnzxhvOP3eutVy5Z3GzXQqWRDiXRJH7TDI3DqeMgO+oydyyXPyy44mHk4YOIr9rKl33zSgy/nYGIYW6jHB4VDLqHHKK2H83Exahm1mw9ZBQEV/f8Th+Dkcg/s5SwluIHKD+1MEjuOdXzUXNVDFRdlcS+PwmMBCibTq+bHbeCRUbV7NlgF/rDSDPlYNFYxtGd76iC2oWfdeGpI9yPj4SniWe/SYpIhsuw1hryoCGTiRQ8sverjo3si234onG3VlIEk+D7fH4SsR1YLTIceXLIuLqMN02fgkR0sFk2mm2MXhFi52kRidK+BoMNvbPnDoiz9uuIs9zJ8Fx1kDSniCfipq3bWX2U+8cGpuyFiiErv9FtO4N4YT+c/9Zng2IiinsXH88zUHAvEaQsdhf6wkSkRjMxdBi9xUiNQXC6h0cWE1BvEENmolBnKUjF1m0ukOHiDMm+yLGBEt/c9bY1gfjljA/gnKaszeVp8GwMnEu6kEFQnfyzD6FgGbQIhdZ9nluEKqi9v2VNuswj54OEKWT27ZaXibcLfZKyBtfHeztX4f3trJVMpEQo9ZwOdrae1u/awyBm2LSygwbLru+Dr8 rINlDBhsVOT9Gs9YDdLT0GxM8TDaoeEWbsD6kgvJsdGQF+EKsJN8ZpsCt97RhsZWNxGiopxK5CDXZtr5rSxlompeULgV9XGm0kKxcgyXrQajzEcp/CeoR+hbMy6Jvie7r+J3bUGnrTz7PpWsKjQiHmcsvgOflSZzhb4RFbaSvYFrMuAwLwxQ3vTNSGu1pFD8QcJL5e6YgQ1vnyjv3tHi/eT91ZEhHz+3VMDgpBgrOYNs7mTjE8vtzfVXFVTRGatjhldZv0rytSFYp8vKciGS0Z1BOXE4Pm2QAdb8YpPmKxOp1jHN6FPnmGteYhzMroPln+gdVN67ghGWuk7gt8dmzVUNsTe954ggKQODOqiRWB8SR3Gh0ixaw2Uer31zLONbrlYv9u5TRqsUtElf2nci0qt6dBuiJwZyS64B2f2NGyxdIpDVFh7utW5jtPmaaaojpT2z4QiUcOYrnLZ9nvkmNJAV4BB/b144mF6M63ppZmtoV9Qf4gm/Nw36qKimHfVsFxRdLvB7hoBlhTB5EoEVvu5Be4S9PgO99uvEg24HHgBwpvewYIJrUrdfSF4mt+WD+GOAW9AsGfjYM9Ahb+jQmWFyU48Uo0+gKAyFOYxxf+huKGJw+zWfYNYccsffsD23 X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: b9c71150-6555-429f-2417-08db86a4b2e7 X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:33.8406 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Kx17G0Kn5BYRBBA+tzMth7n53WnTCKl0QVqH8YbacEt9RKFMw30ZmXgsUwYAOYGp1iBVhy9mcigb5qMECFfmMw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771657947600601514 X-GMAIL-MSGID: 1771657947600601514 Allow the cplxlower pass to identify if an operation does not need to be lowered through optabs. In this case, lowering is not performed. The cplxlower pass now has to handle a mix of lowered and non-lowered operations. A quick access to both parts of a complex constant is also implemented. gcc/lto/ChangeLog: * lto-common.cc (compare_tree_sccs_1): Handle both parts of a complex constant gcc/ChangeLog: * coretypes.h: Add enum for complex parts * gensupport.cc (match_pattern): Add complex types * lto-streamer-out.cc (DFS::DFS_write_tree_body): (hash_tree): Handle both parts of a complex constant * tree-complex.cc (get_component_var): Support handling of both parts of a complex (get_component_ssa_name): Likewise (set_component_ssa_name): Likewise (extract_component): Likewise (update_complex_components): Likewise (update_complex_components_on_edge): Likewise (update_complex_assignment): Likewise (update_phi_components): Likewise (expand_complex_move): Likewise (expand_complex_asm): Update with complex_part_t (complex_component_cst_p): New: check if a complex component is a constant (target_native_complex_operation): New: Check if complex operation is supported natively by the backend, through the optab (expand_complex_operations_1): Condionally lowered ops (tree_lower_complex): Support handling of both parts of a complex * tree-core.h (struct GTY): Add field for both parts of the tree_complex struct * tree-streamer-in.cc (lto_input_ts_complex_tree_pointers): Handle both parts of a complex constant * tree-streamer-out.cc (write_ts_complex_tree_pointers): Likewise * tree.cc (build_complex): likewise * tree.h (class auto_suppress_location_wrappers): (type_has_mode_precision_p): Add special case for complex --- gcc/coretypes.h | 9 + gcc/gensupport.cc | 2 + gcc/lto-streamer-out.cc | 2 + gcc/lto/lto-common.cc | 2 + gcc/tree-complex.cc | 434 +++++++++++++++++++++++++++++---------- gcc/tree-core.h | 1 + gcc/tree-streamer-in.cc | 1 + gcc/tree-streamer-out.cc | 1 + gcc/tree.cc | 8 + gcc/tree.h | 15 +- 10 files changed, 363 insertions(+), 112 deletions(-) diff --git a/gcc/coretypes.h b/gcc/coretypes.h index ca8837cef67..a000c104b53 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -443,6 +443,15 @@ enum optimize_size_level OPTIMIZE_SIZE_MAX }; +/* part of a complex */ + +typedef enum +{ + REAL_P = 0, + IMAG_P = 1, + BOTH_P = 2 +} complex_part_t; + /* Support for user-provided GGC and PCH markers. The first parameter is a pointer to a pointer, the second either NULL if the pointer to pointer points into a GC object or the actual pointer address if diff --git a/gcc/gensupport.cc b/gcc/gensupport.cc index 959d1d9c83c..9aa2ba69fcd 100644 --- a/gcc/gensupport.cc +++ b/gcc/gensupport.cc @@ -3746,9 +3746,11 @@ match_pattern (optab_pattern *p, const char *name, const char *pat) break; if (*p == 0 && (! force_int || mode_class[i] == MODE_INT + || mode_class[i] == MODE_COMPLEX_INT || mode_class[i] == MODE_VECTOR_INT) && (! force_partial_int || mode_class[i] == MODE_INT + || mode_class[i] == MODE_COMPLEX_INT || mode_class[i] == MODE_PARTIAL_INT || mode_class[i] == MODE_VECTOR_INT) && (! force_float diff --git a/gcc/lto-streamer-out.cc b/gcc/lto-streamer-out.cc index 5ffa8954022..38c48e44867 100644 --- a/gcc/lto-streamer-out.cc +++ b/gcc/lto-streamer-out.cc @@ -985,6 +985,7 @@ DFS::DFS_write_tree_body (struct output_block *ob, { DFS_follow_tree_edge (TREE_REALPART (expr)); DFS_follow_tree_edge (TREE_IMAGPART (expr)); + DFS_follow_tree_edge (TREE_COMPLEX_BOTH_PARTS (expr)); } if (CODE_CONTAINS_STRUCT (code, TS_DECL_MINIMAL)) @@ -1417,6 +1418,7 @@ hash_tree (struct streamer_tree_cache_d *cache, hash_map *map, { visit (TREE_REALPART (t)); visit (TREE_IMAGPART (t)); + visit (TREE_COMPLEX_BOTH_PARTS (t)); } if (CODE_CONTAINS_STRUCT (code, TS_DECL_MINIMAL)) diff --git a/gcc/lto/lto-common.cc b/gcc/lto/lto-common.cc index 703e665b698..f647ee62f9e 100644 --- a/gcc/lto/lto-common.cc +++ b/gcc/lto/lto-common.cc @@ -1408,6 +1408,8 @@ compare_tree_sccs_1 (tree t1, tree t2, tree **map) { compare_tree_edges (TREE_REALPART (t1), TREE_REALPART (t2)); compare_tree_edges (TREE_IMAGPART (t1), TREE_IMAGPART (t2)); + compare_tree_edges (TREE_COMPLEX_BOTH_PARTS (t1), + TREE_COMPLEX_BOTH_PARTS (t2)); } if (CODE_CONTAINS_STRUCT (code, TS_DECL_MINIMAL)) diff --git a/gcc/tree-complex.cc b/gcc/tree-complex.cc index 688fe13989c..63753e4acf4 100644 --- a/gcc/tree-complex.cc +++ b/gcc/tree-complex.cc @@ -42,6 +42,10 @@ along with GCC; see the file COPYING3. If not see #include "cfganal.h" #include "gimple-fold.h" #include "diagnostic-core.h" +#include "target.h" +#include "memmodel.h" +#include "optabs-tree.h" +#include "internal-fn.h" /* For each complex ssa name, a lattice value. We're interested in finding @@ -74,7 +78,7 @@ static vec complex_lattice_values; the hashtable. */ static int_tree_htab_type *complex_variable_components; -/* For each complex SSA_NAME, a pair of ssa names for the components. */ +/* For each complex SSA_NAME, three ssa names for the components. */ static vec complex_ssa_name_components; /* Vector of PHI triplets (original complex PHI and corresponding real and @@ -476,17 +480,27 @@ create_one_component_var (tree type, tree orig, const char *prefix, /* Retrieve a value for a complex component of VAR. */ static tree -get_component_var (tree var, bool imag_p) +get_component_var (tree var, complex_part_t part) { - size_t decl_index = DECL_UID (var) * 2 + imag_p; + size_t decl_index = DECL_UID (var) * 3 + part; tree ret = cvc_lookup (decl_index); if (ret == NULL) { - ret = create_one_component_var (TREE_TYPE (TREE_TYPE (var)), var, - imag_p ? "CI" : "CR", - imag_p ? "$imag" : "$real", - imag_p ? IMAGPART_EXPR : REALPART_EXPR); + switch (part) + { + case REAL_P: + ret = create_one_component_var (TREE_TYPE (TREE_TYPE (var)), var, + "CR", "$real", REALPART_EXPR); + break; + case IMAG_P: + ret = create_one_component_var (TREE_TYPE (TREE_TYPE (var)), var, + "CI", "$imag", IMAGPART_EXPR); + break; + case BOTH_P: + ret = var; + break; + } cvc_insert (decl_index, ret); } @@ -496,13 +510,15 @@ get_component_var (tree var, bool imag_p) /* Retrieve a value for a complex component of SSA_NAME. */ static tree -get_component_ssa_name (tree ssa_name, bool imag_p) +get_component_ssa_name (tree ssa_name, complex_part_t part) { complex_lattice_t lattice = find_lattice_value (ssa_name); size_t ssa_name_index; tree ret; - if (lattice == (imag_p ? ONLY_REAL : ONLY_IMAG)) + if (((lattice == ONLY_IMAG) && (part == REAL_P)) + || ((lattice == ONLY_REAL) && (part == IMAG_P))) + { tree inner_type = TREE_TYPE (TREE_TYPE (ssa_name)); if (SCALAR_FLOAT_TYPE_P (inner_type)) @@ -511,14 +527,33 @@ get_component_ssa_name (tree ssa_name, bool imag_p) return build_int_cst (inner_type, 0); } - ssa_name_index = SSA_NAME_VERSION (ssa_name) * 2 + imag_p; + if (part == BOTH_P) + return ssa_name; + + ssa_name_index = SSA_NAME_VERSION (ssa_name) * 3 + part; + + /* increase size of dynamic array if needed */ + if (ssa_name_index >= complex_ssa_name_components.length ()) + { + complex_ssa_name_components.safe_grow_cleared + (2 * complex_ssa_name_components.length (), true); + complex_lattice_values.safe_grow_cleared + (2 * complex_lattice_values.length (), true); + } + ret = complex_ssa_name_components[ssa_name_index]; if (ret == NULL) { if (SSA_NAME_VAR (ssa_name)) - ret = get_component_var (SSA_NAME_VAR (ssa_name), imag_p); + ret = get_component_var (SSA_NAME_VAR (ssa_name), part); else - ret = TREE_TYPE (TREE_TYPE (ssa_name)); + { + if (part == BOTH_P) + ret = TREE_TYPE (ssa_name); + else + ret = TREE_TYPE (TREE_TYPE (ssa_name)); + } + ret = make_ssa_name (ret); /* Copy some properties from the original. In particular, whether it @@ -542,7 +577,7 @@ get_component_ssa_name (tree ssa_name, bool imag_p) gimple_seq of stuff that needs doing. */ static gimple_seq -set_component_ssa_name (tree ssa_name, bool imag_p, tree value) +set_component_ssa_name (tree ssa_name, complex_part_t part, tree value) { complex_lattice_t lattice = find_lattice_value (ssa_name); size_t ssa_name_index; @@ -553,14 +588,24 @@ set_component_ssa_name (tree ssa_name, bool imag_p, tree value) /* We know the value must be zero, else there's a bug in our lattice analysis. But the value may well be a variable known to contain zero. We should be safe ignoring it. */ - if (lattice == (imag_p ? ONLY_REAL : ONLY_IMAG)) + if (((lattice == ONLY_IMAG) && (part == REAL_P)) + || ((lattice == ONLY_REAL) && (part == IMAG_P))) return NULL; /* If we've already assigned an SSA_NAME to this component, then this means that our walk of the basic blocks found a use before the set. This is fine. Now we should create an initialization for the value we created earlier. */ - ssa_name_index = SSA_NAME_VERSION (ssa_name) * 2 + imag_p; + ssa_name_index = SSA_NAME_VERSION (ssa_name) * 3 + part; + + /* increase size of dynamic array if needed */ + if (ssa_name_index >= complex_ssa_name_components.length ()) + { + size_t old_size = complex_ssa_name_components.length (); + complex_ssa_name_components.safe_grow (2 * old_size, true); + complex_lattice_values.safe_grow (2 * old_size, true); + } + comp = complex_ssa_name_components[ssa_name_index]; if (comp) ; @@ -584,7 +629,7 @@ set_component_ssa_name (tree ssa_name, bool imag_p, tree value) && (!SSA_NAME_VAR (value) || DECL_IGNORED_P (SSA_NAME_VAR (value))) && !DECL_IGNORED_P (SSA_NAME_VAR (ssa_name))) { - comp = get_component_var (SSA_NAME_VAR (ssa_name), imag_p); + comp = get_component_var (SSA_NAME_VAR (ssa_name), part); replace_ssa_name_symbol (value, comp); } @@ -595,7 +640,7 @@ set_component_ssa_name (tree ssa_name, bool imag_p, tree value) /* Finally, we need to stabilize the result by installing the value into a new ssa name. */ else - comp = get_component_ssa_name (ssa_name, imag_p); + comp = get_component_ssa_name (ssa_name, part); /* Do all the work to assign VALUE to COMP. */ list = NULL; @@ -612,13 +657,14 @@ set_component_ssa_name (tree ssa_name, bool imag_p, tree value) Emit any new code before gsi. */ static tree -extract_component (gimple_stmt_iterator *gsi, tree t, bool imagpart_p, +extract_component (gimple_stmt_iterator * gsi, tree t, complex_part_t part, bool gimple_p, bool phiarg_p = false) { switch (TREE_CODE (t)) { case COMPLEX_CST: - return imagpart_p ? TREE_IMAGPART (t) : TREE_REALPART (t); + return (part == BOTH_P) ? t : (part == IMAG_P) ? + TREE_IMAGPART (t) : TREE_REALPART (t); case COMPLEX_EXPR: gcc_unreachable (); @@ -629,7 +675,7 @@ extract_component (gimple_stmt_iterator *gsi, tree t, bool imagpart_p, t = unshare_expr (t); TREE_TYPE (t) = inner_type; TREE_OPERAND (t, 1) = TYPE_SIZE (inner_type); - if (imagpart_p) + if (part == IMAG_P) TREE_OPERAND (t, 2) = size_binop (PLUS_EXPR, TREE_OPERAND (t, 2), TYPE_SIZE (inner_type)); if (gimple_p) @@ -646,10 +692,11 @@ extract_component (gimple_stmt_iterator *gsi, tree t, bool imagpart_p, case VIEW_CONVERT_EXPR: case MEM_REF: { - tree inner_type = TREE_TYPE (TREE_TYPE (t)); - - t = build1 ((imagpart_p ? IMAGPART_EXPR : REALPART_EXPR), - inner_type, unshare_expr (t)); + if (part == BOTH_P) + t = unshare_expr (t); + else + t = build1 (((part == IMAG_P) ? IMAGPART_EXPR : REALPART_EXPR), + (TREE_TYPE (TREE_TYPE (t))), unshare_expr (t)); if (gimple_p) t = force_gimple_operand_gsi (gsi, t, true, NULL, true, @@ -659,10 +706,12 @@ extract_component (gimple_stmt_iterator *gsi, tree t, bool imagpart_p, } case SSA_NAME: - t = get_component_ssa_name (t, imagpart_p); - if (TREE_CODE (t) == SSA_NAME && SSA_NAME_DEF_STMT (t) == NULL) - gcc_assert (phiarg_p); - return t; + { + t = get_component_ssa_name (t, part); + if (TREE_CODE (t) == SSA_NAME && SSA_NAME_DEF_STMT (t) == NULL) + gcc_assert (phiarg_p); + return t; + } default: gcc_unreachable (); @@ -673,18 +722,29 @@ extract_component (gimple_stmt_iterator *gsi, tree t, bool imagpart_p, static void update_complex_components (gimple_stmt_iterator *gsi, gimple *stmt, tree r, - tree i) + tree i, tree b = NULL) { tree lhs; gimple_seq list; + gcc_assert (b || (r && i)); lhs = gimple_get_lhs (stmt); + if (!b) + b = lhs; + if (!r) + r = build1 (REALPART_EXPR, TREE_TYPE (TREE_TYPE (b)), unshare_expr (b)); + if (!i) + i = build1 (IMAGPART_EXPR, TREE_TYPE (TREE_TYPE (b)), unshare_expr (b)); + + list = set_component_ssa_name (lhs, REAL_P, r); + if (list) + gsi_insert_seq_after (gsi, list, GSI_CONTINUE_LINKING); - list = set_component_ssa_name (lhs, false, r); + list = set_component_ssa_name (lhs, IMAG_P, i); if (list) gsi_insert_seq_after (gsi, list, GSI_CONTINUE_LINKING); - list = set_component_ssa_name (lhs, true, i); + list = set_component_ssa_name (lhs, BOTH_P, b); if (list) gsi_insert_seq_after (gsi, list, GSI_CONTINUE_LINKING); } @@ -694,11 +754,11 @@ update_complex_components_on_edge (edge e, tree lhs, tree r, tree i) { gimple_seq list; - list = set_component_ssa_name (lhs, false, r); + list = set_component_ssa_name (lhs, REAL_P, r); if (list) gsi_insert_seq_on_edge (e, list); - list = set_component_ssa_name (lhs, true, i); + list = set_component_ssa_name (lhs, IMAG_P, i); if (list) gsi_insert_seq_on_edge (e, list); } @@ -707,19 +767,24 @@ update_complex_components_on_edge (edge e, tree lhs, tree r, tree i) /* Update an assignment to a complex variable in place. */ static void -update_complex_assignment (gimple_stmt_iterator *gsi, tree r, tree i) +update_complex_assignment (gimple_stmt_iterator * gsi, tree r, tree i, + tree b = NULL) { gimple *old_stmt = gsi_stmt (*gsi); - gimple_assign_set_rhs_with_ops (gsi, COMPLEX_EXPR, r, i); + if (b == NULL) + gimple_assign_set_rhs_with_ops (gsi, COMPLEX_EXPR, r, i); + else + /* dummy assignment, but pr45569.C fails if removed */ + gimple_assign_set_rhs_from_tree (gsi, b); + gimple *stmt = gsi_stmt (*gsi); update_stmt (stmt); if (maybe_clean_or_replace_eh_stmt (old_stmt, stmt)) bitmap_set_bit (need_eh_cleanup, gimple_bb (stmt)->index); - update_complex_components (gsi, gsi_stmt (*gsi), r, i); + update_complex_components (gsi, gsi_stmt (*gsi), r, i, b); } - /* Generate code at the entry point of the function to initialize the component variables for a complex parameter. */ @@ -768,7 +833,8 @@ update_phi_components (basic_block bb) for (j = 0; j < 2; j++) { - tree l = get_component_ssa_name (gimple_phi_result (phi), j > 0); + tree l = get_component_ssa_name (gimple_phi_result (phi), + (complex_part_t) j); if (TREE_CODE (l) == SSA_NAME) p[j] = create_phi_node (l, bb); } @@ -779,7 +845,9 @@ update_phi_components (basic_block bb) for (j = 0; j < 2; j++) if (p[j]) { - comp = extract_component (NULL, arg, j > 0, false, true); + comp = + extract_component (NULL, arg, (complex_part_t) j, false, + true); if (TREE_CODE (comp) == SSA_NAME && SSA_NAME_DEF_STMT (comp) == NULL) { @@ -809,13 +877,14 @@ update_phi_components (basic_block bb) } } + /* Expand a complex move to scalars. */ static void expand_complex_move (gimple_stmt_iterator *gsi, tree type) { tree inner_type = TREE_TYPE (type); - tree r, i, lhs, rhs; + tree r, i, b, lhs, rhs; gimple *stmt = gsi_stmt (*gsi); if (is_gimple_assign (stmt)) @@ -862,16 +931,13 @@ expand_complex_move (gimple_stmt_iterator *gsi, tree type) else { if (gimple_assign_rhs_code (stmt) != COMPLEX_EXPR) - { - r = extract_component (gsi, rhs, 0, true); - i = extract_component (gsi, rhs, 1, true); - } + update_complex_assignment (gsi, NULL, NULL, + extract_component (gsi, rhs, + BOTH_P, true)); else - { - r = gimple_assign_rhs1 (stmt); - i = gimple_assign_rhs2 (stmt); - } - update_complex_assignment (gsi, r, i); + update_complex_assignment (gsi, + gimple_assign_rhs1 (stmt), + gimple_assign_rhs2 (stmt), NULL); } } else if (rhs @@ -883,24 +949,18 @@ expand_complex_move (gimple_stmt_iterator *gsi, tree type) location_t loc; loc = gimple_location (stmt); - r = extract_component (gsi, rhs, 0, false); - i = extract_component (gsi, rhs, 1, false); - - x = build1 (REALPART_EXPR, inner_type, unshare_expr (lhs)); - t = gimple_build_assign (x, r); - gimple_set_location (t, loc); - gsi_insert_before (gsi, t, GSI_SAME_STMT); + b = extract_component (gsi, rhs, BOTH_P, false); if (stmt == gsi_stmt (*gsi)) { - x = build1 (IMAGPART_EXPR, inner_type, unshare_expr (lhs)); + x = unshare_expr (lhs); gimple_assign_set_lhs (stmt, x); - gimple_assign_set_rhs1 (stmt, i); + gimple_assign_set_rhs1 (stmt, b); } else { - x = build1 (IMAGPART_EXPR, inner_type, unshare_expr (lhs)); - t = gimple_build_assign (x, i); + x = unshare_expr (lhs); + t = gimple_build_assign (x, b); gimple_set_location (t, loc); gsi_insert_before (gsi, t, GSI_SAME_STMT); @@ -1641,26 +1701,88 @@ expand_complex_asm (gimple_stmt_iterator *gsi) } /* Make sure to not ICE later, see PR105165. */ tree zero = build_zero_cst (TREE_TYPE (TREE_TYPE (op))); - set_component_ssa_name (op, false, zero); - set_component_ssa_name (op, true, zero); + set_component_ssa_name (op, REAL_P, zero); + set_component_ssa_name (op, IMAG_P, zero); + set_component_ssa_name (op, BOTH_P, zero); continue; } tree type = TREE_TYPE (op); tree inner_type = TREE_TYPE (type); tree r = build1 (REALPART_EXPR, inner_type, op); tree i = build1 (IMAGPART_EXPR, inner_type, op); - gimple_seq list = set_component_ssa_name (op, false, r); + tree b = op; + gimple_seq list = set_component_ssa_name (op, REAL_P, r); if (list) gsi_insert_seq_after (gsi, list, GSI_CONTINUE_LINKING); - list = set_component_ssa_name (op, true, i); + list = set_component_ssa_name (op, IMAG_P, i); + if (list) + gsi_insert_seq_after (gsi, list, GSI_CONTINUE_LINKING); + + list = set_component_ssa_name (op, BOTH_P, b); if (list) gsi_insert_seq_after (gsi, list, GSI_CONTINUE_LINKING); } } } +/* Returns true if a complex component is a constant */ + +static bool +complex_component_cst_p (tree cplx, complex_part_t part) +{ + switch (TREE_CODE (cplx)) + { + case COMPLEX_CST: + return true; + + case SSA_NAME: + { + size_t ssa_name_index = SSA_NAME_VERSION (cplx) * 3 + part; + tree val = complex_ssa_name_components[ssa_name_index]; + return (val) ? CONSTANT_CLASS_P (val) : false; + } + + default: + return false; + } +} + +/* Returns true if the target support a particular complex operation natively */ + +static bool +target_native_complex_operation (enum tree_code code, tree type, + tree inner_type, tree ac, tree bc, + complex_lattice_t al, complex_lattice_t bl) +{ + /* Native complex instructions are currently only used when both operands are varying, + but a finer grain approach may be interesting */ + if ((al != VARYING) || ((bl != VARYING) && (bl != UNINITIALIZED))) + return false; + + /* do not use native operations when a part of the result is constant */ + if ((bl == UNINITIALIZED) + && (complex_component_cst_p (ac, REAL_P) + || complex_component_cst_p (ac, IMAG_P))) + return false; + else if ((bl != UNINITIALIZED) + && + ((complex_component_cst_p (ac, REAL_P) + && complex_component_cst_p (bc, REAL_P)) + || (complex_component_cst_p (ac, IMAG_P) + && complex_component_cst_p (bc, IMAG_P)))) + return false; + + optab op = optab_for_tree_code (code, inner_type, optab_default); + + /* no need to search if operation is not in the optab */ + if (op == unknown_optab) + return false; + + return optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing; +} + /* Process one statement. If we identify a complex operation, expand it. */ static void @@ -1729,14 +1851,17 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) && TREE_CODE (lhs) == SSA_NAME) { rhs = gimple_assign_rhs1 (stmt); + enum tree_code rhs_code = gimple_assign_rhs_code (stmt); rhs = extract_component (gsi, TREE_OPERAND (rhs, 0), - gimple_assign_rhs_code (stmt) - == IMAGPART_EXPR, - false); + (rhs_code == IMAGPART_EXPR) ? IMAG_P + : (rhs_code == REALPART_EXPR) ? REAL_P + : BOTH_P, false); gimple_assign_set_rhs_from_tree (gsi, rhs); stmt = gsi_stmt (*gsi); update_stmt (stmt); } + else if (is_gimple_call (stmt)) + return; } return; } @@ -1755,19 +1880,6 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) bc = gimple_cond_rhs (stmt); } - ar = extract_component (gsi, ac, false, true); - ai = extract_component (gsi, ac, true, true); - - if (ac == bc) - br = ar, bi = ai; - else if (bc) - { - br = extract_component (gsi, bc, 0, true); - bi = extract_component (gsi, bc, 1, true); - } - else - br = bi = NULL_TREE; - al = find_lattice_value (ac); if (al == UNINITIALIZED) al = VARYING; @@ -1783,44 +1895,142 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) bl = VARYING; } - switch (code) + if (target_native_complex_operation + (code, type, inner_type, ac, bc, al, bl)) { - case PLUS_EXPR: - case MINUS_EXPR: - expand_complex_addition (gsi, inner_type, ar, ai, br, bi, code, al, bl); - break; + tree ab, bb, rb; + gimple_seq stmts = NULL; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + ab = extract_component (gsi, ac, BOTH_P, true); + if (ac == bc) + bb = ab; + else if (bc) + { + bb = extract_component (gsi, bc, BOTH_P, true); + } + else + bb = NULL_TREE; - case MULT_EXPR: - expand_complex_multiplication (gsi, type, ar, ai, br, bi, al, bl); - break; + switch (code) + { + case PLUS_EXPR: + case MINUS_EXPR: + case MULT_EXPR: + rb = gimple_build (&stmts, loc, code, type, ab, bb); + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + update_complex_assignment (gsi, NULL, NULL, rb); + break; - case TRUNC_DIV_EXPR: - case CEIL_DIV_EXPR: - case FLOOR_DIV_EXPR: - case ROUND_DIV_EXPR: - case RDIV_EXPR: - expand_complex_division (gsi, type, ar, ai, br, bi, code, al, bl); - break; + case NEGATE_EXPR: + case CONJ_EXPR: + rb = gimple_build (&stmts, loc, code, type, ab); + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + update_complex_assignment (gsi, NULL, NULL, rb); + break; - case NEGATE_EXPR: - expand_complex_negation (gsi, inner_type, ar, ai); - break; + case EQ_EXPR: + case NE_EXPR: + /* FIXME */ + { + gimple *stmt = gsi_stmt (*gsi); + rb = gimple_build (&stmts, loc, code, type, ab, bb); + switch (gimple_code (stmt)) + { + case GIMPLE_RETURN: + { + greturn *return_stmt = as_a < greturn * >(stmt); + gimple_return_set_retval (return_stmt, + fold_convert (type, rb)); + } + break; - case CONJ_EXPR: - expand_complex_conjugate (gsi, inner_type, ar, ai); - break; + case GIMPLE_ASSIGN: + update_complex_assignment (gsi, NULL, NULL, rb); + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + break; - case EQ_EXPR: - case NE_EXPR: - expand_complex_comparison (gsi, ar, ai, br, bi, code); - break; + case GIMPLE_COND: + { + gcond *cond_stmt = as_a < gcond * >(stmt); + gimple_cond_set_code (cond_stmt, EQ_EXPR); + gimple_cond_set_lhs (cond_stmt, rb); + gimple_cond_set_rhs (cond_stmt, boolean_true_node); + } + break; - default: - gcc_unreachable (); + default: + break; + } + break; + } + + + /* not supported yet */ + case TRUNC_DIV_EXPR: + case CEIL_DIV_EXPR: + case FLOOR_DIV_EXPR: + case ROUND_DIV_EXPR: + case RDIV_EXPR: + + default: + gcc_unreachable (); + } + return; } -} + ar = extract_component (gsi, ac, REAL_P, true); + ai = extract_component (gsi, ac, IMAG_P, true); + + if (ac == bc) + br = ar, bi = ai; + else if (bc) + { + br = extract_component (gsi, bc, REAL_P, true); + bi = extract_component (gsi, bc, IMAG_P, true); + } + else + br = bi = NULL_TREE; + + switch (code) + { + case PLUS_EXPR: + case MINUS_EXPR: + expand_complex_addition (gsi, inner_type, ar, ai, br, bi, code, al, + bl); + break; + + case MULT_EXPR: + expand_complex_multiplication (gsi, type, ar, ai, br, bi, al, bl); + break; + + case TRUNC_DIV_EXPR: + case CEIL_DIV_EXPR: + case FLOOR_DIV_EXPR: + case ROUND_DIV_EXPR: + case RDIV_EXPR: + expand_complex_division (gsi, type, ar, ai, br, bi, code, al, bl); + break; + + case NEGATE_EXPR: + expand_complex_negation (gsi, inner_type, ar, ai); + break; + + case CONJ_EXPR: + expand_complex_conjugate (gsi, inner_type, ar, ai); + break; + + case EQ_EXPR: + case NE_EXPR: + expand_complex_comparison (gsi, ar, ai, br, bi, code); + break; + + default: + gcc_unreachable (); + } +} + /* Entry point for complex operation lowering during optimization. */ static unsigned int @@ -1845,8 +2055,8 @@ tree_lower_complex (void) complex_variable_components = new int_tree_htab_type (10); - complex_ssa_name_components.create (2 * num_ssa_names); - complex_ssa_name_components.safe_grow_cleared (2 * num_ssa_names, true); + complex_ssa_name_components.create (3 * num_ssa_names); + complex_ssa_name_components.safe_grow_cleared (3 * num_ssa_names, true); update_parameter_components (); @@ -1879,7 +2089,9 @@ tree_lower_complex (void) || is_gimple_min_invariant (op)) continue; tree arg = gimple_phi_arg_def (phis_to_revisit[j], l); - op = extract_component (NULL, arg, k > 0, false, false); + op = + extract_component (NULL, arg, (complex_part_t) k, false, + false); SET_PHI_ARG_DEF (phi, l, op); } } diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 668808a29d0..da6daf99fc1 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -1486,6 +1486,7 @@ struct GTY(()) tree_complex { struct tree_typed typed; tree real; tree imag; + tree both; }; struct GTY(()) tree_vector { diff --git a/gcc/tree-streamer-in.cc b/gcc/tree-streamer-in.cc index 5bead0c3c6a..a1fa2cb9eea 100644 --- a/gcc/tree-streamer-in.cc +++ b/gcc/tree-streamer-in.cc @@ -695,6 +695,7 @@ lto_input_ts_complex_tree_pointers (class lto_input_block *ib, { TREE_REALPART (expr) = stream_read_tree_ref (ib, data_in); TREE_IMAGPART (expr) = stream_read_tree_ref (ib, data_in); + TREE_COMPLEX_BOTH_PARTS (expr) = stream_read_tree_ref (ib, data_in); } diff --git a/gcc/tree-streamer-out.cc b/gcc/tree-streamer-out.cc index ff9694e17dd..be7314ef748 100644 --- a/gcc/tree-streamer-out.cc +++ b/gcc/tree-streamer-out.cc @@ -592,6 +592,7 @@ write_ts_complex_tree_pointers (struct output_block *ob, tree expr) { stream_write_tree_ref (ob, TREE_REALPART (expr)); stream_write_tree_ref (ob, TREE_IMAGPART (expr)); + stream_write_tree_ref (ob, TREE_COMPLEX_BOTH_PARTS (expr)); } diff --git a/gcc/tree.cc b/gcc/tree.cc index 420857b110c..2bc1b0d1e3f 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -2497,6 +2497,14 @@ build_complex (tree type, tree real, tree imag) tree t = make_node (COMPLEX_CST); + /* represent both parts as a constant vector */ + tree vector_type = build_vector_type (TREE_TYPE (real), 2); + tree_vector_builder v (vector_type, 1, 2); + v.quick_push (real); + v.quick_push (imag); + tree both = v.build (); + + TREE_COMPLEX_BOTH_PARTS (t) = both; TREE_REALPART (t) = real; TREE_IMAGPART (t) = imag; TREE_TYPE (t) = type ? type : build_complex_type (TREE_TYPE (real)); diff --git a/gcc/tree.h b/gcc/tree.h index fa02e2907a1..28716b53120 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -634,6 +634,12 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, /* Nonzero if TYPE represents a complex floating-point type. */ +#define COMPLEX_INTEGER_TYPE_P(TYPE) \ + (TREE_CODE (TYPE) == COMPLEX_TYPE \ + && TREE_CODE (TREE_TYPE (TYPE)) == INTEGER_TYPE) + +/* Nonzero if TYPE represents a complex floating-point type. */ + #define COMPLEX_FLOAT_TYPE_P(TYPE) \ (TREE_CODE (TYPE) == COMPLEX_TYPE \ && TREE_CODE (TREE_TYPE (TYPE)) == REAL_TYPE) @@ -1155,6 +1161,7 @@ extern void omp_clause_range_check_failed (const_tree, const char *, int, /* In a COMPLEX_CST node. */ #define TREE_REALPART(NODE) (COMPLEX_CST_CHECK (NODE)->complex.real) #define TREE_IMAGPART(NODE) (COMPLEX_CST_CHECK (NODE)->complex.imag) +#define TREE_COMPLEX_BOTH_PARTS(NODE) (COMPLEX_CST_CHECK (NODE)->complex.both) /* In a VECTOR_CST node. See generic.texi for details. */ #define VECTOR_CST_NELTS(NODE) (TYPE_VECTOR_SUBPARTS (TREE_TYPE (NODE))) @@ -2214,6 +2221,8 @@ class auto_suppress_location_wrappers (as_a (TYPE_CHECK (NODE)->type_common.mode)) #define SCALAR_FLOAT_TYPE_MODE(NODE) \ (as_a (TYPE_CHECK (NODE)->type_common.mode)) +#define COMPLEX_TYPE_MODE(NODE) \ + (as_a (TYPE_CHECK (NODE)->type_common.mode)) #define SET_TYPE_MODE(NODE, MODE) \ (TYPE_CHECK (NODE)->type_common.mode = (MODE)) @@ -6646,7 +6655,11 @@ extern const builtin_structptr_type builtin_structptr_types[6]; inline bool type_has_mode_precision_p (const_tree t) { - return known_eq (TYPE_PRECISION (t), GET_MODE_PRECISION (TYPE_MODE (t))); + if (TREE_CODE (t) == COMPLEX_TYPE) + return known_eq (2*TYPE_PRECISION (TREE_TYPE(t)), + GET_MODE_PRECISION (TYPE_MODE (t))); + else + return known_eq (TYPE_PRECISION (t), GET_MODE_PRECISION (TYPE_MODE (t))); } /* Helper functions for fndecl_built_in_p. */ From patchwork Mon Jul 17 09:02:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121138 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp990657vqt; Mon, 17 Jul 2023 02:06:35 -0700 (PDT) X-Google-Smtp-Source: APBJJlGA5xf3cSZq5fWiYdO29QZtXMQ6jvDTaHA2sUa4bZyaW033imqxX2rQiUES1Ft8ellnwtDX X-Received: by 2002:a17:907:1186:b0:991:e12e:9858 with SMTP id uz6-20020a170907118600b00991e12e9858mr10071495ejb.64.1689584794863; Mon, 17 Jul 2023 02:06:34 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id j22-20020a170906411600b00993cdec0182si3578884ejk.143.2023.07.17.02.06.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:06:34 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=K3EIBOYM; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 21A453858D33 for ; Mon, 17 Jul 2023 09:05:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 21A453858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584750; bh=rGjt6WP1YdrRth348s3e9PetNdts9K+0WG+K4wzVFCQ=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=K3EIBOYMePxSY3GwvDEn0DaMrhajSOCdtKefrnQK0FECXRko6lLhBKQD0ESqa6HwH EF2K2NSwNNS+5hXSy5tKeyCXmmXUiNivtoyQywyUDsXqUWJMRcDZKcM3ss+kxibzin lvHkg+Db9CcNRaF6RezfZ0vRNZBIZUZ0s2PScGXE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.148]) by sourceware.org (Postfix) with ESMTPS id AE6503858436 for ; Mon, 17 Jul 2023 09:03:39 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AE6503858436 Received: from localhost (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id AF86B322A0A for ; Mon, 17 Jul 2023 11:03:38 +0200 (CEST) Received: from fx408 (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 8DCBB32297C for ; Mon, 17 Jul 2023 11:03:38 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0100.outbound.protection.outlook.com [104.47.24.100]) by fx408.security-mail.net (Postfix) with ESMTPS id 126B8322A25 for ; Mon, 17 Jul 2023 11:03:38 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:36 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:36 +0000 X-Virus-Scanned: E-securemail Secumail-id: <12739.64b503ea.11b5d.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EKqjh/NbO1oC2YWEOpGTzuzrHAVltTMAbhhyeD38x4sxtgQEpHMriAHbKBZ3Jrj4uZf9PzlEf0BTOqnOccu0zhnosnp6uKZ4gX6m0F4/o4bwRnyA2fSewMvmvPO9jDvb1RdT4V6/wQtrPpYhebkn3q7xlDZGp8LXqWlh0htTaQriTZRR0LwUlphAho2u0ztMi6GaokHTVcvJQsJMagiR+lwZw8wzq1iiZL/ssWLJdplKsI8bowGP99Cq4Kermqv/HBhiEmpRkhVRBqnodiXU230ymxAkf/9p/OQSZTIpYOlevK5uwV7LIOgDQ6bKvAvWQGXaKc95XBez3tSRsR6R/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rGjt6WP1YdrRth348s3e9PetNdts9K+0WG+K4wzVFCQ=; b=er1PRg5YBH1lkB2nscxbMH1fOIep8PQhqFqGuGrrWk/Rx3Y97oRZqaMSd07erlPzeHINDL7Hzl/xSr/SADbXWPr4RNjAOCdJ9pnXxaACY8q1CRMBr+kRD5lvWPmx5opbI/xsBiytAIdKEaJfgF+xbur1jeCxXrpxLc3M/RSsd5uFYPHdbVeMHxXLzityVyp/B18lwPwtG0dZ43K0dfD6ejeVVMV5LSjecHmuaR5T+xpzTLiQtEFdGLKrFxzrlOPaQbmEhUeEof4iMh/koGoy+/xGBvApCmMULZ92YPp36bdAU1QeK1yKpdEmxnuNCoetFf0UoeVj+nKUYw3TD4yyNQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 2/9] Native complex operations: Move functions to hooks Date: Mon, 17 Jul 2023 11:02:43 +0200 Message-ID: <20230717090250.4645-3-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO2P265CA0311.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a5::35) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: 429b0d4a-0cf3-4790-0bfd-08db86a4b47a X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BLgkggVo6HvgZUWCEFZanitfE/sO/AeFzijHWxOWWtG8tWnYP+nz1o0NOOp+u46CfYSDL9uPFG3S8ITLZQmg2ZdCbdfSBmny/BvRQiF1YWhdLuT87+IeqROWZ/jhlLzvabjSIsxZ57j06uCzv6yvTvFHcRkw9aww8h56m/oYNc+27opH1XtTwfP+w5YLmExt3XIRZU/J3jd2VBYFTSrfuNyTNLMLaPl09HcH58UY6P8DUfD1S9gUGqlg5oOiMIOxTswB8Hpy2DW4FzD8dbvgIayfsMlno0NrfmzUYNdcFlvQ0lcCd35oZku0z0Jz597BLUL7QVUqobEis4Xu8KMlqHR49LOnFbwwXffX80YjrKo8/jFFODsx5IGfAiOWKGwRKEGY105jkd6y29vDmF0jZ+yEv3x2Md0O9ObG3gHV0+r5lokya5iMfz/JIZy/jpv8gHHoKVxVKFna7rqlvjWSbCNX2Mk/Zjb7ZEBaFHq9I+wTEBQAVwaD8oW1DdyE52aAmnxMXWRdqWMa9SCruurH6ympGwxyHyJoO1qrXvFYTMlujw1wDMQEG0CzcHheVl5L X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(6666004)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(30864003)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: DKElO8pK74u//uRXgejtmqQOxyAjliKO7mAqdA97n0cOcdTroQKicUNKoOs47P6Qzdv9NFXMHC33GuqhIYtleRmbjBOAUtxLOP5TmJF0I1xDftu/PxRWBd+Go5UWrDO9x4ivPz5dYZ531Pb5ertVAS5kWU8GpAb2jtl/MzCUZZb+i9L8OWHEZn/ymYuFz6MpzTNZWXlVTgGm81zfFwmBRQXhAhgPRRy6AahHx1NobKH0w35KyrCNsAM4T20U6KYwmIlz2gORGuJcOo7UGXZcBEoJNeslmBPiFNjMTlCnQ9z87/PYJiFV78JkkyphXnkzMI2+2v1NMxRH7ELHj98SZuuGw45ik1NF1MNE/6WWOpv9O59wVsF4nlZTJgRE3OfGuTZctjV3wxFj/l671nt5Q6bT15mRyv0ky0dAeLVBsOgweEiY+P3rFwYJPzeWH/NG4dkyGs9JPdguhlVnbaySzrBbY8RR6BtEkt8gjNPxlbKhj0l7ehkZN1RHhbtLVPtnxQwMPEKIv1mD91VF0KhL33cxL1xsyxLOeP0KKOCDY8iLQVKDo1vHy70p6vN1bjsMgeKAsa6K8PS9vz22jF7sfOwwhsjWoJjcXGISlxozcAwzOTk+VaySNqTRfbLIJP5hspok+NQrOl1O9kPocS/4XwpVbdHA7p+56IuDycL+DjiJgqAso+vhz+EfxCYK388rsX6WiLGPzx2lhkcJRdsm+zpJswsfw4+xtS++HDn2EB+ubOCPdWV0J7u9NGngrfGxjCBysvVqZi9h/GYYIfHboX1V4R5ss9Y8eoE/SQ9D7k2VmkLqnAHBnt8WKKOwmsiFeKhE6RRDqi/dcmvYPp95Y6Vt9d1dfvoGb0OBDCJTm6cXYOnLHbO9i5OOLKF+Qa+qIaAZvmgBlBAUrdHL/crsAqaxrXiQqIzph/i9lSGFb3Hlox5lX3wuIib5NcQc1b7L ZTMkORBsdQ8gCAcr9yfzgC3wuBX77xAqtk6IEdhLGOsPH9YojWlIAtwKfg918Y77JEmZmSvZE0zJsGlDuENGYPLQZ8wLNmLKSttQZsoz7NytRxvOQqIEYf6fWkVshP5rR7vCzZQfSr35zTO6vgDNhkdGe1T6EFD8aBbO1kqoPAi94BH2sTiGxaeXWrcyQEDr3zoX0i/rL2SO9LhVobgy/p9UK8z5NAIMw3ja2gJx19PLU/FdIaUv8E/dQpDGjl8Rg756DnFDKV5L9PZNCE5lDvsn04bbP4Om91bpvWaZBtGyqQly43K3y93Dsaa+e/wrlyCSbDh0yeGrQ9OcIxttvXjmZbxSre0XjRKYsiyc7tUPZ1E3A9uOxdC0rY5cE+v/akEMVqVPK2cxkytu5rV85a4MrUam32fAevf4AmG7696EZezQaJGt23Y0IDILapi5tT/kLff2s9iG3crIC4fw5Ezro1ICT5mrkrP0ODiKA/dXUHp14ylmggWDJtHb8ez+WLAKyXuIJqSiBKZBaaNS+zkQJVgWGDOLrno3AvbsHzHTk3eU0p6J/SlGngcIRoorVEzy5IBFYoF1SY+zETZ/FBMX8xrVtP4NuwVlaE/w80c4ZMq6hNQfQgOOQ3uQHj7e X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 429b0d4a-0cf3-4790-0bfd-08db86a4b47a X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:36.4854 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 0Fu1WNhoPMN0bCo+yUmqu4ygOjKgniQUgxvR8u6V5uwjffTDJOh+edbMXjBzxEPrGUBaU2+lsyaofkJ8+evudw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658065971247996 X-GMAIL-MSGID: 1771658065971247996 Move read_complex_part and write_complex_part to target hooks. Their signature also change because of the type of argument part is now complex_part_t. Calls to theses functions are updated accordingly. gcc/ChangeLog: * target.def: Define hooks for read_complex_part and write_complex_part * targhooks.cc (default_read_complex_part): New: default implementation of read_complex_part (default_write_complex_part): New: default implementation if write_complex_part * targhooks.h: Add default_read_complex_part and default_write_complex_part * doc/tm.texi: Document the new TARGET_READ_COMPLEX_PART and TARGET_WRITE_COMPLEX_PART hooks * doc/tm.texi.in: Add TARGET_READ_COMPLEX_PART and TARGET_WRITE_COMPLEX_PART * expr.cc (write_complex_part): Call TARGET_READ_COMPLEX_PART hook (read_complex_part): Call TARGET_WRITE_COMPLEX_PART hook * expr.h: Update function signatures of read_complex_part and write_complex_part * builtins.cc (expand_ifn_atomic_compare_exchange_into_call): Update calls to read_complex_part and write_complex_part (expand_ifn_atomic_compare_exchange): Likewise * expmed.cc (flip_storage_order): Likewise (clear_storage_hints): Likewise and write_complex_part (emit_move_complex_push): Likewise (emit_move_complex_parts): Likewise (expand_assignment): Likewise (expand_expr_real_2): Likewise (expand_expr_real_1): Likewise (const_vector_from_tree): Likewise * internal-fn.cc (expand_arith_set_overflow): Likewise (expand_arith_overflow_result_store): Likewise (expand_addsub_overflow): Likewise (expand_neg_overflow): Likewise (expand_mul_overflow): Likewise (expand_arith_overflow): Likewise (expand_UADDC): Likewise --- gcc/builtins.cc | 8 +-- gcc/doc/tm.texi | 10 +++ gcc/doc/tm.texi.in | 4 ++ gcc/expmed.cc | 4 +- gcc/expr.cc | 164 +++++++++------------------------------------ gcc/expr.h | 5 +- gcc/internal-fn.cc | 20 +++--- gcc/target.def | 18 +++++ gcc/targhooks.cc | 139 ++++++++++++++++++++++++++++++++++++++ gcc/targhooks.h | 5 ++ 10 files changed, 224 insertions(+), 153 deletions(-) diff --git a/gcc/builtins.cc b/gcc/builtins.cc index 6dff5214ff8..37da6bcae6f 100644 --- a/gcc/builtins.cc +++ b/gcc/builtins.cc @@ -6347,8 +6347,8 @@ expand_ifn_atomic_compare_exchange_into_call (gcall *call, machine_mode mode) if (GET_MODE (boolret) != mode) boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1); x = force_reg (mode, x); - write_complex_part (target, boolret, true, true); - write_complex_part (target, x, false, false); + write_complex_part (target, boolret, IMAG_P, true); + write_complex_part (target, x, REAL_P, false); } } @@ -6403,8 +6403,8 @@ expand_ifn_atomic_compare_exchange (gcall *call) rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); if (GET_MODE (boolret) != mode) boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1); - write_complex_part (target, boolret, true, true); - write_complex_part (target, oldval, false, false); + write_complex_part (target, boolret, IMAG_P, true); + write_complex_part (target, oldval, REAL_P, false); } } diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 95ba56e05ae..87997b76338 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -4605,6 +4605,16 @@ to return a nonzero value when it is required, the compiler will run out of spill registers and print a fatal error message. @end deftypefn +@deftypefn {Target Hook} rtx TARGET_READ_COMPLEX_PART (rtx @var{cplx}, complex_part_t @var{part}) +This hook should return the rtx representing the specified @var{part} of the complex given by @var{cplx}. + @var{part} can be the real part, the imaginary part, or both of them. +@end deftypefn + +@deftypefn {Target Hook} void TARGET_WRITE_COMPLEX_PART (rtx @var{cplx}, rtx @var{val}, complex_part_t @var{part}, bool @var{undefined_p}) +This hook should move the rtx value given by @var{val} to the specified @var{var} of the complex given by @var{cplx}. + @var{var} can be the real part, the imaginary part, or both of them. +@end deftypefn + @node Scalar Return @subsection How Scalar Function Values Are Returned @cindex return values in registers diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 4ac96dc357d..efbf972e6a7 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3390,6 +3390,10 @@ stack. @hook TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P +@hook TARGET_READ_COMPLEX_PART + +@hook TARGET_WRITE_COMPLEX_PART + @node Scalar Return @subsection How Scalar Function Values Are Returned @cindex return values in registers diff --git a/gcc/expmed.cc b/gcc/expmed.cc index fbd4ce2d42f..2f787cc28f9 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -394,8 +394,8 @@ flip_storage_order (machine_mode mode, rtx x) if (COMPLEX_MODE_P (mode)) { - rtx real = read_complex_part (x, false); - rtx imag = read_complex_part (x, true); + rtx real = read_complex_part (x, REAL_P); + rtx imag = read_complex_part (x, IMAG_P); real = flip_storage_order (GET_MODE_INNER (mode), real); imag = flip_storage_order (GET_MODE_INNER (mode), imag); diff --git a/gcc/expr.cc b/gcc/expr.cc index fff09dc9951..e1a0892b4d9 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -3480,8 +3480,8 @@ clear_storage_hints (rtx object, rtx size, enum block_op_methods method, zero = CONST0_RTX (GET_MODE_INNER (mode)); if (zero != NULL) { - write_complex_part (object, zero, 0, true); - write_complex_part (object, zero, 1, false); + write_complex_part (object, zero, REAL_P, true); + write_complex_part (object, zero, IMAG_P, false); return NULL; } } @@ -3646,126 +3646,18 @@ set_storage_via_setmem (rtx object, rtx size, rtx val, unsigned int align, If UNDEFINED_P then the value in CPLX is currently undefined. */ void -write_complex_part (rtx cplx, rtx val, bool imag_p, bool undefined_p) +write_complex_part (rtx cplx, rtx val, complex_part_t part, bool undefined_p) { - machine_mode cmode; - scalar_mode imode; - unsigned ibitsize; - - if (GET_CODE (cplx) == CONCAT) - { - emit_move_insn (XEXP (cplx, imag_p), val); - return; - } - - cmode = GET_MODE (cplx); - imode = GET_MODE_INNER (cmode); - ibitsize = GET_MODE_BITSIZE (imode); - - /* For MEMs simplify_gen_subreg may generate an invalid new address - because, e.g., the original address is considered mode-dependent - by the target, which restricts simplify_subreg from invoking - adjust_address_nv. Instead of preparing fallback support for an - invalid address, we call adjust_address_nv directly. */ - if (MEM_P (cplx)) - { - emit_move_insn (adjust_address_nv (cplx, imode, - imag_p ? GET_MODE_SIZE (imode) : 0), - val); - return; - } - - /* If the sub-object is at least word sized, then we know that subregging - will work. This special case is important, since store_bit_field - wants to operate on integer modes, and there's rarely an OImode to - correspond to TCmode. */ - if (ibitsize >= BITS_PER_WORD - /* For hard regs we have exact predicates. Assume we can split - the original object if it spans an even number of hard regs. - This special case is important for SCmode on 64-bit platforms - where the natural size of floating-point regs is 32-bit. */ - || (REG_P (cplx) - && REGNO (cplx) < FIRST_PSEUDO_REGISTER - && REG_NREGS (cplx) % 2 == 0)) - { - rtx part = simplify_gen_subreg (imode, cplx, cmode, - imag_p ? GET_MODE_SIZE (imode) : 0); - if (part) - { - emit_move_insn (part, val); - return; - } - else - /* simplify_gen_subreg may fail for sub-word MEMs. */ - gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); - } - - store_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, 0, 0, imode, val, - false, undefined_p); + targetm.write_complex_part (cplx, val, part, undefined_p); } /* Extract one of the components of the complex value CPLX. Extract the real part if IMAG_P is false, and the imaginary part if it's true. */ rtx -read_complex_part (rtx cplx, bool imag_p) -{ - machine_mode cmode; - scalar_mode imode; - unsigned ibitsize; - - if (GET_CODE (cplx) == CONCAT) - return XEXP (cplx, imag_p); - - cmode = GET_MODE (cplx); - imode = GET_MODE_INNER (cmode); - ibitsize = GET_MODE_BITSIZE (imode); - - /* Special case reads from complex constants that got spilled to memory. */ - if (MEM_P (cplx) && GET_CODE (XEXP (cplx, 0)) == SYMBOL_REF) - { - tree decl = SYMBOL_REF_DECL (XEXP (cplx, 0)); - if (decl && TREE_CODE (decl) == COMPLEX_CST) - { - tree part = imag_p ? TREE_IMAGPART (decl) : TREE_REALPART (decl); - if (CONSTANT_CLASS_P (part)) - return expand_expr (part, NULL_RTX, imode, EXPAND_NORMAL); - } - } - - /* For MEMs simplify_gen_subreg may generate an invalid new address - because, e.g., the original address is considered mode-dependent - by the target, which restricts simplify_subreg from invoking - adjust_address_nv. Instead of preparing fallback support for an - invalid address, we call adjust_address_nv directly. */ - if (MEM_P (cplx)) - return adjust_address_nv (cplx, imode, - imag_p ? GET_MODE_SIZE (imode) : 0); - - /* If the sub-object is at least word sized, then we know that subregging - will work. This special case is important, since extract_bit_field - wants to operate on integer modes, and there's rarely an OImode to - correspond to TCmode. */ - if (ibitsize >= BITS_PER_WORD - /* For hard regs we have exact predicates. Assume we can split - the original object if it spans an even number of hard regs. - This special case is important for SCmode on 64-bit platforms - where the natural size of floating-point regs is 32-bit. */ - || (REG_P (cplx) - && REGNO (cplx) < FIRST_PSEUDO_REGISTER - && REG_NREGS (cplx) % 2 == 0)) - { - rtx ret = simplify_gen_subreg (imode, cplx, cmode, - imag_p ? GET_MODE_SIZE (imode) : 0); - if (ret) - return ret; - else - /* simplify_gen_subreg may fail for sub-word MEMs. */ - gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); - } - - return extract_bit_field (cplx, ibitsize, imag_p ? ibitsize : 0, - true, NULL_RTX, imode, imode, false, NULL); +read_complex_part (rtx cplx, complex_part_t part) +{ + return targetm.read_complex_part (cplx, part); } /* A subroutine of emit_move_insn_1. Yet another lowpart generator. @@ -3936,9 +3828,10 @@ emit_move_complex_push (machine_mode mode, rtx x, rtx y) } emit_move_insn (gen_rtx_MEM (submode, XEXP (x, 0)), - read_complex_part (y, imag_first)); + read_complex_part (y, (imag_first) ? IMAG_P : REAL_P)); return emit_move_insn (gen_rtx_MEM (submode, XEXP (x, 0)), - read_complex_part (y, !imag_first)); + read_complex_part (y, + (imag_first) ? REAL_P : IMAG_P)); } /* A subroutine of emit_move_complex. Perform the move from Y to X @@ -3954,8 +3847,8 @@ emit_move_complex_parts (rtx x, rtx y) && REG_P (x) && !reg_overlap_mentioned_p (x, y)) emit_clobber (x); - write_complex_part (x, read_complex_part (y, false), false, true); - write_complex_part (x, read_complex_part (y, true), true, false); + write_complex_part (x, read_complex_part (y, REAL_P), REAL_P, true); + write_complex_part (x, read_complex_part (y, IMAG_P), IMAG_P, false); return get_last_insn (); } @@ -5812,9 +5705,9 @@ expand_assignment (tree to, tree from, bool nontemporal) if (from_rtx) { emit_move_insn (XEXP (to_rtx, 0), - read_complex_part (from_rtx, false)); + read_complex_part (from_rtx, REAL_P)); emit_move_insn (XEXP (to_rtx, 1), - read_complex_part (from_rtx, true)); + read_complex_part (from_rtx, IMAG_P)); } else { @@ -5836,14 +5729,16 @@ expand_assignment (tree to, tree from, bool nontemporal) concat_store_slow:; rtx temp = assign_stack_temp (GET_MODE (to_rtx), GET_MODE_SIZE (GET_MODE (to_rtx))); - write_complex_part (temp, XEXP (to_rtx, 0), false, true); - write_complex_part (temp, XEXP (to_rtx, 1), true, false); + write_complex_part (temp, XEXP (to_rtx, 0), REAL_P, true); + write_complex_part (temp, XEXP (to_rtx, 1), IMAG_P, false); result = store_field (temp, bitsize, bitpos, bitregion_start, bitregion_end, mode1, from, get_alias_set (to), nontemporal, reversep); - emit_move_insn (XEXP (to_rtx, 0), read_complex_part (temp, false)); - emit_move_insn (XEXP (to_rtx, 1), read_complex_part (temp, true)); + emit_move_insn (XEXP (to_rtx, 0), + read_complex_part (temp, REAL_P)); + emit_move_insn (XEXP (to_rtx, 1), + read_complex_part (temp, IMAG_P)); } } /* For calls to functions returning variable length structures, if TO_RTX @@ -10322,8 +10217,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, complex_expr_swap_order: /* Move the imaginary (op1) and real (op0) parts to their location. */ - write_complex_part (target, op1, true, true); - write_complex_part (target, op0, false, false); + write_complex_part (target, op1, IMAG_P, true); + write_complex_part (target, op0, REAL_P, false); return target; } @@ -10352,8 +10247,8 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, } /* Move the real (op0) and imaginary (op1) parts to their location. */ - write_complex_part (target, op0, false, true); - write_complex_part (target, op1, true, false); + write_complex_part (target, op0, REAL_P, true); + write_complex_part (target, op1, IMAG_P, false); return target; @@ -11508,7 +11403,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, rtx parts[2]; for (int i = 0; i < 2; i++) { - rtx op = read_complex_part (op0, i != 0); + rtx op = + read_complex_part (op0, (i != 0) ? IMAG_P : REAL_P); if (GET_CODE (op) == SUBREG) op = force_reg (GET_MODE (op), op); temp = gen_lowpart_common (GET_MODE_INNER (mode1), op); @@ -12106,11 +12002,11 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, case REALPART_EXPR: op0 = expand_normal (treeop0); - return read_complex_part (op0, false); + return read_complex_part (op0, REAL_P); case IMAGPART_EXPR: op0 = expand_normal (treeop0); - return read_complex_part (op0, true); + return read_complex_part (op0, IMAG_P); case RETURN_EXPR: case LABEL_EXPR: @@ -13449,8 +13345,8 @@ const_vector_from_tree (tree exp) builder.quick_push (const_double_from_real_value (TREE_REAL_CST (elt), inner)); else if (TREE_CODE (elt) == FIXED_CST) - builder.quick_push (CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), - inner)); + builder.quick_push (CONST_FIXED_FROM_FIXED_VALUE + (TREE_FIXED_CST (elt), inner)); else builder.quick_push (immed_wide_int_const (wi::to_poly_wide (elt), inner)); diff --git a/gcc/expr.h b/gcc/expr.h index 11bff531862..833ff16bd0d 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -261,9 +261,8 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx); extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx); extern rtx_insn *emit_move_complex_parts (rtx, rtx); -extern rtx read_complex_part (rtx, bool); -extern void write_complex_part (rtx, rtx, bool, bool); -extern rtx read_complex_part (rtx, bool); +extern rtx read_complex_part (rtx, complex_part_t); +extern void write_complex_part (rtx, rtx, complex_part_t, bool); extern rtx emit_move_resolve_push (machine_mode, rtx); /* Push a block of length SIZE (perhaps variable) diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index f9aaf66cf2a..8d3d4599256 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -917,9 +917,9 @@ expand_arith_set_overflow (tree lhs, rtx target) { if (TYPE_PRECISION (TREE_TYPE (TREE_TYPE (lhs))) == 1 && !TYPE_UNSIGNED (TREE_TYPE (TREE_TYPE (lhs)))) - write_complex_part (target, constm1_rtx, true, false); + write_complex_part (target, constm1_rtx, IMAG_P, false); else - write_complex_part (target, const1_rtx, true, false); + write_complex_part (target, const1_rtx, IMAG_P, false); } /* Helper for expand_*_overflow. Store RES into the __real__ part @@ -974,7 +974,7 @@ expand_arith_overflow_result_store (tree lhs, rtx target, expand_arith_set_overflow (lhs, target); emit_label (done_label); } - write_complex_part (target, lres, false, false); + write_complex_part (target, lres, REAL_P, false); } /* Helper for expand_*_overflow. Store RES into TARGET. */ @@ -1019,7 +1019,7 @@ expand_addsub_overflow (location_t loc, tree_code code, tree lhs, { target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); if (!is_ubsan) - write_complex_part (target, const0_rtx, true, false); + write_complex_part (target, const0_rtx, IMAG_P, false); } /* We assume both operands and result have the same precision @@ -1464,7 +1464,7 @@ expand_neg_overflow (location_t loc, tree lhs, tree arg1, bool is_ubsan, { target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); if (!is_ubsan) - write_complex_part (target, const0_rtx, true, false); + write_complex_part (target, const0_rtx, IMAG_P, false); } enum insn_code icode = optab_handler (negv3_optab, mode); @@ -1589,7 +1589,7 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, tree arg1, { target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); if (!is_ubsan) - write_complex_part (target, const0_rtx, true, false); + write_complex_part (target, const0_rtx, IMAG_P, false); } if (is_ubsan) @@ -2406,7 +2406,7 @@ expand_mul_overflow (location_t loc, tree lhs, tree arg0, tree arg1, do_compare_rtx_and_jump (op1, res, NE, true, mode, NULL_RTX, NULL, all_done_label, profile_probability::very_unlikely ()); emit_label (set_noovf); - write_complex_part (target, const0_rtx, true, false); + write_complex_part (target, const0_rtx, IMAG_P, false); emit_label (all_done_label); } @@ -2675,7 +2675,7 @@ expand_arith_overflow (enum tree_code code, gimple *stmt) { /* The infinity precision result will always fit into result. */ rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE); - write_complex_part (target, const0_rtx, true, false); + write_complex_part (target, const0_rtx, IMAG_P, false); scalar_int_mode mode = SCALAR_INT_TYPE_MODE (type); struct separate_ops ops; ops.code = code; @@ -2840,8 +2840,8 @@ expand_UADDC (internal_fn ifn, gcall *stmt) create_input_operand (&ops[3], op2, mode); create_input_operand (&ops[4], op3, mode); expand_insn (icode, 5, ops); - write_complex_part (target, re, false, false); - write_complex_part (target, im, true, false); + write_complex_part (target, re, REAL_P, false); + write_complex_part (target, im, IMAG_P, false); } /* Expand USUBC STMT. */ diff --git a/gcc/target.def b/gcc/target.def index 7d684296c17..9798c0f58e4 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -3306,6 +3306,24 @@ a pointer to int.", bool, (ao_ref *ref), default_ref_may_alias_errno) +/* Returns the value corresponding to the specified part of a complex. */ +DEFHOOK +(read_complex_part, + "This hook should return the rtx representing the specified @var{part} of the complex given by @var{cplx}.\n\ + @var{part} can be the real part, the imaginary part, or both of them.", + rtx, + (rtx cplx, complex_part_t part), + default_read_complex_part) + +/* Moves a value to the specified part of a complex */ +DEFHOOK +(write_complex_part, + "This hook should move the rtx value given by @var{val} to the specified @var{var} of the complex given by @var{cplx}.\n\ + @var{var} can be the real part, the imaginary part, or both of them.", + void, + (rtx cplx, rtx val, complex_part_t part, bool undefined_p), + default_write_complex_part) + /* Support for named address spaces. */ #undef HOOK_PREFIX #define HOOK_PREFIX "TARGET_ADDR_SPACE_" diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index e190369f87a..d33fcbd9a13 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1532,6 +1532,145 @@ default_preferred_simd_mode (scalar_mode) return word_mode; } +/* By default, extract one of the components of the complex value CPLX. Extract the + real part if part is REAL_P, and the imaginary part if it is IMAG_P. If part is + BOTH_P, return cplx directly*/ + +rtx +default_read_complex_part (rtx cplx, complex_part_t part) +{ + machine_mode cmode; + scalar_mode imode; + unsigned ibitsize; + + if (part == BOTH_P) + return cplx; + + if (GET_CODE (cplx) == CONCAT) + return XEXP (cplx, part); + + cmode = GET_MODE (cplx); + imode = GET_MODE_INNER (cmode); + ibitsize = GET_MODE_BITSIZE (imode); + + /* Special case reads from complex constants that got spilled to memory. */ + if (MEM_P (cplx) && GET_CODE (XEXP (cplx, 0)) == SYMBOL_REF) + { + tree decl = SYMBOL_REF_DECL (XEXP (cplx, 0)); + if (decl && TREE_CODE (decl) == COMPLEX_CST) + { + tree cplx_part = + (part == IMAG_P) ? TREE_IMAGPART (decl) : TREE_REALPART (decl); + if (CONSTANT_CLASS_P (cplx_part)) + return expand_expr (cplx_part, NULL_RTX, imode, EXPAND_NORMAL); + } + } + + /* For MEMs simplify_gen_subreg may generate an invalid new address + because, e.g., the original address is considered mode-dependent + by the target, which restricts simplify_subreg from invoking + adjust_address_nv. Instead of preparing fallback support for an + invalid address, we call adjust_address_nv directly. */ + if (MEM_P (cplx)) + return adjust_address_nv (cplx, imode, (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0); + + /* If the sub-object is at least word sized, then we know that subregging + will work. This special case is important, since extract_bit_field + wants to operate on integer modes, and there's rarely an OImode to + correspond to TCmode. */ + if (ibitsize >= BITS_PER_WORD + /* For hard regs we have exact predicates. Assume we can split + the original object if it spans an even number of hard regs. + This special case is important for SCmode on 64-bit platforms + where the natural size of floating-point regs is 32-bit. */ + || (REG_P (cplx) + && REGNO (cplx) < FIRST_PSEUDO_REGISTER + && REG_NREGS (cplx) % 2 == 0)) + { + rtx ret = simplify_gen_subreg (imode, cplx, cmode, (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0); + if (ret) + return ret; + else + /* simplify_gen_subreg may fail for sub-word MEMs. */ + gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); + } + + return extract_bit_field (cplx, ibitsize, (part == IMAG_P) ? ibitsize : 0, + true, NULL_RTX, imode, imode, false, NULL); +} + +/* By default, Write to one of the components of the complex value CPLX. Write VAL to + the real part if part is REAL_P, and the imaginary part if it is IMAG_P. If part is + BOTH_P, call recursively with REAL_P and IMAG_P */ + +void +default_write_complex_part (rtx cplx, rtx val, complex_part_t part, bool undefined_p) +{ + machine_mode cmode; + scalar_mode imode; + unsigned ibitsize; + + if (part == BOTH_P) + { + write_complex_part (cplx, read_complex_part (val, REAL_P), REAL_P, false); + write_complex_part (cplx, read_complex_part (val, IMAG_P), IMAG_P, false); + return; + } + + if (GET_CODE (cplx) == CONCAT) + { + emit_move_insn (XEXP (cplx, part == IMAG_P), val); + return; + } + + cmode = GET_MODE (cplx); + imode = GET_MODE_INNER (cmode); + ibitsize = GET_MODE_BITSIZE (imode); + + /* For MEMs simplify_gen_subreg may generate an invalid new address + because, e.g., the original address is considered mode-dependent + by the target, which restricts simplify_subreg from invoking + adjust_address_nv. Instead of preparing fallback support for an + invalid address, we call adjust_address_nv directly. */ + if (MEM_P (cplx)) + { + emit_move_insn (adjust_address_nv (cplx, imode, (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0), val); + return; + } + + /* If the sub-object is at least word sized, then we know that subregging + will work. This special case is important, since store_bit_field + wants to operate on integer modes, and there's rarely an OImode to + correspond to TCmode. */ + if (ibitsize >= BITS_PER_WORD + /* For hard regs we have exact predicates. Assume we can split + the original object if it spans an even number of hard regs. + This special case is important for SCmode on 64-bit platforms + where the natural size of floating-point regs is 32-bit. */ + || (REG_P (cplx) + && REGNO (cplx) < FIRST_PSEUDO_REGISTER + && REG_NREGS (cplx) % 2 == 0)) + { + rtx cplx_part = simplify_gen_subreg (imode, cplx, cmode, + (part == IMAG_P) ? + GET_MODE_SIZE (imode) : 0); + if (cplx_part) + { + emit_move_insn (cplx_part, val); + return; + } + else + /* simplify_gen_subreg may fail for sub-word MEMs. */ + gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); + } + + store_bit_field (cplx, ibitsize, (part == IMAG_P) ? ibitsize : 0, 0, 0, + imode, val, false, undefined_p); +} + /* By default do not split reductions further. */ machine_mode diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 1a0db8dddd5..805abd96938 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -124,6 +124,11 @@ extern opt_machine_mode default_get_mask_mode (machine_mode); extern bool default_empty_mask_is_expensive (unsigned); extern vector_costs *default_vectorize_create_costs (vec_info *, bool); +extern rtx default_read_complex_part (rtx cplx, complex_part_t part); +extern void default_write_complex_part (rtx cplx, rtx val, + complex_part_t part, + bool undefined_p); + /* OpenACC hooks. */ extern bool default_goacc_validate_dims (tree, int [], int, unsigned); extern int default_goacc_dim_limit (int); From patchwork Mon Jul 17 09:02:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121139 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp990669vqt; Mon, 17 Jul 2023 02:06:36 -0700 (PDT) X-Google-Smtp-Source: APBJJlGS5409lo2XUez/PkOdMemQkEaPTzWgbT4RSRnTWVd+UvuMKDRssosk8G/BPgW8w5EuZCVJ X-Received: by 2002:a17:907:b0a:b0:993:6382:6e34 with SMTP id h10-20020a1709070b0a00b0099363826e34mr8650465ejl.72.1689584796234; Mon, 17 Jul 2023 02:06:36 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d14-20020a170906c20e00b00991f1a1c99csi13060920ejz.360.2023.07.17.02.06.35 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:06:36 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=UIkkIWwv; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8AF7F385C6F2 for ; Mon, 17 Jul 2023 09:05:50 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8AF7F385C6F2 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584750; bh=RXRRZt+zxGgq5qbmC8s4W/M3vKfZsoT0FDVk4ovXtyY=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=UIkkIWwvgxeh+IczbV+RkhoPA7Nvrl+bHQDTsd9auzabn3DJTIcd8RzgOarwOa+ip HAk7YmqsZTDpEnj99HLUjpEg0M2iOKas2i7XYYNvFvN4qfuOuJVnOr1Ti9IMCiTuVN WaRHvuAHiJSaXoC10FRdmqr1GG5WMolYR1fR586s= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.143]) by sourceware.org (Postfix) with ESMTPS id 6941D3858423 for ; Mon, 17 Jul 2023 09:03:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6941D3858423 Received: from localhost (localhost [127.0.0.1]) by fx403.security-mail.net (Postfix) with ESMTP id 6C04472F3AC for ; Mon, 17 Jul 2023 11:03:41 +0200 (CEST) Received: from fx403 (localhost [127.0.0.1]) by fx403.security-mail.net (Postfix) with ESMTP id 26B3772F901 for ; Mon, 17 Jul 2023 11:03:41 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0107.outbound.protection.outlook.com [104.47.24.107]) by fx403.security-mail.net (Postfix) with ESMTPS id 4D8F372F900 for ; Mon, 17 Jul 2023 11:03:40 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:39 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:39 +0000 X-Virus-Scanned: E-securemail Secumail-id: <85d6.64b503ec.4bc20.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U2pDqQJtFE2xyJr6Pubz8+gc/horfQo0uoES5NOV21+wOFURS/aiccmgjKRb96RocGVA8wHOHW8ZlU49mO8dcJOFQoGI4Tp3zUNSD1QmJUs7wBZbxU4fjrgYwiY4QMYmjZ2O9pV/txFAYzXpaczVLWeTwx7VR91IXlxxMAqb91m0fb5TkRb/7yCEhJiLjmFKhpIRuzqYsIuW5QDAkO3w6cf1BegM4HCK7iR9Xn9fs9dI43KSuTc4mF6XMkCq7yQgollNqtdtPmhVqtWwSbLhlKOdux/XnLK9E1XrRQRLD5EBSrdjXUKkfmdV+P6QSIOX+uyrmu7tXRfR+h/GB9M3xw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=RXRRZt+zxGgq5qbmC8s4W/M3vKfZsoT0FDVk4ovXtyY=; b=OXrWdhIcFgKQJ0tsQluo8n9g1FayLjr18mXpbfBtvTUTVKP9OGuD+z2jfhX3m35neDo7y9COYSEqlBmUGRTDY6tbywh7YTceviuRmxjCBhNPJIC/f0YIAHEB9IxrpplQe/A9qF1xNgRqixRhR4BwJkic06u6nRX56m2hjPoFh1Fr1eCf3Fj6HVGJfTznvTUB2fguTVF5pyZafpQO/Uh2TK3w4XBFHLSuehBDPOE9vYNnAHqHCbrhsA4FEJHU/4PCUaULREL4d9XcSoMNIm3yI0B7QrSGwYeXRkVHADyQxkxV5BAkXzRkyMBV2rCAujZSaSPHe3kgtqJW7RaVNQUOPQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 3/9] Native complex operations: Add gen_rtx_complex hook Date: Mon, 17 Jul 2023 11:02:44 +0200 Message-ID: <20230717090250.4645-4-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO2P265CA0309.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a5::33) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: 8c28eac3-7b8e-4062-1af9-08db86a4b614 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Y61IY3Vr+c93zVOkW2RxuUPQqWu2aVZiK7qJUu/UV4/QUahgPuYWQGvSQZS0LIjja/tzg7KSZEGU3EjstyRR6WiTwtZ4b9aJjUr/JS+o6Z4HOPyZDWiTfyzddWESdKLktpx0IzENL7EkSCyvyL48h6EyCls0yDWLjmy5RauYsRKV7JaOSxS6BioQC2QlWkqoe8ySIA6tzKt1haaF4HDiVVmBxKib1Bepci4un9iJXJdTMIpRX2gmrcPdhH4hDlY7Eib/dTAlFjwYe+2huQqzkfZUi1fsnuONuEqQiecGmgitgq3Us7kn1pQ2wzhO6+/gweVxmWE+P7nH1eL4U12XcJIygiDVjlBQn1dSLQQGyjj+zeYxHhHeDyF4dHg/PB2NLIXRhF00ajLS+c01fNblWuOQSS0CqTbbRpER2/kUch8dEx9ErjFqSy0AG7672kr/Dw7lurIEplCuDKOdLGSxV4wOQy+T4sDDlz274MK1Qp2+Y8rrUXSl9sx40JC6q9XbK0kR1CGaDvMteOohxvUe53zuHi7tXkFFfUL5fPjP/hibIN9lBzVwxv8SG/oW07I3 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(6666004)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: QuwSdbaGuOvU7Gi7Tg7KdJc9i0EJnAuxrTa4C5s9TeaaxZlLstCoQcyWq/8GRpltHbvdGOlmzeZDU0gmkYh4/ZKK4rLi8rEuyQ+mp1uZpHaMd9k+LK2YsYk/OHK5QopD/QjgcjbCmYEuei+QlHcKuPS/ZCQePJSwddN220tOakFzAgGeDVGmG3xP3CHS7E/rauJ1bjnVldmvPgAnQzja2ukFuqqrADpFMp/5D74ZkjmeCE9mbXNMAq6cVi+JL3PwO5DPFeMX9meYc8T0GR0riYhpgBhjQsn10faaRngXjmlepk0xLgBfeOr+URCBopBDXEIONSwAbIEvcJjgn+k8koYKsKahqC2xjpSPBIg1M5aeFGVsVy7P0odJnsZmJrX7uZ9RnRUSIbop6GxTua75lfrH+M5RMdrnmyAYZghuIXzWdoJfMHa6mWTRLAt1MXPOL2PCAaL5vrp+wdU9PbExNBUls9GrPWtXzIYIBHQMIzAUQNg14FsyM43kecsZXQDNixJRHWII5VBNXRSZQ7KOeKIpZQOpQGnNAV+mytWB6lyEy+YZyJ4ZTHId7Yq/V1PU8swsMfN/YexStU0+70najny9xdmPSlx5irin8cnpR/qNQvkxUw/JeHt8RyIL16bVXCa+UiNcYLZdt2llDih2kvY28uRG5vJtnvJXq3E8N/vJcldhcLSN1POSJ7yIhGHpLzaAz0ijn/N0Y39pe45elB7FCTBmKGW7nyzYF03eOdEXuSfOru36vPWZC+8jkQgbqHV7zBQyKPQxPYmjGJQFu1DxZk3x9lpQgVL2i4KWTJ9zpWtFb5ez5MmmJ7PGriSfiUIZjwnY3OUQQ/WJ1317/uPsgMGZGcJH+wQooj7cpkG6tMExq00q0ketHAtTo7d/791eEDF3NBNnIlXaIg5rxvBYMSRJE3nJKkMizXPhJCG9nU/Say8E81KANVoNqtAk DNjl3sdAhSLI70XRxNOfyNN8GWU6oc/qpKMYNc0RsLj9xz3t5UuX5RnStbab2uRm19+7zrFHorpYwEVDFh/EDdQT7zsQ9st9oAbykdkzgU01Wk1wdngFXLKMj4B3tAYcKwiW/iykOknYRP3HxHRVvnt9PtLi0Tr5+FvDaBwNTlu3vR156DFOWGfecfI/J8K/D83oxnC7ubPFxiPIyw8MHS12HwZOQByAZFlLZw7JteUKoRU11o1UCvJc1mKCjhpS+MKjUQpO4pjqwXqpsJX12K2NdiPHsL23GG1RIkaoHxxvGpiuTfmNlH98vURK1H9EtIViEtyxn0hRNKwCVwAguGGK8AbS+Kq2Ov6LRJOHaF7s3ksi4vYc1imtc3syzWvSYbeNvWCxhZ7M8bmwTL9K6VqsIvtBbZ4AVG57XNZf1csMtWY2VlCrTSXXr6jteCZ/k21nZ8TS7YpG74vVt2u2vJKQE27Za0x2wuW0Bd9UU894MxeaQDz5lDnyU6GZDBwW5jIH+GnfbdZYLq0ySjeyDrD/3ArGlOjviXvGtMjAnNbmpAHoPVxonI6KicJkDrqVAMKuAlpyiL0VNKaLkQihDEjnPiRxIz/ZT0f6tIFv/Wbg3rBVpEmw3PT92MjBEaj+ X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8c28eac3-7b8e-4062-1af9-08db86a4b614 X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:39.1392 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: mC5Ok6DbOwg556h48aBVcnLCtO8OAAxko2vApPPej0bk9X7YiexLmzONGxukv7vYu5tyKtFRKLdXOnb1d18BdQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-11.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658067237529914 X-GMAIL-MSGID: 1771658067237529914 Add a new target hook for complex element creation during the expand pass, called gen_rtx_complex. The default implementation calls gen_rtx_CONCAT like before. Then calls to gen_rtx_CONCAT for complex handling are replaced by calls to targetm.gen_rtx_complex. gcc/ChangeLog: * target.def: Add gen_rtx_complex target hook * targhooks.cc (default_gen_rtx_complex): New: Default implementation for gen_rtx_complex * targhooks.h: Add default_gen_rtx_complex * doc/tm.texi: Document TARGET_GEN_RTX_COMPLEX * doc/tm.texi.in: Add TARGET_GEN_RTX_COMPLEX * emit-rtl.cc (gen_reg_rtx): Replace call to gen_rtx_CONCAT by call to gen_rtx_complex (init_emit_once): Likewise * expmed.cc (flip_storage_order): Likewise * optabs.cc (expand_doubleword_mod): Likewise --- gcc/doc/tm.texi | 6 ++++++ gcc/doc/tm.texi.in | 2 ++ gcc/emit-rtl.cc | 26 +++++++++----------------- gcc/expmed.cc | 2 +- gcc/optabs.cc | 12 +++++++----- gcc/target.def | 10 ++++++++++ gcc/targhooks.cc | 27 +++++++++++++++++++++++++++ gcc/targhooks.h | 2 ++ 8 files changed, 64 insertions(+), 23 deletions(-) diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 87997b76338..b73147aea9f 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -4605,6 +4605,12 @@ to return a nonzero value when it is required, the compiler will run out of spill registers and print a fatal error message. @end deftypefn +@deftypefn {Target Hook} rtx TARGET_GEN_RTX_COMPLEX (machine_mode @var{mode}, rtx @var{real_part}, rtx @var{imag_part}) +This hook should return an rtx representing a complex of mode @var{machine_mode} built from @var{real_part} and @var{imag_part}. + If both arguments are @code{NULL}, create them as registers. + The default is @code{gen_rtx_CONCAT}. +@end deftypefn + @deftypefn {Target Hook} rtx TARGET_READ_COMPLEX_PART (rtx @var{cplx}, complex_part_t @var{part}) This hook should return the rtx representing the specified @var{part} of the complex given by @var{cplx}. @var{part} can be the real part, the imaginary part, or both of them. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index efbf972e6a7..dd39e450903 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -3390,6 +3390,8 @@ stack. @hook TARGET_SMALL_REGISTER_CLASSES_FOR_MODE_P +@hook TARGET_GEN_RTX_COMPLEX + @hook TARGET_READ_COMPLEX_PART @hook TARGET_WRITE_COMPLEX_PART diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index f6276a2d0b6..22012bfea13 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -1190,19 +1190,7 @@ gen_reg_rtx (machine_mode mode) if (generating_concat_p && (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT || GET_MODE_CLASS (mode) == MODE_COMPLEX_INT)) - { - /* For complex modes, don't make a single pseudo. - Instead, make a CONCAT of two pseudos. - This allows noncontiguous allocation of the real and imaginary parts, - which makes much better code. Besides, allocating DCmode - pseudos overstrains reload on some machines like the 386. */ - rtx realpart, imagpart; - machine_mode partmode = GET_MODE_INNER (mode); - - realpart = gen_reg_rtx (partmode); - imagpart = gen_reg_rtx (partmode); - return gen_rtx_CONCAT (mode, realpart, imagpart); - } + return targetm.gen_rtx_complex (mode, NULL, NULL); /* Do not call gen_reg_rtx with uninitialized crtl. */ gcc_assert (crtl->emit.regno_pointer_align_length); @@ -6274,14 +6262,18 @@ init_emit_once (void) FOR_EACH_MODE_IN_CLASS (mode, MODE_COMPLEX_INT) { - rtx inner = const_tiny_rtx[0][(int)GET_MODE_INNER (mode)]; - const_tiny_rtx[0][(int) mode] = gen_rtx_CONCAT (mode, inner, inner); + machine_mode imode = GET_MODE_INNER (mode); + rtx inner = const_tiny_rtx[0][(int) imode]; + const_tiny_rtx[0][(int) mode] = + targetm.gen_rtx_complex (mode, inner, inner); } FOR_EACH_MODE_IN_CLASS (mode, MODE_COMPLEX_FLOAT) { - rtx inner = const_tiny_rtx[0][(int)GET_MODE_INNER (mode)]; - const_tiny_rtx[0][(int) mode] = gen_rtx_CONCAT (mode, inner, inner); + machine_mode imode = GET_MODE_INNER (mode); + rtx inner = const_tiny_rtx[0][(int) imode]; + const_tiny_rtx[0][(int) mode] = + targetm.gen_rtx_complex (mode, inner, inner); } FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_BOOL) diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 2f787cc28f9..8a18161827b 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -400,7 +400,7 @@ flip_storage_order (machine_mode mode, rtx x) real = flip_storage_order (GET_MODE_INNER (mode), real); imag = flip_storage_order (GET_MODE_INNER (mode), imag); - return gen_rtx_CONCAT (mode, real, imag); + return targetm.gen_rtx_complex (mode, real, imag); } if (UNLIKELY (reverse_storage_order_supported < 0)) diff --git a/gcc/optabs.cc b/gcc/optabs.cc index 4e9f58f8060..18900e8113e 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1001,16 +1001,18 @@ expand_doubleword_mod (machine_mode mode, rtx op0, rtx op1, bool unsignedp) machine_mode cmode = TYPE_MODE (ctype); rtx op00 = operand_subword_force (op0, 0, mode); rtx op01 = operand_subword_force (op0, 1, mode); - rtx cres = gen_rtx_CONCAT (cmode, gen_reg_rtx (word_mode), - gen_reg_rtx (word_mode)); + rtx cres = targetm.gen_rtx_complex (cmode, gen_reg_rtx (word_mode), + gen_reg_rtx (word_mode)); tree lhs = make_tree (ctype, cres); tree arg0 = make_tree (wtype, op00); tree arg1 = make_tree (wtype, op01); expand_addsub_overflow (UNKNOWN_LOCATION, PLUS_EXPR, lhs, arg0, arg1, true, true, true, false, NULL); - sum = expand_simple_binop (word_mode, PLUS, XEXP (cres, 0), - XEXP (cres, 1), NULL_RTX, 1, - OPTAB_DIRECT); + sum = + expand_simple_binop (word_mode, PLUS, + read_complex_part (cres, REAL_P), + read_complex_part (cres, IMAG_P), NULL_RTX, + 1, OPTAB_DIRECT); if (sum == NULL_RTX) return NULL_RTX; } diff --git a/gcc/target.def b/gcc/target.def index 9798c0f58e4..ee1dfdc7565 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -3306,6 +3306,16 @@ a pointer to int.", bool, (ao_ref *ref), default_ref_may_alias_errno) +/* Return the rtx representation of a complex with a specified mode. */ +DEFHOOK +(gen_rtx_complex, + "This hook should return an rtx representing a complex of mode @var{machine_mode} built from @var{real_part} and @var{imag_part}.\n\ + If both arguments are @code{NULL}, create them as registers.\n\ + The default is @code{gen_rtx_CONCAT}.", + rtx, + (machine_mode mode, rtx real_part, rtx imag_part), + default_gen_rtx_complex) + /* Returns the value corresponding to the specified part of a complex. */ DEFHOOK (read_complex_part, diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index d33fcbd9a13..4ea40c643a8 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1532,6 +1532,33 @@ default_preferred_simd_mode (scalar_mode) return word_mode; } +/* By default, call gen_rtx_CONCAT. */ + +rtx +default_gen_rtx_complex (machine_mode mode, rtx real_part, rtx imag_part) +{ + /* For complex modes, don't make a single pseudo. + Instead, make a CONCAT of two pseudos. + This allows noncontiguous allocation of the real and imaginary parts, + which makes much better code. Besides, allocating DCmode + pseudos overstrains reload on some machines like the 386. */ + machine_mode imode = GET_MODE_INNER (mode); + + if (real_part == NULL) + real_part = gen_reg_rtx (imode); + else + gcc_assert ((GET_MODE (real_part) == imode) + || (GET_MODE (real_part) == E_VOIDmode)); + + if (imag_part == NULL) + imag_part = gen_reg_rtx (imode); + else + gcc_assert ((GET_MODE (imag_part) == imode) + || (GET_MODE (imag_part) == E_VOIDmode)); + + return gen_rtx_CONCAT (mode, real_part, imag_part); +} + /* By default, extract one of the components of the complex value CPLX. Extract the real part if part is REAL_P, and the imaginary part if it is IMAG_P. If part is BOTH_P, return cplx directly*/ diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 805abd96938..811cd6165de 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -124,6 +124,8 @@ extern opt_machine_mode default_get_mask_mode (machine_mode); extern bool default_empty_mask_is_expensive (unsigned); extern vector_costs *default_vectorize_create_costs (vec_info *, bool); +extern rtx default_gen_rtx_complex (machine_mode mode, rtx real_part, + rtx imag_part); extern rtx default_read_complex_part (rtx cplx, complex_part_t part); extern void default_write_complex_part (rtx cplx, rtx val, complex_part_t part, From patchwork Mon Jul 17 09:02:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121140 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp990708vqt; Mon, 17 Jul 2023 02:06:43 -0700 (PDT) X-Google-Smtp-Source: APBJJlFHE2wh/cjbpAbhMxNRVtG59AxY9fzhZC50b5tDtmQlTPXqLwDgMMxT7wbgE0fG4Ei1Qjvd X-Received: by 2002:a17:906:7a14:b0:993:d47f:3c84 with SMTP id d20-20020a1709067a1400b00993d47f3c84mr11370080ejo.7.1689584802802; Mon, 17 Jul 2023 02:06:42 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f26-20020a1709062c5a00b0098935e138basi12351090ejh.286.2023.07.17.02.06.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:06:42 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=MzfFw5WJ; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 44D16385558A for ; Mon, 17 Jul 2023 09:05:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 44D16385558A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584752; bh=qNpw1f1ypf9TcjG1yXu7px/UBNxwH+AGw7S366mcoYo=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=MzfFw5WJwMR8gaDJUU2AHdEqxSnADdEPU15g6QGtedj8u2oCEERDzY8FI5kBkTVw0 EOuK/45YH1s/r/4sTrk8dh6bSAIuiprGLodOgyGZmyLbiXvg6YOMRiU+T6kZmQJudI 6cb2S7s6FJRCubKeMEvZfhpVUXnPtb5KEge2D+/E= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.143]) by sourceware.org (Postfix) with ESMTPS id 12E3E38582BE for ; Mon, 17 Jul 2023 09:03:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 12E3E38582BE Received: from localhost (localhost [127.0.0.1]) by fx403.security-mail.net (Postfix) with ESMTP id 0CD1C72F53F for ; Mon, 17 Jul 2023 11:03:44 +0200 (CEST) Received: from fx403 (localhost [127.0.0.1]) by fx403.security-mail.net (Postfix) with ESMTP id B31E272F343 for ; Mon, 17 Jul 2023 11:03:43 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0106.outbound.protection.outlook.com [104.47.24.106]) by fx403.security-mail.net (Postfix) with ESMTPS id B7E0672EEFC for ; Mon, 17 Jul 2023 11:03:42 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:41 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:41 +0000 X-Virus-Scanned: E-securemail Secumail-id: <83bd.64b503ee.b6c36.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FHA7CTFRaH9gIBrEMz2MLOBtiJIrpZajVKjHvwtDOlQm1E8vcf59ugmlfMrpA+yGmDoZKjmV0J96DzsDBojZbMkSssNmM1zGgG5cs6GN5azjD1gDsEOmk2r4/FTUbDuu0e+UttLOp58a9Nh9Ri01D0abqqHL0fLCz75op/4zPhs6bYtJnoerNtm+73f9EDTHNfLvG9S6MRIQ/IaAC5kWISvln4qihg0iv4/2UsLi1aa8XzdwTuPUUX0s+PVul4qwWPzflSVP5UDdrEtSBSugpS+8feBdmaRBvohcOxTON5hklmMBPWm0v1QDMl+YbR8Q2HscFyQTyf5jmQIbW9m92g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=qNpw1f1ypf9TcjG1yXu7px/UBNxwH+AGw7S366mcoYo=; b=fAXJl2N6U67uw5qOLGlYFcvOERb7UENdq8CFRwjl+pDdS2W17VSh7XX4jJusM2wki50SzBylphR4Zg2CzNSpHvJO7mZ59hh5TGF/dmWI9b9A/qJXvZHQTCDoQzDiciYBns2Dn2xFdLBQX73ofgpY3YFXHCNmTEHgEmPlVtS50T9trKlkC1zzgUNoXbOeNuLHYAxG33VhEjNqkt3mYVXELI9sjR3bgKzWHit+y7+q+KGABBXJ9xc2vAZiJV4xlRHuMQVJVH1AkCY09c3MuFK2Y9NX36n3kAblycsXN4Ieb2pj7pOhb0OsTpL91P07xW+TQcdYUz/RiVa7GV3JhKXrSA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 4/9] Native complex operations: Allow native complex regs and ops in rtl Date: Mon, 17 Jul 2023 11:02:45 +0200 Message-ID: <20230717090250.4645-5-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO2P265CA0307.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a5::31) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: 477d6251-d07b-4bd7-8df6-08db86a4b791 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: dh7duCaNTNIXAWlFJRx2fNdgFDCIicUA/1rKIzmEKzxHmdiISgNQLsW+HlGhayyiJ8ksEptyo3RHYBQPJ5MjwAzCg1t4Akf4lEHwvdBOlolmwwQRnSeTkcmHpL+a/uAnsqbhow1grWy1yVJK3F04yEe4YwnLNmjbeFtP0QAWW8PU0Vrsr0HBuYQ+dGP0EUEk2sDf+khE0vZQWrffJmlRzZr80m1rG5k+2jzDyC5l7L57ae5EVGBfXfiM3u1MJgUXnMPdnaEYU65Alq4+m+d+o1sAG+Stdr3ExLjhi3ce8reDYO+IoiLcKIaitCpA3RibNUALiay9xycEuApHg00zCDYmrBUl7NngstG0klxjpVXj4JErG0+i6svgLS7CocOyCpAlJgazFBCKyw6bFj6R7dMsVrPAErQUSIf4gb/U5QMxDR/iM5dYbOmaHBy2sfwQBHKlrSbV/ILc/tLl6nJ33y9c8NWfr5LtGCrpUFPUvX1OZaz6YG2iC1dtyP5PRbgoKLfA0fFg1DfrcL9C/XocbpRSm1Al8aFpJvUhRdoNCX0eLtLdZSqf/zoD5UDcji7/ X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: zLkMHDnaXQD/WYlAYIUuw1tQyHGx1pOaswnlvFVgxKZk//Ret5r+ZfYRINDw0RuOGue5PTQqM6AGOIlbpAID+fRZ7fc6o15NFi71+3WUH356Ur2kUN3Mlh5j7OS9hYsCGaP9VVbYMod2WJg5lCPh1PnyUCTH9E5E9CmHV73wR+nzuEKQGgXHIqJ3TqptgIYQLp6bYbHAi3xPSRjze/uML9sywt/DKlhEIJ+xWEzKJbDjgMFdMDMsDrzyTa6+mm9haevtNHIyFXhZZgaIDCnR8gyUMvdBTYmakmmtz4eHipqAxh0I4VkfqI2D2esaPsepXZSZ3Vdu7pyk+QZGfKJlUhADKOJ9iaZKoKFsKNWmCC1z0j9gP1vrb9ZmBjkH0dxoFQ2rm54Id3NLfb9R2YqU2pg3ae+oM57BjbZLKq6FUDuz8yBzpSdRB9yaGETRglV3VLj4Bv71woPopAvEdNuLGsVGwf5oUmgeJ/B5t4XtR/2UPl8yojYek6xRmY5VAdmd6tQSNpE9lM4GUbLCktC6ToPgcyXmcj4VvOh3xxNCF6vLdhcPXA4SkgyUv1+GMjZR180CGdQ0bH6HHIThCQwjTnMerlmc02Lk5oih3h6+zd2VaexJ3EkzNmJiTPwuBnWCuz8CyWiGKNWG15W0dpVvoNzM0fMsb0/X+ZWJkD7Jm4iWvnjqT+lblehgiGJ9LG4NEB3g7EnaLiEyvR8u5AmX7UfKuPYXLRE39iv1S0xU3n5+k6s/3dFRKy5CXOWPI/a5VCQLapUP6Bt2iyv40rk/KVTfJ7l4kQWeFPQf+v4DkON5F2tlWEhr5uCh9rdL4g6pnCD30Ii+URkMP/m4qpAlNPsYDhlW2MGv4wZapvcTsM7Oce5HAd9UdM9Iv+kRe3j4lr7sCY5NXQ9cLo6LidtyFJMYnweOsqPmoCH1fTPETTD2I6KP7Lj5zdF7xFJrW0Eo rmxdXttQZjcKB4WK5OoSdj9qonhWWmaqUDXwxP9Y2iEtTyFb/Yjog2kJ3aCZOVfB8enegaTAr6zspUpQLxsbgF9ApduPgnbwq7T2FFhhqShuzT1Zeo2fLxjV3jAJiMtB7ResbmkYeoxUJYSY2Uvi5TmxXO4pNP4OOTLRGSDayC8UdofP16D0xuCzsQ/NOh+czj1Fd5SHwtA2Gg1Ks4aoK8EQCLlcqNgE0lFUQBwHZfooyo//ckgOAVPcyMci6kxvJOYzzuELnyzNBSiS0a21l4xqXlbjhXBRSrCo1BWGdQP58MnaSAYZcKQXiC0uUVYR7M89I3IqaYx7lfT13SLilwC/BUoUrnFV4Rqk3/cP3/CQDplzTLGoE1MMiQh2eujTcyhtPyvSLdghDmN3KwVClugrnSpF9LpG977H/sEmqjqWjnEL6Dir+rcjWigkz5mZNfmhQ3klBI1BQcZoKkhk4oHMOG5MGBsD+kN/8JQGzAUpJxBzsSz9jFIW4n3B3F1t2zpeOrOZSH2gnCrfJlkKD82A+9hcFcqivIZ7gP4Mhkvss6TgLmvXOzU8th4UZt//RQXiUWldWqBYuzAELYvembkWW8qDlNYHP9IB2KzzFuwtwmQQm7lLvENOH3gzWTsC X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 477d6251-d07b-4bd7-8df6-08db86a4b791 X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:41.6579 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /CHw0c10N0VQH/yU+tdQ8RLAds09ZPOU2V9ubVAcvtMzd6U2s3ISIa0swxUaJUqJHVFJESPfj8B+VYYnhjcQ1g== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658073701851966 X-GMAIL-MSGID: 1771658073701851966 Support registers of complex types in rtl. Also adapt the functions called during the expand pass to support native complex operations. gcc/ChangeLog: * explow.cc (trunc_int_for_mode): Allow complex int modes * expr.cc (emit_move_complex_parts): Move both parts at the same time if it is supported by the backend (emit_move_complex): Do not move via integer if not int mode corresponds. For complex floats, relax the constraint on the number of registers for targets with pairs of registers, and use native moves if it is supported by the backend. (expand_expr_real_2): Move both parts at the same time if it is supported by the backend (expand_expr_real_1): Update the expand of complex constants (const_vector_from_tree): Add the expand of both parts of a complex constant * real.h: update FLOAT_MODE_FORMAT * machmode.h: Add COMPLEX_INT_MODE_P and COMPLEX_FLOAT_MODE_P predicates * optabs-libfuncs.cc (gen_int_libfunc): Add support for complex modes (gen_intv_fp_libfunc): Likewise * recog.cc (general_operand): Likewise --- gcc/explow.cc | 2 +- gcc/expr.cc | 84 ++++++++++++++++++++++++++++++++++++------ gcc/machmode.h | 6 +++ gcc/optabs-libfuncs.cc | 29 ++++++++++++--- gcc/real.h | 3 +- gcc/recog.cc | 1 + 6 files changed, 105 insertions(+), 20 deletions(-) diff --git a/gcc/explow.cc b/gcc/explow.cc index 6424c0802f0..48572a40eab 100644 --- a/gcc/explow.cc +++ b/gcc/explow.cc @@ -56,7 +56,7 @@ trunc_int_for_mode (HOST_WIDE_INT c, machine_mode mode) int width = GET_MODE_PRECISION (smode); /* You want to truncate to a _what_? */ - gcc_assert (SCALAR_INT_MODE_P (mode)); + gcc_assert (SCALAR_INT_MODE_P (mode) || COMPLEX_INT_MODE_P (mode)); /* Canonicalize BImode to 0 and STORE_FLAG_VALUE. */ if (smode == BImode) diff --git a/gcc/expr.cc b/gcc/expr.cc index e1a0892b4d9..e94de8a05b5 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -3847,8 +3847,14 @@ emit_move_complex_parts (rtx x, rtx y) && REG_P (x) && !reg_overlap_mentioned_p (x, y)) emit_clobber (x); - write_complex_part (x, read_complex_part (y, REAL_P), REAL_P, true); - write_complex_part (x, read_complex_part (y, IMAG_P), IMAG_P, false); + machine_mode mode = GET_MODE (x); + if (optab_handler (mov_optab, mode) != CODE_FOR_nothing) + write_complex_part (x, read_complex_part (y, BOTH_P), BOTH_P, false); + else + { + write_complex_part (x, read_complex_part (y, REAL_P), REAL_P, true); + write_complex_part (x, read_complex_part (y, IMAG_P), IMAG_P, false); + } return get_last_insn (); } @@ -3868,14 +3874,14 @@ emit_move_complex (machine_mode mode, rtx x, rtx y) /* See if we can coerce the target into moving both values at once, except for floating point where we favor moving as parts if this is easy. */ - if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT + scalar_int_mode imode; + if (!int_mode_for_mode (mode).exists (&imode)) + try_int = false; + else if (GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT && optab_handler (mov_optab, GET_MODE_INNER (mode)) != CODE_FOR_nothing - && !(REG_P (x) - && HARD_REGISTER_P (x) - && REG_NREGS (x) == 1) - && !(REG_P (y) - && HARD_REGISTER_P (y) - && REG_NREGS (y) == 1)) + && optab_handler (mov_optab, mode) != CODE_FOR_nothing + && !(REG_P (x) && HARD_REGISTER_P (x)) + && !(REG_P (y) && HARD_REGISTER_P (y))) try_int = false; /* Not possible if the values are inherently not adjacent. */ else if (GET_CODE (x) == CONCAT || GET_CODE (y) == CONCAT) @@ -10246,9 +10252,14 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, break; } - /* Move the real (op0) and imaginary (op1) parts to their location. */ - write_complex_part (target, op0, REAL_P, true); - write_complex_part (target, op1, IMAG_P, false); + if ((op0 == op1) && (GET_CODE (op0) == CONST_VECTOR)) + write_complex_part (target, op0, BOTH_P, false); + else + { + /* Move the real (op0) and imaginary (op1) parts to their location. */ + write_complex_part (target, op0, REAL_P, true); + write_complex_part (target, op1, IMAG_P, false); + } return target; @@ -11001,6 +11012,51 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, return original_target; } + else if (original_target && (GET_CODE (original_target) == REG) + && + ((GET_MODE_CLASS (GET_MODE (original_target)) == + MODE_COMPLEX_INT) + || (GET_MODE_CLASS (GET_MODE (original_target)) == + MODE_COMPLEX_FLOAT))) + { + mode = TYPE_MODE (TREE_TYPE (exp)); + + /* Move both parts at the same time if possible */ + if (TREE_COMPLEX_BOTH_PARTS (exp) != NULL) + { + op0 = + expand_expr (TREE_COMPLEX_BOTH_PARTS (exp), original_target, + mode, EXPAND_NORMAL); + write_complex_part (original_target, op0, BOTH_P, false); + } + else + { + mode = TYPE_MODE (TREE_TYPE (TREE_TYPE (exp))); + + rtx rtarg = gen_reg_rtx (mode); + rtx itarg = gen_reg_rtx (mode); + op0 = + expand_expr (TREE_REALPART (exp), rtarg, mode, EXPAND_NORMAL); + op1 = + expand_expr (TREE_IMAGPART (exp), itarg, mode, EXPAND_NORMAL); + + write_complex_part (original_target, op0, REAL_P, false); + write_complex_part (original_target, op1, IMAG_P, false); + + return original_target; + } + } + /* TODO use a finer grain approach than just size of 2 words */ + else if ((TREE_COMPLEX_BOTH_PARTS (exp) != NULL) + && (known_le (GET_MODE_BITSIZE (mode), 2 * BITS_PER_WORD))) + { + op0 = + expand_expr (TREE_COMPLEX_BOTH_PARTS (exp), original_target, mode, + EXPAND_NORMAL); + rtx tmp = gen_reg_rtx (mode); + write_complex_part (tmp, op0, BOTH_P, false); + return tmp; + } /* fall through */ @@ -13347,6 +13403,10 @@ const_vector_from_tree (tree exp) else if (TREE_CODE (elt) == FIXED_CST) builder.quick_push (CONST_FIXED_FROM_FIXED_VALUE (TREE_FIXED_CST (elt), inner)); + else if (TREE_CODE (elt) == COMPLEX_CST) + builder.quick_push (expand_expr + (TREE_COMPLEX_BOTH_PARTS (elt), NULL_RTX, mode, + EXPAND_NORMAL)); else builder.quick_push (immed_wide_int_const (wi::to_poly_wide (elt), inner)); diff --git a/gcc/machmode.h b/gcc/machmode.h index a22df60dc20..b1937eafdc3 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -119,6 +119,12 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) +#define COMPLEX_INT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT) + +#define COMPLEX_FLOAT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) + /* Nonzero if MODE is a complex mode. */ #define COMPLEX_MODE_P(MODE) \ (GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT \ diff --git a/gcc/optabs-libfuncs.cc b/gcc/optabs-libfuncs.cc index f1abe6916d3..fe390a592eb 100644 --- a/gcc/optabs-libfuncs.cc +++ b/gcc/optabs-libfuncs.cc @@ -190,19 +190,34 @@ gen_int_libfunc (optab optable, const char *opname, char suffix, int maxsize = 2 * BITS_PER_WORD; int minsize = BITS_PER_WORD; scalar_int_mode int_mode; + complex_mode cplx_int_mode; + int bitsize; + bool cplx = false; - if (!is_int_mode (mode, &int_mode)) + if (is_int_mode (mode, &int_mode)) + bitsize = GET_MODE_BITSIZE (int_mode); + else if (is_complex_int_mode (mode, &cplx_int_mode)) + { + cplx = true; + bitsize = GET_MODE_BITSIZE (cplx_int_mode); + } + else return; + if (maxsize < LONG_LONG_TYPE_SIZE) maxsize = LONG_LONG_TYPE_SIZE; if (minsize > INT_TYPE_SIZE && (trapv_binoptab_p (optable) || trapv_unoptab_p (optable))) minsize = INT_TYPE_SIZE; - if (GET_MODE_BITSIZE (int_mode) < minsize - || GET_MODE_BITSIZE (int_mode) > maxsize) + + if (bitsize < minsize || bitsize > maxsize) return; - gen_libfunc (optable, opname, suffix, int_mode); + + if (GET_MODE_CLASS (mode) == MODE_INT) + gen_libfunc (optable, opname, suffix, int_mode); + else if (cplx) + gen_libfunc (optable, opname, suffix, cplx_int_mode); } /* Like gen_libfunc, but verify that FP and set decimal prefix if needed. */ @@ -280,9 +295,11 @@ void gen_intv_fp_libfunc (optab optable, const char *name, char suffix, machine_mode mode) { - if (DECIMAL_FLOAT_MODE_P (mode) || GET_MODE_CLASS (mode) == MODE_FLOAT) + if (DECIMAL_FLOAT_MODE_P (mode) || GET_MODE_CLASS (mode) == MODE_FLOAT + || GET_MODE_CLASS (mode) == MODE_COMPLEX_FLOAT) gen_fp_libfunc (optable, name, suffix, mode); - if (GET_MODE_CLASS (mode) == MODE_INT) + if (GET_MODE_CLASS (mode) == MODE_INT + || GET_MODE_CLASS (mode) == MODE_COMPLEX_INT) { int len = strlen (name); char *v_name = XALLOCAVEC (char, len + 2); diff --git a/gcc/real.h b/gcc/real.h index 9ed6c372b14..53585418e68 100644 --- a/gcc/real.h +++ b/gcc/real.h @@ -189,7 +189,8 @@ extern const struct real_format * : (gcc_unreachable (), 0)]) #define FLOAT_MODE_FORMAT(MODE) \ - (REAL_MODE_FORMAT (as_a (GET_MODE_INNER (MODE)))) + (REAL_MODE_FORMAT (as_a \ + (GET_MODE_INNER ((COMPLEX_FLOAT_MODE_P (MODE)) ? (GET_MODE_INNER (MODE)) : (MODE))))) /* The following macro determines whether the floating point format is composite, i.e. may contain non-consecutive mantissa bits, in which diff --git a/gcc/recog.cc b/gcc/recog.cc index 37432087812..687fe2b1b8a 100644 --- a/gcc/recog.cc +++ b/gcc/recog.cc @@ -1441,6 +1441,7 @@ general_operand (rtx op, machine_mode mode) if the caller wants something floating. */ if (GET_MODE (op) == VOIDmode && mode != VOIDmode && GET_MODE_CLASS (mode) != MODE_INT + && GET_MODE_CLASS (mode) != MODE_COMPLEX_INT && GET_MODE_CLASS (mode) != MODE_PARTIAL_INT) return false; From patchwork Mon Jul 17 09:02:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121142 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp991558vqt; Mon, 17 Jul 2023 02:08:45 -0700 (PDT) X-Google-Smtp-Source: APBJJlEyQWhzOeBSAQKgC8snv85MBjoSW7Uwc1S9QW7LibgM6Af/7VaM2yDDu4SeRv7h3wmmUn45 X-Received: by 2002:a05:6512:3c8a:b0:4fb:829b:196e with SMTP id h10-20020a0565123c8a00b004fb829b196emr13630597lfv.2.1689584925684; Mon, 17 Jul 2023 02:08:45 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id n14-20020aa7c44e000000b0051e052dbb30si12679323edr.608.2023.07.17.02.08.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:08:45 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=XDD3x91P; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B17A1384C6A4 for ; Mon, 17 Jul 2023 09:07:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B17A1384C6A4 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584827; bh=LyCGd2ahE0MXZ/vPA6MNXoJgNDB5pXUp7FYF1rq8xHA=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=XDD3x91PhduV5+0swG4owyKTD2eRMJCE9kUf7QXVTlbOrj09QJ+31EDvbMznAiUxJ DLyhNFMtKi3t70/sY5LPFIFixAXxGzO007+fPmd9jOAinapRcDNAKnDpuk/MrV/drL 1nw7lz7LqTCYWbCWIJRbW3q5BEN6cqVI4qUq9ank= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.148]) by sourceware.org (Postfix) with ESMTPS id A6FC43857C51 for ; Mon, 17 Jul 2023 09:03:46 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A6FC43857C51 Received: from localhost (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 97D18322A27 for ; Mon, 17 Jul 2023 11:03:45 +0200 (CEST) Received: from fx408 (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 76A52322A10 for ; Mon, 17 Jul 2023 11:03:45 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0103.outbound.protection.outlook.com [104.47.24.103]) by fx408.security-mail.net (Postfix) with ESMTPS id D75E6322A05 for ; Mon, 17 Jul 2023 11:03:44 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:44 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:44 +0000 X-Virus-Scanned: E-securemail Secumail-id: <5fd2.64b503f0.d6b9f.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=K6DfvX8qd19jBXT2oleLZMHwQDlBn9YqsF4cMQFr0Dv4bgDbFoKNQsp8gmAok0JKWaWkf0EhniHSXefB92+LsghJIEmI9Ojq0jctL6bes2FFiGJXYRZr3CbBoTyplI0Um+Iqj7+mGeoXr0l/UyBEVk2dkm9TgToNrnAhlxJdxDama93bXsItMyrG1KLECj7V6GKeUzwGbSme86nFzMXOx8GEjY/UQKp6lCozbkwXgE+w20MQ5DB1Ql+ymdBUqmoYfopOOVjwvRdjBgWYDnNXD3MibTKeO9DC4dB6+eEiwkP2PMQ39tEgbxS9+eGXAj3JNAXN/9qN0ziIASvLAhiMpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=LyCGd2ahE0MXZ/vPA6MNXoJgNDB5pXUp7FYF1rq8xHA=; b=JxQegIN66D0Zhw2PRp2Tcz/nRaP4Xo+sWKVOMvbsj5HGxMN0hlrV4PQ6GnzkmIksyeZZH8TvD//2E4OFdYQUJ7E5zWnAyuScxbYTVKq4uR9L2j0a7Ey187pz8vZo5UF3Od1WzlsDGSCCpIaTuU3m2dyS9Pyr+MoLroRuveHU9+hVSXpu78JUq0e7m7wU6SaSAgWoErmdrwMXc5psWAsyTTggfSGpAw6hFMjT+rsKF8BE8P3SD2t0wrAmbh16lbSIQMsOn90iJyp3PqJmKC++p9ahatqzNFu4CltuAg+31HFu3w3KtaJ0Vx4uDeT0QE6rcJoKQyDc7rHNYPolQ4gpfw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 5/9] Native complex operations: Add the conjugate op in optabs Date: Mon, 17 Jul 2023 11:02:46 +0200 Message-ID: <20230717090250.4645-6-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO2P265CA0173.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a::17) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: c6bbe664-3d40-4703-1570-08db86a4b903 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: XZPZ1tulEDK+udmA3z+yxuTHPKWeXqgs5Mqgf5CmnbnPCNxjzZSO2e0WDVbig0GHunsc8Wxri+oQlBU9Gf8f3U7NDE+vQ0do72TGoO7GYdbYXCrLt9nGRG8BJ7Lc1MhSJldD8RM8crbTub0v+ElmFr/acn9txk7tbzT6WScmU8VWMRKcqpjuqEKHyutjLno+xqnzhlxcXZxzHj7LDBTDMPl0yXjDHwyMUFquUo5qEFszagzopkr6REHwunl5bbUdSLhQ4TNZeuW8gvdJ4Nvk3/z6oWeoxq2yxBbapskeo+oxp4V5aPgdgbyvBTBAKoZiSZPWcas4OuXRC05w2vPdmhhh5tANqq91tYBvTlRLcVYZwBgCVryNmVUt7VyjH9SB5FpXlcXpFWjz8PBg2mXckPfQbIXmh6T2yTaYw9os4hQKRa6QQGg5RgaFZLlT/j6yRTbpAR17/lQyBFgldOSl11CF9yddE5K19uAajpKkassJYpMEO1Ei/7PxA6HT8qKb8StIWBHUOXFs6R8ivkiZYLtg+SR3bLkyMkomJQiSQuSaB6yTY7pneFQthb0wNhh5 X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: r1rFbUIrS+HRDPGF6sn2uVU2GbiwCBAcvC82wBdRrJ/f4lqZDFuyE6bwW8GyKz2OJAa7jaSL6a5XiAWK2wcwXnDWsL4MR+v8WKmNS5itwiTFx6z28MaWemBTgCMFdCc3ZA78S4guc4sQfKqZ40xPiuGPSub1Ho0GpUhvW4uupEjHva6gOluErD1+l/tkxPa/LPC4y1hS3jpP7siMMgkuGM9Q4qhmCSiQGkZjN4HkkjqO4a6hbNLs1ER/haZ1j0zRj73p4dmzreGTY9ERhqzONQYYXhOGKiuB94vsOW62N/uRkwjE6Sx+ifotpiOH9xbv3fsD80kCEZEArcwUNME3WyMhlnPcv9qOMuKk8+FB2NHbYr4E/PFFUMpZNsWBnemLKeVY0hcPDmGo/Dqi0uPc9J57u88JIOao7byBszSNluRqfxMu7kCO/sGoUG3sD6inAshZ+pnu0jL0CmceBNjEumyjb77sxtPFFpvoSD6qxfqXXiw0X+6Ld//2RbhO9f8xnLo7cY1vFg9f9uGvNGmwlk6ObwlSx139cofosgjQkovtXW4UpJju7YKRS0en0G/MkgmCI4hXa6w+390Opn2yfUXBHhNIXvQX3kSjqAXU6Tgjc8wUmCjVxBZLZRQxSq7RYC1IfrEj+2OqpbEWLgSoB12muW4YivASNwYeqNgWh47hYtS2KQJ/NDMYtZkh3AvRbragmGF3t14tqmQuGvXpOyUKWkpJVfowNW3dShi1jYzQsBLZJpGXSYRR8jJI2DH9DaD7UJ9S0Da2RuPj1kHSPP+kSrX5z9mu2mW3PNsMreyJOaYcOAO/McYp5qPBMtZ5+8Xk7VGs7EbaTwOmpGhSZETsz+h1RcuNG5/WCqk/Noe6kTTUxwWKS4iFmKb9NThYC5eUqn/WZyUPcvPXJMR8vpoJQgoHkMvtRfueldl/6vC8zcTgwyHN4WpCMO6999nY 3DOkon31D1EGauIivNbNz1qG6XhrjS1r8hmIPEwrJQQkzWJ4HlsDMJMQabskMg2kYTfkbyBY8jI24BNH9gN3GoFy0slry7SDTY+aMlUxWEPTl9dUYJUOyU+sUyR3ie+w0ADDGidACEGsIXjALO/TN3Y7XyrFalzW91DyRQLm3onzY0tygXmcy5gkRe/Y3377Ly9nV/xaQGTnlvrkzIRvMYeUVyNA+WdSubWNUFLsH1h02lQw/Mr2T4oF0GDKaNSi20V6zlRKG8MD1h1CdKT/aOS2BfUalYFX4LhqB1JfaHersDW9E1B6CJ+m5jX0a37lbHLnRBUfoAcx6N/mxebBRtDHweQWbaHCb4/fRxqXezqPkvlV6owxR+MKnRnSSYhoFHl3Ox+esIbhBw2R/cdSw19t4QKkAiIxrf4+jTLA79qNxW+wiDkCnjeuA3OoGeaGmxe7cHRrGTR87FHbDpVguLCVCj+gSOPT7UXRZgZ4IDNQtq03r3aEDSnxj4axPP7eSFA5E/oo83Esdu2DFq5cM54ROZyEYKPTJpRThS//4nqufN1iY7veC5PxCgyrXpyo/vlufzi1m0HaT5rGb1UedMr8oUSvq+CGznVqm83fHBmswurau1QoGD0Sg+DZl2EK X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: c6bbe664-3d40-4703-1570-08db86a4b903 X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:44.0770 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: oR6kWcRvobEoKKxr/0QsUmisEmGuyYJqqWNbReQz/Q4r0cof3HAeeXpr2p9iHtkE/Nt379xNxAcRAzH4SWUosA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658202743825267 X-GMAIL-MSGID: 1771658202743825267 Add an optab and rtl operation for the conjugate, called conj, to expand CONJ_EXPR. gcc/ChangeLog: * rtl.def: Add a conj operation in rtl * optabs.def: Add a conj optab * optabs-tree.cc (optab_for_tree_code): use the conj_optab to convert a CONJ_EXPR * expr.cc (expand_expr_real_2): Add a case to expand native CONJ_EXPR (expand_expr_real_1): Likewise --- gcc/expr.cc | 17 ++++++++++++++++- gcc/optabs-tree.cc | 3 +++ gcc/optabs.def | 3 +++ gcc/rtl.def | 3 +++ 4 files changed, 25 insertions(+), 1 deletion(-) diff --git a/gcc/expr.cc b/gcc/expr.cc index e94de8a05b5..be153be0b71 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -10498,6 +10498,18 @@ expand_expr_real_2 (sepops ops, rtx target, machine_mode tmode, return dst; } + case CONJ_EXPR: + op0 = expand_expr (treeop0, subtarget, VOIDmode, EXPAND_NORMAL); + if (modifier == EXPAND_STACK_PARM) + target = 0; + temp = expand_unop (mode, + optab_for_tree_code (CONJ_EXPR, type, + optab_default), + op0, target, 0); + gcc_assert (temp); + return REDUCE_BIT_FIELD (temp); + + default: gcc_unreachable (); } @@ -12064,6 +12076,10 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, op0 = expand_normal (treeop0); return read_complex_part (op0, IMAG_P); + case CONJ_EXPR: + op0 = expand_normal (treeop0); + return op0; + case RETURN_EXPR: case LABEL_EXPR: case GOTO_EXPR: @@ -12087,7 +12103,6 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, case VA_ARG_EXPR: case BIND_EXPR: case INIT_EXPR: - case CONJ_EXPR: case COMPOUND_EXPR: case PREINCREMENT_EXPR: case PREDECREMENT_EXPR: diff --git a/gcc/optabs-tree.cc b/gcc/optabs-tree.cc index e6ae15939d3..c646b3667d4 100644 --- a/gcc/optabs-tree.cc +++ b/gcc/optabs-tree.cc @@ -271,6 +271,9 @@ optab_for_tree_code (enum tree_code code, const_tree type, return TYPE_UNSIGNED (type) ? usneg_optab : ssneg_optab; return trapv ? negv_optab : neg_optab; + case CONJ_EXPR: + return conj_optab; + case ABS_EXPR: return trapv ? absv_optab : abs_optab; diff --git a/gcc/optabs.def b/gcc/optabs.def index 3dae228fba6..31475c8afcc 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -160,6 +160,9 @@ OPTAB_NL(umax_optab, "umax$I$a3", UMAX, "umax", '3', gen_int_libfunc) OPTAB_NL(neg_optab, "neg$P$a2", NEG, "neg", '2', gen_int_fp_fixed_libfunc) OPTAB_NX(neg_optab, "neg$F$a2") OPTAB_NX(neg_optab, "neg$Q$a2") +OPTAB_NL(conj_optab, "conj$P$a2", CONJ, "conj", '2', gen_int_fp_fixed_libfunc) +OPTAB_NX(conj_optab, "conj$F$a2") +OPTAB_NX(conj_optab, "conj$Q$a2") OPTAB_VL(negv_optab, "negv$I$a2", NEG, "neg", '2', gen_intv_fp_libfunc) OPTAB_VX(negv_optab, "neg$F$a2") OPTAB_NL(ssneg_optab, "ssneg$Q$a2", SS_NEG, "ssneg", '2', gen_signed_fixed_libfunc) diff --git a/gcc/rtl.def b/gcc/rtl.def index 88e2b198503..4280f727286 100644 --- a/gcc/rtl.def +++ b/gcc/rtl.def @@ -460,6 +460,9 @@ DEF_RTL_EXPR(MINUS, "minus", "ee", RTX_BIN_ARITH) /* Minus operand 0. */ DEF_RTL_EXPR(NEG, "neg", "e", RTX_UNARY) +/* Conj operand 0 */ +DEF_RTL_EXPR(CONJ, "conj", "e", RTX_UNARY) + DEF_RTL_EXPR(MULT, "mult", "ee", RTX_COMM_ARITH) /* Multiplication with signed saturation */ From patchwork Mon Jul 17 09:02:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121143 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp991564vqt; Mon, 17 Jul 2023 02:08:47 -0700 (PDT) X-Google-Smtp-Source: APBJJlF1K1J+U5lx7hvIvWpRhmks2IGIHwVrM3T6s/mFsbOTic2shMmLpjycl544CHFBiZh3VzCf X-Received: by 2002:a17:906:1009:b0:994:5340:22f4 with SMTP id 9-20020a170906100900b00994534022f4mr7524444ejm.6.1689584926807; Mon, 17 Jul 2023 02:08:46 -0700 (PDT) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id gz19-20020a170906f2d300b009891f47b282si12992708ejb.1012.2023.07.17.02.08.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:08:46 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=FLfzYpKs; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B81E2384C6A5 for ; Mon, 17 Jul 2023 09:07:07 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B81E2384C6A5 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584827; bh=w+sqq1+ODwq36K3PxqKtX8PncLcQzkmp76ouzHJHuMQ=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=FLfzYpKs0jnjO5SCrpI7AHP+ZXqioAX1pPyTawDxbV7Y2PMHCfqkhXy9VExCI9ia2 Caa9VHT2Uc1gnUG1B3N1eFtO3mgPobf/Ve631hmEFvCYzgOZDlFL8uJY62oc04P/rU p/e2Q+BRfVKW9ad55rCsxBS3IwNm6xQZF4Cu7pMM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout30.security-mail.net (smtpout30.security-mail.net [85.31.212.36]) by sourceware.org (Postfix) with ESMTPS id 93DDB385842E for ; Mon, 17 Jul 2023 09:03:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 93DDB385842E Received: from localhost (fx306.security-mail.net [127.0.0.1]) by fx306.security-mail.net (Postfix) with ESMTP id 8E71035CF20 for ; Mon, 17 Jul 2023 11:03:48 +0200 (CEST) Received: from fx306 (fx306.security-mail.net [127.0.0.1]) by fx306.security-mail.net (Postfix) with ESMTP id 64DA635CF4C for ; Mon, 17 Jul 2023 11:03:48 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0108.outbound.protection.outlook.com [104.47.24.108]) by fx306.security-mail.net (Postfix) with ESMTPS id D2D9F35CF14 for ; Mon, 17 Jul 2023 11:03:47 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:46 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:46 +0000 X-Virus-Scanned: E-securemail Secumail-id: <12444.64b503f3.d1fcc.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OxijOG7Sv6kha6928zrt1vvv6x1wxV0yMpXnYCYpqQKrrLR625tmSa+2XyytV3NYPu3n2ydU1Lx4hB03eF8EwqAQfLNRcc+wbvyMXXCikR0hLj9vs9I8gcAetDo5pE9N1Cy6sSCG3qoieu/EZbZ0y6q33hhH9h5CKRQa0X525OCe0ErPDpM2ucA1d7SafhSHiZufUI/OzaP2PwWHMVYdNsL0mBsyC9M8u4cS2H2WENoqyzfw4H/ddErY5nMokzzuWNubs0r7pG8Vtf09FEkkqP34Wk1L9dX/SFJUdG8ir+L4M2P346zLlMAOmiv6jAer7uNSc707m5D7LI++DfcirQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=w+sqq1+ODwq36K3PxqKtX8PncLcQzkmp76ouzHJHuMQ=; b=H0wwcb5LFerVv6+ZKt782UfndpQO/6SEzz+t4vXsm1m+bmR4JR/Ktof/u4lygTmngAL11FBnI1T/zuvUNgQLZFutgBRtQwXAG9DEOcE/bwlkAAfer4B6dFnruEnowtxvfinxpkePP0YEYhghjxXUGtRiyZ711voUg3WCzhbG6LYRuh4vghegLoG9XpYb2zQREVeoTW0VPjmAieXTidRaV/DI8DfxWy+5SwbUSn8v/PRWxvvmUjWdEZqZYyqPeziVSThnihJuQdzSHstDWi89q7rTWa3mX5zf5zBJtjJdNSwH4dysku1QrVqrXIGYnig/I2HYYr9L6O2uZ5NAvQUv5Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 6/9] Native complex operations: Update how complex rotations are handled Date: Mon, 17 Jul 2023 11:02:47 +0200 Message-ID: <20230717090250.4645-7-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LNXP265CA0020.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:5e::32) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: e1c474a4-402c-4cf4-27fb-08db86a4ba5f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: PMWm1J9SKfgf26tUrLuZrppYSWfa6QtOnKwMUhHiOvW5tojH8N2V4g3vzyqcz5IKvUCjbAj5wQxXBCjNwquqAol6wpM6TtggR03theTIFeaUfZ4f69ZmjFSnmUq1/IVzkBoPA1PMnqSaOfLHf++sV0P09UqfXMHYDbBIwAD6P2pc12gGV9tYF9z5MBh8L0+bnpSUwFh0eX6F7WoVtJ3olq+GcURfLLeT2CFrkLj5jPu+hILzP6AcsCDTLafNGqEGQtNIlMc3rVYD2UAxu/e6Y0Gt3VBMGdXlpV+z1CoNli1oTDCNH8CBjEel8T46SjC/DOFJBJdMDxpQJsbJxUZfSa8QnwVtBn/7UGB2JSx3g/jLR2O4GalcY6ZRfeM0qQ2tO3A8pOsZFe4QHp5kDd8DSI5GLYCu6Dw9GnSRBV7yIS2jnxv3MU47kIPYyWsc1haF1faDCb6pNNzXpiY8RqYeiOvrht7HWShUJvnbTss+zVod06EARkuVp/SC2fkb+wxjX3M9ygjxkTCiQNNoyaijqDb5fYfq4H/wUOPkSOQomIoU7JZbLaJFlWkp6zfMRyLD X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(15650500001)(30864003)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: u8XHuwVwMuIamgOBqztwFP6V7HjkGsKDUxyRDYuBwA0LYpJxQ7FR/DnjHru9hBlJRYj62w0JNofSFOjdnSFDg5RH7o14V7zS3KJq8F9VUNDGilKDaHZf3zLGbbcV3MK4RK744CI5JHyzEkr3T8Ac3Aw60/4xAzMGD+mIWui+p2E1vH2BmxSgC3vUncx/4q1q+KFlXm7Pr15KA4AcJZHThmqaOAQT4d4jykASSnd98XNEvJaM4C0Um+0nJYpkNzCdK9GxY4JLHunvThkCElbZcYU1j+bxE3lvJfW/4Ri1kt5VWFkzml4XSrLxErUDJe6ymraxIsNARoW/wLSIq8ieytDRpbA2iMcNFnW5MLQIBSbowHjZBis+cGlIooB3V9jhkOgLVkYlotiqCo0RuaDk17Emj6fA4FSt/KX743PUyuCJ/Qc7G30JZCTiVW2yKRo0HVX8fePKKTnUPos2SKen6ZtCJM4/FHV9EyozjPL2ELNC/DOtYXLplmNdL2JQBz+kpJQ/HjT18IupyemYoul5458nYq///kL8sIZk1/9z37KNdYT1i9+cliQ4rf8JRqgK4/6J665i3XpNTNGbZt7MtfoRgoXe3rva1dUqa179ObCDssAdBTL4kTKm2qdQOSr0/gBSr3Rw+/cz3lWbx9CSqP3wN2zuu9IFXfOwM5yW6l8uZPhhAWEWp+hw7DsBHE17klpWo219wVSWCHqHYBxBDgIprZyFbJb/YvuqNU/AS3H+C03/tGDcLh8NnfQww6wuBF2+6zruC6mIfhR5EJvBnvYyM5tQzVchTuRMrmV5fSglqIvlsKMMh4Qj6Z9a7xy/Ccd/n1FyOo/w3CMbfS1k8N5REEgar6TXfqy9zy8rVlHtMHdzE1+SLazanUavS8ievXGjZAS/ac2LjYwftzMVZlQN94DwQ37VnQohuQwav/R+0W+8cH9qZO+mIq4owvs9 8x6hGqUu4iPlYx3ggzSLYUroxDMyGCJN3UehUB+Kmstwo5f6rrIdyJqK+puGZcRgTigx5JAWK09jbbgbxD8I3Sq98ONw/IwSvE+5lspO1mjLA9DcU4Ed5dRGo6ZGySTpHHYho+3vmKj7kxEBhCCpY2UtpFyE7vwLyQUC8271b8mHQSGCE5nlr/YxLyYzUACtjhD4E9QIRLNPvc1a4GfVLTCq0c7KLz3s7xUQ4gD5XCGb8hXC0duMjL/SqWNwRHLW8MzepPqoW9fHz8Qy5v6ePFVMNu+BUzshtMDt36x/1J5DV/8mRfdpwAiexY4e9jMqTQ5ccGypeE06smcJO2S2TUSozUjPWwcowSb7zH/BSykOZhCRNnSC8/AAd7ylcCOGeP0eL6Y7HuQnawGabfKZHvv6ssIyHZHmE0KOF+zq143qMDVRNThQbUpKbEZA/bE5tDOhANptsk3zztRE3EKLuYj1FkkoFmSvgI3po0smmqdAdv3mKisKKZVlekxgwZ5dlvgn81f6tVTfJJQqP2eswVAN/9hGDnLrEraPIBwqRmt0Ld6MRKkcwb/mq0RYmdA4mjuo8Ozne9w8q9Qpq9hCdGROsuJwFActeSh6F6my41T+tqXcmyRzvflkVJ4PnAmK X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: e1c474a4-402c-4cf4-27fb-08db86a4ba5f X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:46.3701 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: IFXQjYjFOcw99dcj00i/EpCtaYQD58pBxQTkvmaa/19fY6QqVhFB5dJ2ShvY3a9666kFTt2tVGpxB2zRucJwVQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658204435937049 X-GMAIL-MSGID: 1771658204435937049 Catch complex rotation by 90° and 270° in fold-const.cc like before, but now convert them into the new COMPLEX_ROT90 and COMPLEX_ROT270 internal functions. Also add crot90 and crot270 optabs to expose these operation the backends. So conditionnaly lower COMPLEX_ROT90/COMPLEX_ROT270 by checking if crot90/crot270 are in the optab. Finally, convert a + crot90/270(b) into cadd90/270(a, b) in a similar way than FMAs. gcc/ChangeLog: * internal-fn.def: Add COMPLEX_ROT90 and COMPLEX_ROT270 * fold-const.cc (fold_binary_loc): Update the folding of complex rotations to generate called to COMPLEX_ROT90 and COMPLEX_ROT270 * optabs.def: add crot90/crot270 optabs * tree-complex.cc (init_dont_simulate_again): Catch calls to COMPLEX_ROT90 and COMPLEX_ROT270 (expand_complex_rotation): Conditionally lower complex rotations if no pattern is present in the backend (expand_complex_operations_1): Likewise (convert_crot): Likewise * tree-ssa-math-opts.cc (convert_crot_1): Catch complex rotations with additions in a similar way the FMAs. (math_opts_dom_walker::after_dom_children): Call convert_crot if a COMPLEX_ROT90 or COMPLEX_ROT270 is identified --- gcc/fold-const.cc | 115 ++++++++++++++++++++++++++------- gcc/internal-fn.def | 2 + gcc/optabs.def | 2 + gcc/tree-complex.cc | 79 ++++++++++++++++++++++- gcc/tree-ssa-math-opts.cc | 129 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 302 insertions(+), 25 deletions(-) diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index a02ede79fed..f1224b6a548 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -11609,30 +11609,6 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, } else { - /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). - This is not the same for NaNs or if signed zeros are - involved. */ - if (!HONOR_NANS (arg0) - && !HONOR_SIGNED_ZEROS (arg0) - && COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0)) - && TREE_CODE (arg1) == COMPLEX_CST - && real_zerop (TREE_REALPART (arg1))) - { - tree rtype = TREE_TYPE (TREE_TYPE (arg0)); - if (real_onep (TREE_IMAGPART (arg1))) - return - fold_build2_loc (loc, COMPLEX_EXPR, type, - negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, - rtype, arg0)), - fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); - else if (real_minus_onep (TREE_IMAGPART (arg1))) - return - fold_build2_loc (loc, COMPLEX_EXPR, type, - fold_build1_loc (loc, IMAGPART_EXPR, rtype, arg0), - negate_expr (fold_build1_loc (loc, REALPART_EXPR, - rtype, arg0))); - } - /* Optimize z * conj(z) for floating point complex numbers. Guarded by flag_unsafe_math_optimizations as non-finite imaginary components don't produce scalar results. */ @@ -11645,6 +11621,97 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, && operand_equal_p (arg0, TREE_OPERAND (arg1, 0), 0)) return fold_mult_zconjz (loc, type, arg0); } + + /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). + This is not the same for NaNs or if signed zeros are + involved. */ + if (!HONOR_NANS (arg0) + && !HONOR_SIGNED_ZEROS (arg0) + && TREE_CODE (arg1) == COMPLEX_CST + && (COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0)) + && real_zerop (TREE_REALPART (arg1)))) + { + if (real_onep (TREE_IMAGPART (arg1))) + { + tree rtype = TREE_TYPE (TREE_TYPE (arg0)); + tree cplx_build = fold_build2_loc (loc, COMPLEX_EXPR, type, + negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, + rtype, arg0)), + fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); + if (cplx_build && TREE_CODE (TREE_OPERAND (cplx_build, 0)) != NEGATE_EXPR) + return cplx_build; + + if ((TREE_CODE (arg0) == COMPLEX_EXPR) && real_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), TREE_OPERAND (arg0, 0)); + + if (TREE_CODE (arg0) == CALL_EXPR) + { + if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT90) + return negate_expr (CALL_EXPR_ARG (arg0, 0)); + else if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT270) + return CALL_EXPR_ARG (arg0, 0); + } + else if (TREE_CODE (arg0) == NEGATE_EXPR) + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, TREE_OPERAND(arg0, 0)); + else + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT90, TREE_TYPE (arg0), 1, arg0); + } + else if (real_minus_onep (TREE_IMAGPART (arg1))) + { + if (real_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), negate_expr (TREE_OPERAND (arg0, 0))); + + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, fold (arg0)); + } + } + + /* Fold z * +-I to __complex__ (-+__imag z, +-__real z). + This is not the same for NaNs or if signed zeros are + involved. */ + if (!HONOR_NANS (arg0) + && !HONOR_SIGNED_ZEROS (arg0) + && TREE_CODE (arg1) == COMPLEX_CST + && (COMPLEX_INTEGER_TYPE_P (TREE_TYPE (arg0)) + && integer_zerop (TREE_REALPART (arg1)))) + { + if (integer_onep (TREE_IMAGPART (arg1))) + { + tree rtype = TREE_TYPE (TREE_TYPE (arg0)); + tree cplx_build = fold_build2_loc (loc, COMPLEX_EXPR, type, + negate_expr (fold_build1_loc (loc, IMAGPART_EXPR, + rtype, arg0)), + fold_build1_loc (loc, REALPART_EXPR, rtype, arg0)); + if (cplx_build && TREE_CODE (TREE_OPERAND (cplx_build, 0)) != NEGATE_EXPR) + return cplx_build; + + if ((TREE_CODE (arg0) == COMPLEX_EXPR) && integer_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), TREE_OPERAND (arg0, 0)); + + if (TREE_CODE (arg0) == CALL_EXPR) + { + if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT90) + return negate_expr (CALL_EXPR_ARG (arg0, 0)); + else if (CALL_EXPR_IFN (arg0) == IFN_COMPLEX_ROT270) + return CALL_EXPR_ARG (arg0, 0); + } + else if (TREE_CODE (arg0) == NEGATE_EXPR) + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, TREE_OPERAND(arg0, 0)); + else + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT90, TREE_TYPE (arg0), 1, arg0); + } + else if (integer_minus_onep (TREE_IMAGPART (arg1))) + { + if (integer_zerop (TREE_OPERAND (arg0, 1))) + return fold_build2_loc (loc, COMPLEX_EXPR, type, + TREE_OPERAND (arg0, 1), negate_expr (TREE_OPERAND (arg0, 0))); + + return build_call_expr_internal_loc(loc, IFN_COMPLEX_ROT270, TREE_TYPE (arg0), 1, fold (arg0)); + } + } + goto associate; case BIT_IOR_EXPR: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index ea750a921ed..e3e32603dc1 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -385,6 +385,8 @@ DEF_INTERNAL_FLT_FN (SCALB, ECF_CONST, scalb, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMIN, ECF_CONST, fmin, binary) DEF_INTERNAL_FLT_FLOATN_FN (FMAX, ECF_CONST, fmax, binary) DEF_INTERNAL_OPTAB_FN (XORSIGN, ECF_CONST, xorsign, binary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ROT90, ECF_CONST, crot90, unary) +DEF_INTERNAL_OPTAB_FN (COMPLEX_ROT270, ECF_CONST, crot270, unary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT90, ECF_CONST, cadd90, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) diff --git a/gcc/optabs.def b/gcc/optabs.def index 31475c8afcc..afd15b1f30f 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -330,6 +330,8 @@ OPTAB_D (atan_optab, "atan$a2") OPTAB_D (atanh_optab, "atanh$a2") OPTAB_D (copysign_optab, "copysign$F$a3") OPTAB_D (xorsign_optab, "xorsign$F$a3") +OPTAB_D (crot90_optab, "crot90$a2") +OPTAB_D (crot270_optab, "crot270$a2") OPTAB_D (cadd90_optab, "cadd90$a3") OPTAB_D (cadd270_optab, "cadd270$a3") OPTAB_D (cmul_optab, "cmul$a3") diff --git a/gcc/tree-complex.cc b/gcc/tree-complex.cc index 63753e4acf4..b5aaa206319 100644 --- a/gcc/tree-complex.cc +++ b/gcc/tree-complex.cc @@ -241,7 +241,10 @@ init_dont_simulate_again (void) switch (gimple_code (stmt)) { case GIMPLE_CALL: - if (gimple_call_lhs (stmt)) + if (gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT90 + || gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT270) + saw_a_complex_op = true; + else if (gimple_call_lhs (stmt)) sim_again_p = is_complex_reg (gimple_call_lhs (stmt)); break; @@ -1727,6 +1730,67 @@ expand_complex_asm (gimple_stmt_iterator *gsi) } } +/* Expand complex rotations represented as internal functions + * This function assumes that lowered complex rotation is still better + * than a complex multiplication, else the backend would has redfined + * crot90 and crot270 */ + +static void +expand_complex_rotation (gimple_stmt_iterator *gsi) +{ + gimple *stmt = gsi_stmt (*gsi); + tree ac = gimple_call_arg (stmt, 0); + gimple_seq stmts = NULL; + location_t loc = gimple_location (gsi_stmt (*gsi)); + + tree lhs = gimple_get_lhs (stmt); + tree type = TREE_TYPE (ac); + tree inner_type = TREE_TYPE (type); + + + tree rr, ri, rb; + optab op = optab_for_tree_code (MULT_EXPR, inner_type, optab_default); + if (optab_handler (op, TYPE_MODE (type)) != CODE_FOR_nothing) + { + tree cst_i = build_complex (type, build_zero_cst (inner_type), build_one_cst (inner_type)); + rb = gimple_build (&stmts, loc, MULT_EXPR, type, ac, cst_i); + + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + + gassign* new_assign = gimple_build_assign (lhs, rb); + gimple_set_lhs (new_assign, lhs); + gsi_replace (gsi, new_assign, true); + + update_complex_assignment (gsi, NULL, NULL, rb); + } + else + { + tree ar = extract_component (gsi, ac, REAL_P, true); + tree ai = extract_component (gsi, ac, IMAG_P, true); + + if (gimple_call_internal_fn (stmt) == IFN_COMPLEX_ROT90) + { + rr = gimple_build (&stmts, loc, NEGATE_EXPR, inner_type, ai); + ri = ar; + } + else if (gimple_call_internal_fn (stmt) == IFN_COMPLEX_ROT270) + { + rr = ai; + ri = gimple_build (&stmts, loc, NEGATE_EXPR, inner_type, ar); + } + else + gcc_unreachable (); + + gsi_insert_seq_before (gsi, stmts, GSI_SAME_STMT); + + gassign* new_assign = gimple_build_assign (gimple_get_lhs (stmt), COMPLEX_EXPR, rr, ri); + gimple_set_lhs (new_assign, gimple_get_lhs (stmt)); + gsi_replace (gsi, new_assign, true); + + update_complex_assignment (gsi, rr, ri); + } +} + /* Returns true if a complex component is a constant */ static bool @@ -1843,6 +1907,19 @@ expand_complex_operations_1 (gimple_stmt_iterator *gsi) if (gimple_code (stmt) == GIMPLE_COND) return; + if (is_gimple_call (stmt) + && (gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT90 + || gimple_call_combined_fn (stmt) == CFN_COMPLEX_ROT270)) + { + if (!direct_internal_fn_supported_p (gimple_call_internal_fn (stmt), type, + bb_optimization_type (gimple_bb (stmt)))) + expand_complex_rotation (gsi); + else + update_complex_components (gsi, stmt, NULL, NULL, gimple_call_lhs (stmt)); + + return; + } + if (TREE_CODE (type) == COMPLEX_TYPE) expand_complex_move (gsi, type); else if (is_gimple_assign (stmt) diff --git a/gcc/tree-ssa-math-opts.cc b/gcc/tree-ssa-math-opts.cc index 68fc518b1ab..c311e9ab29a 100644 --- a/gcc/tree-ssa-math-opts.cc +++ b/gcc/tree-ssa-math-opts.cc @@ -3286,6 +3286,119 @@ last_fma_candidate_feeds_initial_phi (fma_deferring_state *state, return false; } +/* Convert complex rotation to addition with one operation rotated + * in a similar way than FMAs */ + +static void +convert_crot_1 (tree crot_result, tree op1, internal_fn cadd_fn) +{ + gimple *use_stmt; + imm_use_iterator imm_iter; + gcall *cadd_stmt; + + FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter, crot_result) + { + gimple_stmt_iterator gsi = gsi_for_stmt (use_stmt); + tree add_op, result = crot_result; + + if (is_gimple_debug (use_stmt)) + continue; + + add_op = (gimple_assign_rhs1 (use_stmt) != result) + ? gimple_assign_rhs1 (use_stmt) : gimple_assign_rhs2 (use_stmt); + + + cadd_stmt = gimple_build_call_internal (cadd_fn, 2, add_op, op1); + gimple_set_lhs (cadd_stmt, gimple_get_lhs (use_stmt)); + gimple_call_set_nothrow (cadd_stmt, !stmt_can_throw_internal (cfun, + use_stmt)); + gsi_replace (&gsi, cadd_stmt, true); + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "Generated COMPLEX_ADD_ROT "); + print_gimple_stmt (dump_file, gsi_stmt (gsi), 0, TDF_NONE); + fprintf (dump_file, "\n"); + } + } +} + + +/* Convert complex rotation to addition with one operation rotated + * in a similar way than FMAs */ + +static bool +convert_crot (gimple *crot_stmt, tree op1, combined_fn crot_kind) +{ + internal_fn cadd_fn; + switch (crot_kind) + { + case CFN_COMPLEX_ROT90: + cadd_fn = IFN_COMPLEX_ADD_ROT90; + break; + case CFN_COMPLEX_ROT270: + cadd_fn = IFN_COMPLEX_ADD_ROT270; + break; + default: + gcc_unreachable (); + } + + + tree crot_result = gimple_get_lhs (crot_stmt); + /* If there isn't a LHS then this can't be an CADD. There can be no LHS + if the statement was left just for the side-effects. */ + if (!crot_result) + return false; + tree type = TREE_TYPE (crot_result); + gimple *use_stmt; + use_operand_p use_p; + imm_use_iterator imm_iter; + + if (COMPLEX_FLOAT_TYPE_P (type) + && flag_fp_contract_mode == FP_CONTRACT_OFF) + return false; + + /* We don't want to do bitfield reduction ops. */ + if (INTEGRAL_TYPE_P (type) + && (!type_has_mode_precision_p (type) || TYPE_OVERFLOW_TRAPS (type))) + return false; + + /* If the target doesn't support it, don't generate it. */ + optimization_type opt_type = bb_optimization_type (gimple_bb (crot_stmt)); + if (!direct_internal_fn_supported_p (cadd_fn, type, opt_type)) + return false; + + /* If the crot has zero uses, it is kept around probably because + of -fnon-call-exceptions. Don't optimize it away in that case, + it is DCE job. */ + if (has_zero_uses (crot_result)) + return false; + + /* Make sure that the crot statement becomes dead after + the transformation, thus that all uses are transformed to FMAs. + This means we assume that an FMA operation has the same cost + as an addition. */ + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, crot_result) + { + use_stmt = USE_STMT (use_p); + + if (is_gimple_debug (use_stmt)) + continue; + + if (gimple_bb (use_stmt) != gimple_bb (crot_stmt)) + return false; + + if (!is_gimple_assign (use_stmt)) + return false; + + if (gimple_assign_rhs_code (use_stmt) != PLUS_EXPR) + return false; + } + + convert_crot_1 (crot_result, op1, cadd_fn); + return true; +} + /* Combine the multiplication at MUL_STMT with operands MULOP1 and MULOP2 with uses in additions and subtractions to form fused multiply-add operations. Returns true if successful and MUL_STMT should be removed. @@ -5636,6 +5749,22 @@ math_opts_dom_walker::after_dom_children (basic_block bb) cancel_fma_deferring (&fma_state); break; + case CFN_COMPLEX_ROT90: + case CFN_COMPLEX_ROT270: + if (gimple_call_lhs (stmt) + && convert_crot (stmt, + gimple_call_arg (stmt, 0), + gimple_call_combined_fn (stmt))) + { + unlink_stmt_vdef (stmt); + if (gsi_remove (&gsi, true) + && gimple_purge_dead_eh_edges (bb)) + *m_cfg_changed_p = true; + release_defs (stmt); + continue; + } + break; + default: break; } From patchwork Mon Jul 17 09:02:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121144 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp992376vqt; Mon, 17 Jul 2023 02:10:45 -0700 (PDT) X-Google-Smtp-Source: APBJJlEvVBxIOr1h7OghBwdxvjZmwk/uNvOYjFnkrukkIXx82rw97eAf4YviYbQnoJv9fU8bWht0 X-Received: by 2002:a17:906:109b:b0:993:d9bb:748b with SMTP id u27-20020a170906109b00b00993d9bb748bmr11222052eju.1.1689585044833; Mon, 17 Jul 2023 02:10:44 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id gz15-20020a170906f2cf00b00992d0de8763si13508308ejb.910.2023.07.17.02.10.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:10:44 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=lFuyp+tW; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4A6F23870C3D for ; Mon, 17 Jul 2023 09:08:20 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4A6F23870C3D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584900; bh=ewu3lN9WU+SgGZVVPH/eRLh/EsGo7UJ2ONZRf1rCX+M=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=lFuyp+tWsfwGlXihRwuJKRx4yi1TpZS10M8+IuRKgi9q9a/tvO72UzNCf4lZLBDdI 5WdINjAzpLXPUrm+yN0m0QAOYvIYQcyFnAcIqnuj8aGmcKhS9jMZCJmVzqsHvO42rB mSz+5e/QaGT7g8sZKuy9PZCFMwFTRm5mal5iGYTA= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout30.security-mail.net (smtpout30.security-mail.net [85.31.212.36]) by sourceware.org (Postfix) with ESMTPS id D42EF3858C5E for ; Mon, 17 Jul 2023 09:03:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D42EF3858C5E Received: from localhost (fx306.security-mail.net [127.0.0.1]) by fx306.security-mail.net (Postfix) with ESMTP id D63F735CC54 for ; Mon, 17 Jul 2023 11:03:51 +0200 (CEST) Received: from fx306 (fx306.security-mail.net [127.0.0.1]) by fx306.security-mail.net (Postfix) with ESMTP id A90D035CF79 for ; Mon, 17 Jul 2023 11:03:51 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01on0104.outbound.protection.outlook.com [104.47.24.104]) by fx306.security-mail.net (Postfix) with ESMTPS id 28FC935CF7F for ; Mon, 17 Jul 2023 11:03:51 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:49 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:49 +0000 X-Virus-Scanned: E-securemail Secumail-id: <18058.64b503f7.2832d.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=fr1MQR4wnOh2SXrpIszMpKpytlEhbxu8jqpKJKsr0kG6mWrAXiAh8NARR1kKKcoG9uGdkceIy3faCoDTbHxmbPgOppawLQ5cWi+NG24OquY85/Bsq0SlmondP+k9fT+AeX+vCLc2cl8ClS7GlkXzHrMlL29esy1x15wtorD0Pei+vR8vGnyjW8xMFDLEbpS9iCxpHI9J/YQqJr/wB7yJNLLQ0+VAvuP555TS2fQ2o6ioi9k75SpIl+ZbHmMz7N/Dsm1Ivn+1ipQCYDuyUJrUw4bV8xRkOVC818w2J1LYhkrRpRCxulVSmwD+6oUPtOpj29xZ8q06Rd5Pfb4dlPks0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ewu3lN9WU+SgGZVVPH/eRLh/EsGo7UJ2ONZRf1rCX+M=; b=HZpcYtk2tDyFHEIH9SDND4LLNYEPa7wewdnADVrUD98/+kjj264KUh8oGyYjjJZF/bzEz7DEFQePCtVZpoO9pYggcl5ZFiGwSrfM82SKxvdvfhMikjMP8aoOfuqk7/lljxiku0/zrCVLCS9ALxC+Ler9EI4I1qZFVH9EbIJ0ZF8zCe9uJmq2V6rM+n9bGAsZBuq+jtMNqLd3IL7e9JBLtfIe6rQQF0Q48o6b6jwB+bQhqkufg0mn8gx/h4Q+SWb2d3pIeNlvqfjC+SjRyMXivYBDFrxz/RhXakJLOXrGNs3JuFaLK9PkERrwtXmaFOQElxHX8R1NC/xjXiBaxKnoeQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 7/9] Native complex operations: Vectorization of native complex operations Date: Mon, 17 Jul 2023 11:02:48 +0200 Message-ID: <20230717090250.4645-8-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LNXP265CA0002.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:5e::14) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: 0c18261f-6782-428b-5bfb-08db86a4bc1f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BnwWmT3XLBZ0m7TgfIdHydAHXB2aB35/CGolZlahBQS2cZsSo/gDUc6vOhJzCImA+5L1em7LXTXJarPtFBRafq50SE4Aer8aL3uaEE1Hdhevfs+jKN3Qu6oeOCJY7A/n0wXVM78r2mZ7qt/Eg/JgAXUB+wkFUFizEt7XKvHfi12HfM0lrT/ye5CBOeDJI1ONOoBqV/4T+mx/0GfhQ1drG6yIGQjjf7pxufUtAiH8SjJrH7vTCjU3szj6xh4Zu6DMmLHrHQ5TW/QwyVGEpkDAuzTnIFz0sMtAn3s+Y8tZ2wFlhH766z1nBQCPgCPD0Vs/MpPC6JRQJewyja2W0s4RU+uiYKzhonR0QlOuJcETPXna5Xp00lJEXtFKzLnN4QzKfkAjPhnKWa33wzZ7pZOh8m9RU56bYdPOBTEk8MfOR7QokwErfZORp/v8OVCR5NLjzXjKWhnorr4FtLtlqMaEEntcHJFAjZVnXUgypg82O2E07CAdCnbmrScD/FlN9INpug4pPpoQaLX3Q62mVbWIqgEAGu6OOFLvR6s6oW4CryTPD3t4lhpmv/0brAVMjctD X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(30864003)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: t+pRTfOQYdr8fVmUihVu1v3//bb8DSu6ogARcepGgtBWsVfIhGDxCmYMvrpZqzifeZUkWv3RkOp8stvnCQmhdqQeeiEf5MQfO8NYqUtz/OP17YBAixRYhMrD0FNOK/TAcRJSrnQqLCD16PJU7HOI0HrTnF4HDYZqQxf2Rt2ND+pQaHzU8TD6MJNXioDNZGmU/wm12o2vx+fep+Bn1W/eWVSbkrOswI/rgG4n/qN7dH/yrj+aYq2+OFif6L/ZiJhTchgF52uPQRYcpSQs4sBIC0h0BwK891PUebN/gL58H2IESHiqTxbmS0x1ND0DIk4anqrmfUQ6bgu6aCUwv1KYgJ0aVIL+uBLpS3/LKTKExKQUskND7qqH7xWDSqzb/fjPL/hjxZYIBPP08D7jyajC7//QxOKNkB8LVQVcs/OfdBF5fpLIQkyu7FD/EHop1bR3Q5ntguXMnK3UcZteNtyjvmVEo7J7K7ERUcALVqWi7JL7qBzEsCM/nb+wVyu92Qo11RIkqAMqrYrpzICzdu1UgJVqMWgxJM7j7bW/EnHnT0f7NnndN7idNFijJ+IRPFnMx1S/yWUlsHXVLchFHGqIDqsoInfrB59yiKUlhDAGLCgNduZwD+llhgjDl8Tu1rHXxsAu2KwoiDK8GxL/lumqiv95atB7Flack8By9SUmm94avYrUAkshty/Ev7JFVs34neHJNIlxXf5sWUBpt7xcfD54Cu1vQkmzV9BfBsWZyXiijgkNxZEzDEMtYH10c6dATi6VsWFRggLa7GJ49GoNrm7bAUqjxIh3t6U4RYXMb1B1ZB7+kD8k5oMOcRMGgteJj3mAsDmP95+EGL9p7pUAHG2TnYa41CjTYG/9SCpHFoAyAPsEbiunp1Im0U4YncwMBNBJpDL2PxHvQENMhnvGcZRRh+q7RGaJi//hb+QCUfOj2XbAB+2hDRJe06ej4qeE YElcJVAhFqYv8t4DM1EyRqiTKZ5FN8XD/f2v6iJsdnVd4vfinzmC4ATHAUumywuLfEK8a+fv73xswXf+20aKBe4M/PSVyhXW5JDJjgQvNn8xUw/iKDbpn2FVITZN1MHvfnSEacxSSPdnF+Oe9cZ7NvMHW03FlW6Cd0RuwDYLcAgSPgJSXBmLZaBFk/9fp5Uf4uFF0lBvI8k+FC49FbLIPEDzhShNqDwcRBYZs3qMq2MH8sJ6x4dXFJq49G3qyIAZlD9XVQ3RaVcGeICQ6U0cbCTHaYaqdt61R/YZZWOcC07EVtJ7xrF+5t8XM8nAG1vpWfIQktBh1x5FbfI7diM5QJ9IBcVqA7NDgsqOn1d8Nd0tcrBO4EwL79l/hGOlrzPskwpFFodFwvWeoteVJhpQjKoz6VM9GEvsF5DkZT67ad0mtICHLaauEuK8kZFLQNivDpwaEHNWZWqSJ5D/ueId4Pp7kM1Oh66hZ27IuEgd+yMgP2NSMXRemOAyChVmCf0LTZq1uoecD63jQ4i4p8i7bBvZ8Au3Hib1sY9eYjzfE7r1mgYJ/aykcZdrvY95eThcR0MJCZpWy9vQxwxQ7sYARPAZ7ow6TJnh/ei8wQ4MAT2agwZtAR1ITcyzwner48ou X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0c18261f-6782-428b-5bfb-08db86a4bc1f X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:49.3083 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: aQUC6eXtuC16HRxrioPa48xgPZNaTSbC8Y9NZ9wBoTvCx4bI/19UoB6htExBu4n3fVQJJ6yFoSE/q2pWLqmoFg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658327986196831 X-GMAIL-MSGID: 1771658327986196831 Add vectors of complex types to vectorize native operations. Because of the vectorize was designed to work with scalar elements, several functions and target hooks have to be adapted or duplicated to support complex types. After that, the vectorization of native complex operations follows exactly the same flow as scalars operations. gcc/ChangeLog: * target.def: Add preferred_simd_mode_complex and related_mode_complex by duplicating their scalar counterparts * targhooks.h: Add default_preferred_simd_mode_complex and default_vectorize_related_mode_complex * targhooks.cc (default_preferred_simd_mode_complex): New: Default implementation of preferred_simd_mode_complex (default_vectorize_related_mode_complex): New: Default implementation of related_mode_complex * doc/tm.texi: Document TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX and TARGET_VECTORIZE_RELATED_MODE_COMPLEX * doc/tm.texi.in: Add TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX and TARGET_VECTORIZE_RELATED_MODE_COMPLEX * emit-rtl.cc (init_emit_once): Add the zero constant for vectors of complex modes * genmodes.cc (vector_class): Add case for vectors of complex (complete_mode): Likewise (make_complex_modes): Likewise * gensupport.cc (match_pattern): Likewise * machmode.h: Add vectors of complex in predicates and redefine mode_for_vector and related_vector_mode for complex types * mode-classes.def: Add MODE_VECTOR_COMPLEX_INT and MODE_VECTOR_COMPLEX_FLOAT classes * simplify-rtx.cc (simplify_context::simplify_binary_operation): FIXME: do not simplify binary operations with complex vector modes. * stor-layout.cc (mode_for_vector): Adapt for complex modes using sub-functions calling a common one (related_vector_mode): Implement the function for complex modes * tree-vect-generic.cc (type_for_widest_vector_mode): Add cases for complex modes * tree-vect-stmts.cc (get_related_vectype_for_scalar_type): Adapt for complex modes * tree.cc (build_vector_type_for_mode): Add cases for complex modes --- gcc/doc/tm.texi | 31 ++++++++++++++++++++++++ gcc/doc/tm.texi.in | 4 ++++ gcc/emit-rtl.cc | 10 ++++++++ gcc/genmodes.cc | 8 +++++++ gcc/gensupport.cc | 3 +++ gcc/machmode.h | 19 +++++++++++---- gcc/mode-classes.def | 2 ++ gcc/simplify-rtx.cc | 4 ++++ gcc/stor-layout.cc | 43 +++++++++++++++++++++++++++++---- gcc/target.def | 39 ++++++++++++++++++++++++++++++ gcc/targhooks.cc | 29 ++++++++++++++++++++++ gcc/targhooks.h | 4 ++++ gcc/tree-vect-generic.cc | 4 ++++ gcc/tree-vect-stmts.cc | 52 +++++++++++++++++++++++++++------------- gcc/tree.cc | 2 ++ 15 files changed, 230 insertions(+), 24 deletions(-) diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index b73147aea9f..955a1f983d0 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6229,6 +6229,13 @@ equal to @code{word_mode}, because the vectorizer can do some transformations even in absence of specialized @acronym{SIMD} hardware. @end deftypefn +@deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX (complex_mode @var{mode}) +This hook should return the preferred mode for vectorizing complex +mode @var{mode}. The default is +equal to @code{word_mode}, because the vectorizer can do some +transformations even in absence of specialized @acronym{SIMD} hardware. +@end deftypefn + @deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_SPLIT_REDUCTION (machine_mode) This hook should return the preferred mode to split the final reduction step on @var{mode} to. The reduction is then carried out reducing upper @@ -6291,6 +6298,30 @@ requested mode, returning a mode with the same size as @var{vector_mode} when @var{nunits} is zero. This is the correct behavior for most targets. @end deftypefn +@deftypefn {Target Hook} opt_machine_mode TARGET_VECTORIZE_RELATED_MODE_COMPLEX (machine_mode @var{vector_mode}, complex_mode @var{element_mode}, poly_uint64 @var{nunits}) +If a piece of code is using vector mode @var{vector_mode} and also wants +to operate on elements of mode @var{element_mode}, return the vector mode +it should use for those elements. If @var{nunits} is nonzero, ensure that +the mode has exactly @var{nunits} elements, otherwise pick whichever vector +size pairs the most naturally with @var{vector_mode}. Return an empty +@code{opt_machine_mode} if there is no supported vector mode with the +required properties. + +There is no prescribed way of handling the case in which @var{nunits} +is zero. One common choice is to pick a vector mode with the same size +as @var{vector_mode}; this is the natural choice if the target has a +fixed vector size. Another option is to choose a vector mode with the +same number of elements as @var{vector_mode}; this is the natural choice +if the target has a fixed number of elements. Alternatively, the hook +might choose a middle ground, such as trying to keep the number of +elements as similar as possible while applying maximum and minimum +vector sizes. + +The default implementation uses @code{mode_for_vector} to find the +requested mode, returning a mode with the same size as @var{vector_mode} +when @var{nunits} is zero. This is the correct behavior for most targets. +@end deftypefn + @deftypefn {Target Hook} opt_machine_mode TARGET_VECTORIZE_GET_MASK_MODE (machine_mode @var{mode}) Return the mode to use for a vector mask that holds one boolean result for each element of vector mode @var{mode}. The returned mask mode diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index dd39e450903..a8dc1155f13 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4195,12 +4195,16 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_VECTORIZE_PREFERRED_SIMD_MODE +@hook TARGET_VECTORIZE_PREFERRED_SIMD_MODE_COMPLEX + @hook TARGET_VECTORIZE_SPLIT_REDUCTION @hook TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES @hook TARGET_VECTORIZE_RELATED_MODE +@hook TARGET_VECTORIZE_RELATED_MODE_COMPLEX + @hook TARGET_VECTORIZE_GET_MASK_MODE @hook TARGET_VECTORIZE_EMPTY_MASK_IS_EXPENSIVE diff --git a/gcc/emit-rtl.cc b/gcc/emit-rtl.cc index 22012bfea13..e454f452d46 100644 --- a/gcc/emit-rtl.cc +++ b/gcc/emit-rtl.cc @@ -6276,6 +6276,16 @@ init_emit_once (void) targetm.gen_rtx_complex (mode, inner, inner); } + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_COMPLEX_INT) + { + const_tiny_rtx[0][(int) mode] = gen_const_vector (mode, 0); + } + + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_COMPLEX_FLOAT) + { + const_tiny_rtx[0][(int) mode] = gen_const_vector (mode, 0); + } + FOR_EACH_MODE_IN_CLASS (mode, MODE_VECTOR_BOOL) { const_tiny_rtx[0][(int) mode] = gen_const_vector (mode, 0); diff --git a/gcc/genmodes.cc b/gcc/genmodes.cc index 55ac2adb559..ab113720948 100644 --- a/gcc/genmodes.cc +++ b/gcc/genmodes.cc @@ -142,6 +142,8 @@ vector_class (enum mode_class cl) case MODE_UFRACT: return MODE_VECTOR_UFRACT; case MODE_ACCUM: return MODE_VECTOR_ACCUM; case MODE_UACCUM: return MODE_VECTOR_UACCUM; + case MODE_COMPLEX_INT: return MODE_VECTOR_COMPLEX_INT; + case MODE_COMPLEX_FLOAT: return MODE_VECTOR_COMPLEX_FLOAT; default: error ("no vector class for class %s", mode_class_names[cl]); return MODE_RANDOM; @@ -400,6 +402,8 @@ complete_mode (struct mode_data *m) case MODE_VECTOR_UFRACT: case MODE_VECTOR_ACCUM: case MODE_VECTOR_UACCUM: + case MODE_VECTOR_COMPLEX_INT: + case MODE_VECTOR_COMPLEX_FLOAT: /* Vector modes should have a component and a number of components. */ validate_mode (m, UNSET, UNSET, SET, SET, UNSET); if (m->component->precision != (unsigned int)-1) @@ -462,6 +466,10 @@ make_complex_modes (enum mode_class cl, if (m->boolean) continue; + /* Skip already created mode */ + if (m->complex) + continue; + m_len = strlen (m->name); /* The leading "1 +" is in case we prepend a "C" below. */ buf = (char *) xmalloc (1 + m_len + 1); diff --git a/gcc/gensupport.cc b/gcc/gensupport.cc index 9aa2ba69fcd..de798a70cbd 100644 --- a/gcc/gensupport.cc +++ b/gcc/gensupport.cc @@ -3747,16 +3747,19 @@ match_pattern (optab_pattern *p, const char *name, const char *pat) if (*p == 0 && (! force_int || mode_class[i] == MODE_INT || mode_class[i] == MODE_COMPLEX_INT + || mode_class[i] == MODE_VECTOR_COMPLEX_INT || mode_class[i] == MODE_VECTOR_INT) && (! force_partial_int || mode_class[i] == MODE_INT || mode_class[i] == MODE_COMPLEX_INT + || mode_class[i] == MODE_VECTOR_COMPLEX_INT || mode_class[i] == MODE_PARTIAL_INT || mode_class[i] == MODE_VECTOR_INT) && (! force_float || mode_class[i] == MODE_FLOAT || mode_class[i] == MODE_DECIMAL_FLOAT || mode_class[i] == MODE_COMPLEX_FLOAT + || mode_class[i] == MODE_VECTOR_COMPLEX_FLOAT || mode_class[i] == MODE_VECTOR_FLOAT) && (! force_fixed || mode_class[i] == MODE_FRACT diff --git a/gcc/machmode.h b/gcc/machmode.h index b1937eafdc3..e7d67e2dce1 100644 --- a/gcc/machmode.h +++ b/gcc/machmode.h @@ -110,6 +110,7 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_PARTIAL_INT \ || GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_BOOL \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_INT) /* Nonzero if MODE is a floating-point mode. */ @@ -117,17 +118,22 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; (GET_MODE_CLASS (MODE) == MODE_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_DECIMAL_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT) -#define COMPLEX_INT_MODE_P(MODE) \ - (GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT) +#define COMPLEX_INT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT) -#define COMPLEX_FLOAT_MODE_P(MODE) \ - (GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) +#define COMPLEX_FLOAT_MODE_P(MODE) \ + (GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ + || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) /* Nonzero if MODE is a complex mode. */ #define COMPLEX_MODE_P(MODE) \ (GET_MODE_CLASS (MODE) == MODE_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_COMPLEX_FLOAT) /* Nonzero if MODE is a vector mode. */ @@ -138,6 +144,8 @@ extern const unsigned char mode_class[NUM_MACHINE_MODES]; || GET_MODE_CLASS (MODE) == MODE_VECTOR_FRACT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_UFRACT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_ACCUM \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_INT \ + || GET_MODE_CLASS (MODE) == MODE_VECTOR_COMPLEX_FLOAT \ || GET_MODE_CLASS (MODE) == MODE_VECTOR_UACCUM) /* Nonzero if MODE is a scalar integral mode. */ @@ -927,6 +935,9 @@ extern opt_machine_mode bitwise_mode_for_mode (machine_mode); extern opt_machine_mode mode_for_vector (scalar_mode, poly_uint64); extern opt_machine_mode related_vector_mode (machine_mode, scalar_mode, poly_uint64 = 0); +extern opt_machine_mode mode_for_vector (complex_mode, poly_uint64); +extern opt_machine_mode related_vector_mode (machine_mode, + complex_mode, poly_uint64 = 0); extern opt_machine_mode related_int_vector_mode (machine_mode); /* A class for iterating through possible bitfield modes. */ diff --git a/gcc/mode-classes.def b/gcc/mode-classes.def index de42d7ee6fb..cc6bcaeb026 100644 --- a/gcc/mode-classes.def +++ b/gcc/mode-classes.def @@ -32,9 +32,11 @@ along with GCC; see the file COPYING3. If not see DEF_MODE_CLASS (MODE_COMPLEX_FLOAT), \ DEF_MODE_CLASS (MODE_VECTOR_BOOL), /* vectors of single bits */ \ DEF_MODE_CLASS (MODE_VECTOR_INT), /* SIMD vectors */ \ + DEF_MODE_CLASS (MODE_VECTOR_COMPLEX_INT), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_FRACT), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_UFRACT), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_ACCUM), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_UACCUM), /* SIMD vectors */ \ DEF_MODE_CLASS (MODE_VECTOR_FLOAT), \ + DEF_MODE_CLASS (MODE_VECTOR_COMPLEX_FLOAT), \ DEF_MODE_CLASS (MODE_OPAQUE) /* opaque modes */ diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index d7315d82aa3..0b988bf1484 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -2653,6 +2653,10 @@ simplify_context::simplify_binary_operation (rtx_code code, machine_mode mode, gcc_assert (GET_RTX_CLASS (code) != RTX_COMPARE); gcc_assert (GET_RTX_CLASS (code) != RTX_COMM_COMPARE); + /* FIXME */ + if (VECTOR_MODE_P (mode) && COMPLEX_MODE_P (mode)) + return NULL_RTX; + /* Make sure the constant is second. */ if (GET_RTX_CLASS (code) == RTX_COMM_ARITH && swap_commutative_operands_p (op0, op1)) diff --git a/gcc/stor-layout.cc b/gcc/stor-layout.cc index a6deed4424b..5a7218999e8 100644 --- a/gcc/stor-layout.cc +++ b/gcc/stor-layout.cc @@ -480,8 +480,8 @@ bitwise_type_for_mode (machine_mode mode) elements of mode INNERMODE, if one exists. The returned mode can be either an integer mode or a vector mode. */ -opt_machine_mode -mode_for_vector (scalar_mode innermode, poly_uint64 nunits) +static opt_machine_mode +mode_for_vector (machine_mode innermode, poly_uint64 nunits) { machine_mode mode; @@ -496,8 +496,14 @@ mode_for_vector (scalar_mode innermode, poly_uint64 nunits) mode = MIN_MODE_VECTOR_ACCUM; else if (SCALAR_UACCUM_MODE_P (innermode)) mode = MIN_MODE_VECTOR_UACCUM; - else + else if (SCALAR_INT_MODE_P (innermode)) mode = MIN_MODE_VECTOR_INT; + else if (COMPLEX_FLOAT_MODE_P (innermode)) + mode = MIN_MODE_VECTOR_COMPLEX_FLOAT; + else if (COMPLEX_INT_MODE_P (innermode)) + mode = MIN_MODE_VECTOR_COMPLEX_INT; + else + gcc_unreachable (); /* Only check the broader vector_mode_supported_any_target_p here. We'll filter through target-specific availability and @@ -511,7 +517,7 @@ mode_for_vector (scalar_mode innermode, poly_uint64 nunits) /* For integers, try mapping it to a same-sized scalar mode. */ if (GET_MODE_CLASS (innermode) == MODE_INT) { - poly_uint64 nbits = nunits * GET_MODE_BITSIZE (innermode); + poly_uint64 nbits = nunits * GET_MODE_BITSIZE (innermode).coeffs[0]; if (int_mode_for_size (nbits, 0).exists (&mode) && have_regs_of_mode[mode]) return mode; @@ -520,6 +526,26 @@ mode_for_vector (scalar_mode innermode, poly_uint64 nunits) return opt_machine_mode (); } +/* Find a mode that is suitable for representing a vector with NUNITS + elements of scalar mode INNERMODE, if one exists. The returned mode + can be either an integer mode or a vector mode. */ + +opt_machine_mode +mode_for_vector (scalar_mode innermode, poly_uint64 nunits) +{ + return mode_for_vector (machine_mode (innermode), nunits); +} + +/* Find a mode that is suitable for representing a vector with NUNITS + elements of complex mode INNERMODE, if one exists. The returned mode + can be either an integer mode or a vector mode. */ + +opt_machine_mode +mode_for_vector (complex_mode innermode, poly_uint64 nunits) +{ + return mode_for_vector (machine_mode (innermode), nunits); +} + /* If a piece of code is using vector mode VECTOR_MODE and also wants to operate on elements of mode ELEMENT_MODE, return the vector mode it should use for those elements. If NUNITS is nonzero, ensure that @@ -540,6 +566,15 @@ related_vector_mode (machine_mode vector_mode, scalar_mode element_mode, return targetm.vectorize.related_mode (vector_mode, element_mode, nunits); } +opt_machine_mode +related_vector_mode (machine_mode vector_mode, + complex_mode element_mode, poly_uint64 nunits) +{ + gcc_assert (VECTOR_MODE_P (vector_mode)); + return targetm.vectorize.related_mode_complex (vector_mode, element_mode, + nunits); +} + /* If a piece of code is using vector mode VECTOR_MODE and also wants to operate on integer vectors with the same element size and number of elements, return the vector mode it should use. Return an empty diff --git a/gcc/target.def b/gcc/target.def index ee1dfdc7565..246665bf90f 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1943,6 +1943,18 @@ transformations even in absence of specialized @acronym{SIMD} hardware.", (scalar_mode mode), default_preferred_simd_mode) +/* Returns the preferred mode for SIMD operations for the specified + complex mode. */ +DEFHOOK +(preferred_simd_mode_complex, + "This hook should return the preferred mode for vectorizing complex\n\ +mode @var{mode}. The default is\n\ +equal to @code{word_mode}, because the vectorizer can do some\n\ +transformations even in absence of specialized @acronym{SIMD} hardware.", + machine_mode, + (complex_mode mode), + default_preferred_simd_mode_complex) + /* Returns the preferred mode for splitting SIMD reductions to. */ DEFHOOK (split_reduction, @@ -2017,6 +2029,33 @@ when @var{nunits} is zero. This is the correct behavior for most targets.", (machine_mode vector_mode, scalar_mode element_mode, poly_uint64 nunits), default_vectorize_related_mode) +DEFHOOK +(related_mode_complex, + "If a piece of code is using vector mode @var{vector_mode} and also wants\n\ +to operate on elements of mode @var{element_mode}, return the vector mode\n\ +it should use for those elements. If @var{nunits} is nonzero, ensure that\n\ +the mode has exactly @var{nunits} elements, otherwise pick whichever vector\n\ +size pairs the most naturally with @var{vector_mode}. Return an empty\n\ +@code{opt_machine_mode} if there is no supported vector mode with the\n\ +required properties.\n\ +\n\ +There is no prescribed way of handling the case in which @var{nunits}\n\ +is zero. One common choice is to pick a vector mode with the same size\n\ +as @var{vector_mode}; this is the natural choice if the target has a\n\ +fixed vector size. Another option is to choose a vector mode with the\n\ +same number of elements as @var{vector_mode}; this is the natural choice\n\ +if the target has a fixed number of elements. Alternatively, the hook\n\ +might choose a middle ground, such as trying to keep the number of\n\ +elements as similar as possible while applying maximum and minimum\n\ +vector sizes.\n\ +\n\ +The default implementation uses @code{mode_for_vector} to find the\n\ +requested mode, returning a mode with the same size as @var{vector_mode}\n\ +when @var{nunits} is zero. This is the correct behavior for most targets.", + opt_machine_mode, + (machine_mode vector_mode, complex_mode element_mode, poly_uint64 nunits), + default_vectorize_related_mode_complex) + /* Function to get a target mode for a vector mask. */ DEFHOOK (get_mask_mode, diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index 4ea40c643a8..be3d80a0773 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -1532,6 +1532,15 @@ default_preferred_simd_mode (scalar_mode) return word_mode; } +/* By default, only attempt to parallelize bitwise operations, and + possibly adds/subtracts using bit-twiddling. */ + +machine_mode +default_preferred_simd_mode_complex (complex_mode) +{ + return word_mode; +} + /* By default, call gen_rtx_CONCAT. */ rtx @@ -1733,6 +1742,26 @@ default_vectorize_related_mode (machine_mode vector_mode, return opt_machine_mode (); } + +/* The default implementation of TARGET_VECTORIZE_RELATED_MODE_COMPLEX. */ + +opt_machine_mode +default_vectorize_related_mode_complex (machine_mode vector_mode, + complex_mode element_mode, + poly_uint64 nunits) +{ + machine_mode result_mode; + if ((maybe_ne (nunits, 0U) + || multiple_p (GET_MODE_SIZE (vector_mode), + GET_MODE_SIZE (element_mode), &nunits)) + && mode_for_vector (element_mode, nunits).exists (&result_mode) + && VECTOR_MODE_P (result_mode) + && targetm.vector_mode_supported_p (result_mode)) + return result_mode; + + return opt_machine_mode (); +} + /* By default a vector of integers is used as a mask. */ opt_machine_mode diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 811cd6165de..2fff5ba4640 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -115,11 +115,15 @@ default_builtin_support_vector_misalignment (machine_mode mode, const_tree, int, bool); extern machine_mode default_preferred_simd_mode (scalar_mode mode); +extern machine_mode default_preferred_simd_mode_complex (complex_mode mode); extern machine_mode default_split_reduction (machine_mode); extern unsigned int default_autovectorize_vector_modes (vector_modes *, bool); extern opt_machine_mode default_vectorize_related_mode (machine_mode, scalar_mode, poly_uint64); +extern opt_machine_mode default_vectorize_related_mode_complex (machine_mode, + complex_mode, + poly_uint64); extern opt_machine_mode default_get_mask_mode (machine_mode); extern bool default_empty_mask_is_expensive (unsigned); extern vector_costs *default_vectorize_create_costs (vec_info *, bool); diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc index a7e6cb87a5e..718b144ec23 100644 --- a/gcc/tree-vect-generic.cc +++ b/gcc/tree-vect-generic.cc @@ -1363,6 +1363,10 @@ type_for_widest_vector_mode (tree type, optab op) mode = MIN_MODE_VECTOR_ACCUM; else if (SCALAR_UACCUM_MODE_P (inner_mode)) mode = MIN_MODE_VECTOR_UACCUM; + else if (COMPLEX_INT_MODE_P (inner_mode)) + mode = MIN_MODE_VECTOR_COMPLEX_INT; + else if (COMPLEX_FLOAT_MODE_P (inner_mode)) + mode = MIN_MODE_VECTOR_COMPLEX_FLOAT; else if (inner_mode == BImode) mode = MIN_MODE_VECTOR_BOOL; else diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 10e71178ce7..2852832b7db 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -12272,18 +12272,27 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, tree scalar_type, poly_uint64 nunits) { tree orig_scalar_type = scalar_type; - scalar_mode inner_mode; + scalar_mode scal_mode; + complex_mode cplx_mode; + machine_mode inner_mode; machine_mode simd_mode; tree vectype; + bool cplx = false; - if ((!INTEGRAL_TYPE_P (scalar_type) + if (is_complex_int_mode (TYPE_MODE (scalar_type), &cplx_mode) + || is_complex_float_mode (TYPE_MODE (scalar_type), &cplx_mode)) + cplx = true; + + if ((!cplx && !INTEGRAL_TYPE_P (scalar_type) && !POINTER_TYPE_P (scalar_type) && !SCALAR_FLOAT_TYPE_P (scalar_type)) - || (!is_int_mode (TYPE_MODE (scalar_type), &inner_mode) - && !is_float_mode (TYPE_MODE (scalar_type), &inner_mode))) + || (!cplx && !is_int_mode (TYPE_MODE (scalar_type), &scal_mode) + && !is_float_mode (TYPE_MODE (scalar_type), &scal_mode))) return NULL_TREE; - unsigned int nbytes = GET_MODE_SIZE (inner_mode); + unsigned int nbytes = + (cplx) ? GET_MODE_SIZE (cplx_mode) : GET_MODE_SIZE (scal_mode); + inner_mode = (cplx) ? machine_mode (cplx_mode) : machine_mode (scal_mode); /* Interoperability between modes requires one to be a constant multiple of the other, so that the number of vectors required for each operation @@ -12301,19 +12310,20 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, they support the proper result truncation/extension. We also make sure to build vector types with INTEGER_TYPE component type only. */ - if (INTEGRAL_TYPE_P (scalar_type) - && (GET_MODE_BITSIZE (inner_mode) != TYPE_PRECISION (scalar_type) + if (!cplx && INTEGRAL_TYPE_P (scalar_type) + && (GET_MODE_BITSIZE (scal_mode) != TYPE_PRECISION (scalar_type) || TREE_CODE (scalar_type) != INTEGER_TYPE)) - scalar_type = build_nonstandard_integer_type (GET_MODE_BITSIZE (inner_mode), - TYPE_UNSIGNED (scalar_type)); + scalar_type = + build_nonstandard_integer_type (GET_MODE_BITSIZE (scal_mode), + TYPE_UNSIGNED (scalar_type)); /* We shouldn't end up building VECTOR_TYPEs of non-scalar components. When the component mode passes the above test simply use a type corresponding to that mode. The theory is that any use that would cause problems with this will disable vectorization anyway. */ - else if (!SCALAR_FLOAT_TYPE_P (scalar_type) + else if (!cplx && !SCALAR_FLOAT_TYPE_P (scalar_type) && !INTEGRAL_TYPE_P (scalar_type)) - scalar_type = lang_hooks.types.type_for_mode (inner_mode, 1); + scalar_type = lang_hooks.types.type_for_mode (scal_mode, 1); /* We can't build a vector type of elements with alignment bigger than their size. */ @@ -12331,7 +12341,10 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, if (prevailing_mode == VOIDmode) { gcc_assert (known_eq (nunits, 0U)); - simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode); + + simd_mode = (cplx) + ? targetm.vectorize.preferred_simd_mode_complex (cplx_mode) + : targetm.vectorize.preferred_simd_mode (scal_mode); if (SCALAR_INT_MODE_P (simd_mode)) { /* Traditional behavior is not to take the integer mode @@ -12342,13 +12355,19 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, Note that nunits == 1 is allowed in order to support single element vector types. */ if (!multiple_p (GET_MODE_SIZE (simd_mode), nbytes, &nunits) - || !mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + || !((cplx) + ? mode_for_vector (cplx_mode, nunits).exists (&simd_mode) + : mode_for_vector (scal_mode, nunits).exists (&simd_mode))) return NULL_TREE; } } else if (SCALAR_INT_MODE_P (prevailing_mode) - || !related_vector_mode (prevailing_mode, - inner_mode, nunits).exists (&simd_mode)) + || !((cplx) ? related_vector_mode (prevailing_mode, + cplx_mode, nunits) + .exists (&simd_mode) + : related_vector_mode (prevailing_mode, + scal_mode, nunits) + .exists (&simd_mode))) { /* Fall back to using mode_for_vector, mostly in the hope of being able to use an integer mode. */ @@ -12356,7 +12375,8 @@ get_related_vectype_for_scalar_type (machine_mode prevailing_mode, && !multiple_p (GET_MODE_SIZE (prevailing_mode), nbytes, &nunits)) return NULL_TREE; - if (!mode_for_vector (inner_mode, nunits).exists (&simd_mode)) + if (!((cplx) ? mode_for_vector (cplx_mode, nunits).exists (&simd_mode) + : mode_for_vector (scal_mode, nunits).exists (&simd_mode))) return NULL_TREE; } diff --git a/gcc/tree.cc b/gcc/tree.cc index 2bc1b0d1e3f..91d49016e5b 100644 --- a/gcc/tree.cc +++ b/gcc/tree.cc @@ -10115,6 +10115,8 @@ build_vector_type_for_mode (tree innertype, machine_mode mode) case MODE_VECTOR_UFRACT: case MODE_VECTOR_ACCUM: case MODE_VECTOR_UACCUM: + case MODE_VECTOR_COMPLEX_INT: + case MODE_VECTOR_COMPLEX_FLOAT: nunits = GET_MODE_NUNITS (mode); break; From patchwork Mon Jul 17 09:02:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121141 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp991509vqt; Mon, 17 Jul 2023 02:08:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlER2bsNneUyfbec87A/OEBSC0fOySyi04lGjjkQfF9O3Fik7bXexP0jdEwe1sh+VmMzV86z X-Received: by 2002:aa7:cb4d:0:b0:51d:8f9b:b6ce with SMTP id w13-20020aa7cb4d000000b0051d8f9bb6cemr10704559edt.1.1689584918544; Mon, 17 Jul 2023 02:08:38 -0700 (PDT) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d12-20020aa7c1cc000000b0051bfc04e1b4si13055419edp.226.2023.07.17.02.08.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:08:38 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=RouPneOQ; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 264D03856DDF for ; Mon, 17 Jul 2023 09:07:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 264D03856DDF DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584823; bh=P7hSdHi6xaqfI1HNTVj2zfAo0PQk5KpGuPDMfhOT7CQ=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=RouPneOQw40uE3KUvNKjzB4qFQfWwbod0W4x9RVZKTNEdjSX2bfaLXRTHfwSNkMso ewnQwqRiFguYziuSMkmvz8PL6wQzTvS+2FRQdQtkuYJLUuUndG46byyGE7K7ACqPdr q1OG6RFS8ag7/C9gOWg5LA8aXmMZZ62BBOpD64u0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.148]) by sourceware.org (Postfix) with ESMTPS id 5395E38582A3 for ; Mon, 17 Jul 2023 09:03:55 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5395E38582A3 Received: from localhost (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 66CEA3229FB for ; Mon, 17 Jul 2023 11:03:54 +0200 (CEST) Received: from fx408 (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 485CB322571 for ; Mon, 17 Jul 2023 11:03:54 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0109.outbound.protection.outlook.com [104.47.24.109]) by fx408.security-mail.net (Postfix) with ESMTPS id C635C322517 for ; Mon, 17 Jul 2023 11:03:53 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:52 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:52 +0000 X-Virus-Scanned: E-securemail Secumail-id: <5884.64b503f9.c4ae3.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=RRV9SFC9pJkOcE7n0WwZqKHCksO0tNZYjgeNWv+6ln0RFC13laZ5Me8DstsYEPzdDDHYvwwMHH0MdlLEHQzsTcEH6Z2dnPyDynNy2b3uaNo434bE+61pXs5NU6vRjdbrRiy+MS/uwyKmPUptqlJVqfJdFPaJzjm9GcJ0Vv0snAO7cOo+zmtPJhMft5JOYIZJ9o20EPCeUFASX4BDt/KcjE2iDOM4IV6jaO1QBe4AauRwHgyv396APusj5b9EZp62i0r8N0rMLgl2Z4jBmBuSGGcSmJsenIcl/VA3bA3bWlL3VDocCoEw317kZzYa1BalLTpRo/MnB9SX61EK4gPQQQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P7hSdHi6xaqfI1HNTVj2zfAo0PQk5KpGuPDMfhOT7CQ=; b=oNEDS1ga7sVuUkczvFj1wRyS0uSQMKNh+lGP1TJSsNMjfaiHk5S+25uuGFhnxUMz4Ct2az6sMqsACCl0FAgUWK8LzqwjrgfumCVTjiiQQyRgac+rwdYjiAv0v0kXSAi1mrCcfQ81AhTF55BozhutloR28JNo/cvqcUC7Nh6UejdCKomcXWS/2c/PU8CfRzVaFl2YlvLHFplISjFw4aHl127ooYdFC536nZ/BJA7zj3WvISlTciRclTzZrlShFSEIbPd3U2Day0fklwF50Sf04XCyvGSUVpqhgNETHA9myIh6UK/3rf++YH8ZfyE0PVeZmo94BJCj/+9LYksp1A3Hpg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 8/9] Native complex operations: Add explicit vector of complex Date: Mon, 17 Jul 2023 11:02:49 +0200 Message-ID: <20230717090250.4645-9-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO6P123CA0044.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:2fe::20) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: f0d7a5d5-0544-4908-1930-08db86a4bde9 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gNRz+UiTivjDlhpdZ4ifb2y1kCC0BlAhWZcnj51I9oy4inc/Y6QI/C3DjwhLIlo7AZdKfCoZlDpzfPMUyz2nQAMxsj8JqNKXVoKzJWmmVKNkCKKIpAWPAwNHU0n62kPpiU+UkX29tHeTrsJdtxsbKYdHuBw22jN4A/NLCi6lc5w4MseWGyKNKB7D+JBialUUJ5iVzklepXrmBMf2igOPNqvw5+AVxVaLlbTu1fb+ENEaVMrkNgrY0rGNeRcY5iDtfpkuhpVJO2h0XUKjX07Dcwu3yBZo2A//vQF3qIJbPuKMf/8u2p1vIh6/xkzB1db3dE318HQFYXW89C8ZZPWJ0oWsxnZC0GyFETe7Avl3VppHUgoPGS6l89S0Ro7EdqbuJu9VG5mcJ1rtpB6oLpXauS+MuYtKuaOk5mJjViAjebaD5J/fAXyXqnxwOgvidQFTmizEQ5PXxh38ZFZv7kvCvNYG8fSpt01tBClzdsYmTKGHJ39BU/LKWRW3QyLkcEcs5lYzdfgLo1/4CIK3bU1VD/npuSvje+1UqOgorgZtmewz4tVYAQrfLa8UGoddMAmH X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(6666004)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: ooLBJX+fxjSZ57kl9bTaEp691+ugnZnd5mebS7WekWiAW0x4sm7VlJEqYOOVjQ3Hd2xSQDA55gaXIR4RQXg51JloyY/qZI9B/x+qU65KivJHbpHp35yWfmNNpA0gnxkaFFu4TCxR8vWYB2r1abnLKBTdwMdo2T3TUeHLKcsMo7cFsPh1A95P3WqH5htiU/n1XLGAPLQsLZXqU6k09xq77WTaQL5BuDxVshj2SteA9ruvL05hZYL337xIgpsjPuAPvW1rNF0hTFLZR6eU1Qkr5MES2sOLWU+XcGZsd+op3/gVKGT9fb1rrKU1c2cTP/CEWuJOxtUZhbsvEC4EFiJuJHmSgt01XwnCz9Ytz/1BnKH5C+fAALgK0gcrphveGG2jcdCGWhbbzaX8FKZRRO4W5IQUS6z4XKDvRpw5ssW24x+K48GXHFz+igbhNlqGuZBSpGz8CqLn5htOxPdw1h0IVNIqjzezjc36+ZISXDstFAqvEKqTxnfmO+A9PvO56PCPQvskJVgwiPtSpn87E5Fxuyvga0m4AgehuctF1v+QzaiUtWnlYIK5iYYQMnCSYDeIp8xm8D/iagjeKYwUpmvelGf6ZfhVnsfoI9pOA6FSwvGBJARZNG+ZoCcAypo8H79etgkDn0LU7FDWzaBFthVJRK5+hrc7rfThbLgG56HYFzGetoKAcLYL5NYpxQamtKfe247Rivcbj00Oo90r+7UKbPgzWTpNafpTFnjjIduFfqBZNeNgL2VXrf2o3jDJTBj6GGBd2upJ4OqEGDLQ4Fhj6TLAkPS4MhISOP1Ch9SN3RlluGXU2WHTT5XccX8TwnyqSk0DQtgIyRpOVtSoPev4ZIhbRc6vpKrwZnN+E8CwckLAwmcN/BMVugGSMeB2FvUreSBv0cduguKAvQRCEaWyP00iFRRWfqkyZlDFwkKGGlRNzFRZMhAsS1SZ5jQ1UOG8 qzDFf3IoILWzeLHEiafTWP0SXDcUMn247nQd1ShcBp9rJ6WECXiGi6F4kT8S3sCJZKd60ckIh08h751mRSGaVA7REmegSXECNMNDJUs60nPh+u+I6n/drUwbE/EbSOA39IAHk2ncChNwdqYdE9+ri1Oy2JjGo7mjU881ZNdTa6jyquG2dHZJRpzvD0Y8mmHQz1r73iuLkWzRqy+7U3L6WspHFw/e7Yp5RKl9/jA8onLqRmUDZOJmuvIEpgzLtcWMDh9TJAlCekqsLPXL6Y3Wgu18XuhQQx25PAUdYICXqG84NKvrzrbQxzDiPDCI0I3VRBFNXmZB6Bdo09R3xv2WJqGMT6SkTtNuzRh0R7oGF/yPSALQuUS6ecj8lTXvkcnXvrsukuUh42nMX3/JRzPsnLdCC49vVfUQ6LS9OfMYvaQQUpFiUY/lD1QoDyqUGiegAfiiuYBDS9wQvpr6Drw3/136FvHwpdobSdsvwMegxqkQNwiZzJvzTHTCP276zL282z9jKm6XewaA1bkisvEUpJHz6RN+K/iWJVtIMWIo1tP3ZKJl1CNvghy1DlJ/5OdJZUvmoKqRYOMVzFUnZ8OdbZiX92W/HQ6Rlo0cDJUEgg+2/WPcnzN4rAYlip2stR7f X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: f0d7a5d5-0544-4908-1930-08db86a4bde9 X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:52.2908 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: kbHZ6mhM1Iket4YICgHC1j32dlvosXb9O5zThQ8EiVq/6c+Rkar4wu17ndsc43fyarMSvavzeZX0xgMYezJV8Q== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658196012333292 X-GMAIL-MSGID: 1771658196012333292 Allow the creation and usage of builtins vectors of complex in C, using __attribute__ ((vector_size ())) gcc/c-family/ChangeLog: * c-attribs.cc (vector_mode_valid_p): Add cases for vectors of complex (handle_mode_attribute): Likewise (type_valid_for_vector_size): Likewise * c-common.cc (c_common_type_for_mode): Likewise (vector_types_compatible_elements_p): Likewise gcc/ChangeLog: * fold-const.cc (fold_binary_loc): Likewise gcc/c/ChangeLog: * c-typeck.cc (build_unary_op): Likewise --- gcc/c-family/c-attribs.cc | 12 ++++++++++-- gcc/c-family/c-common.cc | 20 +++++++++++++++++++- gcc/c/c-typeck.cc | 8 ++++++-- gcc/fold-const.cc | 1 + 4 files changed, 36 insertions(+), 5 deletions(-) diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc index e2792ca6898..d4de85160c1 100644 --- a/gcc/c-family/c-attribs.cc +++ b/gcc/c-family/c-attribs.cc @@ -2019,6 +2019,8 @@ vector_mode_valid_p (machine_mode mode) /* Doh! What's going on? */ if (mclass != MODE_VECTOR_INT && mclass != MODE_VECTOR_FLOAT + && mclass != MODE_VECTOR_COMPLEX_INT + && mclass != MODE_VECTOR_COMPLEX_FLOAT && mclass != MODE_VECTOR_FRACT && mclass != MODE_VECTOR_UFRACT && mclass != MODE_VECTOR_ACCUM @@ -2125,6 +2127,8 @@ handle_mode_attribute (tree *node, tree name, tree args, case MODE_VECTOR_INT: case MODE_VECTOR_FLOAT: + case MODE_VECTOR_COMPLEX_INT: + case MODE_VECTOR_COMPLEX_FLOAT: case MODE_VECTOR_FRACT: case MODE_VECTOR_UFRACT: case MODE_VECTOR_ACCUM: @@ -4361,9 +4365,13 @@ type_valid_for_vector_size (tree type, tree atname, tree args, if ((!INTEGRAL_TYPE_P (type) && !SCALAR_FLOAT_TYPE_P (type) + && !COMPLEX_INTEGER_TYPE_P (type) + && !COMPLEX_FLOAT_TYPE_P (type) && !FIXED_POINT_TYPE_P (type)) - || (!SCALAR_FLOAT_MODE_P (orig_mode) - && GET_MODE_CLASS (orig_mode) != MODE_INT + || ((!SCALAR_FLOAT_MODE_P (orig_mode) + && GET_MODE_CLASS (orig_mode) != MODE_INT) + && (!COMPLEX_FLOAT_MODE_P (orig_mode) + && GET_MODE_CLASS (orig_mode) != MODE_COMPLEX_INT) && !ALL_SCALAR_FIXED_POINT_MODE_P (orig_mode)) || !tree_fits_uhwi_p (TYPE_SIZE_UNIT (type)) || TREE_CODE (type) == BOOLEAN_TYPE) diff --git a/gcc/c-family/c-common.cc b/gcc/c-family/c-common.cc index 6ab63dae997..9574c074d26 100644 --- a/gcc/c-family/c-common.cc +++ b/gcc/c-family/c-common.cc @@ -2430,7 +2430,23 @@ c_common_type_for_mode (machine_mode mode, int unsignedp) : make_signed_type (precision)); } - if (COMPLEX_MODE_P (mode)) + if (GET_MODE_CLASS (mode) == MODE_VECTOR_BOOL + && valid_vector_subparts_p (GET_MODE_NUNITS (mode))) + { + unsigned int elem_bits = vector_element_size (GET_MODE_BITSIZE (mode), + GET_MODE_NUNITS (mode)); + tree bool_type = build_nonstandard_boolean_type (elem_bits); + return build_vector_type_for_mode (bool_type, mode); + } + else if (VECTOR_MODE_P (mode) + && valid_vector_subparts_p (GET_MODE_NUNITS (mode))) + { + machine_mode inner_mode = GET_MODE_INNER (mode); + tree inner_type = c_common_type_for_mode (inner_mode, unsignedp); + if (inner_type != NULL_TREE) + return build_vector_type_for_mode (inner_type, mode); + } + else if (COMPLEX_MODE_P (mode)) { machine_mode inner_mode; tree inner_type; @@ -8104,9 +8120,11 @@ vector_types_compatible_elements_p (tree t1, tree t2) gcc_assert ((INTEGRAL_TYPE_P (t1) || c1 == REAL_TYPE + || c1 == COMPLEX_TYPE || c1 == FIXED_POINT_TYPE) && (INTEGRAL_TYPE_P (t2) || c2 == REAL_TYPE + || c2 == COMPLEX_TYPE || c2 == FIXED_POINT_TYPE)); t1 = c_common_signed_type (t1); diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc index 7cf411155c6..68a9646cf5b 100644 --- a/gcc/c/c-typeck.cc +++ b/gcc/c/c-typeck.cc @@ -4584,7 +4584,9 @@ build_unary_op (location_t location, enum tree_code code, tree xarg, /* ~ works on integer types and non float vectors. */ if (typecode == INTEGER_TYPE || (gnu_vector_type_p (TREE_TYPE (arg)) - && !VECTOR_FLOAT_TYPE_P (TREE_TYPE (arg)))) + && !VECTOR_FLOAT_TYPE_P (TREE_TYPE (arg)) + && !COMPLEX_INTEGER_TYPE_P (TREE_TYPE (TREE_TYPE (arg))) + && !COMPLEX_FLOAT_TYPE_P (TREE_TYPE (TREE_TYPE (arg))))) { tree e = arg; @@ -4607,7 +4609,9 @@ build_unary_op (location_t location, enum tree_code code, tree xarg, if (!noconvert) arg = default_conversion (arg); } - else if (typecode == COMPLEX_TYPE) + else if (typecode == COMPLEX_TYPE + || COMPLEX_INTEGER_TYPE_P (TREE_TYPE (TREE_TYPE (arg))) + || COMPLEX_FLOAT_TYPE_P (TREE_TYPE (TREE_TYPE (arg)))) { code = CONJ_EXPR; pedwarn (location, OPT_Wpedantic, diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index f1224b6a548..9e9f711e82d 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -11109,6 +11109,7 @@ fold_binary_loc (location_t loc, enum tree_code code, tree type, to __complex__ ( x, y ). This is not the same for SNaNs or if signed zeros are involved. */ if (!HONOR_SNANS (arg0) + && !(VECTOR_TYPE_P (TREE_TYPE (arg0))) && !HONOR_SIGNED_ZEROS (arg0) && COMPLEX_FLOAT_TYPE_P (TREE_TYPE (arg0))) { From patchwork Mon Jul 17 09:02:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sylvain Noiry X-Patchwork-Id: 121145 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp992754vqt; Mon, 17 Jul 2023 02:11:33 -0700 (PDT) X-Google-Smtp-Source: APBJJlEm+mZzzfyepgPwJfojcaOMGbZRoRCgGQIOUBxBap/G6FfiLGxkTbhXzE2rXOI1f34yQsWj X-Received: by 2002:aa7:cd71:0:b0:51e:2664:e6e7 with SMTP id ca17-20020aa7cd71000000b0051e2664e6e7mr10538614edb.38.1689585093633; Mon, 17 Jul 2023 02:11:33 -0700 (PDT) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id b9-20020a05640202c900b0051df577866fsi13610314edx.150.2023.07.17.02.11.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 17 Jul 2023 02:11:33 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=c2Qm7UHG; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8801538582BC for ; Mon, 17 Jul 2023 09:09:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8801538582BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1689584969; bh=OSNWZoFSkN9m3/Ldj+6T9EHuBXJ7mPVwL1nwpiMI9fc=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=c2Qm7UHGunpJNff5Ltn0/A4TwtDm51ZKjamIdG+etP5Tk3KjRi4017NbhgpCo6PG2 s7F4dRAPjp/s+1RKCI6NeBeJEMjhZZmDP02NQsCSNZEwpgKaK/aJKHBPgHhdYlGAaJ IJWrcYZspNkJ9ySUy432aDytXViF8xknNWwKvySs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtpout140.security-mail.net (smtpout140.security-mail.net [85.31.212.148]) by sourceware.org (Postfix) with ESMTPS id EFB813857709 for ; Mon, 17 Jul 2023 09:03:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EFB813857709 Received: from localhost (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id 23753322A31 for ; Mon, 17 Jul 2023 11:03:57 +0200 (CEST) Received: from fx408 (fx408.security-mail.net [127.0.0.1]) by fx408.security-mail.net (Postfix) with ESMTP id EF95E322517 for ; Mon, 17 Jul 2023 11:03:56 +0200 (CEST) Received: from FRA01-PR2-obe.outbound.protection.outlook.com (mail-pr2fra01lp0108.outbound.protection.outlook.com [104.47.24.108]) by fx408.security-mail.net (Postfix) with ESMTPS id 6274C322A0A for ; Mon, 17 Jul 2023 11:03:56 +0200 (CEST) Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) by PAZP264MB3040.FRAP264.PROD.OUTLOOK.COM (2603:10a6:102:1e7::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6588.32; Mon, 17 Jul 2023 09:03:55 +0000 Received: from MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9]) by MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM ([fe80::a854:17f0:8f2a:f6d9%4]) with mapi id 15.20.6588.031; Mon, 17 Jul 2023 09:03:54 +0000 X-Virus-Scanned: E-securemail Secumail-id: <4a99.64b503fc.619fe.0> ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=G9OIU/N+lxom2ZLgDMsTr/QhX0Z4dVsbbNG3ZOs52XgNEVP1DkOZwyc6wSBY3Ys/+y+hzdwDGyGy63ey2Oprqmt7PvQ2xqOdpQyAq7ODjJRGgs1lLB6T/KnuWIo6oTFcG2w8d8x9PnhU9jBR2r8CwkmXUTk8Wu37eqyUSgmSbiFs2VjwA13DCKRLfcKheQxzSH+MceqxjQqg0P3/MQjn19km+o6xZgVRy/DiXH8U1zaXSr4AE6+IzLaSm1Zzg2CsnkiJwWypQWoeoUcKKb4loocW3sgyOaw73s0zzM9SatHO+PfTLofMPUIe1Xbm65mliqBEMognYhfcoTqDg0QZaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=OSNWZoFSkN9m3/Ldj+6T9EHuBXJ7mPVwL1nwpiMI9fc=; b=OKzUIAz/tYH9fp//z47Y1vUcTf/S/i08K+6bQvt69bFdrvPJE5TAYr1AqiwQXcyZ2nSeI31VgD6iZBmaR0ZDW+jxHSYEUY+hor/QP5KDmLdOEiuJxSuPgmiL3LJJ0tCSw2rHaUj8snrlXhn1E3yws0tEZtfzaxs1DPuDE8iTbPE13+c68ItayXgGKtbfy8z04FUiqSeX+U5rnTTlcVaY56VuZftEDygqGDncyOouAoBjGPxLdU56L0Ay1ATZF9zRnXvcjm2sHCyq+pjMM1anxIZQB9fw0k/jxq6icmsaeNeHnt6IUNsTlEtbJIvxzfKQJWuKQzLTw0rz97BT8ZX6iA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=kalrayinc.com; dmarc=pass action=none header.from=kalrayinc.com; dkim=pass header.d=kalrayinc.com; arc=none To: gcc-patches@gcc.gnu.org Cc: Sylvain Noiry Subject: [PATCH 9/9] Native complex operation: Experimental support in x86 backend Date: Mon, 17 Jul 2023 11:02:50 +0200 Message-ID: <20230717090250.4645-10-snoiry@kalrayinc.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230717090250.4645-1-snoiry@kalrayinc.com> References: <20230717090250.4645-1-snoiry@kalrayinc.com> X-ClientProxiedBy: LO4P123CA0346.GBRP123.PROD.OUTLOOK.COM (2603:10a6:600:18d::9) To MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM (2603:10a6:500:11::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MR2P264MB0113:EE_|PAZP264MB3040:EE_ X-MS-Office365-Filtering-Correlation-Id: ae67afb6-f2c0-46ae-3c75-08db86a4bf77 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: b/ow80pHGQoWiAPnNbB5rmgEuu/Mi2GjMLyQ+PvD/36+Hvi321PX4ZcuosWNKEGw9QhQ7zgI2JR4Z1dKwqCt1aGokSODOEeMhYq6KJOCWNBy0+3wwuADwwoZV0fXxyZWt2cZrt8S6ESWnA7uWtpOd7+HZnGnf1UmmczpzMEHq31pLVmklWXJrV9MxHfiuLBYpPKidKUuEh2mzR6qGquTe1j5LPRX4CS9TYiRbzK1FaAbU7CiNvdS5ZTcCacQbr+DaPNp1O54pBQ9/8C5qGS2bDBx+DrP/YwN/ggXBvftYarjDEuIPBWOzsHfT0pkK/0LZrfYYvGgom5BMowoKbijvDpm7wiRopNpadK0y8oe5J8ZtLIeSt/6U6isivOpwc8dtF1bFTON754grui+n4viVShyQQUsRoSFPHG7VlipnVbWWCQyOZyHLVv6TfvGDsVgA2vObSniQdpcrsQgdL2ZhxXZbF4PFciHxisXhSq+9C1dK6vrs7H4R0hVejRgn1DF627POKPPbG+tNwmQsZUqFJzd33ufS9SrCRiJUI96ov4oRHCUlq+JwZfOp2oE2nRV X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM; PTR:; CAT:NONE; SFS:(13230028)(4636009)(396003)(366004)(346002)(136003)(39850400004)(376002)(451199021)(478600001)(6486002)(6666004)(186003)(1076003)(6506007)(26005)(6512007)(107886003)(2906002)(30864003)(41300700001)(316002)(6916009)(4326008)(66476007)(66946007)(66556008)(5660300002)(36756003)(8676002)(8936002)(38100700002)(86362001)(2616005)(83380400001); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: KFGtmuNNT6XAbK3QX22+Cat0sEzDx2Mg4gN/OlEx0pmTDz1pEhAGPdYv5nhDx9Yb91+j/344BrPFgo4fmC9V1/8owQ6pUlyoX6KVA4vqC/VqdEtP+81XKV4T5yV3yxwefyEDpg8wtDwuPb4kcrlYhnQKvHU1olT4jcUTTFjUGfg5IFitRXYLlEWEibNY534VPtR83jV+0vkCImLAPSsgutmnV7l29sPWbYYWFIer0YoObH74MNTLN+WM0BS9KlprJfkxfL0RC0u1b6Tg056CNAaUDYb+OcYxCaGSnkq+ULMLfCEXwTf4Bv1Y3bJhDD4uuhmXlk0Uo/ambxAgvk03KnAlgvTKzHB25z6qU8u1KQ0WAc1rhCbq6vcq5wALf8qBGkyvAhN6/aA8t0YBGvoBYyj01MH6EIAePkVyFiziD1Kt0fnaP2l+VSJZ/LNM3IEUygr880t3+BXua09Be1X8CLXJe4oOSa2vBq5L0DmPncsei3gOHWNf3Fa+qwHXL4YZrfG+Um6kaL/oxJJMEd0m2QdYQrX2A1P5HUHogLMgJmpCHiZBE2Sv2SdwKcEwHgHBiTj9I+z2gp57/IkdW3W7tywpeWdYUFqw1FsYImpMPIIQH68xt35RD4y0+JLuNGuK+K2Z+cEpkuZ0RSw4GyzNrNekf8WVJu6fJKT/g96FVtaFwglBs3Uymm6cHQzwsoHy7yAcuvImnfp2iVZYYUfLdblvIQY4IbBsXdr9Tt/hiDlRxuUnSIKvbov8yfZ6JNIySfks6NQ94aEefVYOqro2q9Ean+Mfp1TzIpiXKKaWrf0C2bPKnO1Gg4iY6xnojqgI25xInqOsrN/0ULRVNxYEUuoSLA9nXZjAOHKx7dN3vhrrVYsY5Bbs9tpousn0rwYP5GPa8GnWupDCCJ4QflRzJ7BpSirHDWRdLt8hGQOSxBVH6tiueieeuCLdun5ZlvyV 8Gbc7COjffqb9mD/8J9hUPmBHs8MWy9jD6NdQQJssg38XEBNw75tLJHpeJlRX45mCkZYa4h3pfiJ620t5zVvgkiPJWOwKimCooKUuAyOdKsA/2lmI0P7okJitIjX7NNkGdahkB9UkvLXYZmJYhXfHbnZleniBa+PuIHJ+ROuEHuRAzljUVfIJDWz0b+D09+FlO78MVrgBR/qCfcu15K+G0iBMs7N3zYfpvk1XFeFBg/gkUnEivJDQg0YSPITIxFwv4Wd8jAgI4TMwtxsQBdAAHdKWHYiJ0mZSy+1jd64/a2Z56WRxUpU4avr163CkWFmR5f3Nm1Bn2fK5EK/tOfgu8ASrk8S2ZQF5ib4P6Hbo3eeWX3xNpFdNFQEOyHj1sbKvXFoYoQKFwKQCLCbXVOw2CCk8XiU3j65R0CgbtZa6Q97MQ2QG2SmNnuDMCswLryad/J0QZ0+y7jCH4nEgxP/6be6HFjztkSljFpLyB6j39GL5PL2mx5O6iJnyigo4v+GslSKq7pBiJy6DXLyvE/zMg/5elcQnPV2El/GUSYgU5D148U6qEFqwzkFYHc4ICdNmsrrTxjEY7P5rBFjpzjOm60q3IEcKVw+PbS1PPli/j1QNZIeH/UnmNMyZJL1HfG6 X-OriginatorOrg: kalrayinc.com X-MS-Exchange-CrossTenant-Network-Message-Id: ae67afb6-f2c0-46ae-3c75-08db86a4bf77 X-MS-Exchange-CrossTenant-AuthSource: MR2P264MB0113.FRAP264.PROD.OUTLOOK.COM X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jul 2023 09:03:54.9511 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 8931925d-7620-4a64-b7fe-20afd86363d3 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: FbB162S84KnKhy7beUDyIeEMur1NRo7Sam4Xwy1nwPaILXAHIqqf3nnQTD58H/ebCjqyD1m1ytFPO9euW0MuPg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAZP264MB3040 X-ALTERMIMEV2_out: done X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_STOCKGEN, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sylvain Noiry via Gcc-patches From: Sylvain Noiry Reply-To: Sylvain Noiry Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771658378854595985 X-GMAIL-MSGID: 1771658378854595985 Add an experimental support for native complex operation handling in the x86 backend. For now it only support add, sub, mul, conj, neg, mov in SCmode (complex float). Performance gains are still marginal on this target because there are no particular instructions to speedup complex operation, except some SIMD tricks. gcc/ChangeLog: * config/i386/i386.cc (classify_argument): Align complex element to the whole size, not size of the parts (ix86_return_in_memory): Handle complex modes like a scalar with the same size (ix86_class_max_nregs): Likewise (ix86_hard_regno_nregs): Likewise (function_value_ms_64): Add case for SCmode (ix86_build_const_vector): Likewise (ix86_build_signbit_mask): Likewise (x86_gen_rtx_complex): New: Implement the gen_rtx_complex hook, use registers of complex modes to represent complex elements in rtl (x86_read_complex_part): New: Implement the read_complex_part hook, handle registers of complex modes (x86_write_complex_part): New: Implement the write_complex_part hook, handle registers of complex modes * config/i386/i386.h: Add SCmode in several predicates * config/i386/sse.md: Add pattern for some complex operations in SCmode. This includes movsc, addsc3, subsc3, negsc2, mulsc3, and conjsc2 --- gcc/config/i386/i386.cc | 296 +++++++++++++++++++++++++++++++++++++++- gcc/config/i386/i386.h | 11 +- gcc/config/i386/sse.md | 144 +++++++++++++++++++ 3 files changed, 440 insertions(+), 11 deletions(-) diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index f0d6167e667..a65ac92a4a9 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -2339,8 +2339,8 @@ classify_argument (machine_mode mode, const_tree type, mode_alignment = 128; else if (mode == XCmode) mode_alignment = 256; - if (COMPLEX_MODE_P (mode)) - mode_alignment /= 2; + /*if (COMPLEX_MODE_P (mode)) + mode_alignment /= 2;*/ /* Misaligned fields are always returned in memory. */ if (bit_offset % mode_alignment) return 0; @@ -3007,6 +3007,7 @@ pass_in_reg: case E_V4BFmode: case E_V2SImode: case E_V2SFmode: + case E_SCmode: case E_V1TImode: case E_V1DImode: if (!type || !AGGREGATE_TYPE_P (type)) @@ -3257,6 +3258,7 @@ pass_in_reg: case E_V4BFmode: case E_V2SImode: case E_V2SFmode: + case E_SCmode: case E_V1TImode: case E_V1DImode: if (!type || !AGGREGATE_TYPE_P (type)) @@ -4158,8 +4160,8 @@ function_value_ms_64 (machine_mode orig_mode, machine_mode mode, && !INTEGRAL_TYPE_P (valtype) && !VECTOR_FLOAT_TYPE_P (valtype)) break; - if ((SCALAR_INT_MODE_P (mode) || VECTOR_MODE_P (mode)) - && !COMPLEX_MODE_P (mode)) + if ((SCALAR_INT_MODE_P (mode) || VECTOR_MODE_P (mode))) + // && !COMPLEX_MODE_P (mode)) regno = FIRST_SSE_REG; break; case 8: @@ -4266,7 +4268,7 @@ ix86_return_in_memory (const_tree type, const_tree fntype ATTRIBUTE_UNUSED) || INTEGRAL_TYPE_P (type) || VECTOR_FLOAT_TYPE_P (type)) && (SCALAR_INT_MODE_P (mode) || VECTOR_MODE_P (mode)) - && !COMPLEX_MODE_P (mode) + //&& !COMPLEX_MODE_P (mode) && (GET_MODE_SIZE (mode) == 16 || size == 16)) return false; @@ -15722,6 +15724,7 @@ ix86_build_const_vector (machine_mode mode, bool vect, rtx value) case E_V8SFmode: case E_V4SFmode: case E_V2SFmode: + case E_SCmode: case E_V8DFmode: case E_V4DFmode: case E_V2DFmode: @@ -15770,6 +15773,7 @@ ix86_build_signbit_mask (machine_mode mode, bool vect, bool invert) case E_V8SFmode: case E_V4SFmode: case E_V2SFmode: + case E_SCmode: case E_V2SImode: vec_mode = mode; imode = SImode; @@ -19821,7 +19825,8 @@ ix86_class_max_nregs (reg_class_t rclass, machine_mode mode) else { if (COMPLEX_MODE_P (mode)) - return 2; + return CEIL (GET_MODE_SIZE (mode), UNITS_PER_WORD); + //return 2; else return 1; } @@ -20157,7 +20162,8 @@ ix86_hard_regno_nregs (unsigned int regno, machine_mode mode) return CEIL (GET_MODE_SIZE (mode), UNITS_PER_WORD); } if (COMPLEX_MODE_P (mode)) - return 2; + return 1; + //return 2; /* Register pair for mask registers. */ if (mode == P2QImode || mode == P2HImode) return 2; @@ -23613,6 +23619,273 @@ ix86_preferred_simd_mode (scalar_mode mode) } } +static rtx +x86_gen_rtx_complex (machine_mode mode, rtx real_part, rtx imag_part) +{ + machine_mode imode = GET_MODE_INNER (mode); + + if ((real_part == imag_part) && (real_part == CONST0_RTX (imode))) + { + if (CONST_DOUBLE_P (real_part)) + return const_double_from_real_value (dconst0, mode); + else if (CONST_INT_P (real_part)) + return GEN_INT (0); + else + gcc_unreachable (); + } + + bool saved_generating_concat_p = generating_concat_p; + generating_concat_p = false; + rtx complex_reg = gen_reg_rtx (mode); + generating_concat_p = saved_generating_concat_p; + + if (real_part) + { + gcc_assert (imode == GET_MODE (real_part)); + write_complex_part (complex_reg, real_part, REAL_P, false); + } + + if (imag_part) + { + gcc_assert (imode == GET_MODE (imag_part)); + write_complex_part (complex_reg, imag_part, IMAG_P, false); + } + + return complex_reg; +} + +static rtx +x86_read_complex_part (rtx cplx, complex_part_t part) +{ + machine_mode cmode; + scalar_mode imode; + unsigned ibitsize; + + if (GET_CODE (cplx) == CONCAT) + return XEXP (cplx, part); + + cmode = GET_MODE (cplx); + imode = GET_MODE_INNER (cmode); + ibitsize = GET_MODE_BITSIZE (imode); + + if (COMPLEX_MODE_P (cmode) && (part == BOTH_P)) + return cplx; + + /* For constants under 32-bit vector constans are folded during expand, + * so we need to compensate for it as cplx is an integer constant + * In this case cmode and imode are equal */ + if (cmode == imode) + ibitsize /= 2; + + if (cmode == E_VOIDmode) + return cplx; /* FIXME case used when initialising mock in a complex register */ + + if ((cmode == E_DCmode) && (GET_CODE (cplx) == CONST_DOUBLE)) /* FIXME stop generation of DC const_double, because not patterns and wired */ + return CONST0_RTX (E_DFmode); + /* verify aswell SC const_double */ + + /* Special case reads from complex constants that got spilled to memory. */ + if (MEM_P (cplx) && GET_CODE (XEXP (cplx, 0)) == SYMBOL_REF) + { + tree decl = SYMBOL_REF_DECL (XEXP (cplx, 0)); + if (decl && TREE_CODE (decl) == COMPLEX_CST) + { + tree cplx_part = (part == IMAG_P) ? TREE_IMAGPART (decl) + : (part == REAL_P) ? TREE_REALPART (decl) + : TREE_COMPLEX_BOTH_PARTS (decl); + if (CONSTANT_CLASS_P (cplx_part)) + return expand_expr (cplx_part, NULL_RTX, imode, EXPAND_NORMAL); + } + } + + /* For MEMs simplify_gen_subreg may generate an invalid new address + because, e.g., the original address is considered mode-dependent + by the target, which restricts simplify_subreg from invoking + adjust_address_nv. Instead of preparing fallback support for an + invalid address, we call adjust_address_nv directly. */ + if (MEM_P (cplx)) + { + if (part == BOTH_P) + return adjust_address_nv (cplx, cmode, 0); + else + return adjust_address_nv (cplx, imode, (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0); + } + + /* If the sub-object is at least word sized, then we know that subregging + will work. This special case is important, since extract_bit_field + wants to operate on integer modes, and there's rarely an OImode to + correspond to TCmode. */ + if (ibitsize >= BITS_PER_WORD + /* For hard regs we have exact predicates. Assume we can split + the original object if it spans an even number of hard regs. + This special case is important for SCmode on 64-bit platforms + where the natural size of floating-point regs is 32-bit. */ + || (REG_P (cplx) + && REGNO (cplx) < FIRST_PSEUDO_REGISTER + && REG_NREGS (cplx) % 2 == 0)) + { + rtx ret = simplify_gen_subreg (imode, cplx, cmode, (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0); + if (ret) + return ret; + else + /* simplify_gen_subreg may fail for sub-word MEMs. */ + gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); + } + + if (part == BOTH_P) + return extract_bit_field (cplx, 2 * ibitsize, 0, true, NULL_RTX, cmode, + cmode, false, NULL); + else + return extract_bit_field (cplx, ibitsize, (part == IMAG_P) ? ibitsize : 0, + true, NULL_RTX, imode, imode, false, NULL); +} + +static void +x86_write_complex_part (rtx cplx, rtx val, complex_part_t part, bool undefined_p) +{ + machine_mode cmode; + scalar_mode imode; + unsigned ibitsize; + + cmode = GET_MODE (cplx); + imode = GET_MODE_INNER (cmode); + ibitsize = GET_MODE_BITSIZE (imode); + + /* special case for constants */ + if (GET_CODE (val) == CONST_VECTOR) + { + if (part == BOTH_P) + { + machine_mode temp_mode = E_BLKmode;; + switch (cmode) + { + case E_CQImode: + temp_mode = E_HImode; + break; + case E_CHImode: + temp_mode = E_SImode; + break; + case E_CSImode: + temp_mode = E_DImode; + break; + case E_SCmode: + temp_mode = E_DFmode; + break; + case E_CDImode: + temp_mode = E_TImode; + break; + case E_DCmode: + default: + break; + } + + if (temp_mode != E_BLKmode) + { + rtx temp_reg = gen_reg_rtx (temp_mode); + store_bit_field (temp_reg, GET_MODE_BITSIZE (temp_mode), 0, 0, + 0, GET_MODE (val), val, false, undefined_p); + emit_move_insn (cplx, + simplify_gen_subreg (cmode, temp_reg, temp_mode, + 0)); + } + else + { + /* write real part and imag part separetly */ + gcc_assert (GET_CODE (val) == CONST_VECTOR); + write_complex_part (cplx, const_vector_elt (val, 0), REAL_P, false); + write_complex_part (cplx, const_vector_elt (val, 1), IMAG_P, false); + } + } + else + write_complex_part (cplx, + const_vector_elt (val, + ((part == REAL_P) ? 0 : 1)), + part, false); + return; + } + + if ((part == BOTH_P) && !MEM_P (cplx) + /*&& (optab_handler (mov_optab, cmode) != CODE_FOR_nothing)*/) + { + write_complex_part (cplx, read_complex_part(cplx, REAL_P), REAL_P, undefined_p); + write_complex_part (cplx, read_complex_part(cplx, IMAG_P), IMAG_P, undefined_p); + //emit_move_insn (cplx, val); + return; + } + + if ((GET_CODE (val) == CONST_DOUBLE) || (GET_CODE (val) == CONST_INT)) + { + if (part == REAL_P) + { + emit_move_insn (gen_lowpart (imode, cplx), val); + return; + } + else if (part == IMAG_P) + { + /* cannot set highpart of a pseudo register */ + if (REGNO (cplx) < FIRST_PSEUDO_REGISTER) + { + emit_move_insn (gen_highpart (imode, cplx), val); + return; + } + } + else + gcc_unreachable (); + } + + if (GET_CODE (cplx) == CONCAT) + { + emit_move_insn (XEXP (cplx, part), val); + return; + } + + /* For MEMs simplify_gen_subreg may generate an invalid new address + because, e.g., the original address is considered mode-dependent + by the target, which restricts simplify_subreg from invoking + adjust_address_nv. Instead of preparing fallback support for an + invalid address, we call adjust_address_nv directly. */ + if (MEM_P (cplx)) + { + if (part == BOTH_P) + emit_move_insn (adjust_address_nv (cplx, cmode, 0), val); + else + emit_move_insn (adjust_address_nv (cplx, imode, (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0), val); + return; + } + + /* If the sub-object is at least word sized, then we know that subregging + will work. This special case is important, since store_bit_field + wants to operate on integer modes, and there's rarely an OImode to + correspond to TCmode. */ + if (ibitsize >= BITS_PER_WORD + /* For hard regs we have exact predicates. Assume we can split + the original object if it spans an even number of hard regs. + This special case is important for SCmode on 64-bit platforms + where the natural size of floating-point regs is 32-bit. */ + || (REG_P (cplx) + && REGNO (cplx) < FIRST_PSEUDO_REGISTER + && REG_NREGS (cplx) % 2 == 0)) + { + rtx cplx_part = simplify_gen_subreg (imode, cplx, cmode, + (part == IMAG_P) + ? GET_MODE_SIZE (imode) : 0); + if (cplx_part) + { + emit_move_insn (cplx_part, val); + return; + } + else + /* simplify_gen_subreg may fail for sub-word MEMs. */ + gcc_assert (MEM_P (cplx) && ibitsize < BITS_PER_WORD); + } + + store_bit_field (cplx, ibitsize, (part == IMAG_P) ? ibitsize : 0, 0, 0, + imode, val, false, undefined_p); +} + /* If AVX is enabled then try vectorizing with both 256bit and 128bit vectors. If AVX512F is enabled then try vectorizing with 512bit, 256bit and 128bit vectors. */ @@ -25621,6 +25894,15 @@ ix86_libgcc_floating_mode_supported_p #undef TARGET_IFUNC_REF_LOCAL_OK #define TARGET_IFUNC_REF_LOCAL_OK ix86_ifunc_ref_local_ok +#undef TARGET_GEN_RTX_COMPLEX +#define TARGET_GEN_RTX_COMPLEX x86_gen_rtx_complex + +#undef TARGET_READ_COMPLEX_PART +#define TARGET_READ_COMPLEX_PART x86_read_complex_part + +#undef TARGET_WRITE_COMPLEX_PART +#define TARGET_WRITE_COMPLEX_PART x86_write_complex_part + #if !TARGET_MACHO && !TARGET_DLLIMPORT_DECL_ATTRIBUTES # undef TARGET_ASM_RELOC_RW_MASK # define TARGET_ASM_RELOC_RW_MASK ix86_reloc_rw_mask diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index aea3209d5a3..86157b97b25 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -1054,7 +1054,8 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); || (MODE) == V4QImode || (MODE) == V2HImode || (MODE) == V1SImode \ || (MODE) == V2DImode || (MODE) == V2QImode \ || (MODE) == DFmode || (MODE) == DImode \ - || (MODE) == HFmode || (MODE) == BFmode) + || (MODE) == HFmode || (MODE) == BFmode \ + || (MODE) == SCmode) #define VALID_SSE_REG_MODE(MODE) \ ((MODE) == V1TImode || (MODE) == TImode \ @@ -1063,7 +1064,7 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); || (MODE) == TFmode || (MODE) == TDmode) #define VALID_MMX_REG_MODE_3DNOW(MODE) \ - ((MODE) == V2SFmode || (MODE) == SFmode) + ((MODE) == V2SFmode || (MODE) == SFmode || (MODE) == SCmode) /* To match ia32 psABI, V4HFmode should be added here. */ #define VALID_MMX_REG_MODE(MODE) \ @@ -1106,13 +1107,15 @@ extern const char *host_detect_local_cpu (int argc, const char **argv); || (MODE) == V16SImode || (MODE) == V32HImode || (MODE) == V8DFmode \ || (MODE) == V16SFmode \ || (MODE) == V32HFmode || (MODE) == V16HFmode || (MODE) == V8HFmode \ - || (MODE) == V32BFmode || (MODE) == V16BFmode || (MODE) == V8BFmode) + || (MODE) == V32BFmode || (MODE) == V16BFmode || (MODE) == V8BFmode \ + || (MODE) == SCmode) #define X87_FLOAT_MODE_P(MODE) \ (TARGET_80387 && ((MODE) == SFmode || (MODE) == DFmode || (MODE) == XFmode)) #define SSE_FLOAT_MODE_P(MODE) \ - ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode)) + ((TARGET_SSE && (MODE) == SFmode) || (TARGET_SSE2 && (MODE) == DFmode) \ + || (TARGET_SSE2 && (MODE) == SCmode)) #define SSE_FLOAT_MODE_SSEMATH_OR_HF_P(MODE) \ ((SSE_FLOAT_MODE_P (MODE) && TARGET_SSE_MATH) \ diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 6bf9c99a2c1..b2b354c439e 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -30209,3 +30209,147 @@ "vcvtneo2ps\t{%1, %0|%0, %1}" [(set_attr "prefix" "vex") (set_attr "mode" "")]) + +(define_expand "movsc" + [(match_operand:SC 0 "nonimmediate_operand" "") + (match_operand:SC 1 "nonimmediate_operand" "")] + "" + { + emit_insn (gen_movv2sf (simplify_gen_subreg (V2SFmode, operands[0], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[1], SCmode, 0))); + DONE; + } +) + +(define_expand "addsc3" + [(match_operand:SC 0 "register_operand" "=r") + (match_operand:SC 1 "register_operand" "r") + (match_operand:SC 2 "register_operand" "r")] + "" + { + emit_insn (gen_addv2sf3 (simplify_gen_subreg (V2SFmode, operands[0], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[1], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[2], SCmode, 0))); + DONE; + } +) + +(define_expand "subsc3" + [(match_operand:SC 0 "register_operand" "=r") + (match_operand:SC 1 "register_operand" "r") + (match_operand:SC 2 "register_operand" "r")] + "" + { + emit_insn (gen_subv2sf3 (simplify_gen_subreg (V2SFmode, operands[0], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[1], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[2], SCmode, 0))); + DONE; + } +) + +(define_expand "negsc2" + [(match_operand:SC 0 "register_operand" "=r") + (match_operand:SC 1 "register_operand" "r")] + "" + { + emit_insn (gen_negv2sf2 (simplify_gen_subreg (V2SFmode, operands[0], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[1], SCmode, 0))); + DONE; + } +) + +(define_expand "sse_shufsc" + [(match_operand:V4SF 0 "register_operand") + (match_operand:SC 1 "register_operand") + (match_operand:SC 2 "vector_operand") + (match_operand:SI 3 "const_int_operand")] + "TARGET_SSE" +{ + int mask = INTVAL (operands[3]); + emit_insn (gen_sse_shufsc_sc (operands[0], + operands[1], + operands[2], + GEN_INT ((mask >> 0) & 3), + GEN_INT ((mask >> 2) & 3), + GEN_INT (((mask >> 4) & 3) + 4), + GEN_INT (((mask >> 6) & 3) + 4))); + DONE; +}) + +(define_insn "sse_shufsc_sc" + [(set (match_operand:V4SF 0 "register_operand" "=x,v") + (vec_select:V4SF + (vec_concat:V4SF + (match_operand:V2SF 1 "register_operand" "0,v") + (match_operand:V2SF 2 "vector_operand" "xBm,vm")) + (parallel [(match_operand 3 "const_0_to_3_operand") + (match_operand 4 "const_0_to_3_operand") + (match_operand 5 "const_4_to_7_operand") + (match_operand 6 "const_4_to_7_operand")])))] + "TARGET_SSE" +{ + int mask = 0; + mask |= INTVAL (operands[3]) << 0; + mask |= INTVAL (operands[4]) << 2; + mask |= (INTVAL (operands[5]) - 4) << 4; + mask |= (INTVAL (operands[6]) - 4) << 6; + operands[3] = GEN_INT (mask); + + switch (which_alternative) + { + case 0: + return "shufps\t{%3, %2, %0|%0, %2, %3}"; + case 1: + return "vshufps\t{%3, %2, %1, %0|%0, %1, %2, %3}"; + default: + gcc_unreachable (); + } +} + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sseshuf") + (set_attr "length_immediate" "1") + (set_attr "prefix" "orig,maybe_evex") + (set_attr "mode" "V4SF")]) + +(define_expand "mulsc3" + [(match_operand:SC 0 "register_operand" "=r") + (match_operand:SC 1 "register_operand" "r") + (match_operand:SC 2 "register_operand" "r")] + "TARGET_SSE3" + { + rtx a = gen_reg_rtx (V4SFmode); + rtx b = gen_reg_rtx (V4SFmode); + emit_insn (gen_sse_shufsc (a, + simplify_gen_subreg (V2SFmode, operands[1], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[1], SCmode, 0), + GEN_INT (0b01000100))); + emit_insn (gen_sse_shufsc (b, + simplify_gen_subreg (V2SFmode, operands[2], SCmode, 0), + simplify_gen_subreg (V2SFmode, operands[2], SCmode, 0), + GEN_INT (0b00010100))); + emit_insn (gen_mulv4sf3 (a, a, b)); + emit_insn (gen_sse_shufps (b, + a, + a, + GEN_INT (0b00001101))); + emit_insn (gen_sse_shufps (a, + a, + a, + GEN_INT (0b00001000))); + emit_insn (gen_vec_addsubv2sf3 (simplify_gen_subreg (V2SFmode, operands[0], SCmode, 0), + simplify_gen_subreg (V2SFmode, a, V4SFmode, 0), + simplify_gen_subreg (V2SFmode, b, V4SFmode, 0))); + DONE; + } +) + +(define_expand "conjsc2" + [(match_operand:SC 0 "register_operand" "=r") + (match_operand:SC 1 "register_operand" "r")] + "" + { + emit_insn (gen_negdf2 (simplify_gen_subreg (DFmode, operands[0], SCmode, 0), + simplify_gen_subreg (DFmode, operands[1], SCmode, 0))); + DONE; + } +)