From patchwork Thu Sep 21 07:19:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Hu, Lin1" X-Patchwork-Id: 14307 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp4666376vqi; Thu, 21 Sep 2023 00:23:50 -0700 (PDT) X-Google-Smtp-Source: AGHT+IETyc/Pp0hdV0K55n6QTTy8l3cZP2kib7bVBl0ThUz6JzvPLn1eAh1xlGSpAWANoV3yA5jm X-Received: by 2002:a17:906:3d21:b0:9a2:1e14:86bd with SMTP id l1-20020a1709063d2100b009a21e1486bdmr3824771ejf.65.1695281030536; Thu, 21 Sep 2023 00:23:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695281030; cv=none; d=google.com; s=arc-20160816; b=Ejel9EpCcZGiDMwdAGNT0b1lh2zQSzApZLPY8JIW3l9AX9mqMmo4tV9jv69sg5s9Yj AHcyHvooYFTPZX/+SW2udECJH/5atUSh3yY2Lt0xJWn7uU8IFcH7rU/TkKypZAQzcXls ZBQVwkpHWBbacwmGS737OL8wuOvw++KSUx+x2NxvYrqkDJfapx5tlkqr7ltzjSjQFa3q KLxWuyP7rMNOEBmcSON8XJb9p5wmtbCSlBrjGdlxSt1zUgAjY0uAkQgvQ6YCAh6HqDx/ JW2q16fx/5zwtrdcXuTPqFIMhU/QVK9MPgvy7vyYgn/BmsCt1eDFSysWDbOb6PwDEVu3 AdOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :dmarc-filter:delivered-to; bh=+QQJ2HF3RJ95EcGPqI6iKLIgX/u8XyLTUfmhLawgxJ4=; fh=sF/NGAqCthaRflPLk0tS85YHxOxp+1SAOPUzsMi6/xU=; b=rKI97aRiJBAyr7EeqKag0lRiYpxnz0LFzflfnF3UVP+a7w3vcmsYaHVlL8EzDnhWRY kgfQxVnxO9AhNnoUyDqiWyYSk83raBtHRhHMSE1PoZyw6DENxC4nBhMZy9ni8G4u3zon wb0kC89eYTuoAwbOZ87jWGAQY7v5sbS5iBtu5uhQD8dAONOXItYYi5EHU6dQZ9kucGtd KEx9d0B0g3xnclw+YNdFMcpiXUkEXR4GpyOrf0M0ODylaDKQ2q/tFYcBjQ27gEsXntK+ ThV5vhIoh0GMPYIAoBsyfk6CywdpQ0nI40BHQ27tAQzoqJgYKRiviZolsABpXS7idH/O dUpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=na7Av9Go; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id lj15-20020a170906f9cf00b00992a9b22b7fsi710386ejb.668.2023.09.21.00.23.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 21 Sep 2023 00:23:50 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=na7Av9Go; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5CAE0385C6F2 for ; Thu, 21 Sep 2023 07:23:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.115]) by sourceware.org (Postfix) with ESMTPS id F05B53857BB3 for ; Thu, 21 Sep 2023 07:22:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org F05B53857BB3 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695280953; x=1726816953; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=aUbUtN+01v/8oQplbFyve8WEAliv+tErcfAwvf2mkRY=; b=na7Av9GoQGYCV7DJ5UWuGpt+xDsAUE9R4iA02j0wB8Si6hxAEegENZtM uni5DFes6YI6Uw+cXlpZPKd1Xvp7b9z5gweieDaAG8l47T+st0KVtiHK8 q6HiHC4QXUICR+FXuIxnZAxiBGpDM/AbXkQ/PYy7mUgC1n8lLrNKBpBe7 7LOupqKBEC0vP5XLCgT//bG/A062Uc6zJ6Dn3I1GSFmfQWa+wmoRgru73 0v5MugJzIO4MCf5nVIr3PrrjS7PW1kLocYdumo6RiyEUvn+z53yrAPftn lMNw5AzY12Yyb4k9GnWXEU8A8cl5y5N9YV8L+WJ8X2+sNGCtW2hvmlfjf g==; X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="380352138" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="380352138" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Sep 2023 00:22:17 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10839"; a="817262173" X-IronPort-AV: E=Sophos;i="6.03,164,1694761200"; d="scan'208";a="817262173" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga004.fm.intel.com with ESMTP; 21 Sep 2023 00:22:14 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id BA6BA10056B6; Thu, 21 Sep 2023 15:22:13 +0800 (CST) From: "Hu, Lin1" To: gcc-patches@gcc.gnu.org Cc: hongtao.liu@intel.com, ubizjak@gmail.com, haochen.jiang@intel.com Subject: [PATCH 00/18] Support -mevex512 for AVX512 Date: Thu, 21 Sep 2023 15:19:55 +0800 Message-Id: <20230921072013.2124750-1-lin1.hu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-4.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_NUMSUBJECT, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777631001577719046 X-GMAIL-MSGID: 1777631001577719046 Hi all, After previous discussion, instead of supporting option -mavx10.1, we will first introduct option -m[no-]evex512, which will enable/disable 512 bit register and 64 bit mask register. It will not change the current option behavior since if AVX512F is enabled with no evex512 option specified, it will automatically enable 512 bit register and 64 bit mask register. How the patches go comes following: Patch 1 added initial support for option -mevex512. Patch 2-6 refined current intrin file to push evex512 target for all 512 bit intrins. Those scalar intrins remained untouched. Patch 7-11 added OPTION_MASK_ISA2_EVEX512 for all related builtins. Patch 12 disabled zmm register, 512 bit libmvec call for no-evex512, also requested evex512 for vectorization when using 512 bit register. Patch 13-17 supported evex512 in related patterns. Patch 18 added testcases for -mno-evex512 and allowed its usage. The patches currently cause scan-asm fail for pr89229-{5,6,7}b.c since we will emit scalar vmovss here. When trying to use x/ymm 16+ w/o avx512vl but with avx512f+evex512, I suppose we could either emit scalar or zmm instructions. It is quite a rare case on HW since there is no HW w/o avx512vl but with avx512f, so I prefer to not to add maintainence effort here to get a slightly perf improvement. But it could be changed to former behavior. Discussions are welcomed for all the patches. Thx, Haochen Haochen Jiang (18): Initial support for -mevex512 Push evex512 target for 512 bit intrins Push evex512 target for 512 bit intrins Push evex512 target for 512 bit intrins Push evex512 target for 512 bit intrins Push evex512 target for 512 bit intrins Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins Add OPTION_MASK_ISA2_EVEX512 for 512 bit builtins Disable zmm register and 512 bit libmvec call when !TARGET_EVEX512 Support -mevex512 for AVX512F intrins Support -mevex512 for AVX512DQ intrins Support -mevex512 for AVX512BW intrins Support -mevex512 for AVX512{IFMA,VBMI,VNNI,BF16,VPOPCNTDQ,VBMI2,BITALG,VP2INTERSECT},VAES,GFNI,VPCLMULQDQ intrins Support -mevex512 for AVX512FP16 intrins Allow -mno-evex512 usage gcc/common/config/i386/i386-common.cc | 15 + gcc/config.gcc | 19 +- gcc/config/i386/avx5124fmapsintrin.h | 2 +- gcc/config/i386/avx5124vnniwintrin.h | 2 +- gcc/config/i386/avx512bf16intrin.h | 31 +- gcc/config/i386/avx512bitalgintrin.h | 155 +- gcc/config/i386/avx512bitalgvlintrin.h | 180 + gcc/config/i386/avx512bwintrin.h | 291 +- gcc/config/i386/avx512dqintrin.h | 1840 +- gcc/config/i386/avx512erintrin.h | 2 +- gcc/config/i386/avx512fintrin.h | 19663 +++++++++--------- gcc/config/i386/avx512fp16intrin.h | 8925 ++++---- gcc/config/i386/avx512ifmaintrin.h | 4 +- gcc/config/i386/avx512pfintrin.h | 2 +- gcc/config/i386/avx512vbmi2intrin.h | 4 +- gcc/config/i386/avx512vbmiintrin.h | 4 +- gcc/config/i386/avx512vnniintrin.h | 4 +- gcc/config/i386/avx512vp2intersectintrin.h | 4 +- gcc/config/i386/avx512vpopcntdqintrin.h | 4 +- gcc/config/i386/gfniintrin.h | 76 +- gcc/config/i386/i386-builtin.def | 1312 +- gcc/config/i386/i386-builtins.cc | 96 +- gcc/config/i386/i386-c.cc | 2 + gcc/config/i386/i386-expand.cc | 18 +- gcc/config/i386/i386-options.cc | 33 +- gcc/config/i386/i386.cc | 168 +- gcc/config/i386/i386.h | 7 +- gcc/config/i386/i386.md | 127 +- gcc/config/i386/i386.opt | 4 + gcc/config/i386/immintrin.h | 2 + gcc/config/i386/predicates.md | 3 +- gcc/config/i386/sse.md | 854 +- gcc/config/i386/vaesintrin.h | 4 +- gcc/config/i386/vpclmulqdqintrin.h | 4 +- gcc/testsuite/gcc.target/i386/noevex512-1.c | 13 + gcc/testsuite/gcc.target/i386/noevex512-2.c | 13 + gcc/testsuite/gcc.target/i386/noevex512-3.c | 13 + gcc/testsuite/gcc.target/i386/pr89229-5b.c | 2 +- gcc/testsuite/gcc.target/i386/pr89229-6b.c | 2 +- gcc/testsuite/gcc.target/i386/pr89229-7b.c | 2 +- gcc/testsuite/gcc.target/i386/pr90096.c | 2 +- 41 files changed, 17170 insertions(+), 16738 deletions(-) create mode 100644 gcc/config/i386/avx512bitalgvlintrin.h create mode 100644 gcc/testsuite/gcc.target/i386/noevex512-1.c create mode 100644 gcc/testsuite/gcc.target/i386/noevex512-2.c create mode 100644 gcc/testsuite/gcc.target/i386/noevex512-3.c