From patchwork Thu Sep 29 15:55:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 1551 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp47179wrs; Thu, 29 Sep 2022 08:57:10 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6CiwG6PHTLKfC99wlUW4fjPEXz0OQtYL+e0ZuqWTKKq4D7g3f0ZbT/mhiF2k4QZFTLiXbV X-Received: by 2002:a17:907:2d9e:b0:782:69f2:a0ec with SMTP id gt30-20020a1709072d9e00b0078269f2a0ecmr3085824ejc.680.1664467030476; Thu, 29 Sep 2022 08:57:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664467030; cv=none; d=google.com; s=arc-20160816; b=Ni8CE8LabdPL8ewNrjLqsAB8Ut7gxrXj9LKOlkfjV5I6Q1qNrWaHxabnmNtpAvh8Iv XsQYobv+z2Rgv6MBFJEIlM4e59sFucCtbRXZblI70fIc+RHG+mSrjBVK3xc5WOEn0NME ukMnh6WaSMgi8btdQfnUYoGq0UCgBjBf0N2H1McrzUvinKjxYXqfrmHYSnmHy4x9OUwv I4Sp29DoHU40uRmffDMVNwiWX95nP8QN6FJGCghYraa13YrPb4AWPXu/u1TYsg2YdrMp v6+riz9d0IsgF7zi/qGUUAvjMz/zFJ6FiyIHyVyJKjZNcmScrvbXyp1o0U+s1C59/hnc JiXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-disposition:mime-version:message-id:subject:to:date :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=0tZY4UvMqFpc9gpOO8T5mQii0ndq+aVmMhxcJIjCDjk=; b=XyefGTLAkdpciEpub1jUjXSrDOXouASx+mSTgWR1iNt5+WrWBv+5253Lv0I9OJL50H 0nn+0n673G6diepms0wvG5/UDMIvoNK+JY2dYycuX0CV1fHu5KZoRnKfi7Ic8CNKBkhj ozS+AplC/BmLMEhslhzF9gV+HkY9qQOrOxl9UNJLfvHjiwJT1yPH7ZxNFMl+8ZH6lPoN nL+Y6u7ywSIT4K36G9D7hV04J9zf9HhtunlG2BNfZoYjYCQB9Rw+QYuqePpHJ2ck/Hn6 g6IIuONux4dvgzTeC2VXQOVjI+Tme2lGsok1kghI/G6s8bg/zfyObfo4JZ3TaHogP1Hk x48g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=vgIGUevR; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id g18-20020a50d5d2000000b0044ea2b1c817si7338816edj.244.2022.09.29.08.57.10 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Sep 2022 08:57:10 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=vgIGUevR; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4648638582A0 for ; Thu, 29 Sep 2022 15:57:09 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4648638582A0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1664467029; bh=0tZY4UvMqFpc9gpOO8T5mQii0ndq+aVmMhxcJIjCDjk=; h=Date:To:Subject:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=vgIGUevR4A6meuWziXIElUC7gpeeGWX6U4njhTaMrawS2xR5gSTJ0jsKW0yMW6BFv XFXEDebEqSqoAJp8D3PqmWCvz4luIJerawkKYpdywCvNm5KWZ57bJ7UXWRYNPdPt9J tN0sOF7fuMawvjnWTAKXR9vBkgJPyfAYHY0bzRDk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 75A983857BBF for ; Thu, 29 Sep 2022 15:56:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 75A983857BBF Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-591-Ms9BQxNVPt-o4WOGB1JNQg-1; Thu, 29 Sep 2022 11:55:57 -0400 X-MC-Unique: Ms9BQxNVPt-o4WOGB1JNQg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6BB2E3806658; Thu, 29 Sep 2022 15:55:56 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.194]) by smtp.corp.redhat.com (Postfix) with ESMTPS id B3D98C15BA4; Thu, 29 Sep 2022 15:55:55 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 28TFtqlq3938028 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Thu, 29 Sep 2022 17:55:52 +0200 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 28TFtocF3938027; Thu, 29 Sep 2022 17:55:50 +0200 Date: Thu, 29 Sep 2022 17:55:50 +0200 To: Jason Merrill , "Joseph S. Myers" , Hongtao Liu , hjl.tools@gmail.com, Richard Earnshaw , Kyrylo Tkachov , richard.sandiford@arm.com Subject: [RFC PATCH] c++, i386, arm, aarch64, libgcc: std::bfloat16_t and __bf16 arithmetic support Message-ID: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-25.9 required=5.0 tests=BAYES_00, DKIM_INVALID, DKIM_SIGNED, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_MANYTO, KAM_SHORT, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Cc: gcc-patches@gcc.gnu.org Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1745320180652338316?= X-GMAIL-MSGID: =?utf-8?q?1745320180652338316?= Hi! Here is more complete patch to add std::bfloat16_t support on x86, AArch64 and (only partially) on ARM 32-bit. No BFmode optabs are added by the patch, so for binops/unops it extends to SFmode first and then truncates back to BFmode. For {HF,SF,DF,XF,TF}mode -> BFmode conversions libgcc has implementations of all those conversions so that we avoid double rounding, for BFmode -> {DF,XF,TF}mode conversions to avoid growing libgcc too much it emits BFmode -> SFmode conversion first and then converts to the even wider mode, neither step should be imprecise. For BFmode -> HFmode, it first emits a precise BFmode -> SFmode conversion and then SFmode -> HFmode, because neither format is subset or superset of the other, while SFmode is superset of both. expr.cc then contains a -ffast-math optimization of the BF -> SF and SF -> BF conversions if we don't optimize for space (and for the latter if -frounding-math isn't enabled either). For x86, perhaps truncsfbf2 optab could be defined for TARGET_AVX512BF16 but IMNSHO should FAIL if !flag_finite_math || flag_rounding_math || !flag_unsafe_math_optimizations, because I think the insn doesn't raise on sNaNs, hardcodes round to nearest and flushes denormals to zero. In C by default (unless x86 -fexcess-precision=16) we use float excess precision for BFmode, so truncate only on explicit casts and assignments. In C++ unfortunately (but that is the case of also _Float16) we don't support excess precision yet which means that for __bf16 (__bf16 a, __bf16 b, __bf16 c, __bf16 d) { return a * b + c * d; } we do a lot of conversions. The aarch64 part is untested but has a chance of working (IMHO), though I'd appreciate if ARM maintainers could decide whether it is acceptable for them that __bf16 changes mangling and will allow arithmetics and conversions. The arm part is partial, libgcc side is missing as the target doesn't really seem to use soft-fp right now. Perhaps the config/arm/ changes can be left out from the patch (thus keep ARM 32-bit __bf16 as before) and support for it can be done at some later time. Thoughts on this? 2022-09-29 Jakub Jelinek gcc/ * tree-core.h (enum tree_index): Add TI_BFLOAT16_TYPE. * tree.h (bfloat16_type_node): Define. * tree.cc (excess_precision_type): Promote bfloat16_type_mode like float16_type_mode. * expmed.h (maybe_expand_shift): Declare. * expmed.cc (maybe_expand_shift): No longer static. * expr.cc (convert_mode_scalar): Don't ICE on BF -> HF or HF -> BF conversions. If there is no optab, handle BF -> {DF,XF,TF,HF} conversions as separate BF -> SF -> {DF,XF,TF,HF} conversions, add -ffast-math generic implementation for BF -> SF and SF -> BF conversions. * config/arm/arm.h (arm_bf16_type_node): Remove. (arm_bf16_ptr_type_node): Adjust comment. * config/arm/arm.cc (TARGET_INVALID_UNARY_OP, TARGET_INVALID_BINARY_OP): Don't redefine. (arm_mangle_type): Mangle BFmode as DFb16_. (arm_invalid_conversion): Only reject BF <-> HF conversions if HFmode is non-IEEE format. (arm_invalid_unary_op, arm_invalid_binary_op): Remove. * config/arm/arm-builtins.cc (arm_bf16_type_node): Remove. (arm_simd_builtin_std_type): Use bfloat16_type_node rather than arm_bf16_type_node. (arm_init_simd_builtin_types): Likewise. (arm_init_simd_builtin_scalar_types): Likewise. (arm_init_bf16_types): Likewise. * config/i386/i386.cc (ix86_mangle_type): Mangle BFmode as DFb16_. (ix86_invalid_conversion, ix86_invalid_unary_op, ix86_invalid_binary_op): Remove. (TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP, TARGET_INVALID_BINARY_OP): Don't redefine. * config/i386/i386-builtins.cc (ix86_bf16_type_node): Remove. (ix86_register_bf16_builtin_type): Use bfloat16_type_node rather than ix86_bf16_type_node. * config/i386/i386-builtin-types.def (BFLOAT16): Likewise. * config/aarch64/aarch64.h (aarch64_bf16_type_node): Remove. (aarch64_bf16_ptr_type_node): Adjust comment. * config/aarch64/aarch64.cc (aarch64_gimplify_va_arg_expr): Use bfloat16_type_node rather than aarch64_bf16_type_node. (aarch64_mangle_type): Mangle BFmode as DFb16_. (aarch64_invalid_conversion, aarch64_invalid_unary_op): Remove. aarch64_invalid_binary_op): Remove BFmode related rejections. (TARGET_INVALID_CONVERSION, TARGET_INVALID_UNARY_OP): Don't redefine. * config/aarch64/aarch64-builtins.cc (aarch64_bf16_type_node): Remove. (aarch64_int_or_fp_type): Use bfloat16_type_node rather than aarch64_bf16_type_node. (aarch64_init_simd_builtin_types, aarch64_init_bf16_types): Likewise. * config/aarch64/aarch64-sve-builtins.def (svbfloat16_t): Likewise. gcc/c-family/ * c-cppbuiltin.cc (c_cpp_builtins): If bfloat16_type_node, predefine for C++ __BFLT16_*__ macros and for C++23 also __STDCPP_BFLOAT16_T__. * c-lex.cc (interpret_float): Handle CPP_N_BFLOAT16 for C++. gcc/cp/ * cp-tree.h (extended_float_type_p): Return true for bfloat16_type_node. * typeck.cc (cp_compare_floating_point_conversion_ranks): Set extended{1,2} if mv{1,2} is bfloat16_type_node. Adjust comment. libcpp/ * include/cpplib.h (CPP_N_BFLOAT16): Define. * expr.cc (interpret_float_suffix): Handle bf16 and BF16 suffixes for C++. libgcc/ * config/arm/sfp-machine.h (_FP_NANFRAC_B): Define. * config/aarch64/t-softfp (softfp_extensions): Add bfsf. (softfp_truncations): Add tfbf dfbf sfbf hfbf. * config/aarch64/libgcc-softfp.ver (GCC_13.0.0): Export __extendbfsf2 and __trunc{s,d,t,h}fbf2. * config/aarch64/sfp-machine.h (_FP_NANFRAC_B): Define. * config/i386/t-softfp (softfp_extensions): Add bfsf. (softfp_truncations): Add tfbf xfbf dfbf sfbf hfbf. * config/i386/libgcc-glibc.ver (GCC_13.0.0): Export __extendbfsf2 and __trunc{s,d,x,t,h}fbf2. * config/i386/sfp-machine.h (_FP_NANSIGN_B): Define. * config/i386/64/sfp-machine.h (_FP_NANFRAC_B): Define. * config/i386/32/sfp-machine.h (_FP_NANFRAC_B): Define. * soft-fp/brain.h: New file. * soft-fp/truncsfbf2.c: New file. * soft-fp/truncdfbf2.c: New file. * soft-fp/truncxfbf2.c: New file. * soft-fp/trunctfbf2.c: New file. * soft-fp/trunchfbf2.c: New file. * soft-fp/truncbfhf2.c: New file. * soft-fp/extendbfsf2.c: New file. libiberty/ * cp-demangle.h (D_BUILTIN_TYPE_COUNT): Increment. * cp-demangle.c (cplus_demangle_builtin_types): Add std::bfloat16_t entry. (cplus_demangle_type): Demangle DFb16_. * testsuite/demangle-expected (_Z3xxxDFb16_): New test. Jakub --- gcc/tree-core.h.jj 2022-09-29 09:13:25.717718458 +0200 +++ gcc/tree-core.h 2022-09-29 12:40:17.417778754 +0200 @@ -665,6 +665,9 @@ enum tree_index { TI_DOUBLE_TYPE, TI_LONG_DOUBLE_TYPE, + /* __bf16 type if supported (used in C++ as std::bfloat16_t). */ + TI_BFLOAT16_TYPE, + /* The _FloatN and _FloatNx types must be consecutive, and in the same sequence as the corresponding complex types, which must also be consecutive; _FloatN must come before _FloatNx; the order must --- gcc/tree.h.jj 2022-09-29 09:13:25.720718416 +0200 +++ gcc/tree.h 2022-09-29 12:40:17.416778768 +0200 @@ -4285,6 +4285,7 @@ tree_strip_any_location_wrapper (tree ex #define float_type_node global_trees[TI_FLOAT_TYPE] #define double_type_node global_trees[TI_DOUBLE_TYPE] #define long_double_type_node global_trees[TI_LONG_DOUBLE_TYPE] +#define bfloat16_type_node global_trees[TI_BFLOAT16_TYPE] /* Nodes for particular _FloatN and _FloatNx types in sequence. */ #define FLOATN_TYPE_NODE(IDX) global_trees[TI_FLOATN_TYPE_FIRST + (IDX)] --- gcc/tree.cc.jj 2022-09-29 09:13:31.328641080 +0200 +++ gcc/tree.cc 2022-09-29 12:40:17.400778985 +0200 @@ -7711,7 +7711,7 @@ excess_precision_type (tree type) = (flag_excess_precision == EXCESS_PRECISION_FAST ? EXCESS_PRECISION_TYPE_FAST : (flag_excess_precision == EXCESS_PRECISION_FLOAT16 - ? EXCESS_PRECISION_TYPE_FLOAT16 :EXCESS_PRECISION_TYPE_STANDARD)); + ? EXCESS_PRECISION_TYPE_FLOAT16 : EXCESS_PRECISION_TYPE_STANDARD)); enum flt_eval_method target_flt_eval_method = targetm.c.excess_precision (requested_type); @@ -7736,6 +7736,9 @@ excess_precision_type (tree type) machine_mode float16_type_mode = (float16_type_node ? TYPE_MODE (float16_type_node) : VOIDmode); + machine_mode bfloat16_type_mode = (bfloat16_type_node + ? TYPE_MODE (bfloat16_type_node) + : VOIDmode); machine_mode float_type_mode = TYPE_MODE (float_type_node); machine_mode double_type_mode = TYPE_MODE (double_type_node); @@ -7747,16 +7750,19 @@ excess_precision_type (tree type) switch (target_flt_eval_method) { case FLT_EVAL_METHOD_PROMOTE_TO_FLOAT: - if (type_mode == float16_type_mode) + if (type_mode == float16_type_mode + || type_mode == bfloat16_type_mode) return float_type_node; break; case FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE: if (type_mode == float16_type_mode + || type_mode == bfloat16_type_mode || type_mode == float_type_mode) return double_type_node; break; case FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE: if (type_mode == float16_type_mode + || type_mode == bfloat16_type_mode || type_mode == float_type_mode || type_mode == double_type_mode) return long_double_type_node; @@ -7774,16 +7780,19 @@ excess_precision_type (tree type) switch (target_flt_eval_method) { case FLT_EVAL_METHOD_PROMOTE_TO_FLOAT: - if (type_mode == float16_type_mode) + if (type_mode == float16_type_mode + || type_mode == bfloat16_type_mode) return complex_float_type_node; break; case FLT_EVAL_METHOD_PROMOTE_TO_DOUBLE: if (type_mode == float16_type_mode + || type_mode == bfloat16_type_mode || type_mode == float_type_mode) return complex_double_type_node; break; case FLT_EVAL_METHOD_PROMOTE_TO_LONG_DOUBLE: if (type_mode == float16_type_mode + || type_mode == bfloat16_type_mode || type_mode == float_type_mode || type_mode == double_type_mode) return complex_long_double_type_node; --- gcc/expmed.h.jj 2022-07-26 10:32:23.681271790 +0200 +++ gcc/expmed.h 2022-09-29 15:18:46.457023535 +0200 @@ -707,6 +707,8 @@ extern rtx expand_variable_shift (enum t rtx, tree, rtx, int); extern rtx expand_shift (enum tree_code, machine_mode, rtx, poly_int64, rtx, int); +extern rtx maybe_expand_shift (enum tree_code, machine_mode, rtx, int, rtx, + int); #ifdef GCC_OPTABS_H extern rtx expand_divmod (int, enum tree_code, machine_mode, rtx, rtx, rtx, int, enum optab_methods = OPTAB_LIB_WIDEN); --- gcc/expmed.cc.jj 2022-08-31 10:20:20.000000000 +0200 +++ gcc/expmed.cc 2022-09-29 15:17:52.224769673 +0200 @@ -2705,7 +2705,7 @@ expand_shift (enum tree_code code, machi /* Likewise, but return 0 if that cannot be done. */ -static rtx +rtx maybe_expand_shift (enum tree_code code, machine_mode mode, rtx shifted, int amount, rtx target, int unsignedp) { --- gcc/expr.cc.jj 2022-09-09 09:50:35.228575531 +0200 +++ gcc/expr.cc 2022-09-29 17:09:46.716352938 +0200 @@ -344,7 +344,11 @@ convert_mode_scalar (rtx to, rtx from, i gcc_assert ((GET_MODE_PRECISION (from_mode) != GET_MODE_PRECISION (to_mode)) || (DECIMAL_FLOAT_MODE_P (from_mode) - != DECIMAL_FLOAT_MODE_P (to_mode))); + != DECIMAL_FLOAT_MODE_P (to_mode)) + || (REAL_MODE_FORMAT (from_mode) == &arm_bfloat_half_format + && REAL_MODE_FORMAT (to_mode) == &ieee_half_format) + || (REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format + && REAL_MODE_FORMAT (from_mode) == &ieee_half_format)); if (GET_MODE_PRECISION (from_mode) == GET_MODE_PRECISION (to_mode)) /* Conversion between decimal float and binary float, same size. */ @@ -364,6 +368,150 @@ convert_mode_scalar (rtx to, rtx from, i return; } +#ifdef HAVE_SFmode + if (REAL_MODE_FORMAT (from_mode) == &arm_bfloat_half_format + && REAL_MODE_FORMAT (SFmode) == &ieee_single_format) + { + if (GET_MODE_PRECISION (to_mode) > GET_MODE_PRECISION (SFmode)) + { + /* To cut down on libgcc size, implement + BFmode -> {DF,XF,TF}mode conversions by + BFmode -> SFmode -> {DF,XF,TF}mode conversions. */ + rtx temp = gen_reg_rtx (SFmode); + convert_mode_scalar (temp, from, unsignedp); + convert_mode_scalar (to, temp, unsignedp); + return; + } + if (REAL_MODE_FORMAT (to_mode) == &ieee_half_format) + { + /* Similarly, implement BFmode -> HFmode as + BFmode -> SFmode -> HFmode conversion where SFmode + has superset of BFmode values. We don't need + to handle sNaNs by raising exception and turning + into into qNaN though, as that can be done in the + SFmode -> HFmode conversion too. */ + rtx temp = gen_reg_rtx (SFmode); + int save_flag_finite_math_only = flag_finite_math_only; + flag_finite_math_only = true; + convert_mode_scalar (temp, from, unsignedp); + flag_finite_math_only = save_flag_finite_math_only; + convert_mode_scalar (to, temp, unsignedp); + return; + } + if (to_mode == SFmode + && !HONOR_NANS (from_mode) + && !HONOR_NANS (to_mode) + && optimize_insn_for_speed_p ()) + { + /* If we don't expect sNaNs, for BFmode -> SFmode we can just + shift the bits up. */ + machine_mode fromi_mode, toi_mode; + if (int_mode_for_size (GET_MODE_BITSIZE (from_mode), + 0).exists (&fromi_mode) + && int_mode_for_size (GET_MODE_BITSIZE (to_mode), + 0).exists (&toi_mode)) + { + start_sequence (); + rtx fromi = lowpart_subreg (fromi_mode, from, from_mode); + rtx tof = NULL_RTX; + if (fromi) + { + rtx toi = gen_reg_rtx (toi_mode); + convert_mode_scalar (toi, fromi, 1); + toi + = maybe_expand_shift (LSHIFT_EXPR, toi_mode, toi, + GET_MODE_PRECISION (to_mode) + - GET_MODE_PRECISION (from_mode), + NULL_RTX, 1); + if (toi) + { + tof = lowpart_subreg (to_mode, toi, toi_mode); + if (tof) + emit_move_insn (to, tof); + } + } + insns = get_insns (); + end_sequence (); + if (tof) + { + emit_insn (insns); + return; + } + } + } + } + if (REAL_MODE_FORMAT (from_mode) == &ieee_single_format + && REAL_MODE_FORMAT (to_mode) == &arm_bfloat_half_format + && !HONOR_NANS (from_mode) + && !HONOR_NANS (to_mode) + && !flag_rounding_math + && optimize_insn_for_speed_p ()) + { + /* If we don't expect qNaNs nor sNaNs and can assume rounding + to nearest, we can expand the conversion inline as + (fromi + 0x7fff + ((fromi >> 16) & 1)) >> 16. */ + machine_mode fromi_mode, toi_mode; + if (int_mode_for_size (GET_MODE_BITSIZE (from_mode), + 0).exists (&fromi_mode) + && int_mode_for_size (GET_MODE_BITSIZE (to_mode), + 0).exists (&toi_mode)) + { + start_sequence (); + rtx fromi = lowpart_subreg (fromi_mode, from, from_mode); + rtx tof = NULL_RTX; + do + { + if (!fromi) + break; + int shift = (GET_MODE_PRECISION (from_mode) + - GET_MODE_PRECISION (to_mode)); + rtx temp1 + = maybe_expand_shift (RSHIFT_EXPR, fromi_mode, fromi, + shift, NULL_RTX, 1); + if (!temp1) + break; + rtx temp2 + = expand_binop (fromi_mode, and_optab, temp1, const1_rtx, + NULL_RTX, 1, OPTAB_DIRECT); + if (!temp2) + break; + rtx temp3 + = expand_binop (fromi_mode, add_optab, fromi, + gen_int_mode ((HOST_WIDE_INT_1U + << (shift - 1)) - 1, + fromi_mode), NULL_RTX, + 1, OPTAB_DIRECT); + if (!temp3) + break; + rtx temp4 + = expand_binop (fromi_mode, add_optab, temp3, temp2, + NULL_RTX, 1, OPTAB_DIRECT); + if (!temp4) + break; + rtx temp5 = maybe_expand_shift (RSHIFT_EXPR, fromi_mode, + temp4, shift, NULL_RTX, 1); + if (!temp5) + break; + rtx temp6 = lowpart_subreg (toi_mode, temp5, fromi_mode); + if (!temp6) + break; + tof = lowpart_subreg (to_mode, force_reg (toi_mode, temp6), + toi_mode); + if (tof) + emit_move_insn (to, tof); + } + while (0); + insns = get_insns (); + end_sequence (); + if (tof) + { + emit_insn (insns); + return; + } + } + } +#endif + /* Otherwise use a libcall. */ libcall = convert_optab_libfunc (tab, to_mode, from_mode); --- gcc/config/arm/arm.h.jj 2022-09-29 09:13:25.709718568 +0200 +++ gcc/config/arm/arm.h 2022-09-29 12:40:17.401778971 +0200 @@ -78,9 +78,8 @@ extern void (*arm_lang_output_object_att the backend. Defined in arm-builtins.cc. */ extern tree arm_fp16_type_node; -/* This type is the user-visible __bf16. We need it in a few places in - the backend. Defined in arm-builtins.cc. */ -extern tree arm_bf16_type_node; +/* The user-visible __bf16 uses bfloat16_type_node, but for pointer to that + use backend specific tree. Defined in arm-builtins.cc. */ extern tree arm_bf16_ptr_type_node; --- gcc/config/arm/arm.cc.jj 2022-09-29 09:13:25.709718568 +0200 +++ gcc/config/arm/arm.cc 2022-09-29 15:33:07.997170885 +0200 @@ -688,12 +688,6 @@ static const struct attribute_spec arm_a #undef TARGET_INVALID_CONVERSION #define TARGET_INVALID_CONVERSION arm_invalid_conversion -#undef TARGET_INVALID_UNARY_OP -#define TARGET_INVALID_UNARY_OP arm_invalid_unary_op - -#undef TARGET_INVALID_BINARY_OP -#define TARGET_INVALID_BINARY_OP arm_invalid_binary_op - #undef TARGET_ATOMIC_ASSIGN_EXPAND_FENV #define TARGET_ATOMIC_ASSIGN_EXPAND_FENV arm_atomic_assign_expand_fenv @@ -30360,7 +30354,7 @@ arm_mangle_type (const_tree type) if (TREE_CODE (type) == REAL_TYPE && TYPE_PRECISION (type) == 16) { if (TYPE_MODE (type) == BFmode) - return "u6__bf16"; + return "DFb16_"; else return "Dh"; } @@ -33996,47 +33990,22 @@ arm_invalid_conversion (const_tree fromt { if (element_mode (fromtype) != element_mode (totype)) { - /* Do no allow conversions to/from BFmode scalar types. */ - if (TYPE_MODE (fromtype) == BFmode) - return N_("invalid conversion from type %"); - if (TYPE_MODE (totype) == BFmode) - return N_("invalid conversion to type %"); + /* Do no allow conversions from BFmode to non-ieee HFmode + scalar types or vice versa. */ + if (TYPE_MODE (fromtype) == BFmode + && TYPE_MODE (totype) == HFmode + && arm_fp16_format == ARM_FP16_FORMAT_ALTERNATIVE) + return N_("invalid conversion from type % to %<__fp16%>"); + if (TYPE_MODE (totype) == BFmode + && TYPE_MODE (fromtype) == HFmode + && arm_fp16_format == ARM_FP16_FORMAT_ALTERNATIVE) + return N_("invalid conversion to type % from %<__fp16%>"); } /* Conversion allowed. */ return NULL; } -/* Return the diagnostic message string if the unary operation OP is - not permitted on TYPE, NULL otherwise. */ - -static const char * -arm_invalid_unary_op (int op, const_tree type) -{ - /* Reject all single-operand operations on BFmode except for &. */ - if (element_mode (type) == BFmode && op != ADDR_EXPR) - return N_("operation not permitted on type %"); - - /* Operation allowed. */ - return NULL; -} - -/* Return the diagnostic message string if the binary operation OP is - not permitted on TYPE1 and TYPE2, NULL otherwise. */ - -static const char * -arm_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, - const_tree type2) -{ - /* Reject all 2-operand operations on BFmode. */ - if (element_mode (type1) == BFmode - || element_mode (type2) == BFmode) - return N_("operation not permitted on type %"); - - /* Operation allowed. */ - return NULL; -} - /* Implement TARGET_CAN_CHANGE_MODE_CLASS. In VFPv1, VFP registers could only be accessed in the mode they were --- gcc/config/arm/arm-builtins.cc.jj 2022-09-29 09:13:25.681718954 +0200 +++ gcc/config/arm/arm-builtins.cc 2022-09-29 12:40:17.405778917 +0200 @@ -1370,7 +1370,6 @@ struct arm_simd_type_info arm_simd_types tree arm_fp16_type_node = NULL_TREE; /* Back-end node type for brain float (bfloat) types. */ -tree arm_bf16_type_node = NULL_TREE; tree arm_bf16_ptr_type_node = NULL_TREE; static tree arm_simd_intOI_type_node = NULL_TREE; @@ -1459,7 +1458,7 @@ arm_simd_builtin_std_type (machine_mode case E_DFmode: return double_type_node; case E_BFmode: - return arm_bf16_type_node; + return bfloat16_type_node; default: gcc_unreachable (); } @@ -1570,9 +1569,9 @@ arm_init_simd_builtin_types (void) arm_simd_types[Float32x4_t].eltype = float_type_node; /* Init Bfloat vector types with underlying __bf16 scalar type. */ - arm_simd_types[Bfloat16x2_t].eltype = arm_bf16_type_node; - arm_simd_types[Bfloat16x4_t].eltype = arm_bf16_type_node; - arm_simd_types[Bfloat16x8_t].eltype = arm_bf16_type_node; + arm_simd_types[Bfloat16x2_t].eltype = bfloat16_type_node; + arm_simd_types[Bfloat16x4_t].eltype = bfloat16_type_node; + arm_simd_types[Bfloat16x8_t].eltype = bfloat16_type_node; for (i = 0; i < nelts; i++) { @@ -1658,7 +1657,7 @@ arm_init_simd_builtin_scalar_types (void "__builtin_neon_df"); (*lang_hooks.types.register_builtin_type) (intTI_type_node, "__builtin_neon_ti"); - (*lang_hooks.types.register_builtin_type) (arm_bf16_type_node, + (*lang_hooks.types.register_builtin_type) (bfloat16_type_node, "__builtin_neon_bf"); /* Unsigned integer types for various mode sizes. */ (*lang_hooks.types.register_builtin_type) (unsigned_intQI_type_node, @@ -1797,13 +1796,13 @@ arm_init_builtin (unsigned int fcode, ar static void arm_init_bf16_types (void) { - arm_bf16_type_node = make_node (REAL_TYPE); - TYPE_PRECISION (arm_bf16_type_node) = 16; - SET_TYPE_MODE (arm_bf16_type_node, BFmode); - layout_type (arm_bf16_type_node); + bfloat16_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (bfloat16_type_node) = 16; + SET_TYPE_MODE (bfloat16_type_node, BFmode); + layout_type (bfloat16_type_node); - lang_hooks.types.register_builtin_type (arm_bf16_type_node, "__bf16"); - arm_bf16_ptr_type_node = build_pointer_type (arm_bf16_type_node); + lang_hooks.types.register_builtin_type (bfloat16_type_node, "__bf16"); + arm_bf16_ptr_type_node = build_pointer_type (bfloat16_type_node); } /* Set up ACLE builtins, even builtins for instructions that are not --- gcc/config/i386/i386.cc.jj 2022-09-29 12:03:12.073350093 +0200 +++ gcc/config/i386/i386.cc 2022-09-29 12:40:17.409778863 +0200 @@ -22728,7 +22728,7 @@ ix86_mangle_type (const_tree type) switch (TYPE_MODE (type)) { case E_BFmode: - return "u6__bf16"; + return "DFb16_"; case E_HFmode: /* _Float16 is "DF16_". Align with clang's decision in https://reviews.llvm.org/D33719. */ @@ -22747,55 +22747,6 @@ ix86_mangle_type (const_tree type) } } -/* Return the diagnostic message string if conversion from FROMTYPE to - TOTYPE is not allowed, NULL otherwise. */ - -static const char * -ix86_invalid_conversion (const_tree fromtype, const_tree totype) -{ - if (element_mode (fromtype) != element_mode (totype)) - { - /* Do no allow conversions to/from BFmode scalar types. */ - if (TYPE_MODE (fromtype) == BFmode) - return N_("invalid conversion from type %<__bf16%>"); - if (TYPE_MODE (totype) == BFmode) - return N_("invalid conversion to type %<__bf16%>"); - } - - /* Conversion allowed. */ - return NULL; -} - -/* Return the diagnostic message string if the unary operation OP is - not permitted on TYPE, NULL otherwise. */ - -static const char * -ix86_invalid_unary_op (int op, const_tree type) -{ - /* Reject all single-operand operations on BFmode except for &. */ - if (element_mode (type) == BFmode && op != ADDR_EXPR) - return N_("operation not permitted on type %<__bf16%>"); - - /* Operation allowed. */ - return NULL; -} - -/* Return the diagnostic message string if the binary operation OP is - not permitted on TYPE1 and TYPE2, NULL otherwise. */ - -static const char * -ix86_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, - const_tree type2) -{ - /* Reject all 2-operand operations on BFmode. */ - if (element_mode (type1) == BFmode - || element_mode (type2) == BFmode) - return N_("operation not permitted on type %<__bf16%>"); - - /* Operation allowed. */ - return NULL; -} - static GTY(()) tree ix86_tls_stack_chk_guard_decl; static tree @@ -24853,15 +24804,6 @@ ix86_libgcc_floating_mode_supported_p #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE ix86_mangle_type -#undef TARGET_INVALID_CONVERSION -#define TARGET_INVALID_CONVERSION ix86_invalid_conversion - -#undef TARGET_INVALID_UNARY_OP -#define TARGET_INVALID_UNARY_OP ix86_invalid_unary_op - -#undef TARGET_INVALID_BINARY_OP -#define TARGET_INVALID_BINARY_OP ix86_invalid_binary_op - #undef TARGET_STACK_PROTECT_GUARD #define TARGET_STACK_PROTECT_GUARD ix86_stack_protect_guard --- gcc/config/i386/i386-builtins.cc.jj 2022-09-29 09:13:25.710718554 +0200 +++ gcc/config/i386/i386-builtins.cc 2022-09-29 12:40:17.406778903 +0200 @@ -126,7 +126,6 @@ BDESC_VERIFYS (IX86_BUILTIN_MAX, static GTY(()) tree ix86_builtin_type_tab[(int) IX86_BT_LAST_CPTR + 1]; tree ix86_float16_type_node = NULL_TREE; -tree ix86_bf16_type_node = NULL_TREE; tree ix86_bf16_ptr_type_node = NULL_TREE; /* Retrieve an element from the above table, building some of @@ -1372,16 +1371,15 @@ ix86_register_float16_builtin_type (void static void ix86_register_bf16_builtin_type (void) { - ix86_bf16_type_node = make_node (REAL_TYPE); - TYPE_PRECISION (ix86_bf16_type_node) = 16; - SET_TYPE_MODE (ix86_bf16_type_node, BFmode); - layout_type (ix86_bf16_type_node); + bfloat16_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (bfloat16_type_node) = 16; + SET_TYPE_MODE (bfloat16_type_node, BFmode); + layout_type (bfloat16_type_node); if (!maybe_get_identifier ("__bf16") && TARGET_SSE2) { - lang_hooks.types.register_builtin_type (ix86_bf16_type_node, - "__bf16"); - ix86_bf16_ptr_type_node = build_pointer_type (ix86_bf16_type_node); + lang_hooks.types.register_builtin_type (bfloat16_type_node, "__bf16"); + ix86_bf16_ptr_type_node = build_pointer_type (bfloat16_type_node); } } --- gcc/config/i386/i386-builtin-types.def.jj 2022-09-29 09:13:25.709718568 +0200 +++ gcc/config/i386/i386-builtin-types.def 2022-09-29 12:40:17.406778903 +0200 @@ -69,7 +69,7 @@ DEF_PRIMITIVE_TYPE (UINT16, short_unsign DEF_PRIMITIVE_TYPE (INT64, long_long_integer_type_node) DEF_PRIMITIVE_TYPE (UINT64, long_long_unsigned_type_node) DEF_PRIMITIVE_TYPE (FLOAT16, ix86_float16_type_node) -DEF_PRIMITIVE_TYPE (BFLOAT16, ix86_bf16_type_node) +DEF_PRIMITIVE_TYPE (BFLOAT16, bfloat16_type_node) DEF_PRIMITIVE_TYPE (FLOAT, float_type_node) DEF_PRIMITIVE_TYPE (DOUBLE, double_type_node) DEF_PRIMITIVE_TYPE (FLOAT80, float80_type_node) --- gcc/config/aarch64/aarch64.h.jj 2022-09-29 09:13:25.680718968 +0200 +++ gcc/config/aarch64/aarch64.h 2022-09-29 12:40:17.409778863 +0200 @@ -1337,9 +1337,8 @@ extern const char *aarch64_rewrite_mcpu extern GTY(()) tree aarch64_fp16_type_node; extern GTY(()) tree aarch64_fp16_ptr_type_node; -/* This type is the user-visible __bf16, and a pointer to that type. Defined - in aarch64-builtins.cc. */ -extern GTY(()) tree aarch64_bf16_type_node; +/* Pointer to the user-visible __bf16 type. __bf16 itself is generic + bfloat16_type_node. Defined in aarch64-builtins.cc. */ extern GTY(()) tree aarch64_bf16_ptr_type_node; /* The generic unwind code in libgcc does not initialize the frame pointer. --- gcc/config/aarch64/aarch64.cc.jj 2022-09-29 09:13:25.680718968 +0200 +++ gcc/config/aarch64/aarch64.cc 2022-09-29 12:40:17.413778808 +0200 @@ -19741,7 +19741,7 @@ aarch64_gimplify_va_arg_expr (tree valis field_ptr_t = aarch64_fp16_ptr_type_node; break; case E_BFmode: - field_t = aarch64_bf16_type_node; + field_t = bfloat16_type_node; field_ptr_t = aarch64_bf16_ptr_type_node; break; case E_V2SImode: @@ -20645,7 +20645,7 @@ aarch64_mangle_type (const_tree type) if (TREE_CODE (type) == REAL_TYPE && TYPE_PRECISION (type) == 16) { if (TYPE_MODE (type) == BFmode) - return "u6__bf16"; + return "DFb16_"; else return "Dh"; } @@ -26820,39 +26820,6 @@ aarch64_stack_protect_guard (void) return NULL_TREE; } -/* Return the diagnostic message string if conversion from FROMTYPE to - TOTYPE is not allowed, NULL otherwise. */ - -static const char * -aarch64_invalid_conversion (const_tree fromtype, const_tree totype) -{ - if (element_mode (fromtype) != element_mode (totype)) - { - /* Do no allow conversions to/from BFmode scalar types. */ - if (TYPE_MODE (fromtype) == BFmode) - return N_("invalid conversion from type %"); - if (TYPE_MODE (totype) == BFmode) - return N_("invalid conversion to type %"); - } - - /* Conversion allowed. */ - return NULL; -} - -/* Return the diagnostic message string if the unary operation OP is - not permitted on TYPE, NULL otherwise. */ - -static const char * -aarch64_invalid_unary_op (int op, const_tree type) -{ - /* Reject all single-operand operations on BFmode except for &. */ - if (element_mode (type) == BFmode && op != ADDR_EXPR) - return N_("operation not permitted on type %"); - - /* Operation allowed. */ - return NULL; -} - /* Return the diagnostic message string if the binary operation OP is not permitted on TYPE1 and TYPE2, NULL otherwise. */ @@ -26860,11 +26827,6 @@ static const char * aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1, const_tree type2) { - /* Reject all 2-operand operations on BFmode. */ - if (element_mode (type1) == BFmode - || element_mode (type2) == BFmode) - return N_("operation not permitted on type %"); - if (VECTOR_TYPE_P (type1) && VECTOR_TYPE_P (type2) && !TYPE_INDIVISIBLE_P (type1) @@ -27461,12 +27423,6 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_MANGLE_TYPE #define TARGET_MANGLE_TYPE aarch64_mangle_type -#undef TARGET_INVALID_CONVERSION -#define TARGET_INVALID_CONVERSION aarch64_invalid_conversion - -#undef TARGET_INVALID_UNARY_OP -#define TARGET_INVALID_UNARY_OP aarch64_invalid_unary_op - #undef TARGET_INVALID_BINARY_OP #define TARGET_INVALID_BINARY_OP aarch64_invalid_binary_op --- gcc/config/aarch64/aarch64-builtins.cc.jj 2022-09-29 09:13:25.676719023 +0200 +++ gcc/config/aarch64/aarch64-builtins.cc 2022-09-29 12:40:17.410778849 +0200 @@ -918,7 +918,6 @@ tree aarch64_fp16_type_node = NULL_TREE; tree aarch64_fp16_ptr_type_node = NULL_TREE; /* Back-end node type for brain float (bfloat) types. */ -tree aarch64_bf16_type_node = NULL_TREE; tree aarch64_bf16_ptr_type_node = NULL_TREE; /* Wrapper around add_builtin_function. NAME is the name of the built-in @@ -1010,7 +1009,7 @@ aarch64_int_or_fp_type (machine_mode mod case E_DFmode: return double_type_node; case E_BFmode: - return aarch64_bf16_type_node; + return bfloat16_type_node; default: gcc_unreachable (); } @@ -1124,8 +1123,8 @@ aarch64_init_simd_builtin_types (void) aarch64_simd_types[Float64x2_t].eltype = double_type_node; /* Init Bfloat vector types with underlying __bf16 type. */ - aarch64_simd_types[Bfloat16x4_t].eltype = aarch64_bf16_type_node; - aarch64_simd_types[Bfloat16x8_t].eltype = aarch64_bf16_type_node; + aarch64_simd_types[Bfloat16x4_t].eltype = bfloat16_type_node; + aarch64_simd_types[Bfloat16x8_t].eltype = bfloat16_type_node; for (i = 0; i < nelts; i++) { @@ -1197,7 +1196,7 @@ aarch64_init_simd_builtin_scalar_types ( "__builtin_aarch64_simd_poly128"); (*lang_hooks.types.register_builtin_type) (intTI_type_node, "__builtin_aarch64_simd_ti"); - (*lang_hooks.types.register_builtin_type) (aarch64_bf16_type_node, + (*lang_hooks.types.register_builtin_type) (bfloat16_type_node, "__builtin_aarch64_simd_bf"); /* Unsigned integer types for various mode sizes. */ (*lang_hooks.types.register_builtin_type) (unsigned_intQI_type_node, @@ -1682,13 +1681,13 @@ aarch64_init_fp16_types (void) static void aarch64_init_bf16_types (void) { - aarch64_bf16_type_node = make_node (REAL_TYPE); - TYPE_PRECISION (aarch64_bf16_type_node) = 16; - SET_TYPE_MODE (aarch64_bf16_type_node, BFmode); - layout_type (aarch64_bf16_type_node); + bfloat16_type_node = make_node (REAL_TYPE); + TYPE_PRECISION (bfloat16_type_node) = 16; + SET_TYPE_MODE (bfloat16_type_node, BFmode); + layout_type (bfloat16_type_node); - lang_hooks.types.register_builtin_type (aarch64_bf16_type_node, "__bf16"); - aarch64_bf16_ptr_type_node = build_pointer_type (aarch64_bf16_type_node); + lang_hooks.types.register_builtin_type (bfloat16_type_node, "__bf16"); + aarch64_bf16_ptr_type_node = build_pointer_type (bfloat16_type_node); } /* Pointer authentication builtins that will become NOP on legacy platform. --- gcc/config/aarch64/aarch64-sve-builtins.def.jj 2022-09-29 09:13:25.676719023 +0200 +++ gcc/config/aarch64/aarch64-sve-builtins.def 2022-09-29 12:40:17.413778808 +0200 @@ -61,7 +61,7 @@ DEF_SVE_MODE (u64offset, none, svuint64_ DEF_SVE_MODE (vnum, none, none, vectors) DEF_SVE_TYPE (svbool_t, 10, __SVBool_t, boolean_type_node) -DEF_SVE_TYPE (svbfloat16_t, 14, __SVBfloat16_t, aarch64_bf16_type_node) +DEF_SVE_TYPE (svbfloat16_t, 14, __SVBfloat16_t, bfloat16_type_node) DEF_SVE_TYPE (svfloat16_t, 13, __SVFloat16_t, aarch64_fp16_type_node) DEF_SVE_TYPE (svfloat32_t, 13, __SVFloat32_t, float_type_node) DEF_SVE_TYPE (svfloat64_t, 13, __SVFloat64_t, double_type_node) --- gcc/c-family/c-cppbuiltin.cc.jj 2022-09-29 09:13:25.675719037 +0200 +++ gcc/c-family/c-cppbuiltin.cc 2022-09-29 12:40:17.416778768 +0200 @@ -1264,6 +1264,13 @@ c_cpp_builtins (cpp_reader *pfile) builtin_define_float_constants (prefix, ggc_strdup (csuffix), "%s", csuffix, FLOATN_NX_TYPE_NODE (i)); } + if (bfloat16_type_node && c_dialect_cxx ()) + { + if (cxx_dialect > cxx20) + cpp_define (pfile, "__STDCPP_BFLOAT16_T__=1"); + builtin_define_float_constants ("BFLT16", "BF16", "%s", + "BF16", bfloat16_type_node); + } /* For float.h. */ if (targetm.decimal_float_supported_p ()) --- gcc/c-family/c-lex.cc.jj 2022-09-29 09:13:25.675719037 +0200 +++ gcc/c-family/c-lex.cc 2022-09-29 12:40:17.416778768 +0200 @@ -995,6 +995,19 @@ interpret_float (const cpp_token *token, pedwarn (input_location, OPT_Wpedantic, "non-standard suffix on floating constant"); } + else if ((flags & CPP_N_BFLOAT16) != 0 && c_dialect_cxx ()) + { + type = bfloat16_type_node; + if (type == NULL_TREE) + { + error ("unsupported non-standard suffix on floating constant"); + return error_mark_node; + } + if (cxx_dialect < cxx23) + pedwarn (input_location, OPT_Wpedantic, + "% or % suffix on floating constant only " + "available with %<-std=c++2b%> or %<-std=gnu++2b%>"); + } else if ((flags & CPP_N_WIDTH) == CPP_N_LARGE) type = long_double_type_node; else if ((flags & CPP_N_WIDTH) == CPP_N_SMALL --- gcc/cp/cp-tree.h.jj 2022-09-29 09:13:31.164643341 +0200 +++ gcc/cp/cp-tree.h 2022-09-29 12:40:17.414778795 +0200 @@ -8714,6 +8714,8 @@ extended_float_type_p (tree type) for (int i = 0; i < NUM_FLOATN_NX_TYPES; ++i) if (type == FLOATN_TYPE_NODE (i)) return true; + if (type == bfloat16_type_node) + return true; return false; } --- gcc/cp/typeck.cc.jj 2022-09-29 09:13:25.716718472 +0200 +++ gcc/cp/typeck.cc 2022-09-29 12:40:17.415778781 +0200 @@ -293,6 +293,10 @@ cp_compare_floating_point_conversion_ran if (mv2 == FLOATN_NX_TYPE_NODE (i)) extended2 = i + 1; } + if (mv1 == bfloat16_type_node) + extended1 = true; + if (mv2 == bfloat16_type_node) + extended2 = true; if (extended2 && !extended1) { int ret = cp_compare_floating_point_conversion_ranks (t2, t1); @@ -390,7 +394,9 @@ cp_compare_floating_point_conversion_ran if (cnt > 1 && mv2 == long_double_type_node) return -2; /* Otherwise, they have equal rank, but extended types - (other than std::bfloat16_t) have higher subrank. */ + (other than std::bfloat16_t) have higher subrank. + std::bfloat16_t shouldn't have equal rank to any standard + floating point type. */ return 1; } --- libcpp/include/cpplib.h.jj 2022-09-08 13:01:19.853771383 +0200 +++ libcpp/include/cpplib.h 2022-09-28 19:06:59.615380690 +0200 @@ -1275,6 +1275,7 @@ struct cpp_num #define CPP_N_USERDEF 0x1000000 /* C++11 user-defined literal. */ #define CPP_N_SIZE_T 0x2000000 /* C++23 size_t literal. */ +#define CPP_N_BFLOAT16 0x4000000 /* std::bfloat16_t type. */ #define CPP_N_WIDTH_FLOATN_NX 0xF0000000 /* _FloatN / _FloatNx value of N, divided by 16. */ --- libcpp/expr.cc.jj 2022-09-27 08:03:27.119982735 +0200 +++ libcpp/expr.cc 2022-09-28 17:55:36.667177540 +0200 @@ -91,10 +91,10 @@ interpret_float_suffix (cpp_reader *pfil size_t orig_len = len; const uchar *orig_s = s; size_t flags; - size_t f, d, l, w, q, i, fn, fnx, fn_bits; + size_t f, d, l, w, q, i, fn, fnx, fn_bits, bf16; flags = 0; - f = d = l = w = q = i = fn = fnx = fn_bits = 0; + f = d = l = w = q = i = fn = fnx = fn_bits = bf16 = 0; /* The following decimal float suffixes, from TR 24732:2009, TS 18661-2:2015 and C2X, are supported: @@ -131,7 +131,8 @@ interpret_float_suffix (cpp_reader *pfil w, W - machine-specific type such as __float80 (GNU extension). q, Q - machine-specific type such as __float128 (GNU extension). fN, FN - _FloatN (TS 18661-3:2015). - fNx, FNx - _FloatNx (TS 18661-3:2015). */ + fNx, FNx - _FloatNx (TS 18661-3:2015). + bf16, BF16 - std::bfloat16_t (ISO C++23). */ /* Process decimal float suffixes, which are two letters starting with d or D. Order and case are significant. */ @@ -239,6 +240,20 @@ interpret_float_suffix (cpp_reader *pfil fn++; } break; + case 'b': case 'B': + if (len > 2 + /* Except for bf16 / BF16 where case is significant. */ + && s[1] == (s[0] == 'b' ? 'f' : 'F') + && s[2] == '1' + && s[3] == '6' + && CPP_OPTION (pfile, cplusplus)) + { + bf16++; + len -= 3; + s += 3; + break; + } + return 0; case 'd': case 'D': d++; break; case 'l': case 'L': l++; break; case 'w': case 'W': w++; break; @@ -257,7 +272,7 @@ interpret_float_suffix (cpp_reader *pfil of N larger than can be represented in the return value. The caller is responsible for rejecting _FloatN suffixes where _FloatN is not supported on the chosen target. */ - if (f + d + l + w + q + fn + fnx > 1 || i > 1) + if (f + d + l + w + q + fn + fnx + bf16 > 1 || i > 1) return 0; if (fn_bits > CPP_FLOATN_MAX) return 0; @@ -295,6 +310,7 @@ interpret_float_suffix (cpp_reader *pfil q ? CPP_N_MD_Q : fn ? CPP_N_FLOATN | (fn_bits << CPP_FLOATN_SHIFT) : fnx ? CPP_N_FLOATNX | (fn_bits << CPP_FLOATN_SHIFT) : + bf16 ? CPP_N_BFLOAT16 : CPP_N_DEFAULT)); } --- libgcc/config/arm/sfp-machine.h.jj 2020-01-12 11:54:38.615380187 +0100 +++ libgcc/config/arm/sfp-machine.h 2022-09-28 19:02:51.922710542 +0200 @@ -22,6 +22,7 @@ typedef int __gcc_CMPtype __attribute__ /* According to RTABI, QNAN is only with the most significant bit of the significand set, and all other significand bits zero. */ #define _FP_NANFRAC_H _FP_QNANBIT_H +#define _FP_NANFRAC_B _FP_QNANBIT_B #define _FP_NANFRAC_S _FP_QNANBIT_S #define _FP_NANFRAC_D _FP_QNANBIT_D, 0 #define _FP_NANFRAC_Q _FP_QNANBIT_Q, 0, 0, 0 --- libgcc/config/aarch64/t-softfp.jj 2020-09-29 11:32:02.988602194 +0200 +++ libgcc/config/aarch64/t-softfp 2022-09-28 18:59:43.381246466 +0200 @@ -1,7 +1,7 @@ softfp_float_modes := tf softfp_int_modes := si di ti -softfp_extensions := sftf dftf hftf -softfp_truncations := tfsf tfdf tfhf +softfp_extensions := sftf dftf hftf bfsf +softfp_truncations := tfsf tfdf tfhf tfbf dfbf sfbf hfbf softfp_exclude_libgcc2 := n softfp_extras := fixhfti fixunshfti floattihf floatuntihf --- libgcc/config/aarch64/libgcc-softfp.ver.jj 2022-01-11 23:11:23.691271871 +0100 +++ libgcc/config/aarch64/libgcc-softfp.ver 2022-09-28 19:00:36.050537146 +0200 @@ -26,3 +26,12 @@ GCC_11.0 { __mulhc3 __trunctfhf2 } + +%inherit GCC_13.0.0 GCC_11.0.0 +GCC_13.0.0 { + __extendbfsf2 + __truncdfbf2 + __truncsfbf2 + __trunctfbf2 + __trunchfbf2 +} --- libgcc/config/aarch64/sfp-machine.h.jj 2022-01-11 23:11:23.691271871 +0100 +++ libgcc/config/aarch64/sfp-machine.h 2022-09-28 19:02:10.303270053 +0200 @@ -43,6 +43,7 @@ typedef int __gcc_CMPtype __attribute__ #define _FP_DIV_MEAT_Q(R,X,Y) _FP_DIV_MEAT_2_udiv(Q,R,X,Y) #define _FP_NANFRAC_H ((_FP_QNANBIT_H << 1) - 1) +#define _FP_NANFRAC_B ((_FP_QNANBIT_B << 1) - 1) #define _FP_NANFRAC_S ((_FP_QNANBIT_S << 1) - 1) #define _FP_NANFRAC_D ((_FP_QNANBIT_D << 1) - 1) #define _FP_NANFRAC_Q ((_FP_QNANBIT_Q << 1) - 1), -1 --- libgcc/config/i386/t-softfp.jj 2022-09-23 09:02:31.759659479 +0200 +++ libgcc/config/i386/t-softfp 2022-09-28 18:58:09.114520943 +0200 @@ -6,8 +6,9 @@ LIB2FUNCS_EXCLUDE += $(libgcc2-hf-functi libgcc2-hf-extras = $(addsuffix .c, $(libgcc2-hf-functions)) LIB2ADD += $(addprefix $(srcdir)/config/i386/, $(libgcc2-hf-extras)) -softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf -softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf +softfp_extensions := hfsf hfdf hftf hfxf sfdf sftf dftf xftf bfsf +softfp_truncations := tfhf xfhf dfhf sfhf tfsf dfsf tfdf tfxf \ + tfbf xfbf dfbf sfbf hfbf softfp_extras += eqhf2 @@ -20,6 +21,7 @@ CFLAGS-truncsfhf2.c += -msse2 CFLAGS-truncdfhf2.c += -msse2 CFLAGS-truncxfhf2.c += -msse2 CFLAGS-trunctfhf2.c += -msse2 +CFLAGS-trunchfbf2.c += -msse2 CFLAGS-eqhf2.c += -msse2 CFLAGS-_divhc3.c += -msse2 --- libgcc/config/i386/libgcc-glibc.ver.jj 2022-09-23 09:02:31.746659658 +0200 +++ libgcc/config/i386/libgcc-glibc.ver 2022-09-28 18:58:09.114520943 +0200 @@ -214,3 +214,13 @@ GCC_12.0.0 { __trunctfhf2 __truncxfhf2 } + +%inherit GCC_13.0.0 GCC_12.0.0 +GCC_13.0.0 { + __extendbfsf2 + __truncdfbf2 + __truncsfbf2 + __trunctfbf2 + __truncxfbf2 + __trunchfbf2 +} --- libgcc/config/i386/sfp-machine.h.jj 2022-09-23 09:02:31.747659644 +0200 +++ libgcc/config/i386/sfp-machine.h 2022-09-28 18:58:09.114520943 +0200 @@ -18,6 +18,7 @@ typedef int __gcc_CMPtype __attribute__ #define _FP_QNANNEGATEDP 0 #define _FP_NANSIGN_H 1 +#define _FP_NANSIGN_B 1 #define _FP_NANSIGN_S 1 #define _FP_NANSIGN_D 1 #define _FP_NANSIGN_E 1 --- libgcc/config/i386/64/sfp-machine.h.jj 2022-09-23 09:02:31.700660291 +0200 +++ libgcc/config/i386/64/sfp-machine.h 2022-09-28 18:58:09.114520943 +0200 @@ -14,6 +14,7 @@ typedef unsigned int UTItype __attribute #define _FP_DIV_MEAT_Q(R,X,Y) _FP_DIV_MEAT_2_udiv(Q,R,X,Y) #define _FP_NANFRAC_H _FP_QNANBIT_H +#define _FP_NANFRAC_B _FP_QNANBIT_B #define _FP_NANFRAC_S _FP_QNANBIT_S #define _FP_NANFRAC_D _FP_QNANBIT_D #define _FP_NANFRAC_E _FP_QNANBIT_E, 0 --- libgcc/config/i386/32/sfp-machine.h.jj 2022-09-23 09:02:31.683660526 +0200 +++ libgcc/config/i386/32/sfp-machine.h 2022-09-28 18:58:09.115520929 +0200 @@ -87,6 +87,7 @@ #define _FP_DIV_MEAT_Q(R,X,Y) _FP_DIV_MEAT_4_udiv(Q,R,X,Y) #define _FP_NANFRAC_H _FP_QNANBIT_H +#define _FP_NANFRAC_B _FP_QNANBIT_B #define _FP_NANFRAC_S _FP_QNANBIT_S #define _FP_NANFRAC_D _FP_QNANBIT_D, 0 /* Even if XFmode is 12byte, we have to pad it to --- libgcc/soft-fp/brain.h.jj 2022-09-28 18:58:09.113520956 +0200 +++ libgcc/soft-fp/brain.h 2022-09-28 18:58:09.113520956 +0200 @@ -0,0 +1,172 @@ +/* Software floating-point emulation. + Definitions for Brain Floating Point format (bfloat16). + Copyright (C) 1997-2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef SOFT_FP_BRAIN_H +#define SOFT_FP_BRAIN_H 1 + +#if _FP_W_TYPE_SIZE < 32 +# error "Here's a nickel kid. Go buy yourself a real computer." +#endif + +#define _FP_FRACTBITS_B (_FP_W_TYPE_SIZE) + +#define _FP_FRACTBITS_DW_B (_FP_W_TYPE_SIZE) + +#define _FP_FRACBITS_B 8 +#define _FP_FRACXBITS_B (_FP_FRACTBITS_B - _FP_FRACBITS_B) +#define _FP_WFRACBITS_B (_FP_WORKBITS + _FP_FRACBITS_B) +#define _FP_WFRACXBITS_B (_FP_FRACTBITS_B - _FP_WFRACBITS_B) +#define _FP_EXPBITS_B 8 +#define _FP_EXPBIAS_B 127 +#define _FP_EXPMAX_B 255 + +#define _FP_QNANBIT_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2)) +#define _FP_QNANBIT_SH_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-2+_FP_WORKBITS)) +#define _FP_IMPLBIT_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1)) +#define _FP_IMPLBIT_SH_B ((_FP_W_TYPE) 1 << (_FP_FRACBITS_B-1+_FP_WORKBITS)) +#define _FP_OVERFLOW_B ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_B)) + +#define _FP_WFRACBITS_DW_B (2 * _FP_WFRACBITS_B) +#define _FP_WFRACXBITS_DW_B (_FP_FRACTBITS_DW_B - _FP_WFRACBITS_DW_B) +#define _FP_HIGHBIT_DW_B \ + ((_FP_W_TYPE) 1 << (_FP_WFRACBITS_DW_B - 1) % _FP_W_TYPE_SIZE) + +/* The implementation of _FP_MUL_MEAT_B and _FP_DIV_MEAT_B should be + chosen by the target machine. */ + +typedef float BFtype __attribute__ ((mode (BF))); + +union _FP_UNION_B +{ + BFtype flt; + struct _FP_STRUCT_LAYOUT + { +#if __BYTE_ORDER == __BIG_ENDIAN + unsigned sign : 1; + unsigned exp : _FP_EXPBITS_B; + unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0); +#else + unsigned frac : _FP_FRACBITS_B - (_FP_IMPLBIT_B != 0); + unsigned exp : _FP_EXPBITS_B; + unsigned sign : 1; +#endif + } bits; +}; + +#define FP_DECL_B(X) _FP_DECL (1, X) +#define FP_UNPACK_RAW_B(X, val) _FP_UNPACK_RAW_1 (B, X, (val)) +#define FP_UNPACK_RAW_BP(X, val) _FP_UNPACK_RAW_1_P (B, X, (val)) +#define FP_PACK_RAW_B(val, X) _FP_PACK_RAW_1 (B, (val), X) +#define FP_PACK_RAW_BP(val, X) \ + do \ + { \ + if (!FP_INHIBIT_RESULTS) \ + _FP_PACK_RAW_1_P (B, (val), X); \ + } \ + while (0) + +#define FP_UNPACK_B(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1 (B, X, (val)); \ + _FP_UNPACK_CANONICAL (B, 1, X); \ + } \ + while (0) + +#define FP_UNPACK_BP(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1_P (B, X, (val)); \ + _FP_UNPACK_CANONICAL (B, 1, X); \ + } \ + while (0) + +#define FP_UNPACK_SEMIRAW_B(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1 (B, X, (val)); \ + _FP_UNPACK_SEMIRAW (B, 1, X); \ + } \ + while (0) + +#define FP_UNPACK_SEMIRAW_BP(X, val) \ + do \ + { \ + _FP_UNPACK_RAW_1_P (B, X, (val)); \ + _FP_UNPACK_SEMIRAW (B, 1, X); \ + } \ + while (0) + +#define FP_PACK_B(val, X) \ + do \ + { \ + _FP_PACK_CANONICAL (B, 1, X); \ + _FP_PACK_RAW_1 (B, (val), X); \ + } \ + while (0) + +#define FP_PACK_BP(val, X) \ + do \ + { \ + _FP_PACK_CANONICAL (B, 1, X); \ + if (!FP_INHIBIT_RESULTS) \ + _FP_PACK_RAW_1_P (B, (val), X); \ + } \ + while (0) + +#define FP_PACK_SEMIRAW_B(val, X) \ + do \ + { \ + _FP_PACK_SEMIRAW (B, 1, X); \ + _FP_PACK_RAW_1 (B, (val), X); \ + } \ + while (0) + +#define FP_PACK_SEMIRAW_BP(val, X) \ + do \ + { \ + _FP_PACK_SEMIRAW (B, 1, X); \ + if (!FP_INHIBIT_RESULTS) \ + _FP_PACK_RAW_1_P (B, (val), X); \ + } \ + while (0) + +#define FP_TO_INT_B(r, X, rsz, rsg) _FP_TO_INT (B, 1, (r), X, (rsz), (rsg)) +#define FP_TO_INT_ROUND_B(r, X, rsz, rsg) \ + _FP_TO_INT_ROUND (B, 1, (r), X, (rsz), (rsg)) +#define FP_FROM_INT_B(X, r, rs, rt) _FP_FROM_INT (B, 1, X, (r), (rs), rt) + +/* BFmode arithmetic is not implemented. */ + +#define _FP_FRAC_HIGH_B(X) _FP_FRAC_HIGH_1 (X) +#define _FP_FRAC_HIGH_RAW_B(X) _FP_FRAC_HIGH_1 (X) +#define _FP_FRAC_HIGH_DW_B(X) _FP_FRAC_HIGH_1 (X) + +#define FP_CMP_EQ_B(r, X, Y, ex) _FP_CMP_EQ (B, 1, (r), X, Y, (ex)) + +#endif /* !SOFT_FP_BRAIN_H */ --- libgcc/soft-fp/truncsfbf2.c.jj 2022-09-28 18:58:09.113520956 +0200 +++ libgcc/soft-fp/truncsfbf2.c 2022-09-28 18:58:09.113520956 +0200 @@ -0,0 +1,48 @@ +/* Software floating-point emulation. + Truncate IEEE single into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "single.h" + +BFtype +__truncsfbf2 (SFtype a) +{ + FP_DECL_EX; + FP_DECL_S (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_S (A, a); + FP_TRUNC (B, S, 1, 1, R, A); + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/truncdfbf2.c.jj 2022-09-28 18:58:09.114520943 +0200 +++ libgcc/soft-fp/truncdfbf2.c 2022-09-28 18:58:09.114520943 +0200 @@ -0,0 +1,52 @@ +/* Software floating-point emulation. + Truncate IEEE double into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "double.h" + +BFtype +__truncdfbf2 (DFtype a) +{ + FP_DECL_EX; + FP_DECL_D (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_D (A, a); +#if _FP_W_TYPE_SIZE < _FP_FRACBITS_D + FP_TRUNC (B, D, 1, 2, R, A); +#else + FP_TRUNC (B, D, 1, 1, R, A); +#endif + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/truncxfbf2.c.jj 2022-09-28 18:58:09.113520956 +0200 +++ libgcc/soft-fp/truncxfbf2.c 2022-09-28 18:58:09.113520956 +0200 @@ -0,0 +1,52 @@ +/* Software floating-point emulation. + Truncate IEEE extended into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "extended.h" + +BFtype +__truncxfbf2 (XFtype a) +{ + FP_DECL_EX; + FP_DECL_E (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_E (A, a); +#if _FP_W_TYPE_SIZE < 64 + FP_TRUNC (B, E, 1, 4, R, A); +#else + FP_TRUNC (B, E, 1, 2, R, A); +#endif + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/trunctfbf2.c.jj 2022-09-28 18:58:09.114520943 +0200 +++ libgcc/soft-fp/trunctfbf2.c 2022-09-28 18:58:09.114520943 +0200 @@ -0,0 +1,52 @@ +/* Software floating-point emulation. + Truncate IEEE quad into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "quad.h" + +BFtype +__trunctfbf2 (TFtype a) +{ + FP_DECL_EX; + FP_DECL_Q (A); + FP_DECL_B (R); + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_SEMIRAW_Q (A, a); +#if _FP_W_TYPE_SIZE < 64 + FP_TRUNC (B, Q, 1, 4, R, A); +#else + FP_TRUNC (B, Q, 1, 2, R, A); +#endif + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/trunchfbf2.c.jj 2022-09-28 18:58:09.114520943 +0200 +++ libgcc/soft-fp/trunchfbf2.c 2022-09-28 18:58:09.114520943 +0200 @@ -0,0 +1,58 @@ +/* Software floating-point emulation. + Truncate IEEE half into bfloat16. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "brain.h" +#include "half.h" +#include "single.h" + +/* BFtype and HFtype are unordered, neither is a superset or subset + of each other. Convert HFtype to SFtype (lossless) and then + truncate to BFtype. */ + +BFtype +__trunchfbf2 (HFtype a) +{ + FP_DECL_EX; + FP_DECL_H (A); + FP_DECL_S (B); + FP_DECL_B (R); + SFtype b; + BFtype r; + + FP_INIT_ROUNDMODE; + FP_UNPACK_RAW_H (A, a); + FP_EXTEND (S, H, 1, 1, B, A); + FP_PACK_RAW_S (b, B); + FP_UNPACK_SEMIRAW_S (B, b); + FP_TRUNC (B, S, 1, 1, R, B); + FP_PACK_SEMIRAW_B (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/truncbfhf2.c.jj 2022-09-28 18:58:09.113520956 +0200 +++ libgcc/soft-fp/truncbfhf2.c 2022-09-28 18:58:09.113520956 +0200 @@ -0,0 +1,75 @@ +/* Software floating-point emulation. + Truncate bfloat16 into IEEE half. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include "soft-fp.h" +#include "half.h" +#include "brain.h" +#include "single.h" + +/* BFtype and HFtype are unordered, neither is a superset or subset + of each other. Convert BFtype to SFtype (lossless) and then + truncate to HFtype. */ + +HFtype +__truncbfhf2 (BFtype a) +{ + FP_DECL_EX; + FP_DECL_H (A); + FP_DECL_S (B); + FP_DECL_B (R); + SFtype b; + HFtype r; + + FP_INIT_ROUNDMODE; + /* Optimize BFtype to SFtype conversion to simple left shift + by 16 if possible, we don't need to raise exceptions on sNaN + here as the SFtype to HFtype truncation should do that too. */ + if (sizeof (BFtype) == 2 + && sizeof (unsigned short) == 2 + && sizeof (SFtype) == 4 + && sizeof (unsigned int) == 4) + { + union { BFtype a; unsigned short b; } u1; + union { SFtype a; unsigned int b; } u2; + u1.a = a; + u2.b = (u1.b << 8) << 8; + b = u2.a; + } + else + { + FP_UNPACK_RAW_B (A, a); + FP_EXTEND (S, B, 1, 1, B, A); + FP_PACK_RAW_S (b, B); + } + FP_UNPACK_SEMIRAW_S (B, b); + FP_TRUNC (H, S, 1, 1, R, B); + FP_PACK_SEMIRAW_H (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libgcc/soft-fp/extendbfsf2.c.jj 2022-09-28 18:58:09.114520943 +0200 +++ libgcc/soft-fp/extendbfsf2.c 2022-09-28 18:58:09.114520943 +0200 @@ -0,0 +1,49 @@ +/* Software floating-point emulation. + Return an bfloat16 converted to IEEE single + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + In addition to the permissions in the GNU Lesser General Public + License, the Free Software Foundation gives you unlimited + permission to link the compiled version of this file into + combinations with other programs, and to distribute those + combinations without any restriction coming from the use of this + file. (The Lesser General Public License restrictions do apply in + other respects; for example, they cover modification of the file, + and distribution when not linked into a combine executable.) + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define FP_NO_EXACT_UNDERFLOW +#include "soft-fp.h" +#include "brain.h" +#include "single.h" + +SFtype +__extendbfsf2 (BFtype a) +{ + FP_DECL_EX; + FP_DECL_B (A); + FP_DECL_S (R); + SFtype r; + + FP_INIT_EXCEPTIONS; + FP_UNPACK_RAW_B (A, a); + FP_EXTEND (S, B, 1, 1, R, A); + FP_PACK_RAW_S (r, R); + FP_HANDLE_EXCEPTIONS; + + return r; +} --- libiberty/cp-demangle.h.jj 2022-09-27 08:03:27.142982423 +0200 +++ libiberty/cp-demangle.h 2022-09-29 12:42:47.291727886 +0200 @@ -180,7 +180,7 @@ d_advance (struct d_info *di, int i) extern const struct demangle_operator_info cplus_demangle_operators[]; #endif -#define D_BUILTIN_TYPE_COUNT (35) +#define D_BUILTIN_TYPE_COUNT (36) CP_STATIC_IF_GLIBCPP_V3 const struct demangle_builtin_type_info --- libiberty/cp-demangle.c.jj 2022-09-27 08:03:27.141982437 +0200 +++ libiberty/cp-demangle.c 2022-09-29 13:04:57.083526204 +0200 @@ -2489,6 +2489,7 @@ cplus_demangle_builtin_types[D_BUILTIN_T /* 33 */ { NL ("decltype(nullptr)"), NL ("decltype(nullptr)"), D_PRINT_DEFAULT }, /* 34 */ { NL ("_Float"), NL ("_Float"), D_PRINT_FLOAT }, + /* 35 */ { NL ("std::bfloat16_t"), NL ("std::bfloat16_t"), D_PRINT_FLOAT }, }; CP_STATIC_IF_GLIBCPP_V3 @@ -2753,8 +2754,20 @@ cplus_demangle_type (struct d_info *di) case 'F': /* DF_ - _Float. - DFx - _Floatx. */ + DFx - _Floatx + DFb16_ - std::bfloat16_t. */ { + if (d_peek_char (di) == 'b') + { + d_advance (di, 1); + if (d_number (di) != 16 || d_peek_char (di) != '_') + return NULL; + d_advance (di, 1); + ret = d_make_builtin_type (di, + &cplus_demangle_builtin_types[35]); + di->expansion += ret->u.s_builtin.type->len; + break; + } int arg = d_number (di); char buf[12]; char suffix = 0; --- libiberty/testsuite/demangle-expected.jj 2022-09-27 08:03:27.168982071 +0200 +++ libiberty/testsuite/demangle-expected 2022-09-29 12:49:02.181597532 +0200 @@ -1249,6 +1249,10 @@ xxx _Z3xxxDF32xDF64xDF128xCDF32xVb xxx(_Float32x, _Float64x, _Float128x, _Float32x _Complex, bool volatile) xxx +--format=auto --no-params +_Z3xxxDFb16_ +xxx(std::bfloat16_t) +xxx # https://sourceware.org/bugzilla/show_bug.cgi?id=16817 --format=auto --no-params _QueueNotification_QueueController__$4PPPPPPPM_A_INotice___Z