From patchwork Sat Nov 19 08:52:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Jelinek X-Patchwork-Id: 23237 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp618657wrr; Sat, 19 Nov 2022 00:53:58 -0800 (PST) X-Google-Smtp-Source: AA0mqf6CufHy6iHXSrDPX3g+WN6IfyN5Cfvc1DoTV1hmc2gLw9J44YrWkbOMNsjbLZWxAbMejan1 X-Received: by 2002:a17:907:1dda:b0:7ac:db50:90ed with SMTP id og26-20020a1709071dda00b007acdb5090edmr8916723ejc.487.1668848038307; Sat, 19 Nov 2022 00:53:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668848038; cv=none; d=google.com; s=arc-20160816; b=I5z7h1SzkEwUANNKepFnTiRUZ2oFqVKoUI8XylMV0pc0B0texNVRDVGAONo/3RFg4K D0zpULjFhPuDXNGR9jJ1oilmW+EiRQYEWyKN3bgfqV/YFN+3fGUGBn1I2bgCSm3aKj5U DWSYzbKUDlE7pMGixBFxkv1s71oxKaXjKutIqoWDwmpPsXoJ5BmHwuwF2mvinx+56h1b tgX3bi2z3xd6iAPULCJNvFyl4OBGtrtbRjZU8B5qWIC0+TTO5/May5PQTAGDz/QP6TTG K+LE+G1IWhoKYsHxkwWBXsxa/e4ziBOEnLIAHd6RMQXnbmWUkka1WmZpJdd4uEAIwgmL zDFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-disposition:in-reply-to:mime-version:references:message-id :subject:cc:to:date:dmarc-filter:delivered-to:dkim-signature :dkim-filter; bh=JgfwCNO5AgBd3raGHvOziMnVqyWSpXlSenvz2FcO8js=; b=pQVQAWnvasR3dfHM4NCh6RCZelJpeVuvF6ntleHACDUjKV2j0B5lhN0vXbJifam9SR X5lAbK6OSAX/jM/czdOAtv9ZM4lSRXoj9fVMpInt6dvqQYewmg6amdd4aN9ETJPzwlcL in9wuP3IISRkMuhD9Bhh6//Uwj7fom/BAqlvHEFS+w99KhcCEtFnrOl17CnAzHqm2WXm m1s0+nr3PZWmUGoSlUFTVnOMgQgvGCbxyvW/fV8Zjs4vAKwEd2bk0xTu/EWP7DdqNEaE htXG/bn9z9unL0GM2VK7J9eQgGsRfUuQVRc6PZ990hyMlR3hxIN3ku7a+ZdOVrPzaCTa hc4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=I4UQNAu8; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id g23-20020a170906349700b007a9fc5516f6si4040081ejb.307.2022.11.19.00.53.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Nov 2022 00:53:58 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=I4UQNAu8; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1F8FA3858C5E for ; Sat, 19 Nov 2022 08:53:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 1F8FA3858C5E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1668848037; bh=JgfwCNO5AgBd3raGHvOziMnVqyWSpXlSenvz2FcO8js=; h=Date:To:Cc:Subject:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=I4UQNAu8EpYFQIFU7Rc5fR1R+VL91dcBzEBiyZMAdPEwTsdZHm0ZXTPDkYyZc6XRE KAhUKp+5XJpPCvzl2elinBdTHm0Is8LdWSGhh4L0AabVYPv93G32UqQd+iBxXm7I5A daxGlgURqJNmrxWtGlcxEvA3c3WU+Q/dPCWc7EPo= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by sourceware.org (Postfix) with ESMTPS id AA6FC3858D1E for ; Sat, 19 Nov 2022 08:53:02 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AA6FC3858D1E Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-82-8uFOEswnNVGT6VfvVc09ww-1; Sat, 19 Nov 2022 03:53:01 -0500 X-MC-Unique: 8uFOEswnNVGT6VfvVc09ww-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.rdu2.redhat.com [10.11.54.6]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B5FD53C0D198; Sat, 19 Nov 2022 08:53:00 +0000 (UTC) Received: from tucnak.zalov.cz (unknown [10.39.192.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 72C2C2166B26; Sat, 19 Nov 2022 08:53:00 +0000 (UTC) Received: from tucnak.zalov.cz (localhost [127.0.0.1]) by tucnak.zalov.cz (8.17.1/8.17.1) with ESMTPS id 2AJ8qtfk3761120 (version=TLSv1.3 cipher=TLS_AES_256_GCM_SHA384 bits=256 verify=NOT); Sat, 19 Nov 2022 09:52:56 +0100 Received: (from jakub@localhost) by tucnak.zalov.cz (8.17.1/8.17.1/Submit) id 2AJ8qtmw3761119; Sat, 19 Nov 2022 09:52:55 +0100 Date: Sat, 19 Nov 2022 09:52:54 +0100 To: Uros Bizjak Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] i386: Outline fast BF -> SF conversion and fix up sNaN handling in it [PR107628] Message-ID: References: MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 3.1 on 10.11.54.6 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Disposition: inline X-Spam-Status: No, score=-3.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Jakub Jelinek via Gcc-patches From: Jakub Jelinek Reply-To: Jakub Jelinek Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749914000441277142?= X-GMAIL-MSGID: =?utf-8?q?1749914000441277142?= On Fri, Oct 21, 2022 at 10:23:14AM +0200, Uros Bizjak wrote: > OK, but now we have two more copies of a function that effectively > extends BF to SF. Can you please split this utility function out and > use it here and in cbranchbf4/cstorebf4? I'm talking about this part: > > + op = gen_lowpart (HImode, op1); > + if (CONST_INT_P (op)) > + op = simplify_const_unary_operation (FLOAT_EXTEND, SFmode, > + op1, BFmode); > + else > + { > + rtx t1 = gen_reg_rtx (SImode); > + emit_insn (gen_zero_extendhisi2 (t1, op)); > + emit_insn (gen_ashlsi3 (t1, t1, GEN_INT (16))); > + op = gen_lowpart (SFmode, t1); > + } > > Taking this a bit further, it looks like a generic function to extend > BF to SF, when extendbfsf2 named function is not defined. > > The above could be a follow-up patch, the proposed patch is OK. Sorry for the delay, only got to this now. And I'm fixing the sNaN handling in it too. If the argument is a BFmode sNaN constant, we want in this case just a SFmode sNaN constant, but simplify_const_unary_operation (FLOAT_EXTEND, ...) in that case returns NULL (as normally conversions of a sNaN to some other float type should raise an exception). In this case we want to bypass that, as we know the sNaN will be used immediately in the SFmode comparison a few instructions later. The patch fixes it by just simplifying the lowpart to HImode and its zero extension to SImode, then force into a pseudo and do the left shift and subreg to SFmode on the pseudo. CSE or combine can handle it later. Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2022-11-19 Jakub Jelinek PR target/107628 * config/i386/i386-protos.h (ix86_expand_fast_convert_bf_to_sf): Declare. * config/i386/i386-expand.cc (ix86_expand_fast_convert_bf_to_sf): New function. * config/i386/i386.md (cbranchbf4, cstorebf4): Use it. * gcc.target/i386/pr107628.c: New test. Jakub --- gcc/config/i386/i386-protos.h.jj 2022-10-10 09:31:57.234987578 +0200 +++ gcc/config/i386/i386-protos.h 2022-11-18 12:21:26.975706528 +0100 @@ -227,6 +227,7 @@ extern void ix86_expand_atomic_fetch_op_ bool, bool); extern void ix86_expand_cmpxchg_loop (rtx *, rtx, rtx, rtx, rtx, rtx, bool, rtx_code_label *); +extern rtx ix86_expand_fast_convert_bf_to_sf (rtx); #ifdef TREE_CODE extern void init_cumulative_args (CUMULATIVE_ARGS *, tree, rtx, tree, int); --- gcc/config/i386/i386-expand.cc.jj 2022-11-11 08:15:45.452186618 +0100 +++ gcc/config/i386/i386-expand.cc 2022-11-18 12:35:16.646193028 +0100 @@ -24138,4 +24138,30 @@ ix86_expand_cmpxchg_loop (rtx *ptarget_b *ptarget_bool = target_bool; } +/* Convert a BFmode VAL to SFmode without signaling sNaNs. + This is done by returning SF SUBREG of ((HI SUBREG) (VAL)) << 16. */ + +rtx +ix86_expand_fast_convert_bf_to_sf (rtx val) +{ + rtx op = gen_lowpart (HImode, val), ret; + if (CONST_INT_P (op)) + { + ret = simplify_const_unary_operation (FLOAT_EXTEND, SFmode, + val, BFmode); + if (ret) + return ret; + /* FLOAT_EXTEND simplification will fail if VAL is a sNaN. */ + ret = gen_reg_rtx (SImode); + emit_move_insn (ret, GEN_INT (INTVAL (op) & 0xffff)); + } + else + { + ret = gen_reg_rtx (SImode); + emit_insn (gen_zero_extendhisi2 (ret, op)); + } + emit_insn (gen_ashlsi3 (ret, ret, GEN_INT (16))); + return gen_lowpart (SFmode, ret); +} + #include "gt-i386-expand.h" --- gcc/config/i386/i386.md.jj 2022-11-07 10:30:42.727630162 +0100 +++ gcc/config/i386/i386.md 2022-11-18 12:22:25.172898912 +0100 @@ -1668,28 +1668,8 @@ (define_expand "cbranchbf4" (pc)))] "" { - rtx op1 = gen_lowpart (HImode, operands[1]); - if (CONST_INT_P (op1)) - op1 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode, - operands[1], BFmode); - else - { - rtx t1 = gen_reg_rtx (SImode); - emit_insn (gen_zero_extendhisi2 (t1, op1)); - emit_insn (gen_ashlsi3 (t1, t1, GEN_INT (16))); - op1 = gen_lowpart (SFmode, t1); - } - rtx op2 = gen_lowpart (HImode, operands[2]); - if (CONST_INT_P (op2)) - op2 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode, - operands[2], BFmode); - else - { - rtx t2 = gen_reg_rtx (SImode); - emit_insn (gen_zero_extendhisi2 (t2, op2)); - emit_insn (gen_ashlsi3 (t2, t2, GEN_INT (16))); - op2 = gen_lowpart (SFmode, t2); - } + rtx op1 = ix86_expand_fast_convert_bf_to_sf (operands[1]); + rtx op2 = ix86_expand_fast_convert_bf_to_sf (operands[2]); do_compare_rtx_and_jump (op1, op2, GET_CODE (operands[0]), 0, SFmode, NULL_RTX, NULL, as_a (operands[3]), @@ -1723,28 +1703,8 @@ (define_expand "cstorebf4" (const_int 0)]))] "" { - rtx op1 = gen_lowpart (HImode, operands[2]); - if (CONST_INT_P (op1)) - op1 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode, - operands[2], BFmode); - else - { - rtx t1 = gen_reg_rtx (SImode); - emit_insn (gen_zero_extendhisi2 (t1, op1)); - emit_insn (gen_ashlsi3 (t1, t1, GEN_INT (16))); - op1 = gen_lowpart (SFmode, t1); - } - rtx op2 = gen_lowpart (HImode, operands[3]); - if (CONST_INT_P (op2)) - op2 = simplify_const_unary_operation (FLOAT_EXTEND, SFmode, - operands[3], BFmode); - else - { - rtx t2 = gen_reg_rtx (SImode); - emit_insn (gen_zero_extendhisi2 (t2, op2)); - emit_insn (gen_ashlsi3 (t2, t2, GEN_INT (16))); - op2 = gen_lowpart (SFmode, t2); - } + rtx op1 = ix86_expand_fast_convert_bf_to_sf (operands[2]); + rtx op2 = ix86_expand_fast_convert_bf_to_sf (operands[3]); rtx res = emit_store_flag_force (operands[0], GET_CODE (operands[1]), op1, op2, SFmode, 0, 1); if (!rtx_equal_p (res, operands[0])) --- gcc/testsuite/gcc.target/i386/pr107628.c.jj 2022-11-18 13:15:06.859061627 +0100 +++ gcc/testsuite/gcc.target/i386/pr107628.c 2022-11-18 13:14:51.797270220 +0100 @@ -0,0 +1,11 @@ +/* PR target/107628 */ +/* { dg-do compile } */ +/* { dg-options "-fsignaling-nans -msse2" } */ + +typedef __bf16 __attribute__((__vector_size__ (2))) V; + +void +foo (V v) +{ + v < (V) (short) 65436; +}