From patchwork Sun Jul 30 20:12:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 128274 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:918b:0:b0:3e4:2afc:c1 with SMTP id s11csp1648093vqg; Sun, 30 Jul 2023 13:13:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlEdbUEyoNJCJYPlEP2KAkzjpie3Gzjd67kI8m3ZoYFNvZbvExwu/eScxIonwaE4KWM8DJ9G X-Received: by 2002:a17:906:cc58:b0:993:e9b8:90ee with SMTP id mm24-20020a170906cc5800b00993e9b890eemr4685221ejb.18.1690748032560; Sun, 30 Jul 2023 13:13:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690748032; cv=none; d=google.com; s=arc-20160816; b=l7bxP8AbrL/IqGyagIi1Qz3GOKDNytde+j/5PU2lpt0nwP/ydWqmLgD7+htiGmxBJS aG9Cs3hpErGuq3XD11wmMV5JgU3/W4zv2XXvtiUAnjqadIEBdARk3LCJppA7BNstbTep FdjDf2iOvkF646UbPD8dQ+tNnDT9WsRj5rQ0RgT6AaWpYMbcld6zXkfMmfd/TU2KQZp/ Ma7pqHYbl0Doa6OmW1C6/RCOWccfZLvCpaYWa3uaR9CGe8Kh8yBvGJxGCm/DgHHiNjGh LiI9hx3L2iDLqABlLIVN9XbjHHbMeaEfylL9dcZVbPSfyQMwttrJfI2eHeItZyLuvfLT A+7A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:cc:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=hEFsfNFMV4fLrHpDs7THUT47ijJVlRNTr0G/CJjRpis=; fh=6/HTkMeWrIaDt6E0c7/BhyfIxkWPJxua8Za1WydlAgg=; b=HFRNxP+ZmZf3tAbfwsPXYa7/7cpL1EwN76ra9datGFdS6N0h7Y7O36K34lMTCQgddc 5C1mf2rC3novKkVBLPKKZNRupzijZxIsDMBbW2JbUZ8ryAtL0Mn9ynN7jdO+umRwxyFs yID/VO/KXbOofGKas6dBtISzt0ntJqKr9k+Ch6tk2RAiAjvpBdg/Yqq3nKLLKFGM5QtT jcGXdTytA3Pb2C3GVCmnWnmi7gi9veXf0ETr15nEoRkfu/Imj4Ye1Dv/nsNsVXPTrva/ OIWlUzkBbWms/7G+p8M7qLjAueICv6Q+tlmiMhUxco1WL3WqAts9XcJ4jKWSaTEGuhtW zXqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=u7g5gVMG; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id y5-20020a170906558500b0099bc8f939b8si2094696ejp.265.2023.07.30.13.13.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 30 Jul 2023 13:13:52 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=u7g5gVMG; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4580E3858C31 for ; Sun, 30 Jul 2023 20:13:51 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 4580E3858C31 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1690748031; bh=hEFsfNFMV4fLrHpDs7THUT47ijJVlRNTr0G/CJjRpis=; h=Date:Subject:To:Cc:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=u7g5gVMGPMDS686p40bQ9ANJniLv0Osms2V4Flcw4334kT4TdFtIk+9fLgRjXYuZl hYNMQyAXmeMZWNfM+foSG12L7cBJ/z6Owrzph+HVc8eOdcuEn7z4Qqma+YkhBu5twH AOR1g3qpAKxW5cRgFizZRVtJ+IF9tfrP92sZgmWY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 554303858D38 for ; Sun, 30 Jul 2023 20:13:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 554303858D38 Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-31758eb5db8so4078393f8f.2 for ; Sun, 30 Jul 2023 13:13:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690747985; x=1691352785; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=hEFsfNFMV4fLrHpDs7THUT47ijJVlRNTr0G/CJjRpis=; b=kP0tLgHKZmlH+urfb7dIGoEVzg+HnZyo3phzjT193QePzzRmF7Cp/xSeCNowXtUaQC uxivYgOyUJ2wXBjdUdU2InaY97nZKqCk1jO16ugeFOwaLY5y2gBgr5yX7gc4Ev16wAlJ b7ZpyOafe3WuVtyK5KjI6o1FxB2WoEKUfDwFXyfUnNKVu/GPBjdqm8F6UZQKNa7l+63p rsQyc3umqZomnLBsby+havihjbASsk7dYfu9DdySZ4MMkQWS9si5aDN6RTFrLCdYaAk7 I86aD1bzeaoIcyIVYPL5K2Bz0l+R+V7z78wrovuJKUglftaOur19L4YSCkdLITv/PMSz V6Xw== X-Gm-Message-State: ABy/qLY9+k7f+QEzUPOJRCzrM/e9f0gUI1Tt1Cg04nNWA0EgwaJE88Q/ UEfx3x30ZD35xi+p39d/lgbHXfLvCQCZFpJLZO+WhjLxCEpxQw== X-Received: by 2002:adf:db44:0:b0:313:df08:7b7e with SMTP id f4-20020adfdb44000000b00313df087b7emr5539391wrj.14.1690747984410; Sun, 30 Jul 2023 13:13:04 -0700 (PDT) MIME-Version: 1.0 Date: Sun, 30 Jul 2023 22:12:53 +0200 Message-ID: Subject: [RFC PATCH] i386: Do not sanitize upper part of V2SFmode reg with -fno-trapping-math [PR110832] To: "gcc-patches@gcc.gnu.org" Cc: Richard Biener , Jan Hubicka , Hongtao Liu X-Spam-Status: No, score=-8.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772877809029111758 X-GMAIL-MSGID: 1772877809029111758 Also introduce -m[no-]mmxfp-with-sse option to disable trapping V2SF named patterns in order to avoid generation of partial vector V4SFmode trapping instructions. The new option is enabled by default, because even with sanitization, a small but consistent speed up of 2 to 3% with Polyhedron capacita benchmark can be achieved vs. scalar code. Using -fno-trapping-math improves Polyhedron capacita runtime 8 to 9% vs. scalar code. This is what clang does by default, as it defaults to -fno-trapping-math. PR target/110832 gcc/ChangeLog: * config/i386/i386.h (TARGET_MMXFP_WITH_SSE): New macro. * config/i386/i386/opt (mmmxfp-with-sse): New option. * config/i386/mmx.md (movq__to_sse): Do not sanitize upper part of V2SFmode register with -fno-trapping-math. (v2sf3): Enable for TARGET_MMXFP_WITH_SSE. (divv2sf3): Ditto. (v2sf3): Ditto. (sqrtv2sf2): Ditto. (*mmx_haddv2sf3_low): Ditto. (*mmx_hsubv2sf3_low): Ditto. (vec_addsubv2sf3): Ditto. (vec_cmpv2sfv2si): Ditto. (vcondv2sf): Ditto. (fmav2sf4): Ditto. (fmsv2sf4): Ditto. (fnmav2sf4): Ditto. (fnmsv2sf4): Ditto. (fix_truncv2sfv2si2): Ditto. (fixuns_truncv2sfv2si2): Ditto. (floatv2siv2sf2): Ditto. (floatunsv2siv2sf2): Ditto. (nearbyintv2sf2): Ditto. (rintv2sf2): Ditto. (lrintv2sfv2si2): Ditto. (ceilv2sf2): Ditto. (lceilv2sfv2si2): Ditto. (floorv2sf2): Ditto. (lfloorv2sfv2si2): Ditto. (btruncv2sf2): Ditto. (roundv2sf2): Ditto. (lroundv2sfv2si2): Ditto. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index ef342fcee9b..af72b6c48a9 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -50,6 +50,7 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #define TARGET_16BIT_P(x) TARGET_CODE16_P(x) #define TARGET_MMX_WITH_SSE (TARGET_64BIT && TARGET_SSE2) +#define TARGET_MMXFP_WITH_SSE (TARGET_MMX_WITH_SSE && ix86_mmxfp_with_sse) #include "config/vxworks-dummy.h" diff --git a/gcc/config/i386/i386.opt b/gcc/config/i386/i386.opt index 1cc8563477a..1b65fed5daf 100644 --- a/gcc/config/i386/i386.opt +++ b/gcc/config/i386/i386.opt @@ -670,6 +670,10 @@ m3dnowa Target Mask(ISA_3DNOW_A) Var(ix86_isa_flags) Save Support Athlon 3Dnow! built-in functions. +mmmxfp-with-sse +Target Var(ix86_mmxfp_with_sse) Init(1) +Enable MMX floating point vectors in SSE registers + msse Target Mask(ISA_SSE) Var(ix86_isa_flags) Save Support MMX and SSE built-in functions and code generation. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 896af76a33f..0555da9022b 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -597,7 +597,18 @@ (define_expand "movq__to_sse" (match_operand:V2FI 1 "nonimmediate_operand") (match_dup 2)))] "TARGET_SSE2" - "operands[2] = CONST0_RTX (mode);") +{ + if (mode == V2SFmode + && !flag_trapping_math) + { + rtx op1 = force_reg (mode, operands[1]); + emit_move_insn (operands[0], lowpart_subreg (mode, + op1, mode)); + DONE; + } + + operands[2] = CONST0_RTX (mode); +}) ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; @@ -650,7 +661,7 @@ (define_expand "v2sf3" (plusminusmult:V2SF (match_operand:V2SF 1 "nonimmediate_operand") (match_operand:V2SF 2 "nonimmediate_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx op2 = gen_reg_rtx (V4SFmode); rtx op1 = gen_reg_rtx (V4SFmode); @@ -728,7 +739,7 @@ (define_expand "divv2sf3" [(set (match_operand:V2SF 0 "register_operand") (div:V2SF (match_operand:V2SF 1 "register_operand") (match_operand:V2SF 2 "register_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx op2 = gen_reg_rtx (V4SFmode); rtx op1 = gen_reg_rtx (V4SFmode); @@ -750,7 +761,7 @@ (define_expand "v2sf3" (smaxmin:V2SF (match_operand:V2SF 1 "register_operand") (match_operand:V2SF 2 "register_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx op2 = gen_reg_rtx (V4SFmode); rtx op1 = gen_reg_rtx (V4SFmode); @@ -852,7 +863,7 @@ (define_insn "mmx_rcpit2v2sf3" (define_expand "sqrtv2sf2" [(set (match_operand:V2SF 0 "register_operand") (sqrt:V2SF (match_operand:V2SF 1 "nonimmediate_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -933,7 +944,7 @@ (define_insn_and_split "*mmx_haddv2sf3_low" (vec_select:SF (match_dup 1) (parallel [(match_operand:SI 3 "const_0_to_1_operand")]))))] - "TARGET_SSE3 && TARGET_MMX_WITH_SSE + "TARGET_SSE3 && TARGET_MMXFP_WITH_SSE && INTVAL (operands[2]) != INTVAL (operands[3]) && ix86_pre_reload_split ()" "#" @@ -979,7 +990,7 @@ (define_insn_and_split "*mmx_hsubv2sf3_low" (vec_select:SF (match_dup 1) (parallel [(const_int 1)]))))] - "TARGET_SSE3 && TARGET_MMX_WITH_SSE + "TARGET_SSE3 && TARGET_MMXFP_WITH_SSE && ix86_pre_reload_split ()" "#" "&& 1" @@ -1041,7 +1052,7 @@ (define_expand "vec_addsubv2sf3" (match_operand:V2SF 2 "nonimmediate_operand")) (plus:V2SF (match_dup 1) (match_dup 2)) (const_int 1)))] - "TARGET_SSE3 && TARGET_MMX_WITH_SSE" + "TARGET_SSE3 && TARGET_MMXFP_WITH_SSE" { rtx op2 = gen_reg_rtx (V4SFmode); rtx op1 = gen_reg_rtx (V4SFmode); @@ -1104,7 +1115,7 @@ (define_expand "vec_cmpv2sfv2si" (match_operator:V2SI 1 "" [(match_operand:V2SF 2 "nonimmediate_operand") (match_operand:V2SF 3 "nonimmediate_operand")]))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx ops[4]; ops[3] = gen_reg_rtx (V4SFmode); @@ -1130,7 +1141,7 @@ (define_expand "vcondv2sf" (match_operand:V2SF 5 "nonimmediate_operand")]) (match_operand:V2FI 1 "general_operand") (match_operand:V2FI 2 "general_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx ops[6]; ops[5] = gen_reg_rtx (V4SFmode); @@ -1320,7 +1331,7 @@ (define_expand "fmav2sf4" (match_operand:V2SF 2 "nonimmediate_operand") (match_operand:V2SF 3 "nonimmediate_operand")))] "(TARGET_FMA || TARGET_FMA4 || TARGET_AVX512VL) - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op3 = gen_reg_rtx (V4SFmode); rtx op2 = gen_reg_rtx (V4SFmode); @@ -1345,7 +1356,7 @@ (define_expand "fmsv2sf4" (neg:V2SF (match_operand:V2SF 3 "nonimmediate_operand"))))] "(TARGET_FMA || TARGET_FMA4 || TARGET_AVX512VL) - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op3 = gen_reg_rtx (V4SFmode); rtx op2 = gen_reg_rtx (V4SFmode); @@ -1370,7 +1381,7 @@ (define_expand "fnmav2sf4" (match_operand:V2SF 2 "nonimmediate_operand") (match_operand:V2SF 3 "nonimmediate_operand")))] "(TARGET_FMA || TARGET_FMA4 || TARGET_AVX512VL) - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op3 = gen_reg_rtx (V4SFmode); rtx op2 = gen_reg_rtx (V4SFmode); @@ -1396,7 +1407,7 @@ (define_expand "fnmsv2sf4" (neg:V2SF (match_operand:V2SF 3 "nonimmediate_operand"))))] "(TARGET_FMA || TARGET_FMA4 || TARGET_AVX512VL) - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op3 = gen_reg_rtx (V4SFmode); rtx op2 = gen_reg_rtx (V4SFmode); @@ -1422,7 +1433,7 @@ (define_expand "fnmsv2sf4" (define_expand "fix_truncv2sfv2si2" [(set (match_operand:V2SI 0 "register_operand") (fix:V2SI (match_operand:V2SF 1 "nonimmediate_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SImode); @@ -1438,7 +1449,7 @@ (define_expand "fix_truncv2sfv2si2" (define_expand "fixuns_truncv2sfv2si2" [(set (match_operand:V2SI 0 "register_operand") (unsigned_fix:V2SI (match_operand:V2SF 1 "nonimmediate_operand")))] - "TARGET_AVX512VL && TARGET_MMX_WITH_SSE" + "TARGET_AVX512VL && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SImode); @@ -1463,7 +1474,7 @@ (define_insn "mmx_fix_truncv2sfv2si2" (define_expand "floatv2siv2sf2" [(set (match_operand:V2SF 0 "register_operand") (float:V2SF (match_operand:V2SI 1 "nonimmediate_operand")))] - "TARGET_MMX_WITH_SSE" + "TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SImode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1479,7 +1490,7 @@ (define_expand "floatv2siv2sf2" (define_expand "floatunsv2siv2sf2" [(set (match_operand:V2SF 0 "register_operand") (unsigned_float:V2SF (match_operand:V2SI 1 "nonimmediate_operand")))] - "TARGET_AVX512VL && TARGET_MMX_WITH_SSE" + "TARGET_AVX512VL && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SImode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1756,7 +1767,7 @@ (define_expand "vec_initv2sfsf" (define_expand "nearbyintv2sf2" [(match_operand:V2SF 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] - "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "TARGET_SSE4_1 && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1772,7 +1783,7 @@ (define_expand "nearbyintv2sf2" (define_expand "rintv2sf2" [(match_operand:V2SF 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] - "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "TARGET_SSE4_1 && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1788,8 +1799,8 @@ (define_expand "rintv2sf2" (define_expand "lrintv2sfv2si2" [(match_operand:V2SI 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] - "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + "TARGET_SSE4_1 && !flag_trapping_math + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SImode); @@ -1806,7 +1817,7 @@ (define_expand "ceilv2sf2" [(match_operand:V2SF 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1822,8 +1833,8 @@ (define_expand "ceilv2sf2" (define_expand "lceilv2sfv2si2" [(match_operand:V2SI 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] - "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + "TARGET_SSE4_1 && !flag_trapping_math + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SImode); @@ -1840,7 +1851,7 @@ (define_expand "floorv2sf2" [(match_operand:V2SF 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1856,8 +1867,8 @@ (define_expand "floorv2sf2" (define_expand "lfloorv2sfv2si2" [(match_operand:V2SI 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] - "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + "TARGET_SSE4_1 && !flag_trapping_math + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SImode); @@ -1874,7 +1885,7 @@ (define_expand "btruncv2sf2" [(match_operand:V2SF 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1891,7 +1902,7 @@ (define_expand "roundv2sf2" [(match_operand:V2SF 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SFmode); @@ -1907,8 +1918,8 @@ (define_expand "roundv2sf2" (define_expand "lroundv2sfv2si2" [(match_operand:V2SI 0 "register_operand") (match_operand:V2SF 1 "nonimmediate_operand")] - "TARGET_SSE4_1 && !flag_trapping_math - && TARGET_MMX_WITH_SSE" + "TARGET_SSE4_1 && !flag_trapping_math + && TARGET_MMXFP_WITH_SSE" { rtx op1 = gen_reg_rtx (V4SFmode); rtx op0 = gen_reg_rtx (V4SImode);