[committed] amdgcn: add fmin/fmax patterns
Checks
Commit Message
This patch adds patterns for the fmin and fmax operators, for scalars,
vectors, and vector reductions.
The compiler uses smin and smax for most floating-point optimizations,
etc., but not where the user calls fmin/fmax explicitly. On amdgcn the
hardware min/max instructions are already IEEE compliant w.r.t.
unordered values, so there's no need for separate implementations.
Andrew
amdgcn: add fmin/fmax patterns
Add fmin/fmax for scalar, vector, and reductions. The smin/smax patterns are
already using the IEEE compliant hardware instructions anyway, so we can just
expand to use those insns.
gcc/ChangeLog:
* config/gcn/gcn-valu.md (fminmaxop): New iterator.
(<fexpander><mode>3): New define_expand.
(<fexpander><mode>3<exec>): Likewise.
(reduc_<fexpander>_scal_<mode>): Likewise.
* config/gcn/gcn.md (fexpander): New attribute.
@@ -2466,6 +2466,23 @@ (define_insn "<expander><mode>3"
[(set_attr "type" "vop2")
(set_attr "length" "8,8")])
+(define_code_iterator fminmaxop [smin smax])
+(define_expand "<fexpander><mode>3"
+ [(set (match_operand:FP 0 "gcn_valu_dst_operand")
+ (fminmaxop:FP
+ (match_operand:FP 1 "gcn_valu_src0_operand")
+ (match_operand:FP 2 "gcn_valu_src1_operand")))]
+ ""
+ {})
+
+(define_expand "<fexpander><mode>3<exec>"
+ [(set (match_operand:V_FP 0 "gcn_valu_dst_operand")
+ (fminmaxop:V_FP
+ (match_operand:V_FP 1 "gcn_valu_src0_operand")
+ (match_operand:V_FP 2 "gcn_valu_src1_operand")))]
+ ""
+ {})
+
;; }}}
;; {{{ FP unops
@@ -3522,6 +3539,17 @@ (define_expand "reduc_<reduc_op>_scal_<mode>"
DONE;
})
+(define_expand "reduc_<fexpander>_scal_<mode>"
+ [(match_operand:<SCALAR_MODE> 0 "register_operand")
+ (fminmaxop:V_FP
+ (match_operand:V_FP 1 "register_operand"))]
+ ""
+ {
+ /* fmin/fmax are identical to smin/smax. */
+ emit_insn (gen_reduc_<expander>_scal_<mode> (operands[0], operands[1]));
+ DONE;
+ })
+
;; Warning: This "-ffast-math" implementation converts in-order reductions
;; into associative reductions. It's also used where OpenMP or
;; OpenACC paralellization has already broken the in-order semantics.
@@ -372,6 +372,10 @@ (define_code_attr expander
(sign_extend "extend")
(zero_extend "zero_extend")])
+(define_code_attr fexpander
+ [(smin "fmin")
+ (smax "fmax")])
+
;; }}}
;; {{{ Miscellaneous instructions