From patchwork Thu Nov 3 06:26:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 14674 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp350531wru; Wed, 2 Nov 2022 23:29:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5E5PT8D1GKJRb+dBsHgbXNjwSx0DISPCCMqPWOImVitZQoyOquk56cHBKoxLQFbVbT1+UC X-Received: by 2002:aa7:d710:0:b0:463:bd7b:2b44 with SMTP id t16-20020aa7d710000000b00463bd7b2b44mr11757511edq.385.1667456994231; Wed, 02 Nov 2022 23:29:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1667456994; cv=none; d=google.com; s=arc-20160816; b=dVGODGz9QgAtc+fDnosDXs8kfiUbl/g80j8xR/JEiueTnWB3EAa08bB/S+uo03Iai+ fnvqkw8qm79ZjSzx6Am4Z25krY3BRDUMHFsVk10RkBrpbm2BdVzJHImYvx3C+C0slmnU 3Z4IEZtpRCJBFeomN1CS6STdKVyzP4faIzyaMFKeMcPBmJneGvlX4n9F4Finm91b6jUE Pgnk4OplEUuQ49zE6DZwTw7+vA3aLr6vj4kpvruscyy5KDzD4Iw8fg5BMToxoyQhXD+4 2vGonGcMt5IYxKXPl6Lfa37VQSUbj6JjGgeL2AIT9l2nKyYdFD0qHD/hpxzkgLD24yUM uplA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :references:in-reply-to:message-id:date:subject:to:dmarc-filter :delivered-to:dkim-signature:dkim-filter; bh=vkBe4y0BBYY4P/Velx3ELZIHM37MUVNXOE+ZTXy3T+8=; b=Ay+WJBZ69RPqvHARA20MFm4+wfBEEqCbgToxrcyecN0yh5OOQVB+eDvZRwqvRt/1tf 4I2/yJhY8TlwcNefEGnVSsFZ4K3zRQ0cqPaLiVui8Tvpx+Gs59oAf27wFk4qIBpmh+S6 RfT7Gt75819ThkQSbF1zY0ajeeZM6HVL04VGa1KPUcPoE2FwzOkYMtiEEiroqhASqR75 oCAHkBup9CQrfZt2xF23bIzVhNd+KPDAO1mos3/BE0DAlFwdXZJMqvVd52/edRbtVOHH BpgurTP97YmzrMm+wV2uS/k4xxEu6rDuPsjrdvfW91QS4mbF4NJLTiB2xJKVF+FSAk0x Co1Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pEnH1qrO; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id js3-20020a17090797c300b007a5cdd9550esi311190ejc.201.2022.11.02.23.29.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 02 Nov 2022 23:29:54 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=pEnH1qrO; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DF8EF3858407 for ; Thu, 3 Nov 2022 06:29:52 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DF8EF3858407 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1667456992; bh=vkBe4y0BBYY4P/Velx3ELZIHM37MUVNXOE+ZTXy3T+8=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=pEnH1qrOnSOYQIzB1gfX43qPqXv2gZDGYho2quAQoM/mJMeWquqqT5RPBP0rLv3fd CLQrnhd4Mgsvhr2a/SlbvF7ajBh/Ut89hR5FcJ/Vm1xIndi5KXuHDXbyqz4FlSEnUQ 3Fckyl2U5SZIs8qaldst55gd+bewOq/SOODWBIY8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by sourceware.org (Postfix) with ESMTPS id 9CA113858C62 for ; Thu, 3 Nov 2022 06:29:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 9CA113858C62 X-IronPort-AV: E=McAfee;i="6500,9779,10519"; a="292919026" X-IronPort-AV: E=Sophos;i="5.95,235,1661842800"; d="scan'208";a="292919026" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Nov 2022 23:29:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10519"; a="723825378" X-IronPort-AV: E=Sophos;i="5.95,235,1661842800"; d="scan'208";a="723825378" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by FMSMGA003.fm.intel.com with ESMTP; 02 Nov 2022 23:28:59 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 92C8B10056FA; Thu, 3 Nov 2022 14:28:57 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH] Support Intel CMPccXADD Date: Thu, 3 Nov 2022 14:26:57 +0800 Message-Id: <20221103062657.58427-1-haochen.jiang@intel.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: References: X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Cc: hongtao.liu@intel.com Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746649053892050491?= X-GMAIL-MSGID: =?utf-8?q?1748455384899160287?= Hi all, I just revised the patch according to review. The changes comparing to previous version is mentioned below. Ok for trunk? BRs, Haochen gcc/ChangeLog: * common/config/i386/cpuinfo.h (get_available_features): Detect cmpccxadd. * common/config/i386/i386-common.cc (OPTION_MASK_ISA2_CMPCCXADD_SET, OPTION_MASK_ISA2_CMPCCXADD_UNSET): New. (ix86_handle_option): Handle -mcmpccxadd. * common/config/i386/i386-cpuinfo.h (enum processor_features): Add FEATURE_CMPCCXADD. * common/config/i386/i386-isas.h: Add ISA_NAME_TABLE_ENTRY for cmpccxadd. * config.gcc: Add cmpccxaddintrin.h. * config/i386/cpuid.h (bit_CMPCCXADD): New. * config/i386/i386-builtin-types.def: Add DEF_FUNCTION_TYPE(INT, PINT, INT, INT, INT) and DEF_FUNCTION_TYPE(LONGLONG, PLONGLONG, LONGLONG, LONGLONG, INT). * config/i386/i386-builtin.def (BDESC): Add new builtins. * config/i386/i386-c.cc (ix86_target_macros_internal): Define __CMPCCXADD__. * config/i386/i386-expand.cc (ix86_expand_special_args_builtin): Add new parameter to indicate constant position. Handle INT_FTYPE_PINT_INT_INT_INT and LONGLONG_FTYPE_PLONGLONG_LONGLONG_LONGLONG_INT. * config/i386/i386-isa.def (CMPCCXADD): Add DEF_PTA(CMPCCXADD). * config/i386/i386-options.cc (isa2_opts): Add -mcmpccxadd. (ix86_valid_target_attribute_inner_p): Handle cmpccxadd. * config/i386/i386.opt: Add option -mcmpccxadd. * config/i386/sync.md (cmpccxadd_): New define insn. * config/i386/x86gprintrin.h: Include cmpccxaddintrin.h. * doc/extend.texi: Document cmpccxadd. * doc/invoke.texi: Document -mcmpccxadd. * doc/sourcebuild.texi: Document target cmpccxadd. * config/i386/cmpccxaddintrin.h: New file. gcc/testsuite/ChangeLog: * g++.dg/other/i386-2.C: Add -mcmpccxadd. * g++.dg/other/i386-3.C: Ditto. * gcc.target/i386/avx-1.c: Ditto. * gcc.target/i386/funcspec-56.inc: Add new target attribute. * gcc.target/i386/sse-13.c: Add -mcmpccxadd. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/x86gprintrin-1.c: Ditto. * gcc.target/i386/x86gprintrin-2.c: Ditto. * gcc.target/i386/x86gprintrin-3.c: Ditto. * gcc.target/i386/x86gprintrin-4.c: Ditto. * gcc.target/i386/x86gprintrin-5.c: Ditto. * gcc.target/i386/cmpccxadd-1.c: New test. * gcc.target/i386/cmpccxadd-2.c: Ditto. --- gcc/common/config/i386/cpuinfo.h | 2 + gcc/common/config/i386/i386-common.cc | 15 ++ gcc/common/config/i386/i386-cpuinfo.h | 1 + gcc/common/config/i386/i386-isas.h | 1 + gcc/config.gcc | 3 +- gcc/config/i386/cmpccxaddintrin.h | 89 +++++++++++ gcc/config/i386/cpuid.h | 1 + gcc/config/i386/i386-builtin-types.def | 4 + gcc/config/i386/i386-builtin.def | 4 + gcc/config/i386/i386-c.cc | 2 + gcc/config/i386/i386-expand.cc | 22 ++- gcc/config/i386/i386-isa.def | 1 + gcc/config/i386/i386-options.cc | 4 +- gcc/config/i386/i386.opt | 5 + gcc/config/i386/sync.md | 29 ++++ gcc/config/i386/x86gprintrin.h | 2 + gcc/doc/extend.texi | 5 + gcc/doc/invoke.texi | 10 +- gcc/doc/sourcebuild.texi | 3 + gcc/testsuite/g++.dg/other/i386-2.C | 2 +- gcc/testsuite/g++.dg/other/i386-3.C | 2 +- gcc/testsuite/gcc.target/i386/avx-1.c | 4 + gcc/testsuite/gcc.target/i386/cmpccxadd-1.c | 61 ++++++++ gcc/testsuite/gcc.target/i386/cmpccxadd-2.c | 138 ++++++++++++++++++ gcc/testsuite/gcc.target/i386/funcspec-56.inc | 2 + gcc/testsuite/gcc.target/i386/sse-13.c | 6 +- gcc/testsuite/gcc.target/i386/sse-23.c | 6 +- .../gcc.target/i386/x86gprintrin-1.c | 2 +- .../gcc.target/i386/x86gprintrin-2.c | 6 +- .../gcc.target/i386/x86gprintrin-3.c | 2 +- .../gcc.target/i386/x86gprintrin-4.c | 2 +- .../gcc.target/i386/x86gprintrin-5.c | 6 +- gcc/testsuite/lib/target-supports.exp | 10 ++ 33 files changed, 437 insertions(+), 15 deletions(-) create mode 100644 gcc/config/i386/cmpccxaddintrin.h create mode 100644 gcc/testsuite/gcc.target/i386/cmpccxadd-1.c create mode 100644 gcc/testsuite/gcc.target/i386/cmpccxadd-2.c diff --git a/gcc/config/i386/cmpccxaddintrin.h b/gcc/config/i386/cmpccxaddintrin.h --- /dev/null +++ b/gcc/config/i386/cmpccxaddintrin.h +#define __cmpccxadd_epi64(A,B,C,D) \ + __builtin_ia32_cmpccxadd64 ((long long *) (A), (long long) (B), \ + (long long) (C), (_CMPCCX_ENUM) (D)) +#endif Fixed a type issue here. Change both test files to align with the order of opcode. diff --git a/gcc/config/i386/sync.md b/gcc/config/i386/sync.md --- a/gcc/config/i386/sync.md +++ b/gcc/config/i386/sync.md @@ -1061,3 +1064,29 @@ (any_logic:SWI (match_dup 0) (match_dup 1)))] "" "lock{%;} %K2{}\t{%1, %0|%0, %1}") + +;; CMPCCXADD + +(define_insn "cmpccxadd_" + [(set (match_operand:SWI48x 0 "register_operand" "=r") + (unspec_volatile:SWI48x + [(match_operand:SWI48x 1 "memory_operand" "+m") + (match_operand:SWI48x 2 "register_operand" "0") + (match_operand:SWI48x 3 "register_operand" "r") + (match_operand:SI 4 "const_0_to_15_operand" "n")] + UNSPECV_CMPCCXADD)) + (set (match_dup 1) + (unspec_volatile:SWI48x [(const_int 0)] UNSPECV_CMPCCXADD)) + (set (reg:CC FLAGS_REG) + (unspec_volatile:CC [(const_int 0)] UNSPECV_CMPCCXADD))] + "TARGET_CMPCCXADD && TARGET_64BIT" +{ + char buf[128]; + const char *ops = "cmp%sxadd\t{%%3, %%0, %%1|%%1, %%0, %%3}"; + char const *cc[16] = {"o" ,"no", "b", "nb", "z", "nz", "be", "nbe", + "s", "ns", "p", "np", "l", "nl", "le", "nle"}; + + snprintf (buf, sizeof (buf), ops, cc[INTVAL (operands[4])]); + output_asm_insn (buf, operands); + return ""; +}) Changed the whole pattern like how cmpxchg did. Also adjust the cc array order to align with opcode. diff --git a/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c b/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cmpccxadd-1.c @@ -0,0 +1,61 @@ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mcmpccxadd" } */ +/* { dg-final { scan-assembler-times "cmpoxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnoxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpbxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnbxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpzxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnzxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpbexadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnbexadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpsxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnsxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmppxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnpxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmplxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnlxadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmplexadd\[ \\t\]" 2 } } */ +/* { dg-final { scan-assembler-times "cmpnlexadd\[ \\t\]" 2 } } */ +#include + +int *a; +int b, c; +long long *d; +long long e, f; + +void extern +cmpccxadd_test(void) +{ + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_O); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_O); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NO); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NO); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_B); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_B); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NB); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NB); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_Z); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_Z); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NZ); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NZ); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_BE); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_BE); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NBE); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NBE); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_S); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_S); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NS); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NS); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_P); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_P); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NP); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NP); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_L); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_L); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NL); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NL); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_LE); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_LE); + b = __cmpccxadd_epi32 (a, b, c, _CMPCCX_NLE); + e = __cmpccxadd_epi64 (d, e, f, _CMPCCX_NLE); +} diff --git a/gcc/testsuite/gcc.target/i386/cmpccxadd-2.c b/gcc/testsuite/gcc.target/i386/cmpccxadd-2.c --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/cmpccxadd-2.c @@ -0,0 +1,138 @@ +/* { dg-do run { target { ! ia32 } } } */ +/* { dg-options "-O2 -mcmpccxadd" } */ +/* { dg-require-effective-target cmpccxadd } */ + +#include +#include + +int +main() +{ + if (!__builtin_cpu_supports("cmpccxadd")) + return 0; + + int srcdest1[16] = { -2147483648,1,1,1,1,2,1,2,1,2,4,2,1,1,1,2 }; + int srcdest2[16] = { 1,1,2,1,1,1,1,1,2,1,1,1,2,1,1,1 }; + int src3[16] = { 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 }; + int _srcdest1[16], _srcdest2[16], res[16], cond[16]; + long long srcdest1_64[16] = { -9223372036854775807LL-1,1,1,1,1,2,1,2,1,2,4,2,1,1,1,2 }; + long long srcdest2_64[16] = { 1,1,2,1,1,1,1,1,2,1,1,1,2,1,1,1 }; + long long src3_64[16] = { 1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1 }; + long long _srcdest1_64[16], _srcdest2_64[16], res_64[16], cond_64[16]; + + int tmp2[16]; + long long tmp2_64[16]; + + int cf[16] = { 0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0 }; + int of[16] = { 1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }; + int sf[16] = { 0,0,0,0,0,0,0,0,1,0,0,0,1,0,0,0 }; + int zf[16] = { 0,0,0,0,1,0,1,0,0,0,0,0,0,0,1,0 }; + int af[16] = { 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 }; + int pf[16] = { 0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0 }; + + for (int i = 0; i < 16; i++) + { + tmp2[i] = srcdest1[i] + src3[i]; + tmp2_64[i] = srcdest1_64[i] + src3_64[i]; + } + + cond[0] = of[0] == 1 ? 1 : 0; + cond[1] = of[1] == 0 ? 1 : 0; + cond[2] = cf[2] == 1 ? 1 : 0; + cond[3] = cf[3] == 0 ? 1 : 0; + cond[4] = zf[4] == 1 ? 1 : 0; + cond[5] = zf[5] == 0 ? 1 : 0; + cond[6] = (cf[6] || zf[6]) == 1 ? 1 : 0; + cond[7] = (cf[7] || zf[7]) == 0 ? 1 : 0; + cond[8] = sf[8] == 1 ? 1 : 0; + cond[9] = sf[9] == 0 ? 1 : 0; + cond[10] = pf[10] == 1 ? 1 : 0; + cond[11] = pf[11] == 0 ? 1 : 0; + cond[12] = ((sf[12] && !of[12]) || (!sf[12] && of[12])) == 1 ? 1 : 0; + cond[13] = ((sf[13] && !of[13]) || (!sf[13] && of[13])) == 0 ? 1 : 0; + cond[14] = (((sf[14] && !of[14]) || (!sf[14] && of[14])) || zf[14]) == 1 ? 1 : 0; + cond[15] = (((sf[15] && !of[15]) || (!sf[15] && of[15])) || zf[15]) == 0 ? 1 : 0; + + cond_64[0] = of[0] == 1 ? 1 : 0; + cond_64[1] = of[1] == 0 ? 1 : 0; + cond_64[2] = cf[2] == 1 ? 1 : 0; + cond_64[3] = cf[3] == 0 ? 1 : 0; + cond_64[4] = zf[4] == 1 ? 1 : 0; + cond_64[5] = zf[5] == 0 ? 1 : 0; + cond_64[6] = (cf[6] || zf[6]) == 1 ? 1 : 0; + cond_64[7] = (cf[7] || zf[7]) == 0 ? 1 : 0; + cond_64[8] = sf[8] == 1 ? 1 : 0; + cond_64[9] = sf[9] == 0 ? 1 : 0; + cond_64[10] = pf[10] == 1 ? 1 : 0; + cond_64[11] = pf[11] == 0 ? 1 : 0; + cond_64[12] = ((sf[12] && !of[12]) || (!sf[12] && of[12])) == 1 ? 1 : 0; + cond_64[13] = ((sf[13] && !of[13]) || (!sf[13] && of[13])) == 0 ? 1 : 0; + cond_64[14] = (((sf[14] && !of[14]) || (!sf[14] && of[14])) || zf[14]) == 1 ? 1 : 0; + cond_64[15] = (((sf[15] && !of[15]) || (!sf[15] && of[15])) || zf[15]) == 0 ? 1 : 0; + + for (int i = 0; i < 16; i++) + { + if (cond[i] == 1) + { + _srcdest1[i] = tmp2[i]; + } + else + { + _srcdest1[i] = srcdest1[i]; + } + if (cond_64[i] == 1) + { + _srcdest1_64[i] = tmp2_64[i]; + } + else + { + _srcdest1_64[i] = srcdest1_64[i]; + } + _srcdest2[i] = srcdest1[i]; + _srcdest2_64[i] = srcdest1_64[i]; + } + + res[0] = __cmpccxadd_epi32 (&srcdest1[0], srcdest2[0], src3[0], _CMPCCX_O); + res[1] = __cmpccxadd_epi32 (&srcdest1[1], srcdest2[1], src3[1], _CMPCCX_NO); + res[2] = __cmpccxadd_epi32 (&srcdest1[2], srcdest2[2], src3[2], _CMPCCX_B); + res[3] = __cmpccxadd_epi32 (&srcdest1[3], srcdest2[3], src3[3], _CMPCCX_NB); + res[4] = __cmpccxadd_epi32 (&srcdest1[4], srcdest2[4], src3[4], _CMPCCX_Z); + res[5] = __cmpccxadd_epi32 (&srcdest1[5], srcdest2[5], src3[5], _CMPCCX_NZ); + res[6] = __cmpccxadd_epi32 (&srcdest1[6], srcdest2[6], src3[6], _CMPCCX_BE); + res[7] = __cmpccxadd_epi32 (&srcdest1[7], srcdest2[7], src3[7], _CMPCCX_NBE); + res[8] = __cmpccxadd_epi32 (&srcdest1[8], srcdest2[8], src3[8], _CMPCCX_S); + res[9] = __cmpccxadd_epi32 (&srcdest1[9], srcdest2[9], src3[9], _CMPCCX_NS); + res[10] = __cmpccxadd_epi32 (&srcdest1[10], srcdest2[10], src3[10], _CMPCCX_P); + res[11] = __cmpccxadd_epi32 (&srcdest1[11], srcdest2[11], src3[11], _CMPCCX_NP); + res[12] = __cmpccxadd_epi32 (&srcdest1[12], srcdest2[12], src3[12], _CMPCCX_L); + res[13] = __cmpccxadd_epi32 (&srcdest1[13], srcdest2[13], src3[13], _CMPCCX_NL); + res[14] = __cmpccxadd_epi32 (&srcdest1[14], srcdest2[14], src3[14], _CMPCCX_LE); + res[15] = __cmpccxadd_epi32 (&srcdest1[15], srcdest2[15], src3[15], _CMPCCX_NLE); + + res_64[0] = __cmpccxadd_epi64 (&srcdest1_64[0], srcdest2_64[0], src3_64[0], _CMPCCX_O); + res_64[1] = __cmpccxadd_epi64 (&srcdest1_64[1], srcdest2_64[1], src3_64[1], _CMPCCX_NO); + res_64[2] = __cmpccxadd_epi64 (&srcdest1_64[2], srcdest2_64[2], src3_64[2], _CMPCCX_B); + res_64[3] = __cmpccxadd_epi64 (&srcdest1_64[3], srcdest2_64[3], src3_64[3], _CMPCCX_NB); + res_64[4] = __cmpccxadd_epi64 (&srcdest1_64[4], srcdest2_64[4], src3_64[4], _CMPCCX_Z); + res_64[5] = __cmpccxadd_epi64 (&srcdest1_64[5], srcdest2_64[5], src3_64[5], _CMPCCX_NZ); + res_64[6] = __cmpccxadd_epi64 (&srcdest1_64[6], srcdest2_64[6], src3_64[6], _CMPCCX_BE); + res_64[7] = __cmpccxadd_epi64 (&srcdest1_64[7], srcdest2_64[7], src3_64[7], _CMPCCX_NBE); + res_64[8] = __cmpccxadd_epi64 (&srcdest1_64[8], srcdest2_64[8], src3_64[8], _CMPCCX_S); + res_64[9] = __cmpccxadd_epi64 (&srcdest1_64[9], srcdest2_64[9], src3_64[9], _CMPCCX_NS); + res_64[10] = __cmpccxadd_epi64 (&srcdest1_64[10], srcdest2_64[10], src3_64[10], _CMPCCX_P); + res_64[11] = __cmpccxadd_epi64 (&srcdest1_64[11], srcdest2_64[11], src3_64[11], _CMPCCX_NP); + res_64[12] = __cmpccxadd_epi64 (&srcdest1_64[12], srcdest2_64[12], src3_64[12], _CMPCCX_L); + res_64[13] = __cmpccxadd_epi64 (&srcdest1_64[13], srcdest2_64[13], src3_64[13], _CMPCCX_NL); + res_64[14] = __cmpccxadd_epi64 (&srcdest1_64[14], srcdest2_64[14], src3_64[14], _CMPCCX_LE); + res_64[15] = __cmpccxadd_epi64 (&srcdest1_64[15], srcdest2_64[15], src3_64[15], _CMPCCX_NLE); + + for (int i = 0; i < 16; i++) + { + if ((srcdest1[i] != _srcdest1[i]) || (res[i] != _srcdest2[i])) + abort(); + if ((srcdest1_64[i] != _srcdest1_64[i]) || (res_64[i] != _srcdest2_64[i])) + abort(); + } + + return 0; +}