From patchwork Thu Mar 2 16:01:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 63471 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp4312099wrd; Thu, 2 Mar 2023 08:02:39 -0800 (PST) X-Google-Smtp-Source: AK7set/qbG3ykmEl1hV7Wrb2bY6R+3iub4N7RRLHQmkqObETaSBxpNkx+W4tp2yqrjXkzeXf5+ks X-Received: by 2002:a05:6402:8d8:b0:4bc:ab52:ac70 with SMTP id d24-20020a05640208d800b004bcab52ac70mr5568313edz.8.1677772959304; Thu, 02 Mar 2023 08:02:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677772959; cv=none; d=google.com; s=arc-20160816; b=DqXi4H3ypanoeeaTkfNqIBXuCa5Y7rOFAwSG6x6YTsbwTqz0VA4G75qs/sZbmq1aof jCMj8c/1WkxbHrnU8PRY8QokQs8O//FOmeelFMnw0CZRGfaQh5DO8AdnbzSGaMzUEvBz NLiSwaW0I3kQflePpqzgjLSv27wc1F9crK6KWt+jr5cIM0Vnj0FPNGOS6QcQ4Is+deoY VfpwfsAVkTezx72wKOnWh5uCSUTuFDgm5O95Pba4LrTBPjK8vKkagiyPjzLElj+Xk5Wy RAnkeWRC3nwkRqLokgO7MmTCzAhUZzw1C260FEvjW8JTor8hOFErh1HYnEJo7rZD2NPA 5RIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:cc :to:dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=6WGalZVJRke6B69J26cz8ePSHyRjg6FOZjliCbgsrHw=; b=rqnJmMh74BHJK6jV8vqrKHuxhaCXbpQzCmQ9iznLNvsOOkiV7u3NpqZ7UGF129veFL uWiIlln88BJCM/D31CRu2OriIcwMkMQYBvqYZZPbHYrqNJ1Zu1iBISWo4DqHP1O0nfHG g4oZvbW9yTmKq2qkdycOofWr3tnVC5CPzYp4QYrixzDaXBj0PseP71+YV9BflfK84kXs VTUpC8Zgewuie4G+dRMYX2rm2EuwcLOYBcf+9QKpr3SFcYMNgR1HL++yM+ZBCQpjOoZl VjWMDVwcJOtYxEwp722b2qsleNTUEoM5RGOTfFLgdi0O6Ft+LFH5osygtgeqlWjUTR4J fQ/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hyPvbPix; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id by7-20020a0564021b0700b004acc19b2698si43022edb.162.2023.03.02.08.02.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 02 Mar 2023 08:02:39 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hyPvbPix; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B255F385B515 for ; Thu, 2 Mar 2023 16:02:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B255F385B515 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1677772950; bh=6WGalZVJRke6B69J26cz8ePSHyRjg6FOZjliCbgsrHw=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=hyPvbPixSRyrPfPipP5LQMc1HAI+ebKmUm5eEP8w/oJW9A+WFNo7j/Z12z95lMFKG FTbD8o9OKcoxhXqH6eI6FSraj3QMJidsadjG8jNpxtZbnOLYhx5y0iBUnoDvtm7V3Y ucDQ8pxxtZKzeS3Hc6gi/d58Hgtqp0XAh6N/EwzI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [IPv6:2001:470:683e::1]) by sourceware.org (Postfix) with ESMTPS id 7B3C23858CDB for ; Thu, 2 Mar 2023 16:01:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7B3C23858CDB Received: from stargazer.. (unknown [IPv6:240e:358:1106:d100:dc73:854d:832e:6]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id A08AD65C86; Thu, 2 Mar 2023 11:01:32 -0500 (EST) To: gcc-patches@gcc.gnu.org Cc: WANG Xuerui , Lulu Cheng , Chenghua Xu , Xi Ruoyao Subject: [PATCH] LoongArch: Stop -mfpu from silently breaking ABI Date: Fri, 3 Mar 2023 00:01:22 +0800 Message-Id: <20230302160122.47573-1-xry111@xry111.site> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 X-Spam-Status: No, score=-8.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Xi Ruoyao via Gcc-patches From: Xi Ruoyao Reply-To: Xi Ruoyao Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759272458987607874?= X-GMAIL-MSGID: =?utf-8?q?1759272458987607874?= In the toolchain convention, we describe -mfpu= as: "Selects the allowed set of basic floating-point instructions and registers. This option should not change the FP calling convention unless it's necessary." Though not explicitly stated, the rationale of this rule is to allow combinations like "-mabi=lp64s -mfpu=64". This will be useful for running applications with LP64S/F ABI on a double-float-capable LoongArch hardware and using a math library with LP64S/F ABI but native double float HW instructions, for a better performance. And now a case in Linux kernel has again proven the usefulness of this kind of combination. The AMDGPU DCN kernel driver needs to perform some floating-point operation, but the entire kernel uses LP64S ABI. So the translation units of the AMDGPU DCN driver need to be compiled with -mfpu=64 (the kernel lacks soft-FP routines in libgcc), but -mabi=lp64s (or you can't link it with the other part of the kernel). Unfortunately, currently GCC uses TARGET_{HARD,SOFT,DOUBLE}_FLOAT to determine the floating calling convention. This causes "-mfpu=64" silently allow using $fa* to pass parameters and return values EVEN IF -mabi=lp64s is used. To make things worse, the generated object file has SOFT-FLOAT set in the eflags field so the linker will happily link it with other LP64S ABI object files, but obviously this will lead to bad results at runtime. The fix is simple: use TARGET_*_FLOAT_ABI instead. But then it causes "-mabi=lp64s -march=loongarch64" to generate code like: movgr2fr.d $fa0, $a0 frecip.d $fa0, $fa0 movfr2gr.d $a0, $fa0 The problem here is "loongarch64" is never strictly defined. So we consider "loongarch64" a "64-bit LoongArch CPU with the simplest FPU needed by the ABI", and if -march=loongarch64 but -mfpu is not explicitly used, we set -mfpu such a simplest one. I consider this a bug fix: the behavior difference from the toolchain convention doc is a bug, and generating object files with SOFT-FLOAT flag but parameters/return values passed through FPRs is definitely a bug. Bootstrapped and regtested on loongarch64-linux-gnu. Ok for trunk? I'm not sure if it's a good idea to backport this into gcc-12 though. gcc/ChangeLog: * config/loongarch/loongarch.h (FP_RETURN): Use TARGET_*_FLOAT_ABI instead of TARGET_*_FLOAT. (UNITS_PER_FP_ARG): Likewise. * config/loongarch/loongarch-opts.cc (loongarch_config_target): If -march=loongarch64 and -mfpu not explicitly used, guess FPU capability from ABI. gcc/testsuite/ChangeLog: * gcc.target/loongarch/flt-abi-isa-1.c: New test. * gcc.target/loongarch/flt-abi-isa-2.c: New test. * gcc.target/loongarch/flt-abi-isa-3.c: New test. * gcc.target/loongarch/flt-abi-isa-4.c: New test. * gcc.target/loongarch/flt-abi-isa-5.c: New test. * gcc.target/loongarch/flt-abi-isa-6.c: New test. * gcc.target/loongarch/flt-abi-isa-7.c: New test. * gcc.target/loongarch/flt-abi-isa-8.c: New test. * gcc.target/loongarch/flt-abi-isa-9.c: New test. * gcc.target/loongarch/flt-abi-isa-10.c: New test. --- gcc/config/loongarch/loongarch-opts.cc | 18 ++++++++++++++++++ gcc/config/loongarch/loongarch.h | 4 ++-- .../gcc.target/loongarch/flt-abi-isa-1.c | 12 ++++++++++++ .../gcc.target/loongarch/flt-abi-isa-10.c | 7 +++++++ .../gcc.target/loongarch/flt-abi-isa-2.c | 11 +++++++++++ .../gcc.target/loongarch/flt-abi-isa-3.c | 11 +++++++++++ .../gcc.target/loongarch/flt-abi-isa-4.c | 12 ++++++++++++ .../gcc.target/loongarch/flt-abi-isa-5.c | 7 +++++++ .../gcc.target/loongarch/flt-abi-isa-6.c | 11 +++++++++++ .../gcc.target/loongarch/flt-abi-isa-7.c | 5 +++++ .../gcc.target/loongarch/flt-abi-isa-8.c | 7 +++++++ .../gcc.target/loongarch/flt-abi-isa-9.c | 7 +++++++ 12 files changed, 110 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-10.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-5.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-6.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-7.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-8.c create mode 100644 gcc/testsuite/gcc.target/loongarch/flt-abi-isa-9.c diff --git a/gcc/config/loongarch/loongarch-opts.cc b/gcc/config/loongarch/loongarch-opts.cc index a52e25236ea..bea77da93e9 100644 --- a/gcc/config/loongarch/loongarch-opts.cc +++ b/gcc/config/loongarch/loongarch-opts.cc @@ -251,6 +251,24 @@ config_target_isa: ((t.cpu_arch == CPU_NATIVE && constrained.arch) ? t.isa.fpu : DEFAULT_ISA_EXT_FPU); + /* "loongarch64" is not really strictly defined: which FPU does it have? + So if -march=loongarch64 and -mfpu not explicitly provided, use the + minimal -mfpu setting suitable for the ABI. */ + if (t.cpu_arch == CPU_LOONGARCH64 && !constrained.fpu) + switch (t.abi.base) + { + case ABI_BASE_LP64D: + t.isa.fpu = ISA_EXT_FPU64; + break; + case ABI_BASE_LP64F: + t.isa.fpu = ISA_EXT_FPU32; + break; + case ABI_BASE_LP64S: + t.isa.fpu = ISA_EXT_NOFPU; + break; + default: + gcc_unreachable (); + } /* 4. ABI-ISA compatibility */ /* Note: diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h index f4e903d46bb..f8167875646 100644 --- a/gcc/config/loongarch/loongarch.h +++ b/gcc/config/loongarch/loongarch.h @@ -676,7 +676,7 @@ enum reg_class point values. */ #define GP_RETURN (GP_REG_FIRST + 4) -#define FP_RETURN ((TARGET_SOFT_FLOAT) ? GP_RETURN : (FP_REG_FIRST + 0)) +#define FP_RETURN ((TARGET_SOFT_FLOAT_ABI) ? GP_RETURN : (FP_REG_FIRST + 0)) #define MAX_ARGS_IN_REGISTERS 8 @@ -1154,6 +1154,6 @@ struct GTY (()) machine_function /* The largest type that can be passed in floating-point registers. */ /* TODO: according to mabi. */ #define UNITS_PER_FP_ARG \ - (TARGET_HARD_FLOAT ? (TARGET_DOUBLE_FLOAT ? 8 : 4) : 0) + (TARGET_HARD_FLOAT_ABI ? (TARGET_DOUBLE_FLOAT_ABI ? 8 : 4) : 0) #define FUNCTION_VALUE_REGNO_P(N) ((N) == GP_RETURN || (N) == FP_RETURN) diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c new file mode 100644 index 00000000000..ab1c357d98c --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=loongarch64 -O2" } */ +/* { dg-final { scan-assembler-not "frecip\\.d" } } */ + +/* With the "default" -march=loongarch64, -mabi=lp64s implies -mfpu=0 so + we won't puzzle people. */ + +double +t (double x) +{ + return 1.0 / x; +} diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-10.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-10.c new file mode 100644 index 00000000000..49d2f4ec267 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-10.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=la464 -O2" } */ +/* { dg-final { scan-assembler "frecip\\.s" } } */ +/* { dg-final { scan-assembler "movgr2fr\\.w" } } */ +/* { dg-final { scan-assembler "movfr2gr\\.s" } } */ + +#include "flt-abi-isa-6.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c new file mode 100644 index 00000000000..d248cc546f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=loongarch64 -O2 -mfpu=64" } */ +/* { dg-final { scan-assembler "frecip\\.d" } } */ +/* { dg-final { scan-assembler "movgr2fr\\.d" } } */ +/* { dg-final { scan-assembler "movfr2gr\\.d" } } */ + +/* With -mabi=lp64s and -mfpu=64, we can use the FPU to calculate the + answer but we need to move the argument from a0 to a FPR, then move + the answer from a FPR back to a0. */ + +#include "flt-abi-isa-1.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c new file mode 100644 index 00000000000..e31a1d1fbc4 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-3.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=la464 -O2" } */ +/* { dg-final { scan-assembler "frecip\\.d" } } */ +/* { dg-final { scan-assembler "movgr2fr\\.d" } } */ +/* { dg-final { scan-assembler "movfr2gr\\.d" } } */ + +/* We know LA464 has a 64-bit FPU, so we can use it to calculate the + answer but we need to move the argument from a0 to a FPR, then move + the answer from a FPR back to a0. */ + +#include "flt-abi-isa-1.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c new file mode 100644 index 00000000000..398e6b56ab5 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-4.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64f -march=loongarch64 -O2 -mfpu=64" } */ +/* { dg-final { scan-assembler "frecip\\.d" } } */ +/* { dg-final { scan-assembler "movgr2fr\\.d" } } */ +/* { dg-final { scan-assembler "movfr2gr\\.d" } } */ + +/* With -mabi=lp64f and -mfpu=64, we can use the FPU to calculate the + answer but we need to move the argument from a0 to a FPR, then move + the answer from a FPR back to a0 as the LP64F ABI mandates passing + double values via GPR (like LP64S). */ + +#include "flt-abi-isa-1.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-5.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-5.c new file mode 100644 index 00000000000..d7db62de344 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-5.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=la464 -O2 -mfpu=none" } */ +/* { dg-final { scan-assembler-not "frecip\\.d" } } */ + +/* Explicitly disable FPU on LA464. */ + +#include "flt-abi-isa-1.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-6.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-6.c new file mode 100644 index 00000000000..9e204c6cd7c --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-6.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64f -march=loongarch64 -O2" } */ +/* { dg-final { scan-assembler "frecip\\.s" } } */ +/* { dg-final { scan-assembler-not "movgr2fr" } } */ +/* { dg-final { scan-assembler-not "movfr2gr" } } */ + +float +t (float x) +{ + return 1.0 / x; +} diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-7.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-7.c new file mode 100644 index 00000000000..dc9f2a322bf --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-7.c @@ -0,0 +1,5 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=loongarch64 -O2" } */ +/* { dg-final { scan-assembler-not "frecip\\.s" } } */ + +#include "flt-abi-isa-6.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-8.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-8.c new file mode 100644 index 00000000000..001b034be9a --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-8.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=loongarch64 -O2 -mfpu=32" } */ +/* { dg-final { scan-assembler "frecip\\.s" } } */ +/* { dg-final { scan-assembler "movgr2fr\\.w" } } */ +/* { dg-final { scan-assembler "movfr2gr\\.s" } } */ + +#include "flt-abi-isa-6.c" diff --git a/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-9.c b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-9.c new file mode 100644 index 00000000000..5762294207e --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/flt-abi-isa-9.c @@ -0,0 +1,7 @@ +/* { dg-do compile } */ +/* { dg-options "-mabi=lp64s -march=loongarch64 -O2 -mfpu=64" } */ +/* { dg-final { scan-assembler "frecip\\.s" } } */ +/* { dg-final { scan-assembler "movgr2fr\\.w" } } */ +/* { dg-final { scan-assembler "movfr2gr\\.s" } } */ + +#include "flt-abi-isa-6.c"