Message ID | f911f018-ce39-470a-9c56-fc7e596db8f2@baylibre.com |
---|---|
State | Accepted |
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2087:b0:106:209c:c626 with SMTP id gs7csp476557dyb; Mon, 29 Jan 2024 02:34:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IGDZHa7k1wrlGxYKghG8aOkgZ0ymLcrHYSqIah7iEAIxIeFkIupToOW7nb0n9sb3Ep1mHJA X-Received: by 2002:a05:6214:27ce:b0:68c:49be:c656 with SMTP id ge14-20020a05621427ce00b0068c49bec656mr2540629qvb.38.1706524494799; Mon, 29 Jan 2024 02:34:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706524494; cv=pass; d=google.com; s=arc-20160816; b=vTM/lTeIOuxQG+x64QJVQYnv/WFKerpb8QGHDOZFxgh6lfm8VHq20keRVp/OdaXgil D0YvzXL00cigNVZFQhFL0HWNjpusBwUOOxMC1hDW+GuzRWihUPNZTaX1lbPgw/idUqeA lZNJLZDEmcjZZWJLS/4HCFdGXTjZafmkCvoXVLMKRm2q9kS+HpUwAmNdXv3YWvYkgZg+ 63UiVsAzXzXaq8rKpt9BdYYOQSKDGZdRuJJkOPihPz/r5zfaxqarxljl7Xp+8StnzmdU KnG6PN92wm1qK+HZlCuTzkYzTM+Z04JzBI87GnscKbGwEuCiaY4WFYqTq26vQ7LhseZ8 awdQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:cc:subject:from:to :content-language:user-agent:mime-version:date:message-id :dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=OQhOxXMNI0pBlKr6v2gEpHTlAienjzLtOnzx8p26VZw=; fh=hlJCJSBA4G0qwlXn9Sec5yVIWfttVVBijl2zglUAWqo=; b=Rp0p/KtJXnRPgki3Yl2XWu4Tm0EbjrNKezHIJ2gEFiT80r85VxOzVhJ2M3SNPzJJ4W m8394TY+nNaq10kB2MPeOIPpc1cO7xotssrb+fY67bTrH53fqHAs4+7GJlfpAYu2nJYJ bkYCMy/Wm8efA/wIP7bAQP/P+WHQfSMkH86w4KfTQD4rn53iC9qgiBeY5qAoqwrk0S+5 B/NB8FZuSruhMzO3DUTy7wzRVUS03ABllXHheFgjrGl+/YrnpyRdizTW5C/nsJPGqFhV z8CYPuFWV7JMKbFG5WqOVe4XOc9X9BDoRkl0CXPsYm4CQDpxIf2c349EvCJfhPpcLekP qbHA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@baylibre-com.20230601.gappssmtp.com header.s=20230601 header.b=1YF3rjRV; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t12-20020ad45bcc000000b0068c4c6ca5a1si2263948qvt.514.2024.01.29.02.34.54 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 29 Jan 2024 02:34:54 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@baylibre-com.20230601.gappssmtp.com header.s=20230601 header.b=1YF3rjRV; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 74B863858D20 for <ouuuleilei@gmail.com>; Mon, 29 Jan 2024 10:34:54 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x430.google.com (mail-wr1-x430.google.com [IPv6:2a00:1450:4864:20::430]) by sourceware.org (Postfix) with ESMTPS id 517BC3858D20 for <gcc-patches@gcc.gnu.org>; Mon, 29 Jan 2024 10:34:08 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 517BC3858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=baylibre.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=baylibre.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 517BC3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::430 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706524450; cv=none; b=clyYlYH0XMyz9RdMj1csQAFWUkUmc4GBMw0XFVGsJbP/EqwXa9RabIgPRKfHHJdwJFHaKCQ9szv3V5nURrkNLTnwkm7kZ3lXigrPFO49pnRHGHkw1o7zUAWbWiXSa+BiYZz72mj93xU6ZMaJb75TExZLNyinrDsOlLqxJrzfrP8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706524450; c=relaxed/simple; bh=TZnPVF8CpA92jZuMeIzeivQu5ic6dpQH8aJuIDN//6o=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=QfveM4R+ftTgTvlSQnnX7rZc1ssBY0K4EXjNEJXsh4Bg4S4eK72xFxsKuOFMYpA+RCENVA3rmb/2G+MVdcJUgRBi5t/yfX9KQQWb3Kj98z/L/AedYrnmvGzUq8hIyI3FiipYeCtpyN7pnegdrMBSqLZEny6pdb9sLwfiTB2JzJI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-33ae3be1c37so764824f8f.0 for <gcc-patches@gcc.gnu.org>; Mon, 29 Jan 2024 02:34:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=baylibre-com.20230601.gappssmtp.com; s=20230601; t=1706524447; x=1707129247; darn=gcc.gnu.org; h=cc:subject:from:to:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=OQhOxXMNI0pBlKr6v2gEpHTlAienjzLtOnzx8p26VZw=; b=1YF3rjRVVUDt6adNNEnRBmNMdCCbyqVuoLCqHH7hfzkv9eUpeBZAOXGqaUYvnMvIVh kklqhISz3mCnEYztbouMW5CvIPbu8THXguohE8+3Ze3tqsWVuhECmwb4S+sNKnbBbxDx dSB3vmlym7ifIZBCHf8LPNLRzIcH9fXmwk5+5r0Jicg4UXvP0FOVTgShFTQp59V5T+Np jFT+D2meinOGSDtSsR/GIaKPY6nw8mMFaOvOJmcX4T0Nd0tVEPsVNwL8yE2lyMujupZR 19JuekAn7xPRRg2L1h5d1zd/SaR1QbYKAvf21NJh598fVCtSDWhcxQHiVs6IHktZTwAx 4uHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706524447; x=1707129247; h=cc:subject:from:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OQhOxXMNI0pBlKr6v2gEpHTlAienjzLtOnzx8p26VZw=; b=A+tlrGSOpKotKkrlOR4uyOAi0xZW6ZuoZNFuyuGnpvnzJP+551JSiVjHt+6ryWS9le nPNOMsL2sYymfzwVF8VU1SAALsG4Nhgws+efZSIe1zC26HR31JeCVTLX56N/SA9Lum60 SRnH/EgqXaIrTPhXqUJsBOxIkF8cpWXs15nkcWvhcIDov+0y/QYrzh6VMgTXtP6yKAGK ecTfNgm2GVJsIU5zeQNJ7V48iUxynHVSWaz4dM3Xkod889RSaF79hVlEA/gGhnS3dFFG ghA8gila30NnaboARoHGRYvjL31pgG/Jm5o/jEvj7EojVHQslBFCaEdHc1P5RsEyqMMc RKug== X-Gm-Message-State: AOJu0Yxis2JVW0WQ2Z1ktSwLhMFkmC27PGGJ7eOI2UFFZRHJmVoO9itT AeYPJRvLq4tIj5k8skvlxlwaZIGRmT1hhq7f+Nb3oPOBD95u0nduVisRH+/0snp0tIruONLGuLM e X-Received: by 2002:a5d:4651:0:b0:336:8d35:f7e9 with SMTP id j17-20020a5d4651000000b003368d35f7e9mr3510241wrs.22.1706524447065; Mon, 29 Jan 2024 02:34:07 -0800 (PST) Received: from ?IPV6:2001:16b8:2a2e:4200:be03:58ff:fe31:f74? (200116b82a2e4200be0358fffe310f74.dip.versatel-1u1.de. [2001:16b8:2a2e:4200:be03:58ff:fe31:f74]) by smtp.gmail.com with ESMTPSA id fa1-20020a056000258100b0033af5716a7fsm37616wrb.61.2024.01.29.02.34.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 29 Jan 2024 02:34:06 -0800 (PST) Content-Type: multipart/mixed; boundary="------------MdwVnE0rTnQ6WN8XHuCRYRot" Message-ID: <f911f018-ce39-470a-9c56-fc7e596db8f2@baylibre.com> Date: Mon, 29 Jan 2024 11:34:05 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: gcc-patches <gcc-patches@gcc.gnu.org>, Andrew Stubbs <ams@baylibre.com> From: Tobias Burnus <tburnus@baylibre.com> Subject: [patch] gcn/gcn-valu.md: Disable fold_left_plus for TARGET_RDNA2_PLUS [PR113615] Cc: Richard Biener <rguenther@suse.de> X-Spam-Status: No, score=-8.0 required=5.0 tests=BAYES_00, BODY_8BITS, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, MEDICAL_SUBJECT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789420628417825197 X-GMAIL-MSGID: 1789420628417825197 |
Series |
gcn/gcn-valu.md: Disable fold_left_plus for TARGET_RDNA2_PLUS [PR113615]
|
|
Checks
Context | Check | Description |
---|---|---|
snail/gcc-patch-check | success | Github commit url |
Commit Message
Tobias Burnus
Jan. 29, 2024, 10:34 a.m. UTC
Andrew wrote off list: "Vector reductions don't work on RDNA, as is, but they're supposed to be disabled by the insn condition" This patch disables "fold_left_plus_<mode>", which is about vectorization and in the code path shown in the backtrace. I can also confirm manually that it fixes the ICE I saw and also the ICE for the testfile that Richard's PR shows at the end of his backtrace. (-O3 is needed to trigger the ICE.) OK for mainline? Tobias * * * PS: We could add testcase(s) that is/are explicitly compiled with gfx1100 and/or gfx1030 + '-O3' to ensure that this gets tested with AMDGPU enabled, but I am not sure whether it is really worthwhile. PPS: Running the testsuite, I see the following fails with gfx1100 offloading: FAIL: libgomp.c/../libgomp.c-c++-common/for-5.c (test for excess errors) Excess errors: /tmp/ccrsHfVQ.mkoffload.2.s:788736:27: error: value out of range .amdhsa_next_free_vgpr 516 ^~~ [Obviously, likewise forlibgomp.c++/../libgomp.c-c++-common/for-5.c] FAIL:libgomp.c/pr104783-2.c execution test FAIL:libgomp.c/pr104783.c execution test (The .log unfortunately does not show more details) FAIL:libgomp.fortran/optional-map.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL:libgomp.fortran/optional-map.f90 -O3 -g (test for excess errors) FAIL: libgomp.fortran/target1.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) FAIL: libgomp.fortran/target1.f90 -O3 -g (test for excess errors)Same 'out of range' as above. * * * Manual testing shows for the two execution fails: Memory access fault by GPU node-1 (Agent handle: 0x8d1aa0) on address (nil). Reason: Page not present or supervisor privilege. Interestingly, it only fails with -O1 or higher, for -O0 it works. Tobias
Comments
On 29/01/2024 10:34, Tobias Burnus wrote: > Andrew wrote off list: > "Vector reductions don't work on RDNA, as is, but they're > supposed to be disabled by the insn condition" > > This patch disables "fold_left_plus_<mode>", which is about > vectorization and in the code path shown in the backtrace. > I can also confirm manually that it fixes the ICE I saw and > also the ICE for the testfile that Richard's PR shows at the > end of his backtrace. (-O3 is needed to trigger the ICE.) > > OK for mainline? OK. > Tobias > > * * * > > PS: We could add testcase(s) that is/are explicitly compiled with > gfx1100 and/or gfx1030 + '-O3' to ensure that this gets tested > with AMDGPU enabled, but I am not sure whether it is really worthwhile. > > > PPS: Running the testsuite, I see the following fails with > gfx1100 offloading: > > FAIL: libgomp.c/../libgomp.c-c++-common/for-5.c (test for excess errors) > Excess errors: > /tmp/ccrsHfVQ.mkoffload.2.s:788736:27: error: value out of range > .amdhsa_next_free_vgpr 516 > ^~~ [Obviously, likewise > forlibgomp.c++/../libgomp.c-c++-common/for-5.c] > FAIL:libgomp.c/pr104783-2.c execution test FAIL:libgomp.c/pr104783.c > execution test (The .log unfortunately does not show more details) > FAIL:libgomp.fortran/optional-map.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions (test for > excess errors) FAIL:libgomp.fortran/optional-map.f90 -O3 -g (test for > excess errors) FAIL: libgomp.fortran/target1.f90 -O3 > -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer > -finline-functions (test for excess errors) FAIL: > libgomp.fortran/target1.f90 -O3 -g (test for excess errors)Same 'out > of range' as above. * * * Manual testing shows for the two execution > fails: Memory access fault by GPU node-1 (Agent handle: 0x8d1aa0) on > address (nil). Reason: Page not present or supervisor privilege. > Interestingly, it only fails with -O1 or higher, for -O0 it works. Tobias Hmm, supposedly there are 768 registers allocated in groups of 12, on gfx1100 (8 on other devices), which number you have to double on wavefrontsize64 because that field actually counts the number of 32-lane registers. The ISA can only actually reference 256 registers, so the limit here should be 512. (The remaining registers are intended for other wavefronts to use.) But 256 is not divisible by 12, and it looks like we've rounded up. I guess we need to set the limit at 252 (504), for gfx1100. for-5.c is a register allocation nightmare! Andrew
Andrew Stubbs wrote: >> /tmp/ccrsHfVQ.mkoffload.2.s:788736:27: error: value out of range >> .amdhsa_next_free_vgpr 516 >> ^~~ [Obviously, likewise >> forlibgomp.c++/.. > Hmm, supposedly there are 768 registers allocated in groups of 12, on > gfx1100 (8 on other devices), which number you have to double on > wavefrontsize64 because that field actually counts the number of > 32-lane registers. The ISA can only actually reference 256 registers, > so the limit here should be 512. (The remaining registers are intended > for other wavefronts to use.) > > But 256 is not divisible by 12, and it looks like we've rounded up. I > guess we need to set the limit at 252 (504), for gfx1100. BTW: The LLVM source code has, https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp#L1066 unsigned getTotalNumVGPRs(const MCSubtargetInfo *STI) { if (STI->getFeatureBits().test(FeatureGFX90AInsts)) return 512; if (!isGFX10Plus(*STI)) return 256; bool IsWave32 = STI->getFeatureBits().test(FeatureWavefrontSize32); if (STI->getFeatureBits().test(FeatureGFX11FullVGPRs)) return IsWave32 ? 1536 : 768; return IsWave32 ? 1024 : 512; } Tobias
On 29/01/2024 12:50, Tobias Burnus wrote: > Andrew Stubbs wrote: >>> /tmp/ccrsHfVQ.mkoffload.2.s:788736:27: error: value out of range >>> .amdhsa_next_free_vgpr 516 >>> ^~~ [Obviously, likewise >>> forlibgomp.c++/.. >> Hmm, supposedly there are 768 registers allocated in groups of 12, on >> gfx1100 (8 on other devices), which number you have to double on >> wavefrontsize64 because that field actually counts the number of >> 32-lane registers. The ISA can only actually reference 256 registers, >> so the limit here should be 512. (The remaining registers are intended >> for other wavefronts to use.) >> >> But 256 is not divisible by 12, and it looks like we've rounded up. I >> guess we need to set the limit at 252 (504), for gfx1100. > > BTW: The LLVM source code has, > https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.cpp#L1066 > > unsigned getTotalNumVGPRs(const MCSubtargetInfo *STI) { > if (STI->getFeatureBits().test(FeatureGFX90AInsts)) > return 512; > if (!isGFX10Plus(*STI)) > return 256; > bool IsWave32 = STI->getFeatureBits().test(FeatureWavefrontSize32); > if (STI->getFeatureBits().test(FeatureGFX11FullVGPRs)) > return IsWave32 ? 1536 : 768; > return IsWave32 ? 1024 : 512; > } That matches what we have in libgomp. LLVM must have another configuration somewhere for how many registers it can actually use in code (the ISA can encode 256, but that doesn't mean it should always do so). This may be a moot point because allowing too many registers limits how many threads can run in parallel, so they may have chosen to impose an artificial limit at all times. In GCC, non-kernel functions are limited to 24 registers (for maximum occupancy -- we could probably increase that 50% on "GFX11Full" devices), but the kernel entry point is permitted to go crazy. Andrew
gcn/gcn-valu.md: Disable fold_left_plus for TARGET_RDNA2_PLUS [PR113615] gcc/ChangeLog: PR target/113615 * config/gcn/gcn-valu.md (fold_left_plus_<mode>): Only define for !TARGET_RDNA2_PLUS. Signed-off-by: Tobias Burnus <tburnus@baylibre.com> gcc/config/gcn/gcn-valu.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index cd027f8b369..23b441f8e8b 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -4274,7 +4274,8 @@ (define_expand "fold_left_plus_<mode>" [(match_operand:<SCALAR_MODE> 0 "register_operand") (match_operand:<SCALAR_MODE> 1 "gcn_alu_operand") (match_operand:V_FP 2 "gcn_alu_operand")] - "can_create_pseudo_p () + "!TARGET_RDNA2_PLUS + && can_create_pseudo_p () && (flag_openacc || flag_openmp || flag_associative_math)" {