Message ID | 20230320212012.12704-1-ubizjak@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp1449907wrt; Mon, 20 Mar 2023 14:38:27 -0700 (PDT) X-Google-Smtp-Source: AK7set++gUroFLyBbO38AnZ6t0/q5McvfjjcFIF7x3QU+ORkCQPKsjtjSU1/grmY6JoXN0AAZ1aj X-Received: by 2002:a05:6a20:6a0a:b0:d4:156f:45ad with SMTP id p10-20020a056a206a0a00b000d4156f45admr24376251pzk.58.1679348306951; Mon, 20 Mar 2023 14:38:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679348306; cv=none; d=google.com; s=arc-20160816; b=k31iZhOP8eBsw9kiT+fNNwH7XhjbiShj+SW49w7i5hkBSYB6Y3Zuls+fK/ipiD829k W91OfRB0O0pA8qI9hajdVhFYXaEnBOJAd4dbo7cnpLNABe5oyo4UNC40RO31LNUykKqy v3f5poPvW3C6JUCRN3Yt28VYZ/P1SkgqV+INWlhRijc4A78wo1/UVFdqPMqs2J12C68P P6s9P62jnah+ov0WUdoNfvOmKf+2MWels1rDZRVEHUYlkIXK/A1LPMTh93+dvNgenn5m 667PIPu3YcpEYTTHkxEKxeE+x8/acZ3q7QsZEiYAs8ngYbu9aKPPIFnFTcReiWfh2q1L NO5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=wnqyD8RmQzvkYX7wOFvo1E4JUh7LYwrEe0pQvUYqgq8=; b=rIsuk8gSc177wWMc9ie6wMKYOOqNRpYZ0RNP4FZ9fVQEbGBs1v8nmTo5gOMfwHxSls DpSpw8UXBhn6UHIZeXP8znFRfJxzod3O6EkeQipDNOHgxSaovrcDC4MswpunLxH7u4Cy Uo2mxk5iXQrQTRNrLuYLLvfXtF+Kxen2Byiutwlms/DYTPFCtWC1EdEMkoXzi16oi6k0 hDyqd/G/uWpoO7bV0R75KWTzkM9yvuVw/IDcM8z+vZUYeHqAqH9SD9f8VkSY+VJtZmtV 4ZUVcuAUzgsW3Pykj7Dm0RgBt8pG6n9g045TbiWRZbV1RbwVGBV3n1YqXwgTyeDG49qI Xljg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="OMb/UAsV"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p30-20020a63741e000000b0050bd8ddd41asi11422294pgc.811.2023.03.20.14.38.14; Mon, 20 Mar 2023 14:38:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="OMb/UAsV"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230040AbjCTVUa (ORCPT <rfc822;pusanteemu@gmail.com> + 99 others); Mon, 20 Mar 2023 17:20:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41566 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229949AbjCTVU2 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 20 Mar 2023 17:20:28 -0400 Received: from mail-ed1-x536.google.com (mail-ed1-x536.google.com [IPv6:2a00:1450:4864:20::536]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A22FF27D61; Mon, 20 Mar 2023 14:20:26 -0700 (PDT) Received: by mail-ed1-x536.google.com with SMTP id eh3so52177380edb.11; Mon, 20 Mar 2023 14:20:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679347225; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=wnqyD8RmQzvkYX7wOFvo1E4JUh7LYwrEe0pQvUYqgq8=; b=OMb/UAsVxy8Rg203eWnmzDoBUU19ufkggfUFz4Fxls5xlKcieP5DAe5fzdyJwZUe+S kgPWUSNXhvNG0aLaGBVZVzBgb31YXb+NfdPbVKxc/iwVlU0lScSCBKrPr9ipZe+gcdjs 2xiXaUIRJauRplbDIPjdtumdMLjgmi5Ta/+5z3wNjG8+0ilUEza8FD6tYvItbI/tTWtX FNdj0LggZDFosWVHqRn3wdMJ0H6+lo6hzVR7irbeCwaXGFGZhN40QdmCOzqPqGqXgk5u DBkhuPYAKGDF8yvqcnMKSwJLf81UrOWoBwCogOjhV+qrzEtaffl/svxhJF31xtk4FabA l3+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679347225; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=wnqyD8RmQzvkYX7wOFvo1E4JUh7LYwrEe0pQvUYqgq8=; b=M3cq3TvQpPOYxrd5USXCRRQ/CwNQvTanV4UACZxAZGz93WNdYrHxY081bCiDnvmzXv g9s5XAwfI6DxtDLMXkJxpMlLFMeIB91MNWKthpH+EdsFdL+A5HttXbXt4IgfPEfi7P3V 0Et7Ou0g5v9TMqYBpjcSxWNcOqqmHsL14cHxxKE5XVbUlma6FJM+idNWaoHMQENDNKI+ 0WhXmNFZEJ222YSk8yMzQMnFX3BgC9hF3o7MSDZk3Ea235ALTK9ZRWZeseX2xzqNGxGM /M4JMzMR/Dcd9jGz8kqaHRJ0A/V8edKtE+98F/8IbwpCqL/klQ5P92OwiFTmKyWKyyxc 2yIQ== X-Gm-Message-State: AO0yUKW/d8aFFJg6EabAcw+XBI6E2d/mv5Dp+09vhtrVsasXMBBAGqdP J1xgJ6EctzIoGu8/cHzysl44c5b03ibKnA== X-Received: by 2002:a17:906:5fd4:b0:931:51c0:7300 with SMTP id k20-20020a1709065fd400b0093151c07300mr468394ejv.77.1679347224778; Mon, 20 Mar 2023 14:20:24 -0700 (PDT) Received: from localhost.localdomain ([46.248.82.114]) by smtp.gmail.com with ESMTPSA id z26-20020a17090674da00b009310d4dece9sm4875965ejl.62.2023.03.20.14.20.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Mar 2023 14:20:24 -0700 (PDT) From: Uros Bizjak <ubizjak@gmail.com> To: x86@kernel.org, linux-acpi@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Uros Bizjak <ubizjak@gmail.com>, "Rafael J. Wysocki" <rafael@kernel.org>, Len Brown <lenb@kernel.org>, Thomas Gleixner <tglx@linutronix.de>, Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>, Dave Hansen <dave.hansen@linux.intel.com>, "H. Peter Anvin" <hpa@zytor.com> Subject: [PATCH v2] x86/ACPI/boot: Improve __acpi_acquire_global_lock Date: Mon, 20 Mar 2023 22:20:12 +0100 Message-Id: <20230320212012.12704-1-ubizjak@gmail.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760924329856934035?= X-GMAIL-MSGID: =?utf-8?q?1760924329856934035?= |
Series |
[v2] x86/ACPI/boot: Improve __acpi_acquire_global_lock
|
|
Commit Message
Uros Bizjak
March 20, 2023, 9:20 p.m. UTC
Improve __acpi_acquire_global_lock by using a temporary variable.
This enables compiler to perform if-conversion and improves generated
code from:
...
72a: d1 ea shr %edx
72c: 83 e1 fc and $0xfffffffc,%ecx
72f: 83 e2 01 and $0x1,%edx
732: 09 ca or %ecx,%edx
734: 83 c2 02 add $0x2,%edx
737: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
73b: 75 e9 jne 726 <__acpi_acquire_global_lock+0x6>
73d: 83 e2 03 and $0x3,%edx
740: 31 c0 xor %eax,%eax
742: 83 fa 03 cmp $0x3,%edx
745: 0f 95 c0 setne %al
748: f7 d8 neg %eax
to:
...
72a: d1 e9 shr %ecx
72c: 83 e2 fc and $0xfffffffc,%edx
72f: 83 e1 01 and $0x1,%ecx
732: 09 ca or %ecx,%edx
734: 83 c2 02 add $0x2,%edx
737: f0 0f b1 17 lock cmpxchg %edx,(%rdi)
73b: 75 e9 jne 726 <__acpi_acquire_global_lock+0x6>
73d: 8d 41 ff lea -0x1(%rcx),%eax
BTW: the compiler could generate:
lea 0x2(%rcx,%rdx,1),%edx
instead of:
or %ecx,%edx
add $0x2,%edx
but unwated conversion from add to or when bits are known to be zero
prevents this improvement. This is GCC PR108477.
No functional change intended.
Signed-off-by: Uros Bizjak <ubizjak@gmail.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Len Brown <lenb@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
---
v2: Expand return statement.
---
arch/x86/kernel/acpi/boot.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
Comments
On Mon, Mar 20, 2023 at 10:20 PM Uros Bizjak <ubizjak@gmail.com> wrote: > > Improve __acpi_acquire_global_lock by using a temporary variable. > This enables compiler to perform if-conversion and improves generated > code from: > > ... > 72a: d1 ea shr %edx > 72c: 83 e1 fc and $0xfffffffc,%ecx > 72f: 83 e2 01 and $0x1,%edx > 732: 09 ca or %ecx,%edx > 734: 83 c2 02 add $0x2,%edx > 737: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > 73b: 75 e9 jne 726 <__acpi_acquire_global_lock+0x6> > 73d: 83 e2 03 and $0x3,%edx > 740: 31 c0 xor %eax,%eax > 742: 83 fa 03 cmp $0x3,%edx > 745: 0f 95 c0 setne %al > 748: f7 d8 neg %eax > > to: > > ... > 72a: d1 e9 shr %ecx > 72c: 83 e2 fc and $0xfffffffc,%edx > 72f: 83 e1 01 and $0x1,%ecx > 732: 09 ca or %ecx,%edx > 734: 83 c2 02 add $0x2,%edx > 737: f0 0f b1 17 lock cmpxchg %edx,(%rdi) > 73b: 75 e9 jne 726 <__acpi_acquire_global_lock+0x6> > 73d: 8d 41 ff lea -0x1(%rcx),%eax > > BTW: the compiler could generate: > > lea 0x2(%rcx,%rdx,1),%edx > > instead of: > > or %ecx,%edx > add $0x2,%edx > > but unwated conversion from add to or when bits are known to be zero > prevents this improvement. This is GCC PR108477. > > No functional change intended. > > Signed-off-by: Uros Bizjak <ubizjak@gmail.com> > Cc: "Rafael J. Wysocki" <rafael@kernel.org> > Cc: Len Brown <lenb@kernel.org> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Ingo Molnar <mingo@redhat.com> > Cc: Borislav Petkov <bp@alien8.de> > Cc: Dave Hansen <dave.hansen@linux.intel.com> > Cc: "H. Peter Anvin" <hpa@zytor.com> > --- > v2: Expand return statement. Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> or please let me know if you want me to pick this up (in which case it will require an ACK from one of the x86 maintainers). > --- > arch/x86/kernel/acpi/boot.c | 11 ++++++++--- > 1 file changed, 8 insertions(+), 3 deletions(-) > > diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c > index 1c38174b5f01..a08a4a7a03f8 100644 > --- a/arch/x86/kernel/acpi/boot.c > +++ b/arch/x86/kernel/acpi/boot.c > @@ -1853,13 +1853,18 @@ early_param("acpi_sci", setup_acpi_sci); > > int __acpi_acquire_global_lock(unsigned int *lock) > { > - unsigned int old, new; > + unsigned int old, new, val; > > old = READ_ONCE(*lock); > do { > - new = (((old & ~0x3) + 2) + ((old >> 1) & 0x1)); > + val = (old >> 1) & 0x1; > + new = (old & ~0x3) + 2 + val; > } while (!try_cmpxchg(lock, &old, new)); > - return ((new & 0x3) < 3) ? -1 : 0; > + > + if (val) > + return 0; > + > + return -1; > } > > int __acpi_release_global_lock(unsigned int *lock) > -- > 2.39.2 >
On 3/22/23 11:24, Rafael J. Wysocki wrote: > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > or please let me know if you want me to pick this up (in which case it > will require an ACK from one of the x86 maintainers). I'll pull it into x86/acpi. I'm kinda shocked the compiler is so clueless, but this makes the C code more readable anyway. Win/win, I guess.
On Wed, Mar 22, 2023 at 7:34 PM Dave Hansen <dave.hansen@intel.com> wrote: > > On 3/22/23 11:24, Rafael J. Wysocki wrote: > > Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > or please let me know if you want me to pick this up (in which case it > > will require an ACK from one of the x86 maintainers). > > I'll pull it into x86/acpi. I'm kinda shocked the compiler is so > clueless, but this makes the C code more readable anyway. Win/win, I guess. Please note that the return form __acpi_{acquire,release}_global_lock is actually used as bool: acenv.h: int __acpi_acquire_global_lock(unsigned int *lock); int __acpi_release_global_lock(unsigned int *lock); #define ACPI_ACQUIRE_GLOBAL_LOCK(facs, Acq) \ ((Acq) = __acpi_acquire_global_lock(&facs->global_lock)) #define ACPI_RELEASE_GLOBAL_LOCK(facs, Acq) \ ((Acq) = __acpi_release_global_lock(&facs->global_lock)) evglock.c: acpi_status acpi_ev_acquire_global_lock(u16 timeout) { ... u8 acquired = FALSE; ... ACPI_ACQUIRE_GLOBAL_LOCK(acpi_gbl_FACS, acquired); if (acquired) ... } acpi_status acpi_ev_release_global_lock(void) { u8 pending = FALSE; ... ACPI_RELEASE_GLOBAL_LOCK(acpi_gbl_FACS, pending); if (pending) ... } These functions are also defined for ia64, so I didn't want to change the return value. But ia64 is going to be retired, and this opens the optimization opportunity. Uros.
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 1c38174b5f01..a08a4a7a03f8 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -1853,13 +1853,18 @@ early_param("acpi_sci", setup_acpi_sci); int __acpi_acquire_global_lock(unsigned int *lock) { - unsigned int old, new; + unsigned int old, new, val; old = READ_ONCE(*lock); do { - new = (((old & ~0x3) + 2) + ((old >> 1) & 0x1)); + val = (old >> 1) & 0x1; + new = (old & ~0x3) + 2 + val; } while (!try_cmpxchg(lock, &old, new)); - return ((new & 0x3) < 3) ? -1 : 0; + + if (val) + return 0; + + return -1; } int __acpi_release_global_lock(unsigned int *lock)