From patchwork Mon Oct 16 11:09:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: tip-bot2 for Thomas Gleixner X-Patchwork-Id: 153333 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3379964vqb; Mon, 16 Oct 2023 04:09:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFh1WMZ8XlSS4+KNkavmwf30k9V6mA3b00VZH3SYwWu5/LL8rxt4NipRpSfgMIRWdDr6S9M X-Received: by 2002:a92:cb41:0:b0:357:6783:73ce with SMTP id f1-20020a92cb41000000b00357678373cemr7162743ilq.0.1697454587725; Mon, 16 Oct 2023 04:09:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697454587; cv=none; d=google.com; s=arc-20160816; b=FrZfeDwO/mv6dAH5P9BGj0zki5uWUyLcu2QVu5uHXSaMxPUKCNYYoz+5wK1aMJaNmr YhjUANgseJ1dTEafIsTYYuaO4QnPU0B1pptQeZCOVpB3userdoPz3iDzqrFxTV710b3k j3GJsfg8pD2WNVp22Q2qNkxC0mLZgI6Grclvyd35KzpMEjfaczRM66NNFCBX6nywPnbs kI3iqAwnCdE+YInG38sCWxso3DjzFcEXqD6mIHYKf2fTeFh3pkOjb+vkhX+GTb90qMbp UzeyQHpBmBZIsxKrNltUJEjw+H9AvpLsUzic6i3Pdo3gARsetTPBKlf7Nvc3iqMEJiXQ f6kQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:robot-unsubscribe :robot-id:message-id:mime-version:references:in-reply-to:cc:subject :to:reply-to:sender:from:dkim-signature:dkim-signature:date; bh=DO/jszgKO0c4D8xdawhnlJ187SlWK3qG10hYY5gd9cs=; fh=bfBql1T9OpVEtRBKdZLOBsgHTAnzE7xhzIZCSh8LRKA=; b=CndqSaOGVZcwLOibPMD8irmMjkzJp2CTu85Ug3mK1oIFjEEUgBXbkUqNOrlbaHuwE1 dpRMLkKAfhMbP5X5z+GpZTTkqoB6/DI0tSLkiydPhXa0nYuw4rKPb08WG935TSSiLWf4 1hPhmdUBD1z2l78XIniZvpxlhvlOHT5RUyXT9rbofEry6ihKWVtXcytk+1a39cIWxnnA 95IMBfjnwUS5tYFUjdpAFyRILiLdWJepFe5nRdRZzkavuTqf6XIM7znriztIe8MxnwPI Lqv2HO0goky4gfa/zG+b0x0t/wE0cmYbMFLxyGLA1txiZVBKapNHTEzxxR5yd2ix5mH1 LzUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=jiTuDnk9; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id q125-20020a634383000000b00584a495d8efsi9908777pga.582.2023.10.16.04.09.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Oct 2023 04:09:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=jiTuDnk9; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 74C3280A856B; Mon, 16 Oct 2023 04:09:45 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232978AbjJPLJV (ORCPT + 18 others); Mon, 16 Oct 2023 07:09:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232547AbjJPLJQ (ORCPT ); Mon, 16 Oct 2023 07:09:16 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6808AB; Mon, 16 Oct 2023 04:09:14 -0700 (PDT) Date: Mon, 16 Oct 2023 11:09:11 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1697454552; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DO/jszgKO0c4D8xdawhnlJ187SlWK3qG10hYY5gd9cs=; b=jiTuDnk9H4/WmuCMeVUU90NUskOH7wpr5GXxJJEJa4BsIgJekRbA4pL+objqZqxvWBZb9P wVWPbjF26R3y4YLSkjPKH0erJDATill+yelOB+HMIAsz350Cl2C5X2QCNTLX8k92V+4U72 iDYDd3GpT38IpykWrqncSt8l5uLy7fg0S3CEBbLGq3FZJL7funzFqukT3kmGpq4wC8Hx9F 4jA8HDBN+4jspH3QoTzbSr9MLxsDsGU9LJklvMg0voyd6mxg86SNHfiM78HwbZz8b7QOED gT2qpd074xe2Ycss9OCDYbOPJMS4CGSEh2Pt/OqH3ZA+wB5DNz29Yae+QvfhFg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1697454552; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DO/jszgKO0c4D8xdawhnlJ187SlWK3qG10hYY5gd9cs=; b=WnuHjy4ERMKQjDBKbenxwbepRvn67xU94M5bdXTK5C+4pwzsTvL1SylVdmHAmsWvMcVHH/ xeMB5h/QmwOfBJCQ== From: "tip-bot2 for Uros Bizjak" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/percpu] x86/percpu: Use C for arch_raw_cpu_ptr(), to improve code generation Cc: Nadav Amit , Uros Bizjak , Ingo Molnar , Andy Lutomirski , Brian Gerst , Denys Vlasenko , "H. Peter Anvin" , Linus Torvalds , Josh Poimboeuf , Sean Christopherson , x86@kernel.org, linux-kernel@vger.kernel.org In-Reply-To: <20231015202523.189168-2-ubizjak@gmail.com> References: <20231015202523.189168-2-ubizjak@gmail.com> MIME-Version: 1.0 Message-ID: <169745455193.3135.696745948211732755.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Mon, 16 Oct 2023 04:09:45 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779910141761658777 X-GMAIL-MSGID: 1779910141761658777 The following commit has been merged into the x86/percpu branch of tip: Commit-ID: 1d10f3aec2bb734b4b594afe8c1bd0aa656a7e4d Gitweb: https://git.kernel.org/tip/1d10f3aec2bb734b4b594afe8c1bd0aa656a7e4d Author: Uros Bizjak AuthorDate: Sun, 15 Oct 2023 22:24:40 +02:00 Committer: Ingo Molnar CommitterDate: Mon, 16 Oct 2023 12:52:02 +02:00 x86/percpu: Use C for arch_raw_cpu_ptr(), to improve code generation Implement arch_raw_cpu_ptr() in C to allow the compiler to perform better optimizations, such as setting an appropriate base to compute the address. The compiler is free to choose either MOV or ADD from this_cpu_off address to construct the optimal final address. There are some other issues when memory access to the percpu area is implemented with an asm. Compilers can not eliminate asm common subexpressions over basic block boundaries, but are extremely good at optimizing memory access. By implementing arch_raw_cpu_ptr() in C, the compiler can eliminate additional redundant loads from this_cpu_off, further reducing the number of percpu offset reads from 1646 to 1631 on a test build, a -0.9% reduction. Co-developed-by: Nadav Amit Signed-off-by: Nadav Amit Signed-off-by: Uros Bizjak Signed-off-by: Ingo Molnar Cc: Andy Lutomirski Cc: Brian Gerst Cc: Denys Vlasenko Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Josh Poimboeuf Cc: Uros Bizjak Cc: Sean Christopherson Link: https://lore.kernel.org/r/20231015202523.189168-2-ubizjak@gmail.com --- arch/x86/include/asm/percpu.h | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/arch/x86/include/asm/percpu.h b/arch/x86/include/asm/percpu.h index 915675f..5474690 100644 --- a/arch/x86/include/asm/percpu.h +++ b/arch/x86/include/asm/percpu.h @@ -49,6 +49,21 @@ #define __force_percpu_prefix "%%"__stringify(__percpu_seg)":" #define __my_cpu_offset this_cpu_read(this_cpu_off) +#ifdef CONFIG_USE_X86_SEG_SUPPORT +/* + * Efficient implementation for cases in which the compiler supports + * named address spaces. Allows the compiler to perform additional + * optimizations that can save more instructions. + */ +#define arch_raw_cpu_ptr(ptr) \ +({ \ + unsigned long tcp_ptr__; \ + tcp_ptr__ = __raw_cpu_read(, this_cpu_off); \ + \ + tcp_ptr__ += (unsigned long)(ptr); \ + (typeof(*(ptr)) __kernel __force *)tcp_ptr__; \ +}) +#else /* CONFIG_USE_X86_SEG_SUPPORT */ /* * Compared to the generic __my_cpu_offset version, the following * saves one instruction and avoids clobbering a temp register. @@ -63,6 +78,8 @@ tcp_ptr__ += (unsigned long)(ptr); \ (typeof(*(ptr)) __kernel __force *)tcp_ptr__; \ }) +#endif /* CONFIG_USE_X86_SEG_SUPPORT */ + #else /* CONFIG_SMP */ #define __percpu_seg_override #define __percpu_prefix ""