Message ID | 20221013200134.1487-4-xin3.li@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp465218wrs; Thu, 13 Oct 2022 13:24:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM77d8Gre4sM+vnyRd/wR6bDXYjHQoEMeZICj6CtALkrpM6oM6sZobDa+hTzWbmQKfhgOgiB X-Received: by 2002:a05:6a00:21c8:b0:52e:3404:eba5 with SMTP id t8-20020a056a0021c800b0052e3404eba5mr1556135pfj.67.1665692677434; Thu, 13 Oct 2022 13:24:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665692677; cv=none; d=google.com; s=arc-20160816; b=FfrZrs7VIuW0hx4nvSClxzgfagRcVNycZGSSO8HH+BBAOs9U1Iq9swbCxV2hOfstFs lgt/K6fJrK+rR1NoAxi9O4rYI7ua4BgZL0vztI3/q7LQbX83gIIqDnfyh41em0udpz/y 63lEtXf04/T4I+aioDV3qwNO8G7GhykKITUfg3lqNkZ84kUwkNF2EzFitfjPEhHJ9oJu k71q0Ngx17eoRIsO1gzjE+jlIgPXqy+iOf0gYAyIDhkk8J8LHKSocopBG6pvO/3RSttr Gr/jgF3a8fLJKe+9Jvfl6be7Q+apfQxxMjFowEyPsr2nOhfF9sSqJNSOXo4FHWA5SJ9O cFWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Y0zyNmAYzT3ri+4nefmoVJ0xInwrkHd8fCwz/8mxmhE=; b=Xj1d1nd+/s819AWd+0TH+nij/RMUuVMS0oY2gqPXc2CzzYc9AiHk0maE9lIQbJ9QxJ jYgh+ouNZboJcx5UjbSufopC3b9y2W2bkXC38gJ9FoCJWWR6QdA1+CDievJvuKeCbsuf wVDZYyYWuwgVCkFVY7z4MVNzistNtQKexXViJH2bsp6frnYFZzD/FUNB7jtu6qbE161U wlbiRnRrEoZJyZn+RTD3M3C8kVoImBGk/uFDehyGeGCpH1BzJKwVLq5uS3NwP+x3WD4x HAzMKrtXWc0xawfOIHR5DxJ2DUb82HnDsE3eHgcaPV138Arv9VDFxG+XNjXTN8/ZTfFX gkxg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=c6YHXMAq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u6-20020a62ed06000000b0052542cfb5c4si314982pfh.235.2022.10.13.13.24.23; Thu, 13 Oct 2022 13:24:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=c6YHXMAq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229871AbiJMUXq (ORCPT <rfc822;ouuuleilei@gmail.com> + 99 others); Thu, 13 Oct 2022 16:23:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229803AbiJMUXb (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 13 Oct 2022 16:23:31 -0400 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD363159D53 for <linux-kernel@vger.kernel.org>; Thu, 13 Oct 2022 13:23:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665692610; x=1697228610; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1YbzoiUXSvQWzBUraiDzjfmE9NvY0qFctz1F+7sN7Sw=; b=c6YHXMAqeciH2NJdcryUrWnbiMKBXQt92+ALdUDlWYIdpMULcBKyClZE dcvUTWyPXA6e3Gp1sskBm4PpTVY6d17SnSbYCXAAQRIlPANamFSkM3THA oqVg3Wp3nZOsM49WbKguCnFD3G8i+JOm+q9fjeKcoGRGt7ZFESv9+Yj82 eRy71gmT5hcp7XKdneo8eJ3X2k5Cl4RuYdQ6tN1oiAM6Qi3rrgehjtfq7 FS4K5eawMWnIbxL8kYBBvxbg8aYWm6VjGAY28z/SYjASNxPKvpoFEo7UV HcDjmLFK7+JKIHtSeb37A8zpxNMc3KN1m5Fkf2UPJYVnivUlNr9DnXUeJ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10499"; a="302808960" X-IronPort-AV: E=Sophos;i="5.95,182,1661842800"; d="scan'208";a="302808960" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Oct 2022 13:23:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10499"; a="690271046" X-IronPort-AV: E=Sophos;i="5.95,182,1661842800"; d="scan'208";a="690271046" Received: from unknown (HELO fred..) ([172.25.112.68]) by fmsmga008.fm.intel.com with ESMTP; 13 Oct 2022 13:23:27 -0700 From: Xin Li <xin3.li@intel.com> To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, peterz@infradead.org, brgerst@gmail.com, chang.seok.bae@intel.com Subject: [PATCH v3 3/6] x86/gsseg: make asm_load_gs_index() take an u16 Date: Thu, 13 Oct 2022 13:01:31 -0700 Message-Id: <20221013200134.1487-4-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221013200134.1487-1-xin3.li@intel.com> References: <20221013200134.1487-1-xin3.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746605364923261807?= X-GMAIL-MSGID: =?utf-8?q?1746605364923261807?= |
Series |
Enable LKGS instruction
|
|
Commit Message
Li, Xin3
Oct. 13, 2022, 8:01 p.m. UTC
From: "H. Peter Anvin (Intel)" <hpa@zytor.com> Let gcc know that only the low 16 bits of load_gs_index() argument actually matter. It might allow it to create slightly better code. However, do not propagate this into the prototypes of functions that end up being paravirtualized, to avoid unnecessary changes. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com> Signed-off-by: Xin Li <xin3.li@intel.com> --- arch/x86/entry/entry_64.S | 2 +- arch/x86/include/asm/special_insns.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
Comments
From: Xin Li > Sent: 13 October 2022 21:02 > > From: "H. Peter Anvin (Intel)" <hpa@zytor.com> > > Let gcc know that only the low 16 bits of load_gs_index() argument > actually matter. It might allow it to create slightly better > code. However, do not propagate this into the prototypes of functions > that end up being paravirtualized, to avoid unnecessary changes. Using u16 will almost always make the code worse. At some point the value has to be masked and/or extended to ensure an out of range value doesn't appear in a register. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
> > > > From: "H. Peter Anvin (Intel)" <hpa@zytor.com> > > > > Let gcc know that only the low 16 bits of load_gs_index() argument > > actually matter. It might allow it to create slightly better code. > > However, do not propagate this into the prototypes of functions that > > end up being paravirtualized, to avoid unnecessary changes. > > Using u16 will almost always make the code worse. > At some point the value has to be masked and/or extended to ensure an out of > range value doesn't appear in a register. Any potential issue with this patch set? > > David > > - > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 > 1PT, UK Registration No: 1397386 (Wales)
On October 14, 2022 5:28:25 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote: >From: Xin Li >> Sent: 13 October 2022 21:02 >> >> From: "H. Peter Anvin (Intel)" <hpa@zytor.com> >> >> Let gcc know that only the low 16 bits of load_gs_index() argument >> actually matter. It might allow it to create slightly better >> code. However, do not propagate this into the prototypes of functions >> that end up being paravirtualized, to avoid unnecessary changes. > >Using u16 will almost always make the code worse. >At some point the value has to be masked and/or extended >to ensure an out of range value doesn't appear in >a register. > > David > >- >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK >Registration No: 1397386 (Wales) > > Is that a general statement or are you actually invoking it in this case? This is about it being a narrowing input, *removing* such constraints.
From: H. Peter Anvin > Sent: 15 October 2022 03:41 > > On October 14, 2022 5:28:25 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote: > >From: Xin Li > >> Sent: 13 October 2022 21:02 > >> > >> From: "H. Peter Anvin (Intel)" <hpa@zytor.com> > >> > >> Let gcc know that only the low 16 bits of load_gs_index() argument > >> actually matter. It might allow it to create slightly better > >> code. However, do not propagate this into the prototypes of functions > >> that end up being paravirtualized, to avoid unnecessary changes. > > > >Using u16 will almost always make the code worse. > >At some point the value has to be masked and/or extended > >to ensure an out of range value doesn't appear in > >a register. > > > > David > > Is that a general statement or are you actually invoking it in this case? > This is about it being a narrowing input, *removing* such constraints. It is a general statement. You suggested you might get better code. If fact you'll probably get worse code. It might not matter here, but ... Most modern calling conventions use cpu register to pass arguments and results. So the compiler is required to ensure that u16 values are in range in either the caller or called code (or both). Just because the domain of a value is small doesn't mean that the best type isn't 'int' or 'unsigned int'. Additionally (except on x86) any arithmetic on sub-32bit values requires additional instructions to mask the result. Even on x86-64 if you index an array with an 'int' the compiler has to generate code to sign extend the value to 64 bits. You get better code for 'signed long' or unsigned types. This is probably true for all 64bit architectures. Since (most) cpu have both sign extending an zero extending loads from memory, it can make sense to use u8 and u16 to reduce the size of structures. But for function arguments and function locals it almost always makes the code worse. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On October 17, 2022 12:49:41 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote: >From: H. Peter Anvin >> Sent: 15 October 2022 03:41 >> >> On October 14, 2022 5:28:25 AM PDT, David Laight <David.Laight@ACULAB.COM> wrote: >> >From: Xin Li >> >> Sent: 13 October 2022 21:02 >> >> >> >> From: "H. Peter Anvin (Intel)" <hpa@zytor.com> >> >> >> >> Let gcc know that only the low 16 bits of load_gs_index() argument >> >> actually matter. It might allow it to create slightly better >> >> code. However, do not propagate this into the prototypes of functions >> >> that end up being paravirtualized, to avoid unnecessary changes. >> > >> >Using u16 will almost always make the code worse. >> >At some point the value has to be masked and/or extended >> >to ensure an out of range value doesn't appear in >> >a register. >> > >> > David >> >> Is that a general statement or are you actually invoking it in this case? >> This is about it being a narrowing input, *removing* such constraints. > >It is a general statement. >You suggested you might get better code. >If fact you'll probably get worse code. >It might not matter here, but ... > >Most modern calling conventions use cpu register to pass arguments >and results. >So the compiler is required to ensure that u16 values are in range >in either the caller or called code (or both). >Just because the domain of a value is small doesn't mean that >the best type isn't 'int' or 'unsigned int'. > >Additionally (except on x86) any arithmetic on sub-32bit values >requires additional instructions to mask the result. > >Even on x86-64 if you index an array with an 'int' the compiler >has to generate code to sign extend the value to 64 bits. >You get better code for 'signed long' or unsigned types. >This is probably true for all 64bit architectures. > >Since (most) cpu have both sign extending an zero extending >loads from memory, it can make sense to use u8 and u16 to >reduce the size of structures. >But for function arguments and function locals it almost >always makes the code worse. > > David > >- >Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK >Registration No: 1397386 (Wales) > Ok. You are plain incorrect in this case for two reasons: 1. The x86-64 calling convention makes it up to the receiver (callee for arguments, caller for returns) to do such masking of values. 2. The consumer of the values here does not need any masking or extensions. So this is simply telling the compiler what the programmer knows.
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S index 9953d966d124..e0c48998d2fb 100644 --- a/arch/x86/entry/entry_64.S +++ b/arch/x86/entry/entry_64.S @@ -779,7 +779,7 @@ _ASM_NOKPROBE(common_interrupt_return) /* * Reload gs selector with exception handling - * edi: new selector + * di: new selector * * Is in entry.text as it shouldn't be instrumented. */ diff --git a/arch/x86/include/asm/special_insns.h b/arch/x86/include/asm/special_insns.h index 35f709f619fb..a71d0e8d4684 100644 --- a/arch/x86/include/asm/special_insns.h +++ b/arch/x86/include/asm/special_insns.h @@ -120,7 +120,7 @@ static inline void native_wbinvd(void) asm volatile("wbinvd": : :"memory"); } -extern asmlinkage void asm_load_gs_index(unsigned int selector); +extern asmlinkage void asm_load_gs_index(u16 selector); static inline void native_load_gs_index(unsigned int selector) {