Message ID | 20230706052231.2183-1-xin3.li@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2337662vqx; Wed, 5 Jul 2023 22:55:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlFnw76l6vFue1EP4Jt0adpr419qd2XcV+PlheL+snHL309SzLhNxRxdGACqi86o9D7iDsKD X-Received: by 2002:a9d:7dcc:0:b0:6b8:8596:934e with SMTP id k12-20020a9d7dcc000000b006b88596934emr1060244otn.28.1688622923744; Wed, 05 Jul 2023 22:55:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688622923; cv=none; d=google.com; s=arc-20160816; b=IDHTSveWSN4p7FyxZuJIxSwL58gY+rmXFYUo7PYmULKwXHSB8wjImgwE4/EcoRQWB9 JqJFZxbi2UgqWaYrHne8z5Uo/T8uSkbeJ9RbJgbiRG4vNlChfsN7ppNu+YFmFKNSC7ko 9WKtIj3epRlsoe4kwAIRYSr57O0rTMCplUX+C+UVX9KfdK+vq+3l31kgS43DWfGSKA5l sVt0QF2nzzl0CWJR/P08ihqVthwBkVDQ6NAx8SlDD1MoSOWPgYreKnx8htkS5VdTldja olVVZTYRz0BE2BdVu3nTONzyDbJRS3tF+Ip5pLvu5uVE7pl8vp2fkk9IaSV9yI+XVevv y8YA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=KH2dl3p4B/X5nq/Lbej9E4PQSu+3bB4m4fegZl5DZwg=; fh=+kgAyz5lppE9gIha2P18L5ueKqwl0IgFkt2/9QrptSA=; b=QcEVOlY+fyhkjaQJFA3DeLLYktQBrQIJ4HUERlj3tm/CTjD4lww4qLCy7+uuudfcKx p9bxFuQyHR+Q9lVge8VfTYPaMcItl3fqCZXp0zTzz8SYPByT8358E4WpEPjH5/OIZQLV McAu9IxniqQlW+pcTbA/DwD0NvOIPEDhMJBMtdI98ZULrqv0099G81XP2gvSmQD84yPP sbc1Xb2DxGSCM3tUdz5+qRQfkqGhnZzp8/vzTvXm15yrcjIDhR+Vrr87ttpRBB+w8NfQ pWy5B4oxwhByqd+jFM/7oWS+58nmKbUrxnKRttAWxMwY4TjCdXIXcY8ubZ75ZL4PAwy0 Oiag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LSS5NONW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r11-20020a635d0b000000b0054f93b05b51si822420pgb.96.2023.07.05.22.55.11; Wed, 05 Jul 2023 22:55:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=LSS5NONW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233087AbjGFFuP (ORCPT <rfc822;tebrre53rla2o@gmail.com> + 99 others); Thu, 6 Jul 2023 01:50:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229527AbjGFFuO (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 6 Jul 2023 01:50:14 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69C36110 for <linux-kernel@vger.kernel.org>; Wed, 5 Jul 2023 22:50:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1688622613; x=1720158613; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=w6/BBpGC9o6TMtXs4FPV+zb5Oys3dj7edttjfM+T3Mw=; b=LSS5NONWXJKLNGkMKCuGFNZcNd8BvO9C8u7pZi8qEcS8szSFGY2lZPfl hHeJHFdikCrM++Ny+D0PhSLxoduSSzmXhqAl1NTXqFJJg9KbKekoIu+wW GO3oA7P2euTsUpj8n7dngl8HAO3UkhPTFPF0EQjNBt494mojrXzMxdr4B inY7Jdk50usDBVNtN74ZilDkHcbWqFwtqVpiZfgh2OOJzy/9AT8EXmUlW C8ku/Qj3PwP9FqWuDgiBxFVCAoNqE1Igz9VoLrHXbXsRra/vT8uIzY5Tj 3ETeUwTDh3FRq0bMQyHpadmXFbUIresocPijXhp9AXtc4J8kRYo1ivhv+ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10762"; a="362383031" X-IronPort-AV: E=Sophos;i="6.01,185,1684825200"; d="scan'208";a="362383031" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Jul 2023 22:50:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10762"; a="719479867" X-IronPort-AV: E=Sophos;i="6.01,185,1684825200"; d="scan'208";a="719479867" Received: from unknown (HELO fred..) ([172.25.112.68]) by orsmga002.jf.intel.com with ESMTP; 05 Jul 2023 22:50:12 -0700 From: Xin Li <xin3.li@intel.com> To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, brgerst@gmail.com, ebiederm@xmission.com, xin3.li@intel.com Subject: [PATCH] x86/ia32: Do not modify the DPL bits for a null selector Date: Wed, 5 Jul 2023 22:22:31 -0700 Message-Id: <20230706052231.2183-1-xin3.li@intel.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770649470419627466?= X-GMAIL-MSGID: =?utf-8?q?1770649470419627466?= |
Series |
x86/ia32: Do not modify the DPL bits for a null selector
|
|
Commit Message
Li, Xin3
July 6, 2023, 5:22 a.m. UTC
When a null selector is to be loaded into a segment register,
reload_segments() sets its DPL bits to 3. Later when an IRET
instruction loads it, it zeros the segment register. The two
operations offset each other to actually effect a nop.
Fix it by not modifying the DPL bits for a null selector.
Signed-off-by: Xin Li <xin3.li@intel.com>
---
arch/x86/kernel/signal_32.c | 21 +++++++++++++--------
1 file changed, 13 insertions(+), 8 deletions(-)
Comments
Xin Li <xin3.li@intel.com> writes: > When a null selector is to be loaded into a segment register, > reload_segments() sets its DPL bits to 3. Later when an IRET > instruction loads it, it zeros the segment register. The two > operations offset each other to actually effect a nop. > > Fix it by not modifying the DPL bits for a null selector. Maybe this is the right thing but this needs some serious comments about what is going on. In particular how does sel <= 3 equate to a null selector? Is that defined somewhere? At a minimum you should have static asserts to make certain no one redefines the first 4 segment selectors as anything else, if you want to refer to them by number instead of testing for specific properties. As written this looks like it requires an enormous amount of knowledge about how other parts of the code works, to be comprehensible or to change safely. That level of non-local knowledge should be unnecessary. Eric > Signed-off-by: Xin Li <xin3.li@intel.com> > --- > arch/x86/kernel/signal_32.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c > index 9027fc088f97..7796cf84fca2 100644 > --- a/arch/x86/kernel/signal_32.c > +++ b/arch/x86/kernel/signal_32.c > @@ -36,22 +36,27 @@ > #ifdef CONFIG_IA32_EMULATION > #include <asm/ia32_unistd.h> > > +static inline u16 usrseg(u16 sel) > +{ > + return sel <= 3 ? sel : sel | 3; > +} > + > static inline void reload_segments(struct sigcontext_32 *sc) > { > unsigned int cur; > > savesegment(gs, cur); > - if ((sc->gs | 0x03) != cur) > - load_gs_index(sc->gs | 0x03); > + if (usrseg(sc->gs) != cur) > + load_gs_index(usrseg(sc->gs)); > savesegment(fs, cur); > - if ((sc->fs | 0x03) != cur) > - loadsegment(fs, sc->fs | 0x03); > + if (usrseg(sc->fs) != cur) > + loadsegment(fs, usrseg(sc->fs)); > savesegment(ds, cur); > - if ((sc->ds | 0x03) != cur) > - loadsegment(ds, sc->ds | 0x03); > + if (usrseg(sc->ds) != cur) > + loadsegment(ds, usrseg(sc->ds)); > savesegment(es, cur); > - if ((sc->es | 0x03) != cur) > - loadsegment(es, sc->es | 0x03); > + if (usrseg(sc->es) != cur) > + loadsegment(es, usrseg(sc->es)); > } > > #define sigset32_t compat_sigset_t
On 7/5/23 22:22, Xin Li wrote: > When a null selector is to be loaded into a segment register, > reload_segments() sets its DPL bits to 3. Later when an IRET > instruction loads it, it zeros the segment register. The two > operations offset each other to actually effect a nop. > > Fix it by not modifying the DPL bits for a null selector. So in the end, this is an optimization attempt? Is there any other benefit?
> > When a null selector is to be loaded into a segment register, > > reload_segments() sets its DPL bits to 3. Later when an IRET > > instruction loads it, it zeros the segment register. The two > > operations offset each other to actually effect a nop. > > > > Fix it by not modifying the DPL bits for a null selector. > > So in the end, this is an optimization attempt? Is there any other benefit? You asked but I think you have the answer 😉, but want me to give the details in the commit message, which probably I should have done. Unlike IRET, ERETU, introduced with FRED to return to ring 3 from ring 0, does not make any of DS, ES, FS, or GS null if it is found to have DPL < 3. It is expected that a FRED-enabled operating system will return to ring 3 (in compatibility mode) only when those non-null selectors all have DPL = 3. Thus when FRED is enabled, because reload_segments() sets the DPL bits of null selector to 3, we end up with having 3 in a segment register even when it is initially set to 0. As a result, the sigreturn selftest fails as it sets DS to 0 and then checks if it's still 0 after sigreturn. Thanks! Xin
> > When a null selector is to be loaded into a segment register, > > reload_segments() sets its DPL bits to 3. Later when an IRET > > instruction loads it, it zeros the segment register. The two > > operations offset each other to actually effect a nop. > > > > Fix it by not modifying the DPL bits for a null selector. > > Maybe this is the right thing but this needs some serious comments about what is > going on. > > In particular how does sel <= 3 equate to a null selector? Is that defined > somewhere? In protected mode, a NULL selector (values 0000 through 0003) can be loaded into DS, ES, FS, or GS registers without causing a protection exception. This can be found at the description of instruction LDS/LES/LFS/LGS/LSS, in section 3.3 "instructions (A-L)" of Intel SDM Volume 2, Instruction Set Reference. >At a minimum you should have static asserts to make certain no > one redefines the first 4 segment selectors as anything else, if you want to refer to > them by number instead of testing for specific properties. Bits 0 and 1 of a selector are NOT used to index the GDT or IDT, thus selector values 0 ~ 3 all point to the first entry of the GDT, i.e., the null selector. Thus 3 as a selector is the same as 0, and it doesn't matter to change it or not. But when IRET sees an invalid segment register in ES, FS, GS, and DS, it sets it to 0, making 0 a preferred null selector value. The sigreturn selftest sets DS of its signal context to 0 and then checks if it's still 0 after sigreturn. And it passes. However if it sets DS to any other null selector value, e.g., 1, it fails. For FRED, if we don't modify the DPL bits, the test passes, because ERETU doesn't change segment registers. > As written this looks like it requires an enormous amount of knowledge about > how other parts of the code works, to be comprehensible or to change safely. > That level of non-local knowledge should be unnecessary.
> Thus 3 as a selector is the same as 0, and it doesn't matter to change it or not. But > when IRET sees an invalid segment register in ES, FS, GS, and DS, it sets it to 0, > making 0 a preferred null selector value. To clarify, an invalid segment register value includes NULL selector values.
"Li, Xin3" <xin3.li@intel.com> writes: >> Thus 3 as a selector is the same as 0, and it doesn't matter to change it or not. But >> when IRET sees an invalid segment register in ES, FS, GS, and DS, it sets it to 0, >> making 0 a preferred null selector value. > > To clarify, an invalid segment register value includes NULL selector values. Perhaps something like patch below to make it clear that we are normalizing the segment values and forcing that normalization. I am a bit confused why this code is not the same for ia32 and ia32_emulation. I would think the normalization at least should apply to the 32bit case as well. Eric diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c index 9027fc088f97..e5f3978388fd 100644 --- a/arch/x86/kernel/signal_32.c +++ b/arch/x86/kernel/signal_32.c @@ -36,22 +36,47 @@ #ifdef CONFIG_IA32_EMULATION #include <asm/ia32_unistd.h> +static inline unsigned int normalize_seg_index(unsigned int index) +{ + /* + * Convert the segment index into normalized form. + * + * For the indexes 0,1,2,3 always use the value of 0, as IRET + * forces this form for the nul segment. + * + * Otherwise set both DPL bits to force it to be a userspace + * ring 3 segment index. + */ + return (index < 3) ? 0 : index | 3; +} + static inline void reload_segments(struct sigcontext_32 *sc) { - unsigned int cur; + unsigned int new, cur; + new = normalize_seg_index(sc->gs); savesegment(gs, cur); - if ((sc->gs | 0x03) != cur) - load_gs_index(sc->gs | 0x03); + cur = normalize_seg_index(cur); + if (new != cur) + load_gs_index(new); + + new = normalize_seg_index(sc->fs); savesegment(fs, cur); - if ((sc->fs | 0x03) != cur) - loadsegment(fs, sc->fs | 0x03); + cur = normalize_seg_index(cur); + if (new != cur) + loadsegment(fs, new); + + new = normalize_seg_index(sc->ds); savesegment(ds, cur); - if ((sc->ds | 0x03) != cur) - loadsegment(ds, sc->ds | 0x03); + cur = normalize_seg_index(cur); + if (new != cur) + loadsegment(ds, new); + + new = normalize_seg_index(sc->es); savesegment(es, cur); - if ((sc->es | 0x03) != cur) - loadsegment(es, sc->es | 0x03); + cur = normalize_seg_index(cur); + if (new != cur) + loadsegment(es, new); } #define sigset32_t compat_sigset_t
> Perhaps something like patch below to make it clear that we are > normalizing the segment values and forcing that normalization. "Normalizing" sounds a good term. > I am a bit confused why this code is not the same for ia32 and > ia32_emulation. I would think the normalization at least should apply > to the 32bit case as well. > > Eric > > diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c > index 9027fc088f97..e5f3978388fd 100644 > --- a/arch/x86/kernel/signal_32.c > +++ b/arch/x86/kernel/signal_32.c > @@ -36,22 +36,47 @@ > #ifdef CONFIG_IA32_EMULATION > #include <asm/ia32_unistd.h> > > +static inline unsigned int normalize_seg_index(unsigned int index) normalize_usrseg_index? > +{ > + /* > + * Convert the segment index into normalized form. > + * > + * For the indexes 0,1,2,3 always use the value of 0, as IRET > + * forces this form for the nul segment. > + * > + * Otherwise set both DPL bits to force it to be a userspace > + * ring 3 segment index. > + */ > + return (index < 3) ? 0 : index | 3; s/</<= no? With this patch it looks that 1,2,3 are no longer valid selector values in Linux, which seems the right thing to do but I don't know if there is any side effect. > +} > + > static inline void reload_segments(struct sigcontext_32 *sc) > { > - unsigned int cur; > + unsigned int new, cur; > > + new = normalize_seg_index(sc->gs); > savesegment(gs, cur); > - if ((sc->gs | 0x03) != cur) > - load_gs_index(sc->gs | 0x03); > + cur = normalize_seg_index(cur); > + if (new != cur) > + load_gs_index(new); > + > + new = normalize_seg_index(sc->fs); > savesegment(fs, cur); > - if ((sc->fs | 0x03) != cur) > - loadsegment(fs, sc->fs | 0x03); > + cur = normalize_seg_index(cur); > + if (new != cur) > + loadsegment(fs, new); > + > + new = normalize_seg_index(sc->ds); > savesegment(ds, cur); > - if ((sc->ds | 0x03) != cur) > - loadsegment(ds, sc->ds | 0x03); > + cur = normalize_seg_index(cur); > + if (new != cur) > + loadsegment(ds, new); > + > + new = normalize_seg_index(sc->es); > savesegment(es, cur); > - if ((sc->es | 0x03) != cur) > - loadsegment(es, sc->es | 0x03); > + cur = normalize_seg_index(cur); > + if (new != cur) > + loadsegment(es, new); > } > > #define sigset32_t compat_sigset_t
"Li, Xin3" <xin3.li@intel.com> writes: >> Perhaps something like patch below to make it clear that we are >> normalizing the segment values and forcing that normalization. > > "Normalizing" sounds a good term. > >> I am a bit confused why this code is not the same for ia32 and >> ia32_emulation. I would think the normalization at least should apply >> to the 32bit case as well. >> >> Eric >> >> diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c >> index 9027fc088f97..e5f3978388fd 100644 >> --- a/arch/x86/kernel/signal_32.c >> +++ b/arch/x86/kernel/signal_32.c >> @@ -36,22 +36,47 @@ >> #ifdef CONFIG_IA32_EMULATION >> #include <asm/ia32_unistd.h> >> >> +static inline unsigned int normalize_seg_index(unsigned int index) > > normalize_usrseg_index? Perhaps useg? I think that is the abbreviation I have seen used elsewhere. I was trying to get things as close as I could to the x86 documentation so people could figure out what is going on by reading the documentation. >> +{ >> + /* >> + * Convert the segment index into normalized form. >> + * >> + * For the indexes 0,1,2,3 always use the value of 0, as IRET >> + * forces this form for the nul segment. >> + * >> + * Otherwise set both DPL bits to force it to be a userspace >> + * ring 3 segment index. >> + */ >> + return (index < 3) ? 0 : index | 3; > > s/</<= > > no? Yes. My typo. The patch was very much a suggestion on how we can perhaps write the code to make it clearer what is happening. Normalizing the segment index values seems like what we have been intending to do all along. In fact it might make sense for clarity to simply use the existing "index | 3" logic in what I called normal_seg_index, and then just update the normalization to add the null pointer case. I was just spending a little time trying to make it so that the code is readable. > With this patch it looks that 1,2,3 are no longer valid selector values > in Linux, which seems the right thing to do but I don't know if there is > any side effect. You have said that IRET always changes 1,2,3 to 0. So I don't expect the selector values of 1,2,3 will last very long. If that is not the case I misunderstood, and we should consider doing something else. It bothers me that the existing code, and your code as well only handles the normalization on x86_64 for ia32 mode. Shouldn't the same normalization logic apply in a 32bit kernel as well? Scope creep I know but the fact the code does not match seems concerning. Eric >> +} >> + >> static inline void reload_segments(struct sigcontext_32 *sc) >> { >> - unsigned int cur; >> + unsigned int new, cur; >> >> + new = normalize_seg_index(sc->gs); >> savesegment(gs, cur); >> - if ((sc->gs | 0x03) != cur) >> - load_gs_index(sc->gs | 0x03); >> + cur = normalize_seg_index(cur); >> + if (new != cur) >> + load_gs_index(new); >> + >> + new = normalize_seg_index(sc->fs); >> savesegment(fs, cur); >> - if ((sc->fs | 0x03) != cur) >> - loadsegment(fs, sc->fs | 0x03); >> + cur = normalize_seg_index(cur); >> + if (new != cur) >> + loadsegment(fs, new); >> + >> + new = normalize_seg_index(sc->ds); >> savesegment(ds, cur); >> - if ((sc->ds | 0x03) != cur) >> - loadsegment(ds, sc->ds | 0x03); >> + cur = normalize_seg_index(cur); >> + if (new != cur) >> + loadsegment(ds, new); >> + >> + new = normalize_seg_index(sc->es); >> savesegment(es, cur); >> - if ((sc->es | 0x03) != cur) >> - loadsegment(es, sc->es | 0x03); >> + cur = normalize_seg_index(cur); >> + if (new != cur) >> + loadsegment(es, new); >> } >> >> #define sigset32_t compat_sigset_t
> > normalize_usrseg_index? > > Perhaps useg? I think that is the abbreviation I have seen used > elsewhere. I was trying to get things as close as I could to the > x86 documentation so people could figure out what is going on by > reading the documentation. useg seems the right term afaict. > >> + return (index < 3) ? 0 : index | 3; > > > > s/</<= > > > > no? > > Yes. My typo. > > The patch was very much a suggestion on how we can perhaps write the > code to make it clearer what is happening. Normalizing the segment > index values seems like what we have been intending to do all along. > > In fact it might make sense for clarity to simply use the existing > "index | 3" logic in what I called normal_seg_index, and then just > update the normalization to add the null pointer case. > > I was just spending a little time trying to make it so that the code > is readable. > > > With this patch it looks that 1,2,3 are no longer valid selector values > > in Linux, which seems the right thing to do but I don't know if there is > > any side effect. > > You have said that IRET always changes 1,2,3 to 0. So I don't expect > the selector values of 1,2,3 will last very long. > > If that is not the case I misunderstood, and we should consider doing > something else. Your understanding is correct. And I am NOT opposed to your change, but want to see if there is any concern from other people. On the other side, I got to mention that when FRED is enabled, ERETU doesn't forcibly set non-zero null selector values to 0. Then for the sigreturn self-test, it can set DS to any of the null selector values, and expect DS is set to the value after sigreturn. With your proposal, 1, 2 and 3 are no longer valid selector values in Linux, and Linux would fociably set non-zero null selector values to 0, just like what IRET has been doing. Then the sigreturn self-test should be changed to check any non-zero null selector value will be set to 0 after sigreturn. And this behavior is consistent w/ and w/o FRED. I think we should do it. > It bothers me that the existing code, and your code as well only > handles the normalization on x86_64 for ia32 mode. Shouldn't > the same normalization logic apply in a 32bit kernel as well? > Scope creep I know but the fact the code does not match > seems concerning. Agreed! We *should* fix it in the same way. I guess people are just conservative with 32-bit kernel specific changes. How many OSVs still release 32-bit kernel? What about the testing?
> > It bothers me that the existing code, and your code as well only > > handles the normalization on x86_64 for ia32 mode. Shouldn't the same > > normalization logic apply in a 32bit kernel as well? > > Scope creep I know but the fact the code does not match seems > > concerning. > > Agreed! We *should* fix it in the same way. The fact is that the existing code unconditionally sets the DPL bits to 3 only on x86_64 (why did we forgot to do it on 32-bit?). Thus, there is nothing to revert for null selectors on 32-bit. With your suggestion to normalize null selectors, we need to add the code only when IRET is NOT used to return to user level. Fortunately, Brian Gerst just posted a patch set https://lore.kernel.org/lkml/20230721161018.50214-1-brgerst@gmail.com/, which makes the case whether IRET is used or not explicit on both x86_64 and x86_32. As a result it is straightforward to add the normalization code afterwards. Thanks! Xin
diff --git a/arch/x86/kernel/signal_32.c b/arch/x86/kernel/signal_32.c index 9027fc088f97..7796cf84fca2 100644 --- a/arch/x86/kernel/signal_32.c +++ b/arch/x86/kernel/signal_32.c @@ -36,22 +36,27 @@ #ifdef CONFIG_IA32_EMULATION #include <asm/ia32_unistd.h> +static inline u16 usrseg(u16 sel) +{ + return sel <= 3 ? sel : sel | 3; +} + static inline void reload_segments(struct sigcontext_32 *sc) { unsigned int cur; savesegment(gs, cur); - if ((sc->gs | 0x03) != cur) - load_gs_index(sc->gs | 0x03); + if (usrseg(sc->gs) != cur) + load_gs_index(usrseg(sc->gs)); savesegment(fs, cur); - if ((sc->fs | 0x03) != cur) - loadsegment(fs, sc->fs | 0x03); + if (usrseg(sc->fs) != cur) + loadsegment(fs, usrseg(sc->fs)); savesegment(ds, cur); - if ((sc->ds | 0x03) != cur) - loadsegment(ds, sc->ds | 0x03); + if (usrseg(sc->ds) != cur) + loadsegment(ds, usrseg(sc->ds)); savesegment(es, cur); - if ((sc->es | 0x03) != cur) - loadsegment(es, sc->es | 0x03); + if (usrseg(sc->es) != cur) + loadsegment(es, usrseg(sc->es)); } #define sigset32_t compat_sigset_t