From patchwork Tue Oct 18 11:33:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4127 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1912526wrs; Tue, 18 Oct 2022 04:45:39 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5lm2ZIt7FDAud1VDOcxuv5Il+K5AEgdjYlF6I2o98+yU+XGplJ7tzqZ9VpbAYdBDhk49Ib X-Received: by 2002:a17:907:7618:b0:78d:ad63:2828 with SMTP id jx24-20020a170907761800b0078dad632828mr2144382ejc.27.1666093539602; Tue, 18 Oct 2022 04:45:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093539; cv=none; d=google.com; s=arc-20160816; b=tYWcGOQdVC7kk8Z+eQJ7221JY+Mdh8dxWW1YyNALU6nby+HT9hw6zX1is03ewyjtaW Tv3fsg3wI3iW20L+Pzwp855+F9gBeF5UI2e8gtIvpUljF1whnNDqfaXs/y/Dq73MkZR3 mAki8USm7ygzBBd/fIs6c1L1Hwoj5uZaO6UeGRJBemi35m85V2klvB4b4VOq82fuaYrM fUgtkv59gRvP7AkOXqWOdu1RlFY+Ms3aPNv7xQ8n9Taxh5vbpuotmedsK5Tke9R1OtNh dyravUutAq9wBhcuY1y16gp2rfO7aLDXkEs8cqF+8gub+knPZXicTxCOId1h+c6jOwqz UvxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/gQxZVfxZ0XVnQZf+76lcdkLcCWBm7/MmNe6nzWlntI=; b=0YfHxOcIsDt73dVKMRBrLh+sk8Ahqxtj7/mItsJ3OGR0p7gv6hvLxyUbylm/NXo6pL 99IUxgWG8QTIu9CSaR4iMs7LS8v69am3s90RzoC3Uqgw+P7XB3IE670uL+Xb4KuJrD+P gWmG0JNDZMT+XUgu6Sq9nBG7UiOjGTYBK8Q0v9KsY9o6lK7ErKhkx5UqtKuvf1xuEapK VNaZcZeICDF7RLEdzPIB4v+NjEHovRcLdg0eMuNb/RoL1He1SGyNCtV6rW1dk7+tPByJ PQ16IjyrMOuNT/WRGKVfh9/9q9zhm2LF68pYSCOhzMgtLwWSmyvEzBrIg98HKHjFreVF 0QFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=a1c91sRg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sc11-20020a1709078a0b00b0078d8db64fffsi11409790ejc.20.2022.10.18.04.45.14; Tue, 18 Oct 2022 04:45:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=a1c91sRg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230232AbiJRLhb (ORCPT + 99 others); Tue, 18 Oct 2022 07:37:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229605AbiJRLh3 (ORCPT ); Tue, 18 Oct 2022 07:37:29 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3287BA27B for ; Tue, 18 Oct 2022 04:37:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666093024; x=1697629024; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YRI2YivWJDaeR3A5b46DIJCCsip9GmvTxTmH9XsxWxQ=; b=a1c91sRgQnEQxKto/7Jx6EmVW6MSiLh3Hg91hpNfqyyA/YP+NeYlJo1W zfA/fyOJx9hqVkIH5yACX76YSk4TusA6Lk0hqdyQWgI/OA3rPI+FHRIN8 TwJQhMWFFKQMtdDGmBfrzuyyW2tztTBrGQ6nJzK7SrbwBcweEo9ipUIN0 QgJ1+N0CeMdhEmrqJznPfM/cok8PkpU1LyrH+WxyccAKVXlHGTN3+/dA7 dNmNwOhGRyJoHsO9miX4SaUP853hKxTAPwgccDz4hFDNCjioj3fjzL+Wj JIvPTxwOy0FTyDo9CFX51DbY5+Iz5mBgnWNBiA+H7OiLAxnT6syaAH312 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="332620544" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="332620544" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861119" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861119" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:06 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id EDF1F1046BE; Tue, 18 Oct 2022 14:34:03 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 01/15] x86/mm: Fix CR3_ADDR_MASK Date: Tue, 18 Oct 2022 14:33:44 +0300 Message-Id: <20221018113358.7833-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025699157332577?= X-GMAIL-MSGID: =?utf-8?q?1747025699157332577?= The mask must not include bits above physical address mask. These bits are reserved and can be used for other things. Bits 61 and 62 are used for Linear Address Masking. Signed-off-by: Kirill A. Shutemov Reviewed-by: Rick Edgecombe Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/processor-flags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index 02c2cbda4a74..a7f3d9100adb 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -35,7 +35,7 @@ */ #ifdef CONFIG_X86_64 /* Mask off the address space ID and SME encryption bits. */ -#define CR3_ADDR_MASK __sme_clr(0x7FFFFFFFFFFFF000ull) +#define CR3_ADDR_MASK __sme_clr(PHYSICAL_PAGE_MASK) #define CR3_PCID_MASK 0xFFFull #define CR3_NOFLUSH BIT_ULL(63) From patchwork Tue Oct 18 11:33:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4124 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1912155wrs; Tue, 18 Oct 2022 04:44:41 -0700 (PDT) X-Google-Smtp-Source: AMsMyM672PXz/8rgiXAJBL+cHSwn+2mmpiUb3zdsmuTi5dw1XK7OF9lplAnUHlmJPXz/AehUVxP/ X-Received: by 2002:a17:90b:38ca:b0:20d:a0e7:af33 with SMTP id nn10-20020a17090b38ca00b0020da0e7af33mr33248911pjb.154.1666093470517; Tue, 18 Oct 2022 04:44:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093470; cv=none; d=google.com; s=arc-20160816; b=Yvi4wD9WDosotZ1rTD3aRiJMKwgmLwONBUvVVqDhDvX9qyLtICtkI8WKuLNcm/vKiF L9aWwvLHRuPJYv24JWD4WV4Tp+kepCETCpphyPn5p2ICAf+Bh3Gn96r0sXxBCax3PJ8d RRk16H+pdUw1GygsbO+vZJWDjQcQ3ovTT2RuV4RBIQHQONU4dMd/d6QbEYOWEVahaxnN hw2NAk3m6kDxsW24kbFIAjXnL9oK3gPlq2dAbmuQRlRuBYGWqX1+meKCHcRSYq4f6oQu anNo1RvetszUBFmn+iN/+Y434CYqph2hJxEewdCvKSuRP9v3tuFDXaKAtcn2NK1YobB2 CQww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ua/d+KkLZv0PaTb7Hq4TesrbnvHM12E/lwcQfI1JPRY=; b=ucjFF9lOJFqEEQ+AYwpaejiEYs05sh/ZqCY4LS7Zd5GZbW214OgIFL9mBhbhxx1w54 wiXvzfKI1ANn9xn5eJz8XeQcpW+KxfhesfvZAwlC+s921cYnS640H1Wblu9S2UoyynAz ns5jMjh1pcdnNP2dOeeW6R8/Upx58fcpCsvWDLfNzyQ6vWnruua/cs5WwQ08PcZXn41E 1gw4u/+MgAzPEwm292Cmfp2PNIhYfas0uvdzFVPMvJyTxvnSdVdLTiyoEtlVKU+1lPpH jOf3oSPU/X+PQDOLf/oSjEOhWy0OCTtcLPE6ybzerFcQdOokBs/Yf4m2qc2YRmLyuL/s /bnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ftu7jpDF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 20-20020a631154000000b0043c0b452d3csi15446756pgr.69.2022.10.18.04.44.16; Tue, 18 Oct 2022 04:44:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ftu7jpDF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230288AbiJRLhd (ORCPT + 99 others); Tue, 18 Oct 2022 07:37:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229834AbiJRLha (ORCPT ); Tue, 18 Oct 2022 07:37:30 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2AAFBBE20 for ; Tue, 18 Oct 2022 04:37:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666093025; x=1697629025; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7qeZVouD+LQqNvYdTdPAJZv4Un9wdAzppe5kV3GbELE=; b=ftu7jpDFhwJyvT/jiqamwVXex080M6OCvEtzVJpe/MIGgWa9ugpgjvUz kTJSE8X10GafU3byJ65uNBCmiNtswaO4xAZ5dQ89gHrilideEChmbhzPb Shor+OlrwABuhwCm3v3S1QfY2s4M56fLXEyLj0Pu/LVcGkMe+ghXdbCfQ 6xsFyxRjDBjwwDbUX9PmPhQwEd285os0q41wk3ZJQ0OuPQlDiJtv5HGK6 SoN2bAW035CVp76jn8Fo6RclgTBF5vI6VwVhWZ5kT5l/rxIUdLwwCTD2W PlC/REdWUFIYdwO0anY/QVHpAdQ2cyXPPSpMrZbzovuhSo5mSbcGWBrFs w==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="332620547" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="332620547" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861121" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861121" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:06 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 04B111046BF; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 02/15] x86: CPUID and CR3/CR4 flags for Linear Address Masking Date: Tue, 18 Oct 2022 14:33:45 +0300 Message-Id: <20221018113358.7833-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025627082353151?= X-GMAIL-MSGID: =?utf-8?q?1747025627082353151?= Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags. Signed-off-by: Kirill A. Shutemov Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/processor-flags.h | 2 ++ arch/x86/include/uapi/asm/processor-flags.h | 6 ++++++ 3 files changed, 9 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index b71f4f2ecdd5..265805e71806 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -308,6 +308,7 @@ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ +#define X86_FEATURE_LAM (12*32+26) /* Linear Address Masking */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index a7f3d9100adb..d8cccadc83a6 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -28,6 +28,8 @@ * On systems with SME, one bit (in a variable position!) is stolen to indicate * that the top-level paging structure is encrypted. * + * On systemms with LAM, bits 61 and 62 are used to indicate LAM mode. + * * All of the remaining bits indicate the physical address of the top-level * paging structure. * diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index c47cc7f2feeb..d898432947ff 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -82,6 +82,10 @@ #define X86_CR3_PCID_BITS 12 #define X86_CR3_PCID_MASK (_AC((1UL << X86_CR3_PCID_BITS) - 1, UL)) +#define X86_CR3_LAM_U57_BIT 61 /* Activate LAM for userspace, 62:57 bits masked */ +#define X86_CR3_LAM_U57 _BITULL(X86_CR3_LAM_U57_BIT) +#define X86_CR3_LAM_U48_BIT 62 /* Activate LAM for userspace, 62:48 bits masked */ +#define X86_CR3_LAM_U48 _BITULL(X86_CR3_LAM_U48_BIT) #define X86_CR3_PCID_NOFLUSH_BIT 63 /* Preserve old PCID */ #define X86_CR3_PCID_NOFLUSH _BITULL(X86_CR3_PCID_NOFLUSH_BIT) @@ -132,6 +136,8 @@ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) #define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */ #define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) +#define X86_CR4_LAM_SUP_BIT 28 /* LAM for supervisor pointers */ +#define X86_CR4_LAM_SUP _BITUL(X86_CR4_LAM_SUP_BIT) /* * x86-64 Task Priority Register, CR8 From patchwork Tue Oct 18 11:33:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4128 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1912537wrs; Tue, 18 Oct 2022 04:45:42 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7atoOanlu9Mww1rB8wx41K2CXRmqhLv1snZMZF7PH1U5i43GKveTJhIC9psT17Z78akg4i X-Received: by 2002:a17:907:168c:b0:78d:8b6c:a209 with SMTP id hc12-20020a170907168c00b0078d8b6ca209mr2168811ejc.185.1666093542364; Tue, 18 Oct 2022 04:45:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093542; cv=none; d=google.com; s=arc-20160816; b=e7kPBECzDMzS1QC9gAmYSNjDTg3MVNeEyXbnD1B+/ARxqfqXo6ppZ9aW5fxikela2F AoOLEiJjJG72c4J4EJt5pzL4+hyKhM6cOR/88dG2R0QA0fLLVQR3JBIewRFSqu3tRXX0 abxTy7sGiBYkmn5socW1wZGjE+Ht37GQpA5LQSyIORGd5jRhIEQtDcHRQhmACQK6xuBO wngC5AcLAm8Rq58DLUTS2wSUnO3BRKrN9e8HMpqwRnfN2wbqIWPEJskuDgWCFVfQjkTQ l+ViQnEuJlnjILTPJg66AxAegafngmIH6PebyLMzBntDMopxlgs4f3sZNKj35/3VDmFF B1ww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WeNeYWRLLoJCZYFvHpRnOLuGR1puYqhHrRDuz2xh2m4=; b=Oot0pOrrMOo0sQZQpc7KlKJBQ8j9uh1z/1LVXt8Z8XPqgkrc78l3t39WxyzzO+Op1R 16WX6M9gyA8/rWXmnKFd8/XzE82AZv6ECK3trEAMefwKrU+uzcpZj8fEokr9uS0oKCbx tXY/J3HCtqYGwPJrzOQs0uzZFAX41uVbJ4tGPRaK+xL6AKC+hemaaX+Rau3AS2aaIhC3 0warBPsAaMWo9uDDha4rqnDyIraeslwSmzLe9CXGsgLNJffwVpnWUbd5h26NSAu+U6w4 HvWnMdc2NvtcKnHaBmtuFG66YDGhextSS10TtjNf9/CYjLPlELwCkIO/Of9BwdiyJg0m D8XA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HZ7iZ7cm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dn20-20020a17090794d400b007123952b00dsi11821390ejc.100.2022.10.18.04.45.17; Tue, 18 Oct 2022 04:45:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=HZ7iZ7cm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230311AbiJRLhr (ORCPT + 99 others); Tue, 18 Oct 2022 07:37:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230310AbiJRLho (ORCPT ); Tue, 18 Oct 2022 07:37:44 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36AD786F94 for ; Tue, 18 Oct 2022 04:37:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666093039; x=1697629039; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5MPChFauMs1s1DfrpOjLtQAJkHTo9iMj4SeNYDjzaIc=; b=HZ7iZ7cmjoMSBREaiL/wlZWaV14e4GvFsf1XxzHFAeOOHqU4yqaOBs2i mndegStiCPqupezaunN0epUqBEUmsam4twuu2I9JdpENqVoM76a8g0opM AcCzPEDiyc1Ia5VjyD6MRImeiE0ysJpEiHyqPKmd9ZcqT3LttCja5dw0L Zr5xycCEgPDXR703LWb0ZLSfn7kht7aWtGNh7h6Rvly2u1FyJwQRgMOA1 tj+L6qN4GexsH/UULbihz4/odPNSgiBWzINyJQiaBfEAPSNPBPCJhaODF 5sJc1qa/Qt1RkELAhxViOf49wPNb1LqudsQOe97IIaD1Oxmz9qdNAVZ4A A==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="332620550" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="332620550" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861124" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861124" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:06 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 0FDCA1046CA; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 03/15] mm: Pass down mm_struct to untagged_addr() Date: Tue, 18 Oct 2022 14:33:46 +0300 Message-Id: <20221018113358.7833-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025701995966695?= X-GMAIL-MSGID: =?utf-8?q?1747025701995966695?= Intel Linear Address Masking (LAM) brings per-mm untagging rules. Pass down mm_struct to the untagging helper. It will help to apply untagging policy correctly. In most cases, current->mm is the one to use, but there are some exceptions, such as get_user_page_remote(). Move dummy implementation of untagged_addr() from to . can override the implementation. Moving the dummy header outside helps to avoid header hell if you need to defer mm_struct within the helper. Signed-off-by: Kirill A. Shutemov Reviewed-by: Rick Edgecombe Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/memory.h | 4 ++-- arch/arm64/include/asm/signal.h | 2 +- arch/arm64/include/asm/uaccess.h | 2 +- arch/arm64/kernel/hw_breakpoint.c | 2 +- arch/arm64/kernel/traps.c | 4 ++-- arch/arm64/mm/fault.c | 10 +++++----- arch/sparc/include/asm/pgtable_64.h | 2 +- arch/sparc/include/asm/uaccess_64.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +- drivers/gpu/drm/radeon/radeon_gem.c | 2 +- drivers/infiniband/hw/mlx4/mr.c | 2 +- drivers/media/common/videobuf2/frame_vector.c | 2 +- drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +- drivers/staging/media/atomisp/pci/hmm/hmm_bo.c | 2 +- drivers/tee/tee_shm.c | 2 +- drivers/vfio/vfio_iommu_type1.c | 2 +- fs/proc/task_mmu.c | 2 +- include/linux/mm.h | 11 ----------- include/linux/uaccess.h | 15 +++++++++++++++ lib/strncpy_from_user.c | 2 +- lib/strnlen_user.c | 2 +- mm/gup.c | 6 +++--- mm/madvise.c | 2 +- mm/mempolicy.c | 6 +++--- mm/migrate.c | 2 +- mm/mincore.c | 2 +- mm/mlock.c | 4 ++-- mm/mmap.c | 2 +- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/msync.c | 2 +- virt/kvm/kvm_main.c | 2 +- 33 files changed, 58 insertions(+), 52 deletions(-) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 9dd08cd339c3..5b24ef93c6b9 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -227,8 +227,8 @@ static inline unsigned long kaslr_offset(void) #define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) -#define untagged_addr(addr) ({ \ - u64 __addr = (__force u64)(addr); \ +#define untagged_addr(mm, addr) ({ \ + u64 __addr = (__force u64)(addr); \ __addr &= __untagged_addr(__addr); \ (__force __typeof__(addr))__addr; \ }) diff --git a/arch/arm64/include/asm/signal.h b/arch/arm64/include/asm/signal.h index ef449f5f4ba8..0899c355c398 100644 --- a/arch/arm64/include/asm/signal.h +++ b/arch/arm64/include/asm/signal.h @@ -18,7 +18,7 @@ static inline void __user *arch_untagged_si_addr(void __user *addr, if (sig == SIGTRAP && si_code == TRAP_BRKPT) return addr; - return untagged_addr(addr); + return untagged_addr(current->mm, addr); } #define arch_untagged_si_addr arch_untagged_si_addr diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index 5c7b2f9d5913..122d894a4136 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -44,7 +44,7 @@ static inline int access_ok(const void __user *addr, unsigned long size) */ if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) && (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR))) - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); return likely(__access_ok(addr, size)); } diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index b29a311bb055..d637cee7b771 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -715,7 +715,7 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val, u64 wp_low, wp_high; u32 lens, lene; - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); lens = __ffs(ctrl->len); lene = __fls(ctrl->len); diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 23d281ed7621..f40f3885b674 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -477,7 +477,7 @@ void arm64_notify_segfault(unsigned long addr) int code; mmap_read_lock(current->mm); - if (find_vma(current->mm, untagged_addr(addr)) == NULL) + if (find_vma(current->mm, untagged_addr(current->mm, addr)) == NULL) code = SEGV_MAPERR; else code = SEGV_ACCERR; @@ -551,7 +551,7 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs) int ret = 0; tagged_address = pt_regs_read_reg(regs, rt); - address = untagged_addr(tagged_address); + address = untagged_addr(current->mm, tagged_address); switch (crm) { case ESR_ELx_SYS64_ISS_CRM_DC_CVAU: /* DC CVAU, gets promoted */ diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 5b391490e045..b8799e9c7e1b 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -454,7 +454,7 @@ static void set_thread_esr(unsigned long address, unsigned long esr) static void do_bad_area(unsigned long far, unsigned long esr, struct pt_regs *regs) { - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(current->mm, far); /* * If we are in kernel mode at this point, we have no context to @@ -524,7 +524,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, vm_fault_t fault; unsigned long vm_flags; unsigned int mm_flags = FAULT_FLAG_DEFAULT; - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(mm, far); if (kprobe_page_fault(regs, esr)) return 0; @@ -679,7 +679,7 @@ static int __kprobes do_translation_fault(unsigned long far, unsigned long esr, struct pt_regs *regs) { - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(current->mm, far); if (is_ttbr0_addr(addr)) return do_page_fault(far, esr, regs); @@ -726,7 +726,7 @@ static int do_sea(unsigned long far, unsigned long esr, struct pt_regs *regs) * UNKNOWN for synchronous external aborts. Mask them out now * so that userspace doesn't see them. */ - siaddr = untagged_addr(far); + siaddr = untagged_addr(current->mm, far); } arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr); @@ -816,7 +816,7 @@ static const struct fault_info fault_info[] = { void do_mem_abort(unsigned long far, unsigned long esr, struct pt_regs *regs) { const struct fault_info *inf = esr_to_fault_info(esr); - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(current->mm, far); if (!inf->fn(far, esr, regs)) return; diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index a779418ceba9..aa996ffe5c8c 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1052,7 +1052,7 @@ static inline unsigned long __untagged_addr(unsigned long start) return start; } -#define untagged_addr(addr) \ +#define untagged_addr(mm, addr) \ ((__typeof__(addr))(__untagged_addr((unsigned long)(addr)))) static inline bool pte_access_permitted(pte_t pte, bool write) diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h index 94266a5c5b04..b825a5dd0210 100644 --- a/arch/sparc/include/asm/uaccess_64.h +++ b/arch/sparc/include/asm/uaccess_64.h @@ -8,8 +8,10 @@ #include #include +#include #include #include +#include #include #include diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index 978d3970b5cc..173f0b5ccba1 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1659,7 +1659,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) { if (!offset || !*offset) return -EINVAL; - user_addr = untagged_addr(*offset); + user_addr = untagged_addr(current->mm, *offset); } else if (flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL | KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) { bo_type = ttm_bo_type_sg; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 8ef31d687ef3..691dfb3f2c0e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -382,7 +382,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, uint32_t handle; int r; - args->addr = untagged_addr(args->addr); + args->addr = untagged_addr(current->mm, args->addr); if (offset_in_page(args->addr | args->size)) return -EINVAL; diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 261fcbae88d7..cba2f4b19838 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -371,7 +371,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, uint32_t handle; int r; - args->addr = untagged_addr(args->addr); + args->addr = untagged_addr(current->mm, args->addr); if (offset_in_page(args->addr | args->size)) return -EINVAL; diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index a40bf58bcdd3..383ac9e40dfa 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -379,7 +379,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_device *device, u64 start, * again */ if (!ib_access_writable(access_flags)) { - unsigned long untagged_start = untagged_addr(start); + unsigned long untagged_start = untagged_addr(current->mm, start); struct vm_area_struct *vma; mmap_read_lock(current->mm); diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c index 542dde9d2609..7e62f7a2555d 100644 --- a/drivers/media/common/videobuf2/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -47,7 +47,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, if (WARN_ON_ONCE(nr_frames > vec->nr_allocated)) nr_frames = vec->nr_allocated; - start = untagged_addr(start); + start = untagged_addr(mm, start); ret = pin_user_pages_fast(start, nr_frames, FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index 52312ce2ba05..a1444f8afa05 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -157,8 +157,8 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) { - unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; + unsigned long untagged_baddr = untagged_addr(mm, vb->baddr); struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn; unsigned long pages_done, user_address; diff --git a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c index f50494123f03..a43c65950554 100644 --- a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c +++ b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c @@ -794,7 +794,7 @@ static int alloc_user_pages(struct hmm_buffer_object *bo, * and map to user space */ - userptr = untagged_addr(userptr); + userptr = untagged_addr(current->mm, userptr); if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { page_nr = pin_user_pages((unsigned long)userptr, bo->pgnr, diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index 27295bda3e0b..5c85445f3a65 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -262,7 +262,7 @@ register_shm_helper(struct tee_context *ctx, unsigned long addr, shm->flags = flags; shm->ctx = ctx; shm->id = id; - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); start = rounddown(addr, PAGE_SIZE); shm->offset = addr - start; shm->size = length; diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 23c24fe98c00..74b6aecea8b0 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -573,7 +573,7 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr, goto done; } - vaddr = untagged_addr(vaddr); + vaddr = untagged_addr(mm, vaddr); retry: vma = vma_lookup(mm, vaddr); diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 8b4f3073f8f5..665e36885f21 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1685,7 +1685,7 @@ static ssize_t pagemap_read(struct file *file, char __user *buf, /* watch out for wraparound */ start_vaddr = end_vaddr; if (svpfn <= (ULONG_MAX >> PAGE_SHIFT)) - start_vaddr = untagged_addr(svpfn << PAGE_SHIFT); + start_vaddr = untagged_addr(mm, svpfn << PAGE_SHIFT); /* Ensure the address is inside the task */ if (start_vaddr > mm->task_size) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8bbcccbc5565..bfac5a166cb8 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -95,17 +95,6 @@ extern int mmap_rnd_compat_bits __read_mostly; #include #include -/* - * Architectures that support memory tagging (assigning tags to memory regions, - * embedding these tags into addresses that point to these memory regions, and - * checking that the memory and the pointer tags match on memory accesses) - * redefine this macro to strip tags from pointers. - * It's defined as noop for architectures that don't support memory tagging. - */ -#ifndef untagged_addr -#define untagged_addr(addr) (addr) -#endif - #ifndef __pa_symbol #define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0)) #endif diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index afb18f198843..46680189d761 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -10,6 +10,21 @@ #include +/* + * Architectures that support memory tagging (assigning tags to memory regions, + * embedding these tags into addresses that point to these memory regions, and + * checking that the memory and the pointer tags match on memory accesses) + * redefine this macro to strip tags from pointers. + * + * Passing down mm_struct allows to define untagging rules on per-process + * basis. + * + * It's defined as noop for architectures that don't support memory tagging. + */ +#ifndef untagged_addr +#define untagged_addr(mm, addr) (addr) +#endif + /* * Architectures should provide two primitives (raw_copy_{to,from}_user()) * and get rid of their private instances of copy_{to,from}_user() and diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 6432b8c3e431..6e1e2aa0c994 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -121,7 +121,7 @@ long strncpy_from_user(char *dst, const char __user *src, long count) return 0; max_addr = TASK_SIZE_MAX; - src_addr = (unsigned long)untagged_addr(src); + src_addr = (unsigned long)untagged_addr(current->mm, src); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index feeb935a2299..abc096a68f05 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -97,7 +97,7 @@ long strnlen_user(const char __user *str, long count) return 0; max_addr = TASK_SIZE_MAX; - src_addr = (unsigned long)untagged_addr(str); + src_addr = (unsigned long)untagged_addr(current->mm, str); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; diff --git a/mm/gup.c b/mm/gup.c index fe195d47de74..f585e4a185ca 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1168,7 +1168,7 @@ static long __get_user_pages(struct mm_struct *mm, if (!nr_pages) return 0; - start = untagged_addr(start); + start = untagged_addr(mm, start); VM_BUG_ON(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))); @@ -1342,7 +1342,7 @@ int fixup_user_fault(struct mm_struct *mm, struct vm_area_struct *vma; vm_fault_t ret; - address = untagged_addr(address); + address = untagged_addr(mm, address); if (unlocked) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; @@ -3027,7 +3027,7 @@ static int internal_get_user_pages_fast(unsigned long start, if (!(gup_flags & FOLL_FAST_ONLY)) might_lock_read(¤t->mm->mmap_lock); - start = untagged_addr(start) & PAGE_MASK; + start = untagged_addr(current->mm, start) & PAGE_MASK; len = nr_pages << PAGE_SHIFT; if (check_add_overflow(start, len, &end)) return 0; diff --git a/mm/madvise.c b/mm/madvise.c index 2baa93ca2310..1319a18da8bc 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1382,7 +1382,7 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh size_t len; struct blk_plug plug; - start = untagged_addr(start); + start = untagged_addr(mm, start); if (!madvise_behavior_valid(behavior)) return -EINVAL; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index a937eaec5b68..4fdeef477fbd 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1467,7 +1467,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int lmode = mode; int err; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); err = sanitize_mpol_flags(&lmode, &mode_flags); if (err) return err; @@ -1491,7 +1491,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le int err = -ENOENT; VMA_ITERATOR(vmi, mm, start); - start = untagged_addr(start); + start = untagged_addr(mm, start); if (start & ~PAGE_MASK) return -EINVAL; /* @@ -1692,7 +1692,7 @@ static int kernel_get_mempolicy(int __user *policy, if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL; - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); err = do_get_mempolicy(&pval, &nodes, addr, flags); diff --git a/mm/migrate.c b/mm/migrate.c index 1379e1912772..8e7823bef31d 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1795,7 +1795,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush; - addr = (unsigned long)untagged_addr(p); + addr = (unsigned long)untagged_addr(mm, p); err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index fa200c14185f..72c55bd9d184 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -236,7 +236,7 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) diff --git a/mm/mlock.c b/mm/mlock.c index 7032f6dd0ce1..d969703c08ff 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -570,7 +570,7 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); if (!can_do_mlock()) return -EPERM; @@ -633,7 +633,7 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK; diff --git a/mm/mmap.c b/mm/mmap.c index bf2122af94e7..bb8037840160 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2796,7 +2796,7 @@ EXPORT_SYMBOL(vm_munmap); SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) { - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); return __vm_munmap(addr, len, true); } diff --git a/mm/mprotect.c b/mm/mprotect.c index 668bfaa6ed2a..dee44e3a0527 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -680,7 +680,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, struct mmu_gather tlb; MA_STATE(mas, ¤t->mm->mm_mt, 0, 0); - start = untagged_addr(start); + start = untagged_addr(current->mm, start); prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ diff --git a/mm/mremap.c b/mm/mremap.c index e465ffe279bb..81c857281a52 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -909,7 +909,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, * * See Documentation/arm64/tagged-address-abi.rst for more information. */ - addr = untagged_addr(addr); + addr = untagged_addr(mm, addr); if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP)) return ret; diff --git a/mm/msync.c b/mm/msync.c index ac4c9bfea2e7..f941e9bb610f 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,7 +37,7 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL; - start = untagged_addr(start); + start = untagged_addr(mm, start); if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e30f1b4ecfa5..8c86b06b35da 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1945,7 +1945,7 @@ int __kvm_set_memory_region(struct kvm *kvm, return -EINVAL; /* We can read the guest memory with __xxx_user() later on. */ if ((mem->userspace_addr & (PAGE_SIZE - 1)) || - (mem->userspace_addr != untagged_addr(mem->userspace_addr)) || + (mem->userspace_addr != untagged_addr(kvm->mm, mem->userspace_addr)) || !access_ok((void __user *)(unsigned long)mem->userspace_addr, mem->memory_size)) return -EINVAL; From patchwork Tue Oct 18 11:33:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4112 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1909233wrs; Tue, 18 Oct 2022 04:37:15 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7MEZEISri08Z5TvITjVuOIKEWuqDGCo6PvTCWH4YulPaqZ65c3F5EBqpV2KIPb0ckbvxYC X-Received: by 2002:a17:90a:d518:b0:20d:516d:67ee with SMTP id t24-20020a17090ad51800b0020d516d67eemr3118325pju.9.1666093034846; Tue, 18 Oct 2022 04:37:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093034; cv=none; d=google.com; s=arc-20160816; b=bO1vOgYR87B97rF/wB0fVgf1YfL3fExugmmuY9NVWC9pYJH5X6Wlavw3bi5WLyJgwq D38URtCqVltgwZZ39cwA9YV7yaSmG5k+129FuuRn03Flj+R0K2RjGJfaWusbkVjCmWJx egr9UZj2MwtCCfemn+xfgYqfYDDv1JXUhR+3pTi6SfjSlJNtqhh9ITc9cMI/7m9HdLre +P20E9eacudzWBsrbgZzXyQ7XIB4diGSGm/xeWcJTPK63B/mj2t/owtIRmAylMmq1rbl 8R6pppptUHeheo908AcFwQe7t/UJsJkuNl5e9mlVVrA43Xi3gZ2WBl/ifahQuZTaLT9p dcFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mbi38Q7qzb0FXaYJrIMTQJ/s3D3DhSgL35iT4/qtB18=; b=zwKLraWtiJoQhTJ8aIIH+lBhdNTY4NP0et5PiFvn9dg+DKuFk9lkGOWoLqLbFmNYvg E+kSEvXPbTIVHgSFvCDdNeyVPc1jJKRs/6DW90DXZI6xP20aGWz+SDKsZQTckO+ndfzJ IS8wgGJHoTHylcej/tCPxbFg9Qp5MHSSD35T+C3IGooqga+8opLMWb6UA/e8vxz3MCYO OhvIncWPYjA+xmeOLeMPZHnXRUE9ztM8tN6+fOaamYYH0G8tJT+9xDUuN1ARGuwX5L/N /ZNrGEXqzAvW0Z77JFqJeo+qHKjmy7D5bHX8qJ9/EDB/9DmXGN9RsCTX19ZsYoz8miuh R2jg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BRNKRhNv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o12-20020aa7978c000000b0051f2b9f9b3bsi13790870pfp.243.2022.10.18.04.37.01; Tue, 18 Oct 2022 04:37:14 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BRNKRhNv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230154AbiJRLfu (ORCPT + 99 others); Tue, 18 Oct 2022 07:35:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230246AbiJRLfY (ORCPT ); Tue, 18 Oct 2022 07:35:24 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0A3929816 for ; Tue, 18 Oct 2022 04:34:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092886; x=1697628886; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QqwT2o6S3BL4cAti275H/+m86VnTF/Nrk+qSft4pme8=; b=BRNKRhNvrabXh9LbYKEl9a4HsUYenLtqGB0W8O3yzO5zIhLtCdipe/S+ xRnCyYEb26qtdjjsuCdbudsS3QyU4EnYKrhSitsfURPJj3TtfwDo+eVA/ Vo8u/dkZLdKvwlvhkFSZgthkNudtVu66WQdXoopvWWxlvWzCdqOBcm+9F /eZoER8Ocmy1ga/Lqv5TGpk2QNxHPxCrqul7U4gxHNepn9bTRNIPm2OJn 9wuXvW60wP1xYbOtiRafemUzlAEzYrbZt93KBbkmfG4/jTjW3w5mIHehV s+yIIDPKzBugBsc9ZcwCK0NV9f4v943UK/7TcCeo6iX6aso+P6RlbrXBc w==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="368105796" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="368105796" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:11 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763144" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763144" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:06 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 1A6F61046F9; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 04/15] x86/mm: Handle LAM on context switch Date: Tue, 18 Oct 2022 14:33:47 +0300 Message-Id: <20221018113358.7833-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025169807740999?= X-GMAIL-MSGID: =?utf-8?q?1747025169807740999?= Linear Address Masking mode for userspace pointers encoded in CR3 bits. The mode is selected per-process and stored in mm_context_t. switch_mm_irqs_off() now respects selected LAM mode and constructs CR3 accordingly. The active LAM mode gets recorded in the tlb_state. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/mmu.h | 3 ++ arch/x86/include/asm/mmu_context.h | 24 +++++++++++++++ arch/x86/include/asm/tlbflush.h | 35 ++++++++++++++++++++++ arch/x86/mm/tlb.c | 48 ++++++++++++++++++++---------- 4 files changed, 94 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 5d7494631ea9..002889ca8978 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -40,6 +40,9 @@ typedef struct { #ifdef CONFIG_X86_64 unsigned short flags; + + /* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */ + unsigned long lam_cr3_mask; #endif struct mutex lock; diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index b8d40ddeab00..69c943b2ae90 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -91,6 +91,29 @@ static inline void switch_ldt(struct mm_struct *prev, struct mm_struct *next) } #endif +#ifdef CONFIG_X86_64 +static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) +{ + return mm->context.lam_cr3_mask; +} + +static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) +{ + mm->context.lam_cr3_mask = oldmm->context.lam_cr3_mask; +} + +#else + +static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) +{ + return 0; +} + +static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) +{ +} +#endif + #define enter_lazy_tlb enter_lazy_tlb extern void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk); @@ -168,6 +191,7 @@ static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) { arch_dup_pkeys(oldmm, mm); paravirt_arch_dup_mmap(oldmm, mm); + dup_lam(oldmm, mm); return ldt_dup_context(oldmm, mm); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index cda3118f3b27..1ad080163363 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -101,6 +101,16 @@ struct tlb_state { */ bool invalidate_other; +#ifdef CONFIG_X86_64 + /* + * Active LAM mode. + * + * X86_CR3_LAM_U57/U48 shifted right by X86_CR3_LAM_U57_BIT or 0 if LAM + * disabled. + */ + u8 lam; +#endif + /* * Mask that contains TLB_NR_DYN_ASIDS+1 bits to indicate * the corresponding user PCID needs a flush next time we @@ -357,6 +367,30 @@ static inline bool huge_pmd_needs_flush(pmd_t oldpmd, pmd_t newpmd) } #define huge_pmd_needs_flush huge_pmd_needs_flush +#ifdef CONFIG_X86_64 +static inline unsigned long tlbstate_lam_cr3_mask(void) +{ + unsigned long lam = this_cpu_read(cpu_tlbstate.lam); + + return lam << X86_CR3_LAM_U57_BIT; +} + +static inline void set_tlbstate_cr3_lam_mask(unsigned long mask) +{ + this_cpu_write(cpu_tlbstate.lam, mask >> X86_CR3_LAM_U57_BIT); +} + +#else + +static inline unsigned long tlbstate_lam_cr3_mask(void) +{ + return 0; +} + +static inline void set_tlbstate_cr3_lam_mask(u64 mask) +{ +} +#endif #endif /* !MODULE */ static inline void __native_tlb_flush_global(unsigned long cr4) @@ -364,4 +398,5 @@ static inline void __native_tlb_flush_global(unsigned long cr4) native_write_cr4(cr4 ^ X86_CR4_PGE); native_write_cr4(cr4); } + #endif /* _ASM_X86_TLBFLUSH_H */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index c1e31e9a85d7..d6c9c15d2ad2 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -154,26 +154,30 @@ static inline u16 user_pcid(u16 asid) return ret; } -static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) +static inline unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam) { + unsigned long cr3 = __sme_pa(pgd) | lam; + if (static_cpu_has(X86_FEATURE_PCID)) { - return __sme_pa(pgd) | kern_pcid(asid); + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); + cr3 |= kern_pcid(asid); } else { VM_WARN_ON_ONCE(asid != 0); - return __sme_pa(pgd); } + + return cr3; } -static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) +static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid, + unsigned long lam) { - VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); /* * Use boot_cpu_has() instead of this_cpu_has() as this function * might be called during early boot. This should work even after * boot because all CPU's the have same capabilities: */ VM_WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_PCID)); - return __sme_pa(pgd) | kern_pcid(asid) | CR3_NOFLUSH; + return build_cr3(pgd, asid, lam) | CR3_NOFLUSH; } /* @@ -274,15 +278,16 @@ static inline void invalidate_user_asid(u16 asid) (unsigned long *)this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask)); } -static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, bool need_flush) +static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, unsigned long lam, + bool need_flush) { unsigned long new_mm_cr3; if (need_flush) { invalidate_user_asid(new_asid); - new_mm_cr3 = build_cr3(pgdir, new_asid); + new_mm_cr3 = build_cr3(pgdir, new_asid, lam); } else { - new_mm_cr3 = build_cr3_noflush(pgdir, new_asid); + new_mm_cr3 = build_cr3_noflush(pgdir, new_asid, lam); } /* @@ -491,6 +496,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, { struct mm_struct *real_prev = this_cpu_read(cpu_tlbstate.loaded_mm); u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + unsigned long prev_lam = tlbstate_lam_cr3_mask(); + unsigned long new_lam = mm_lam_cr3_mask(next); bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy); unsigned cpu = smp_processor_id(); u64 next_tlb_gen; @@ -520,7 +527,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * isn't free. */ #ifdef CONFIG_DEBUG_VM - if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid))) { + if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid, prev_lam))) { /* * If we were to BUG here, we'd be very likely to kill * the system so hard that we don't see the call trace. @@ -554,6 +561,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, if (real_prev == next) { VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != next->context.ctx_id); + VM_WARN_ON(prev_lam != new_lam); /* * Even in lazy TLB mode, the CPU should stay set in the @@ -622,15 +630,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, barrier(); } + set_tlbstate_cr3_lam_mask(new_lam); if (need_flush) { this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); - load_new_mm_cr3(next->pgd, new_asid, true); + load_new_mm_cr3(next->pgd, new_asid, new_lam, true); trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); } else { /* The new ASID is already up to date. */ - load_new_mm_cr3(next->pgd, new_asid, false); + load_new_mm_cr3(next->pgd, new_asid, new_lam, false); trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0); } @@ -691,6 +700,10 @@ void initialize_tlbstate_and_flush(void) /* Assert that CR3 already references the right mm. */ WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd)); + /* LAM expected to be disabled in CR3 and init_mm */ + WARN_ON(cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)); + WARN_ON(mm_lam_cr3_mask(&init_mm)); + /* * Assert that CR4.PCIDE is set if needed. (CR4.PCIDE initialization * doesn't work like other CR4 bits because it can only be set from @@ -699,8 +712,8 @@ void initialize_tlbstate_and_flush(void) WARN_ON(boot_cpu_has(X86_FEATURE_PCID) && !(cr4_read_shadow() & X86_CR4_PCIDE)); - /* Force ASID 0 and force a TLB flush. */ - write_cr3(build_cr3(mm->pgd, 0)); + /* Disable LAM, force ASID 0 and force a TLB flush. */ + write_cr3(build_cr3(mm->pgd, 0, 0)); /* Reinitialize tlbstate. */ this_cpu_write(cpu_tlbstate.last_user_mm_spec, LAST_USER_MM_INIT); @@ -708,6 +721,7 @@ void initialize_tlbstate_and_flush(void) this_cpu_write(cpu_tlbstate.next_asid, 1); this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen); + set_tlbstate_cr3_lam_mask(0); for (i = 1; i < TLB_NR_DYN_ASIDS; i++) this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0); @@ -1071,8 +1085,10 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) */ unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, - this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + unsigned long cr3 = + build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid), + tlbstate_lam_cr3_mask()); /* For now, be very restrictive about when this can be called. */ VM_WARN_ON(in_nmi() || preemptible()); From patchwork Tue Oct 18 11:33:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4116 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1910088wrs; Tue, 18 Oct 2022 04:39:13 -0700 (PDT) X-Google-Smtp-Source: AMsMyM75Xr1WJd57seMDJivHIIyCbNAxl1wH4IXnrMb99qXsM+TeKkfQu+j1vtpBy4K2+hlzrMS4 X-Received: by 2002:a17:907:1c98:b0:78d:3b08:33ef with SMTP id nb24-20020a1709071c9800b0078d3b0833efmr2062586ejc.175.1666093153191; Tue, 18 Oct 2022 04:39:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093153; cv=none; d=google.com; s=arc-20160816; b=Ge4jPSKEQrOXHLQJqWbHIGgQb4cJdZX0BydKj/RmXRZoUDIhbofEgrtTKQ9xbacmrL mznko10btlkX6frXmja7CBc1i3m5+Eo72LmpXUGhBtKT6vdfcCJV9kfpJ0KokE+6ULZd LdPZW2S5Tqd5C89UUagew1+rE2//Ntl7N8rdcCX9S9a47M7Uh2pthLtekvTsIWwLlOxQ 3y8A9Obyoss9uGD1RhkOzXkt7Q1eYG3e/eiVYGWzRz33iykQgedd3neB/C364oWDhoEz 6Emb9EgLg3E/JERBs5h3SfR7lvN6eTOCj6hk0bVIjks+rdSaoGVPv8agMPTn3oTBQKvG POOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DbV8cHAt/2UaBv88Klfmogq3Zr74tZzoImyK7jUvyRo=; b=GakwU08UIZLua7yxOUd78Y21k7SR9urj/j461hLPH60lRloXKT2lwDSj8Ru78K/hQx SNWis2+PAyYFr31z+OlEm65O9eowKvzFysCRpcCv3YThteu07iYcJOgh6Q6aLU2HiA7q LO4IjD/iUfWRfhIusd8O+okJB9pYWv3wsjNE5tC7cDRlnVD7xzjIUTJrTwqQn5OBtt1W NHd1SZe7O8Z5uzJkfEpR9MMZts69c3fbvfaxIzdfxL9Y7TjBNvg+y9Q1io1An76fs+39 QC8fXr/lkvsjRbUR7XsOO4tzcHIa6Pc28vOUvCUqm4v4+i5xynWAYfVTwKu+mxKukW54 6UMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=D+UwM9cn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dd17-20020a1709069b9100b00730bc62507csi12152487ejc.125.2022.10.18.04.38.47; Tue, 18 Oct 2022 04:39:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=D+UwM9cn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230255AbiJRLg2 (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230101AbiJRLfo (ORCPT ); Tue, 18 Oct 2022 07:35:44 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDEEDBA27D for ; Tue, 18 Oct 2022 04:35:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092916; x=1697628916; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=URl4+yXuTZU7i5eHga9UDkYb72zi7S2whgUd+vZqeAk=; b=D+UwM9cnOKfHSbfnZXzEAOwkYc/ez5q3JUFT52pkj/PGAl6QIUbaMIOD z1P8zM72DEZRefp8wPNbcvvZZse4ic5R111jXENe5VMXHEN5/UqjIIzlC XLk7WoCwuKsmoIZloBPwUoeCzyYR02gbtC7laQvPG25O5JSfjFXtCm1ZE QvkwvMkxqpwpN3HdhGFbCVd+gLDgnafRxCZqwgJ9lX1tzjTzGeRUD8YDS m2qE2JyqNIx3LMjh7xCgY6Mgsx5FNMIMeZbtY042KtaIJBABhFoQKsZge qC6yRKR2XAEdM5Gh/hOWj/MhIPorUfQ+Q7ZB8fzWSJEZIyhOUgAdrbnO2 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="368105822" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="368105822" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:17 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763173" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763173" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 252BC1046FB; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 05/15] x86/uaccess: Provide untagged_addr() and remove tags before address check Date: Tue, 18 Oct 2022 14:33:48 +0300 Message-Id: <20221018113358.7833-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025293864806200?= X-GMAIL-MSGID: =?utf-8?q?1747025293864806200?= untagged_addr() is a helper used by the core-mm to strip tag bits and get the address to the canonical shape. In only handles userspace addresses. The untagging mask is stored in mmu_context and will be set on enabling LAM for the process. The tags must not be included into check whether it's okay to access the userspace address. Strip tags in access_ok(). get_user() and put_user() don't use access_ok(), but check access against TASK_SIZE directly in assembly. Strip tags, before calling into the assembly helper. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/mmu.h | 3 +++ arch/x86/include/asm/mmu_context.h | 11 ++++++++ arch/x86/include/asm/uaccess.h | 42 +++++++++++++++++++++++++++--- arch/x86/kernel/process.c | 3 +++ 4 files changed, 56 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 002889ca8978..2fdb390040b5 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -43,6 +43,9 @@ typedef struct { /* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */ unsigned long lam_cr3_mask; + + /* Significant bits of the virtual address. Excludes tag bits. */ + u64 untag_mask; #endif struct mutex lock; diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 69c943b2ae90..5bd3d46685dc 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -100,6 +100,12 @@ static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) { mm->context.lam_cr3_mask = oldmm->context.lam_cr3_mask; + mm->context.untag_mask = oldmm->context.untag_mask; +} + +static inline void mm_reset_untag_mask(struct mm_struct *mm) +{ + mm->context.untag_mask = -1UL; } #else @@ -112,6 +118,10 @@ static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) { } + +static inline void mm_reset_untag_mask(struct mm_struct *mm) +{ +} #endif #define enter_lazy_tlb enter_lazy_tlb @@ -138,6 +148,7 @@ static inline int init_new_context(struct task_struct *tsk, mm->context.execute_only_pkey = -1; } #endif + mm_reset_untag_mask(mm); init_new_context_ldt(mm); return 0; } diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 8bc614cfe21b..c6062c07ccd2 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -21,6 +22,30 @@ static inline bool pagefault_disabled(void); # define WARN_ON_IN_IRQ() #endif +#ifdef CONFIG_X86_64 +/* + * Mask out tag bits from the address. + * + * Magic with the 'sign' allows to untag userspace pointer without any branches + * while leaving kernel addresses intact. + */ +#define untagged_addr(mm, addr) ({ \ + u64 __addr = (__force u64)(addr); \ + s64 sign = (s64)__addr >> 63; \ + __addr &= (mm)->context.untag_mask | sign; \ + (__force __typeof__(addr))__addr; \ +}) + +#define untagged_ptr(mm, ptr) ({ \ + u64 __ptrval = (__force u64)(ptr); \ + __ptrval = untagged_addr(mm, __ptrval); \ + (__force __typeof__(*(ptr)) *)__ptrval; \ +}) +#else +#define untagged_addr(mm, addr) (addr) +#define untagged_ptr(mm, ptr) (ptr) +#endif + /** * access_ok - Checks if a user space pointer is valid * @addr: User space pointer to start of block to check @@ -41,7 +66,7 @@ static inline bool pagefault_disabled(void); #define access_ok(addr, size) \ ({ \ WARN_ON_IN_IRQ(); \ - likely(__access_ok(addr, size)); \ + likely(__access_ok(untagged_addr(current->mm, addr), size)); \ }) #include @@ -127,7 +152,13 @@ extern int __get_user_bad(void); * Return: zero on success, or -EFAULT on error. * On error, the variable @x is set to zero. */ -#define get_user(x,ptr) ({ might_fault(); do_get_user_call(get_user,x,ptr); }) +#define get_user(x,ptr) \ +({ \ + __typeof__(*(ptr)) __user *__ptr_clean; \ + __ptr_clean = untagged_ptr(current->mm, ptr); \ + might_fault(); \ + do_get_user_call(get_user,x,__ptr_clean); \ +}) /** * __get_user - Get a simple variable from user space, with less checking. @@ -227,7 +258,12 @@ extern void __put_user_nocheck_8(void); * * Return: zero on success, or -EFAULT on error. */ -#define put_user(x, ptr) ({ might_fault(); do_put_user_call(put_user,x,ptr); }) +#define put_user(x, ptr) ({ \ + __typeof__(*(ptr)) __user *__ptr_clean; \ + __ptr_clean = untagged_ptr(current->mm, ptr); \ + might_fault(); \ + do_put_user_call(put_user,x,__ptr_clean); \ +}) /** * __put_user - Write a simple value into user space, with less checking. diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index c21b7347a26d..d1e83ba21130 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -47,6 +47,7 @@ #include #include #include +#include #include "process.h" @@ -367,6 +368,8 @@ void arch_setup_new_exec(void) task_clear_spec_ssb_noexec(current); speculation_ctrl_update(read_thread_flags()); } + + mm_reset_untag_mask(current->mm); } #ifdef CONFIG_X86_IOPL_IOPERM From patchwork Tue Oct 18 11:33:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4114 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1909499wrs; Tue, 18 Oct 2022 04:37:50 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4JyMrVjrVqy4mVyeG+ca+B76HPAg903jVQCIviOWtqLe0MSgM9hl1Bs+L6BlBldIhTrip9 X-Received: by 2002:a05:6402:501b:b0:459:df91:983 with SMTP id p27-20020a056402501b00b00459df910983mr2090222eda.85.1666093070532; Tue, 18 Oct 2022 04:37:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093070; cv=none; d=google.com; s=arc-20160816; b=JJfV4NtKyFSxuUxCgR3yC05mnJj9WHJ95fhe9YD+ulvItZM6XTs6QFWA+Tfn8WhPMV n5eEe/c0OOM1kvIWZFYSJpskL1SEc/vaNttl8Cp7lHPzKm6dThZScrYRcAqFs1TLyvBW PJPT4bzXWB2aqqXK/Zj5iZeNwxLoUBQgtBWfNiI+v9xbdKldn1xnbJX4nYyI7udAvqPm hJssTqZxBRxrRwrJLo1qztNpVfATXWKYmYb4SxXAii/92SpGjYnYmj0jbpvhvItp2j1E EeaJvAkm2ujna6e8BvjAUscwaKE9TwWymk/zhWyz0uxrJWN5kl6yoDd4WSt/oK6aAxmT JoZQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/oO7gDFSCqJwvHDXoFqlm+XodfhyUhtj9wWrRX+3h+8=; b=C/o7M8OB5vwP0M1ONp8lhwvqv2OAsr3uohTLPwfpGkXbC4Fm+Vp9twpNyB6ZVuwtAw d8K8nFDQ79Msqu3EmBpRkElUvpIx7Zqjd5V8uOrSzusuPINT658MX734as+0NeZSa6Lq 6YOrLJFauJRxecUk87z/Qll0rDt4xy3uzw38iG9vDgQSZhETJ13u593gPI6WqY5xVlgR mgOA+oRjlXsV6FbJmnt+WAZiS3yXmC3c4j/co0fEsx9QC3kccsWVGxffalHGAnZ44M8+ iVdhURTZesHXZwDtenUX/iCI80bt6oezhTmi3rG72pKN4oH0jenoVcWAWfZmbCkFg4Ix d6QA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=K2+MiNIb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g7-20020a1709065d0700b007316ac034a1si9649111ejt.831.2022.10.18.04.37.24; Tue, 18 Oct 2022 04:37:50 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=K2+MiNIb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230190AbiJRLgE (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51388 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229979AbiJRLfa (ORCPT ); Tue, 18 Oct 2022 07:35:30 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20B10868A0 for ; Tue, 18 Oct 2022 04:35:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092903; x=1697628903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=kLvyVSLWSQb+POxDNMInSWdsGrXEv5cSfT1ANJOQpLA=; b=K2+MiNIbFHz3yC+VlrHgew5pRe1TsaG55Ei0Ttc3b0hCHMpzcoslcGLG 8OF2EUy5KzhkCcKc3CgzCMuiOt+4gI1wFldPujDR1rF83mT9iHE7Oh/76 9rvnfUT8yc3m5W6UEYe8IPr684WNp/ulvr4lC54YEb+tw5qLotrTvgagU r6L2NbQ9rR6xRnx8KwXum3zLYhlYKxwxtS8GN3/wU6cep4xY6vDbSpKW4 U5qJbFO2Ai4mnAM6F4AxFPSarC7l24wrGTOVEqETtZp+X+e2KD/FbjpGp YNnd2xFQRyEQ5PAAesMso1XF9yufd5uzzGE3lBndfCnAoJFqOPTfEq7Up g==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="392382133" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="392382133" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:18 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861181" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861181" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 3000D104716; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" , Marc Zyngier Subject: [PATCHv10 06/15] KVM: Serialize tagged address check against tagging enabling Date: Tue, 18 Oct 2022 14:33:49 +0300 Message-Id: <20221018113358.7833-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025207583099810?= X-GMAIL-MSGID: =?utf-8?q?1747025207583099810?= KVM forbids usage of tagged userspace addresses for memslots. It is done by checking if the address stays the same after untagging. It is works fine for ARM TBI, but it the check gets racy for LAM. TBI enabling happens per-thread, so nobody can enable tagging for the thread while the memslot gets added. LAM gets enabled per-process. If it gets enabled after the untagged_addr() check, but before access_ok() check the kernel can wrongly allow tagged userspace_addr. Use mmap lock to protect against parallel LAM enabling. Signed-off-by: Kirill A. Shutemov Reported-by: Rick Edgecombe Cc: Marc Zyngier --- virt/kvm/kvm_main.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 8c86b06b35da..833742c21c91 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1943,12 +1943,22 @@ int __kvm_set_memory_region(struct kvm *kvm, return -EINVAL; if (mem->guest_phys_addr & (PAGE_SIZE - 1)) return -EINVAL; + + /* Serialize against tagging enabling */ + if (mmap_read_lock_killable(kvm->mm)) + return -EINTR; + /* We can read the guest memory with __xxx_user() later on. */ if ((mem->userspace_addr & (PAGE_SIZE - 1)) || (mem->userspace_addr != untagged_addr(kvm->mm, mem->userspace_addr)) || !access_ok((void __user *)(unsigned long)mem->userspace_addr, - mem->memory_size)) + mem->memory_size)) { + mmap_read_unlock(kvm->mm); return -EINVAL; + } + + mmap_read_unlock(kvm->mm); + if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_MEM_SLOTS_NUM) return -EINVAL; if (mem->guest_phys_addr + mem->memory_size < mem->guest_phys_addr) From patchwork Tue Oct 18 11:33:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4115 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1909626wrs; Tue, 18 Oct 2022 04:38:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7bYaXeIb74hLmjxO24w4wIMLSW9tRSAQ2hg7qqgbPzXBMoeO8kQbGf+gFQ0Aixr7VUJWS3 X-Received: by 2002:a17:90b:4acf:b0:20d:673e:7b1 with SMTP id mh15-20020a17090b4acf00b0020d673e07b1mr3159234pjb.204.1666093081313; Tue, 18 Oct 2022 04:38:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093081; cv=none; d=google.com; s=arc-20160816; b=nzAbbVI1NO8E9dUO908SU2bxLu+hctnNvXsdeXCfqwrsYxhVKVDFPXZCVe7gMROj8m /ZrN79x7v5R1lS4c68zkzt7N4RiNxcRb5VzvQ3y6iqWcH704m4NPPWwjbfqyeAslv4Ya CJrL7V1iZZD7S953DDu3XiDEjLtlCdQBdFDRu19fB6pSCMtodgUTz0NA0/fjdWsvIHhm dLQgy0WRahzCr7Ni2F3c/QXXsZoNgj/arQ8qFXpF7aC6Xc+Dlv8rk3bT1Cw6y2MqkGgl 4DDK1tvYc301x8Q8pb3uASWFYJlhtPMxapzLTw2Ub0tp9F/VhJ9QsXPLJ1X6xfwRHKH0 I+mA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/ma5kC7NF6AXFvu3x/PCj6VDS8WlSr+B2yYh1KF6M98=; b=0XEcGFM6YhSf4jNMqxp0i98K8439nZF7g8azLX78uxUXOM45gjes/r4IpJ0BSw0HOM z2QmKiWYnVWsSTF1lvqDtSJ/klh+wVqVlEqsNfjVKVSfVQNzOn5eICizwbJ0Pcp5RFEq kWg/1CzxZfNpaAgofAwqpYE5jF5xdGtwWLNxf/CGyhWHwIndH53X0392jC26GmudbC4U /x5YogR6CpFSftohyHDcJOptjHvNLDojF/IWSiOLze6YbLRkx8yqRhQtZJQ8KVMD244t F7WFood06Lfy+UGRr2KZTp9hdMEnHzT196ag5LpBufIEZtl0Y5sIdOdda5/aIPfWvua3 hg3A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="jK/dwJ7U"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f6-20020a056a00238600b0056300ad8dd2si15620340pfc.367.2022.10.18.04.37.48; Tue, 18 Oct 2022 04:38:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="jK/dwJ7U"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230241AbiJRLgT (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230132AbiJRLfs (ORCPT ); Tue, 18 Oct 2022 07:35:48 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2CB8BA91A for ; Tue, 18 Oct 2022 04:35:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092920; x=1697628920; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1RmDlHjank+08AvRxU4Dq3nTeuAWGKaZTDb08aAe7to=; b=jK/dwJ7Ul61emAbaOoppOksFM66niauKjbrzzpE9hiKypmDU6nm5v1S5 YeszmEYg0/i4W64JdK7dGLLgtwTXqqPMldghpEqREiY6/mq7HR4t3UXY5 5mgLW9yEvhfrWd8eqGYZuxn6iN/UxhT11BiR8CiFeXRtYHGkaGFwNoA8E OoDGvpEIQNxdIvRvVxn1CwIyOi2/JdBy136+hPvjqQFoyTs5gQCkw8r2p wkFEWzxotFVHvg5mVOIcjRcduMv7tu7P1QjvdXXukkwotIIvSCYRjBD0d IByk7VteTm4nCVJ3cFXvMLWmC+r8+msrALVWsDkVM/v9AAfFpq8V3wcaX w==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="368105823" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="368105823" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:17 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763175" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763175" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 3AAA610479E; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 07/15] x86/mm: Provide arch_prctl() interface for LAM Date: Tue, 18 Oct 2022 14:33:50 +0300 Message-Id: <20221018113358.7833-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025218649400549?= X-GMAIL-MSGID: =?utf-8?q?1747025218649400549?= Add a couple of arch_prctl() handles: - ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number of tag bits. It is rounded up to the nearest LAM mode that can provide it. For now only LAM_U57 is supported, with 6 tag bits. - ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag bits located in the address. - ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request. Zero if LAM is not supported. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Reviewed-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/uapi/asm/prctl.h | 4 ++ arch/x86/kernel/process_64.c | 65 ++++++++++++++++++++++++++++++- 2 files changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 500b96e71f18..a31e27b95b19 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -20,4 +20,8 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_GET_UNTAG_MASK 0x4001 +#define ARCH_ENABLE_TAGGED_ADDR 0x4002 +#define ARCH_GET_MAX_TAG_BITS 0x4003 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 6b3418bff326..a98536101447 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -743,6 +743,60 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr) } #endif +static void enable_lam_func(void *mm) +{ + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); + unsigned long lam_mask; + unsigned long cr3; + + if (loaded_mm != mm) + return; + + lam_mask = READ_ONCE(loaded_mm->context.lam_cr3_mask); + + /* Update CR3 to get LAM active on the CPU */ + cr3 = __read_cr3(); + cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57); + cr3 |= lam_mask; + write_cr3(cr3); + set_tlbstate_cr3_lam_mask(lam_mask); +} + +#define LAM_U57_BITS 6 + +static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) +{ + int ret = 0; + + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return -ENODEV; + + if (mmap_write_lock_killable(mm)) + return -EINTR; + + /* Already enabled? */ + if (mm->context.lam_cr3_mask) { + ret = -EBUSY; + goto out; + } + + if (!nr_bits) { + ret = -EINVAL; + goto out; + } else if (nr_bits <= LAM_U57_BITS) { + mm->context.lam_cr3_mask = X86_CR3_LAM_U57; + mm->context.untag_mask = ~GENMASK(62, 57); + } else { + ret = -EINVAL; + goto out; + } + + on_each_cpu_mask(mm_cpumask(mm), enable_lam_func, mm, true); +out: + mmap_write_unlock(mm); + return ret; +} + long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) { int ret = 0; @@ -830,7 +884,16 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) case ARCH_MAP_VDSO_64: return prctl_map_vdso(&vdso_image_64, arg2); #endif - + case ARCH_GET_UNTAG_MASK: + return put_user(task->mm->context.untag_mask, + (unsigned long __user *)arg2); + case ARCH_ENABLE_TAGGED_ADDR: + return prctl_enable_tagged_addr(task->mm, arg2); + case ARCH_GET_MAX_TAG_BITS: + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return put_user(0, (unsigned long __user *)arg2); + else + return put_user(LAM_U57_BITS, (unsigned long __user *)arg2); default: ret = -EINVAL; break; From patchwork Tue Oct 18 11:33:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4118 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1910475wrs; Tue, 18 Oct 2022 04:40:05 -0700 (PDT) X-Google-Smtp-Source: AMsMyM63/1BBii6TWqKsU278T4v9ZkE4TnCJNi1xwxSBnaptCxvz9HwAoaL+m0Yw1LugYy5g7YRA X-Received: by 2002:a17:90a:4983:b0:20a:9509:8347 with SMTP id d3-20020a17090a498300b0020a95098347mr38638800pjh.101.1666093204908; Tue, 18 Oct 2022 04:40:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093204; cv=none; d=google.com; s=arc-20160816; b=l8LPEEGnOOW+AU4jcTW2vKjnnaE2VXNPVAj5jL/iklbuFJHZ5WVhFTeG49f3zsNn5g kAqzCm4TQXz9nMIRHXX7thWoCFjQyTy4M122oVkfJxd6Fy/D9EwcwHcownc6oJ/FfwN9 hwqZs9f3wCu+IjUaZIVd8GCIMZ3eFS1i21DMq6lpbL3NZMzKPuCV86/nI+GozYMWHViW z/kGgwEStK7pORmyd55wgERcrZDSRBHsmni4Dc0CVdYxDoTC7Zo61+ZcbKbZbz2N4H+U TOuqtHwrbhQ0hCIYQjT3+vV/WmIH/EIYN1aKXXoG6TUJhTfRU1Hm1mhTWfIKcM4V5ht4 YGxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HGH62+uFEu8c6UZDAvYEhOO+nimuGapexta/TJtDuPw=; b=pYadiiXd+zKehc/DgqSluvZE2eYtVUPy1zgPI1AKsQaoIxZUs/fhADTBLKLKaaW5BO aD4EznuvnZoDLay7s2pRZjhfJkE8/AAo66SiOxP59mf9KA7M0y2jnzx4qMrbEZNy0bAL hOKLKUnU+cmVmKwQuqSI044iThKcK3pAeekk5wcdEhn6gU41S4k6KteZHKwzqJuhXOeQ gYxYWONjsvsg9oA8l/zYEZpQhYNqN87hIhcc95s6tJbk2ypBkyi+4hIJHmA5YadoeIG+ Q+p4/TzRYONEOYHeX5PpfF/gaiXYDlnkge4UMgQUaGBL8uu3ZDzXaNwuTa51spSXH3N2 0lGQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mVyrR2LE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 21-20020a631655000000b0043c3fa353d6si14647340pgw.153.2022.10.18.04.39.50; Tue, 18 Oct 2022 04:40:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mVyrR2LE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230246AbiJRLgW (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230118AbiJRLfs (ORCPT ); Tue, 18 Oct 2022 07:35:48 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0F8CBA90E for ; Tue, 18 Oct 2022 04:35:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092920; x=1697628920; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fhYTon+Hc+WcBW6ZSEbgkDupKrv70IXcyEjN02jeYOc=; b=mVyrR2LESIXtqC1t8pe/d9+9I9ucwmtgaSsZNke6LgLaqxkYtGYQcX/O b3NMAlydiMbfQ3nWHzSnWvjGwOxsnE0VkL7XjGuDhTUTwzPtIxdVUwNiN 9nAHB+ze1PSN5VvjrkKkVfkYgPbudLoe8Yf0B97xqBhOThUyUkQPPdHe4 APQ0amEulHzP1l+31VO1NyWh63xRsCLeC8+5hqhrWxCmDT8czCes8QkFh WuzlRgzuVqYeE0Q6xcl4Ei4DG8M+7GL+DZvgLmgWgEDbTGwmXGf4roCWJ G03Lqb54BQotV085NBF6IR3fnBDqG2xrxsWPwRr6kaqbcw53nM6zBymfX A==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="392382134" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="392382134" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:18 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861182" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861182" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 4530D104A70; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 08/15] x86/mm: Reduce untagged_addr() overhead until the first LAM user Date: Tue, 18 Oct 2022 14:33:51 +0300 Message-Id: <20221018113358.7833-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025348131337120?= X-GMAIL-MSGID: =?utf-8?q?1747025348131337120?= Use static key to reduce untagged_addr() overhead. The key only gets enabled when the first process enables LAM. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/uaccess.h | 8 ++++++-- arch/x86/kernel/process_64.c | 4 ++++ 2 files changed, 10 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index c6062c07ccd2..820234f1f750 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -23,6 +23,8 @@ static inline bool pagefault_disabled(void); #endif #ifdef CONFIG_X86_64 +DECLARE_STATIC_KEY_FALSE(tagged_addr_key); + /* * Mask out tag bits from the address. * @@ -31,8 +33,10 @@ static inline bool pagefault_disabled(void); */ #define untagged_addr(mm, addr) ({ \ u64 __addr = (__force u64)(addr); \ - s64 sign = (s64)__addr >> 63; \ - __addr &= (mm)->context.untag_mask | sign; \ + if (static_branch_likely(&tagged_addr_key)) { \ + s64 sign = (s64)__addr >> 63; \ + __addr &= (mm)->context.untag_mask | sign; \ + } \ (__force __typeof__(addr))__addr; \ }) diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index a98536101447..9952e9f517ec 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -743,6 +743,9 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr) } #endif +DEFINE_STATIC_KEY_FALSE(tagged_addr_key); +EXPORT_SYMBOL_GPL(tagged_addr_key); + static void enable_lam_func(void *mm) { struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); @@ -792,6 +795,7 @@ static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) } on_each_cpu_mask(mm_cpumask(mm), enable_lam_func, mm, true); + static_branch_enable(&tagged_addr_key); out: mmap_write_unlock(mm); return ret; From patchwork Tue Oct 18 11:33:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4125 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1912391wrs; Tue, 18 Oct 2022 04:45:20 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6bLV3L9M9XoqqGYyWKgN1SymkHq0ft4ME6FyUrna+y/diNpHOf+rJE77bFOT9tOdUU3pEe X-Received: by 2002:a05:6402:1b08:b0:458:d229:bcac with SMTP id by8-20020a0564021b0800b00458d229bcacmr2173521edb.118.1666093520402; Tue, 18 Oct 2022 04:45:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093520; cv=none; d=google.com; s=arc-20160816; b=yCQZuoayCYzV7JIIo1a8jD2RE+UQL5I65pHxgpm4wfPv8IhNk6YiMvdHPI1PNVqnE8 IlW2Zq5KdwDkjmJOFKliBlM5i/oKsnZdH2AIz/xRuKz9BWYj0tC6emQO9ztEA534oBrM oKQHavrIA/L+RwfvOYCYTivBq8SAIBoTvWUe+EPO/mLIhBCSqvjmBqA/hexazsga6DlZ 9JouDy9WoXuqgqbXQekPyBN8n9e/QXfCBIat7we/TEWKtNDEM6FSxKC65tbHAdBnhqy8 jG30oN6Wd5vQuZzE/S3YEq8zOGWCeIa2rQqoaZLo1wZiTtnbTrQEdDZD0TmWs5bOfAEu qlCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=hJBlPIXaGVpzoVBpfeh8WgIo7GP76Nf0Fk/ozq6t158=; b=D/wmFDK6wJh54N8pZ189cY1a8VrZFfOoYFmKRLdt43vZvpgMqTduevNAyWptBRSTJ2 SFBOadZ4iaUp7/VzIqQmsH8DlM0EaOONnw6TETfb6GQYslFSdETxbLOuvZQK9GZMGFUW QNxQUKBexInQ2EB1/HPISpNT7Dcr/H6lKSUWJQEFOAz8BYaggNQQl7dB0vD9RyLDjr55 8dZr6EIicldZ/pUQF553NeO72vtY2zlKewW/JdUsgyHAkFtjnx1Bu4hglV36eALb4p2r CPoTIAcIcNV8qTo9P7vST/Wl1joBDqDmyCvJ2D+dFMjuPAOw8f6gAIDOl39rxtRyNsks Opag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=V4CP86l0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o14-20020a170906974e00b007919388d2c6si2272737ejy.357.2022.10.18.04.44.55; Tue, 18 Oct 2022 04:45:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=V4CP86l0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230274AbiJRLgx (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33018 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229894AbiJRLgQ (ORCPT ); Tue, 18 Oct 2022 07:36:16 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1460A28E3F for ; Tue, 18 Oct 2022 04:35:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092945; x=1697628945; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=NHTbKElyyJKm9Q90dm05YZun/yPzZoA5KBxzSmXyv5U=; b=V4CP86l0NszdyOPsM1p99Nmee85VDR+h1el9Jb8J6fEdBSl4WiKjUL3g 7BM3TAiCz/SxzT4eTASASxcNi3+L3UI1b6UCmNlHVpEaIHM23Ahc4Daqe YrUYXs4Z/SQi9xh+6scTsr2qOOGEMNrhpTyxvpKrFZ2gF7+9gf/Xx2KvA taDKUcU+8wMbaQ3kFiSet6jCyQjLSE5UpjlaGSuYSqhR1JahfwOltQlaT 6FWYdvDSV41HwheLl1LtBAfPa6rBTDYUXiUAGm31u7bmF47/X6zYyNAHg bpVeg2863MBBuJukLa0XUR5wYvJTzmCjke0Y/AJ5aoVJdJk14b1jKf7SO w==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="368105825" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="368105825" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:17 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763177" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763177" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 503F0104BA7; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 09/15] x86: Expose untagging mask in /proc/$PID/arch_status Date: Tue, 18 Oct 2022 14:33:52 +0300 Message-Id: <20221018113358.7833-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025679561520533?= X-GMAIL-MSGID: =?utf-8?q?1747025679561520533?= Add a line in /proc/$PID/arch_status to report untag_mask. It can be used to find out LAM status of the process from the outside. It is useful for debuggers. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/mmu_context.h | 10 +++++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/fpu/xstate.c | 47 ----------------------- arch/x86/kernel/proc.c | 60 ++++++++++++++++++++++++++++++ 4 files changed, 72 insertions(+), 47 deletions(-) create mode 100644 arch/x86/kernel/proc.c diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 5bd3d46685dc..b0e9ea23758b 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -103,6 +103,11 @@ static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) mm->context.untag_mask = oldmm->context.untag_mask; } +static inline unsigned long mm_untag_mask(struct mm_struct *mm) +{ + return mm->context.untag_mask; +} + static inline void mm_reset_untag_mask(struct mm_struct *mm) { mm->context.untag_mask = -1UL; @@ -119,6 +124,11 @@ static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) { } +static inline unsigned long mm_untag_mask(struct mm_struct *mm) +{ + return -1UL; +} + static inline void mm_reset_untag_mask(struct mm_struct *mm) { } diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index f901658d9f7c..d99fd065aba8 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -143,6 +143,8 @@ obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o obj-$(CONFIG_CFI_CLANG) += cfi.o +obj-$(CONFIG_PROC_FS) += proc.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index c8340156bfd2..838a6f0627fd 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -10,8 +10,6 @@ #include #include #include -#include -#include #include #include @@ -1745,48 +1743,3 @@ long fpu_xstate_prctl(int option, unsigned long arg2) return -EINVAL; } } - -#ifdef CONFIG_PROC_PID_ARCH_STATUS -/* - * Report the amount of time elapsed in millisecond since last AVX512 - * use in the task. - */ -static void avx512_status(struct seq_file *m, struct task_struct *task) -{ - unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp); - long delta; - - if (!timestamp) { - /* - * Report -1 if no AVX512 usage - */ - delta = -1; - } else { - delta = (long)(jiffies - timestamp); - /* - * Cap to LONG_MAX if time difference > LONG_MAX - */ - if (delta < 0) - delta = LONG_MAX; - delta = jiffies_to_msecs(delta); - } - - seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); - seq_putc(m, '\n'); -} - -/* - * Report architecture specific information - */ -int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, - struct pid *pid, struct task_struct *task) -{ - /* - * Report AVX512 state if the processor and build option supported. - */ - if (cpu_feature_enabled(X86_FEATURE_AVX512F)) - avx512_status(m, task); - - return 0; -} -#endif /* CONFIG_PROC_PID_ARCH_STATUS */ diff --git a/arch/x86/kernel/proc.c b/arch/x86/kernel/proc.c new file mode 100644 index 000000000000..9765b4d05ce4 --- /dev/null +++ b/arch/x86/kernel/proc.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include + +/* + * Report the amount of time elapsed in millisecond since last AVX512 + * use in the task. + */ +static void avx512_status(struct seq_file *m, struct task_struct *task) +{ + unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp); + long delta; + + if (!timestamp) { + /* + * Report -1 if no AVX512 usage + */ + delta = -1; + } else { + delta = (long)(jiffies - timestamp); + /* + * Cap to LONG_MAX if time difference > LONG_MAX + */ + if (delta < 0) + delta = LONG_MAX; + delta = jiffies_to_msecs(delta); + } + + seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); + seq_putc(m, '\n'); +} + +/* + * Report architecture specific information + */ +int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, + struct pid *pid, struct task_struct *task) +{ + struct mm_struct *mm; + unsigned long untag_mask = -1UL; + + /* + * Report AVX512 state if the processor and build option supported. + */ + if (cpu_feature_enabled(X86_FEATURE_AVX512F)) + avx512_status(m, task); + + mm = get_task_mm(task); + if (mm) { + untag_mask = mm_untag_mask(task->mm); + mmput(mm); + } + + seq_printf(m, "untag_mask:\t%#lx\n", untag_mask); + + return 0; +} From patchwork Tue Oct 18 11:33:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4117 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1910341wrs; Tue, 18 Oct 2022 04:39:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6FlAOTzqDRidJ6NPV+A3FwzXmopw4fb0xFa2zCUpgkVgKktCtBJlhcHfN+cwsHxt31Gwd3 X-Received: by 2002:a17:907:6ea1:b0:78d:4c16:a68b with SMTP id sh33-20020a1709076ea100b0078d4c16a68bmr2058823ejc.447.1666093187855; Tue, 18 Oct 2022 04:39:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093187; cv=none; d=google.com; s=arc-20160816; b=psBNy0aIL1RS96UHXMt6DmzMfO0yxjKppeFI3a0BIGE6DThrarXz09YiYpaxbT8pg7 8yak/rvUP+rhzeL562igSLIlyZQLfYyhufBKg5cqsMCMGTLK6R88/TxAvb2V9p6Hi3Ol vCB8mkkV0g8PesgL4qaLDyuwN9UnXIpEqNToPan8fsL2UEircA01gpAuAxgKjvl91Pxn UJYJ2xzIlUBbkotI6AzDUp214+0PWOsfs4E33ox8/QjtVJ+TMUgWP3LStaVVeA8pUJ6h dCxygjRxDYx7n0JfdIxzptfu/bT9yo7ALdYIVQPLdDFjZSy2VyJSX++3O/1lCmBCrRye nxXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=BGtjROjtrNGL89GNMOsNzSU9TtCi6Bm0JSMzAJygmWg=; b=eQLi1MCFMSYGzmRcTTqCbzr/bLQcpFlqaTmi07xv5wY/j461Fp9mqqw9eEzisksLs6 7nlGDvmzM5E4D7ZEFhonlRdsN+mHaz1gkiPLP1aev3oyJViHpJeCq+AuRBDqS7cBMyp6 vqch4oIMyXolvhkNs2rgn3hY9XTLanY02Lkf5LNObYR407kP4BG8U9mXP+Uj+8v+L50M NheFY+GLshOhGZSTpZK8e4llBU/kKhG7ClI0dfVzTb0JUHVbyWu7/J+8BfKaTAsr4NsY alsu2TJcpA8A1zpKpDICKrdotlBoCqw102yDmpJhtslgyA8kPIAAi1JcENhlsQOEAFjm 1dIw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ie3iViYG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cn4-20020a0564020ca400b0045c4b1f4315si10467523edb.485.2022.10.18.04.39.22; Tue, 18 Oct 2022 04:39:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ie3iViYG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230261AbiJRLgf (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229698AbiJRLfz (ORCPT ); Tue, 18 Oct 2022 07:35:55 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9373F4D4F3 for ; Tue, 18 Oct 2022 04:35:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092926; x=1697628926; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=40WXelYo9LYaLeJacp6rhf6Ge84QFQDH6MSN4czUvf8=; b=ie3iViYGe3Wrbuh5C520sMNyl5clkBGcbVbZSpNDtsdm5edWGHQWC0Xx Jgpf9Bzd4kNYqCVNsRBdKpVPmIwcTX5Xe08CWVp4CqIfMXBOe4Lp8v3xj Pg8YPBM0oLu2FxrlmYDfU4a8eDjvsnDV+Dk5RQaNL8Qfbomt6nV22Vx9H /G5jckVN3y695fbQixYJLP6eUtkjfjSf/5HR/srtaP2mbU3Glcdv5s6YO 1TTzqSwG2f78Hrg+BO8Zdjo24NneHpeAcp3NpNqYidcAYygVpRqn29UHZ g/dTCnXAqRthafQoW/pFfKIO+JIOhunlA2eO5Kkol6hRH+E9GSonZ0jk2 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="392382135" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="392382135" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:18 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861186" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861186" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 5A2A5104BA8; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv10 10/15] x86/mm, iommu/sva: Make LAM and SVM mutually exclusive Date: Tue, 18 Oct 2022 14:33:53 +0300 Message-Id: <20221018113358.7833-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025330346102444?= X-GMAIL-MSGID: =?utf-8?q?1747025330346102444?= IOMMU and SVM-capable devices know nothing about LAM and only expect canonical addresses. Attempt to pass down tagged pointer will lead to address translation failure. By default do not allow to enable both LAM and use SVM in the same process. The new ARCH_FORCE_TAGGED_SVM arch_prctl() overrides the limitation. By using the arch_prctl() userspace takes responsibility to never pass tagged address to the device. Signed-off-by: Kirill A. Shutemov Reviewed-by: Ashok Raj --- arch/x86/include/asm/mmu.h | 6 ++++-- arch/x86/include/asm/mmu_context.h | 2 ++ arch/x86/include/uapi/asm/prctl.h | 1 + arch/x86/kernel/process_64.c | 13 +++++++++++++ drivers/iommu/iommu-sva-lib.c | 12 ++++++++++++ include/linux/mmu_context.h | 4 ++++ 6 files changed, 36 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 2fdb390040b5..cce9b32b0d6d 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -9,9 +9,11 @@ #include /* Uprobes on this MM assume 32-bit code */ -#define MM_CONTEXT_UPROBE_IA32 BIT(0) +#define MM_CONTEXT_UPROBE_IA32 BIT(0) /* vsyscall page is accessible on this MM */ -#define MM_CONTEXT_HAS_VSYSCALL BIT(1) +#define MM_CONTEXT_HAS_VSYSCALL BIT(1) +/* Allow LAM and SVM coexisting */ +#define MM_CONTEXT_FORCE_TAGGED_SVM BIT(2) /* * x86 has arch-specific MMU state beyond what lives in mm_struct. diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index b0e9ea23758b..6b9ac2c60cec 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -113,6 +113,8 @@ static inline void mm_reset_untag_mask(struct mm_struct *mm) mm->context.untag_mask = -1UL; } +#define arch_pgtable_dma_compat(mm) \ + (!mm_lam_cr3_mask(mm) || (mm->context.flags & MM_CONTEXT_FORCE_TAGGED_SVM)) #else static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index a31e27b95b19..7bd22defb558 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -23,5 +23,6 @@ #define ARCH_GET_UNTAG_MASK 0x4001 #define ARCH_ENABLE_TAGGED_ADDR 0x4002 #define ARCH_GET_MAX_TAG_BITS 0x4003 +#define ARCH_FORCE_TAGGED_SVM 0x4004 #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 9952e9f517ec..8faa8774bb93 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -783,6 +783,13 @@ static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) goto out; } +#ifdef CONFIG_IOMMU_SVA + if (pasid_valid(mm->pasid) && + !(mm->context.flags & MM_CONTEXT_FORCE_TAGGED_SVM)) { + ret = -EBUSY; + goto out; + } +#endif if (!nr_bits) { ret = -EINVAL; goto out; @@ -893,6 +900,12 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) (unsigned long __user *)arg2); case ARCH_ENABLE_TAGGED_ADDR: return prctl_enable_tagged_addr(task->mm, arg2); + case ARCH_FORCE_TAGGED_SVM: + if (mmap_write_lock_killable(task->mm)) + return -EINTR; + task->mm->context.flags |= MM_CONTEXT_FORCE_TAGGED_SVM; + mmap_write_unlock(task->mm); + return 0; case ARCH_GET_MAX_TAG_BITS: if (!cpu_feature_enabled(X86_FEATURE_LAM)) return put_user(0, (unsigned long __user *)arg2); diff --git a/drivers/iommu/iommu-sva-lib.c b/drivers/iommu/iommu-sva-lib.c index 106506143896..593ae2472e2c 100644 --- a/drivers/iommu/iommu-sva-lib.c +++ b/drivers/iommu/iommu-sva-lib.c @@ -2,6 +2,8 @@ /* * Helpers for IOMMU drivers implementing SVA */ +#include +#include #include #include @@ -31,6 +33,15 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max) min == 0 || max < min) return -EINVAL; + /* Serialize against address tagging enabling */ + if (mmap_write_lock_killable(mm)) + return -EINTR; + + if (!arch_pgtable_dma_compat(mm)) { + mmap_write_unlock(mm); + return -EBUSY; + } + mutex_lock(&iommu_sva_lock); /* Is a PASID already associated with this mm? */ if (pasid_valid(mm->pasid)) { @@ -46,6 +57,7 @@ int iommu_sva_alloc_pasid(struct mm_struct *mm, ioasid_t min, ioasid_t max) mm_pasid_set(mm, pasid); out: mutex_unlock(&iommu_sva_lock); + mmap_write_unlock(mm); return ret; } EXPORT_SYMBOL_GPL(iommu_sva_alloc_pasid); diff --git a/include/linux/mmu_context.h b/include/linux/mmu_context.h index b9b970f7ab45..115e2b518079 100644 --- a/include/linux/mmu_context.h +++ b/include/linux/mmu_context.h @@ -28,4 +28,8 @@ static inline void leave_mm(int cpu) { } # define task_cpu_possible(cpu, p) cpumask_test_cpu((cpu), task_cpu_possible_mask(p)) #endif +#ifndef arch_pgtable_dma_compat +#define arch_pgtable_dma_compat(mm) true +#endif + #endif From patchwork Tue Oct 18 11:33:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4121 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1911757wrs; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4btfsxtNqBMViHP8w4Jc8dsnbytT7hY1k3h1QTlIbp6ZZLxHBBx043fuRgVgLU8DKuFn7e X-Received: by 2002:a17:902:d4c6:b0:180:bdd5:1275 with SMTP id o6-20020a170902d4c600b00180bdd51275mr2795733plg.121.1666093417006; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093417; cv=none; d=google.com; s=arc-20160816; b=ySCoSXEHa1hO/KUFuGL4Y/Ftt15YHVL1en2ouAncP1eJ+vUxwDLcW0/fHaQlEr0S9B PsVghpVtJEyi7wbKfpmK/WWzs8IzQ6Dg1HCsgqppibn3TCb/kGeZL1WowqYlm24jYpnM PtgIma8+V3Bjt+aVcM7/lsVMatnXk3GjXDNgIFx87d9lXlYvQ3HYML39oAmFrsjgc4Kh Z9/nJoXyG6PxaSwu3IXI/HAiRQ/INubO/iKYmu1n5hqD+ARLGdj7qF0e1qQuj6dBTHiS VTLOuOx/A+nuv77Rul5r21CPNBYFomRlxPQ6QOEkZKwQUwqqP8AGecNxTMKm1Mq41md1 0EmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XTsXpEpsDctwvn64LXsgjN942xnSN/j6wbHHE7wF4nQ=; b=embfflC5/ZGLIUYMMxGKF5GRBEIaijCXsDx4Jux/C2KZioxTGvzeI+pxRgDNgmckrc 8G32andMtpb2U8lB8LOTYHFwd9YGJsURV44+2tjr6DmmdfpGvM+06TrN8I6L25BeKw+D eemBGu5Mwf1sG1rv2uUre5RTcgjh8aP/3FxQrYitzqR9Rq5+0U+vZwroe2XwMpli1qPU hsi6lmiObh+f3QCdCbldzwmOC8uvXvNO2WSgae7NXnEPlWOfi6F92dYZcQYeUsQbeZWj 337uVPHt51jl6w7eBqJ/JZf2hJENYJEinwrrkKc4MeGzNELXlaKfoEuDQ5VjYUoBYpjm Y1gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mU454881; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c13-20020a170902b68d00b0018040bdb798si14124220pls.242.2022.10.18.04.43.09; Tue, 18 Oct 2022 04:43:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=mU454881; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230292AbiJRLg5 (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32878 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230281AbiJRLgs (ORCPT ); Tue, 18 Oct 2022 07:36:48 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C868DB9784 for ; Tue, 18 Oct 2022 04:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092948; x=1697628948; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=A07f05+t0KWtt4ghU/0U2MmLVS25K/SAxXIfuDOarp4=; b=mU454881bQqQrDRujxjdiuwOyaTmqIiCp82Bwbfv0To+clvjbUgVoTuv sYVYQhtl9FL1MkKzByDJ++og7EwT7SXvcJJBFXmT652GgtE75RyjKukSn qFSkiY/hqCL7dnC6evyp/QdJoIaEcxwjYnEwP/tQIBhu45Hs6qoPx2RWe MK3O5GpGXPP0kZTiVZalAaksS9U+2Uo1oU5z6qy3KjF++h59r4ad7FCac HpmyWKBV/SOY8ONLsZvHrdozkcpslSEO25W07d4QJtOu4/066Uu3rwdoL uPvzf6SlrSOSANXDczs9F6L5mo2HDW1aig6GyxX5H2i07g3zfsjfdv/3P A==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="368105827" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="368105827" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:17 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763180" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763180" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 654E3104BA9; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv10 11/15] selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking Date: Tue, 18 Oct 2022 14:33:54 +0300 Message-Id: <20221018113358.7833-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025571009926381?= X-GMAIL-MSGID: =?utf-8?q?1747025571009926381?= From: Weihong Zhang LAM is supported only in 64-bit mode and applies only addresses used for data accesses. In 64-bit mode, linear address have 64 bits. LAM is applied to 64-bit linear address and allow software to use high bits for metadata. LAM supports configurations that differ regarding which pointer bits are masked and can be used for metadata. LAM includes following mode: - LAM_U57, pointer bits in positions 62:57 are masked (LAM width 6), allows bits 62:57 of a user pointer to be used as metadata. There are some arch_prctls: ARCH_ENABLE_TAGGED_ADDR: enable LAM mode, mask high bits of a user pointer. ARCH_GET_UNTAG_MASK: get current untagged mask. ARCH_GET_MAX_TAG_BITS: the maximum tag bits user can request. zero if LAM is not supported. The LAM mode is for pre-process, a process has only one chance to set LAM mode. But there is no API to disable LAM mode. So all of test cases are run under child process. Functions of this test: MALLOC - LAM_U57 masks bits 57:62 of a user pointer. Process on user space can dereference such pointers. - Disable LAM, dereference a pointer with metadata above 48 bit or 57 bit lead to trigger SIGSEGV. TAG_BITS - Max tag bits of LAM_U57 is 6. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov --- tools/testing/selftests/x86/Makefile | 2 +- tools/testing/selftests/x86/lam.c | 326 +++++++++++++++++++++++++++ 2 files changed, 327 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/x86/lam.c diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index 0388c4d60af0..c1a16a9d4f2f 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -18,7 +18,7 @@ TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \ test_FCMOV test_FCOMI test_FISTTP \ vdso_restorer TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering \ - corrupt_xstate_header amx + corrupt_xstate_header amx lam # Some selftests require 32bit support enabled also on 64bit systems TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c new file mode 100644 index 000000000000..900a3a0fb709 --- /dev/null +++ b/tools/testing/selftests/x86/lam.c @@ -0,0 +1,326 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../kselftest.h" + +#ifndef __x86_64__ +# error This test is 64-bit only +#endif + +/* LAM modes, these definitions were copied from kernel code */ +#define LAM_NONE 0 +#define LAM_U57_BITS 6 + +#define LAM_U57_MASK (0x3fULL << 57) +/* arch prctl for LAM */ +#define ARCH_GET_UNTAG_MASK 0x4001 +#define ARCH_ENABLE_TAGGED_ADDR 0x4002 +#define ARCH_GET_MAX_TAG_BITS 0x4003 + +/* Specified test function bits */ +#define FUNC_MALLOC 0x1 +#define FUNC_BITS 0x2 + +#define TEST_MASK 0x3 + +#define MALLOC_LEN 32 + +struct testcases { + unsigned int later; + int expected; /* 2: SIGSEGV Error; 1: other errors */ + unsigned long lam; + uint64_t addr; + int (*test_func)(struct testcases *test); + const char *msg; +}; + +int tests_cnt; +jmp_buf segv_env; + +static void segv_handler(int sig) +{ + ksft_print_msg("Get segmentation fault(%d).", sig); + siglongjmp(segv_env, 1); +} + +static inline int cpu_has_lam(void) +{ + unsigned int cpuinfo[4]; + + __cpuid_count(0x7, 1, cpuinfo[0], cpuinfo[1], cpuinfo[2], cpuinfo[3]); + + return (cpuinfo[0] & (1 << 26)); +} + +/* + * Set tagged address and read back untag mask. + * check if the untagged mask is expected. + * + * @return: + * 0: Set LAM mode successfully + * others: failed to set LAM + */ +static int set_lam(unsigned long lam) +{ + int ret = 0; + uint64_t ptr = 0; + + if (lam != LAM_U57_BITS && lam != LAM_NONE) + return -1; + + /* Skip check return */ + syscall(SYS_arch_prctl, ARCH_ENABLE_TAGGED_ADDR, lam); + + /* Get untagged mask */ + syscall(SYS_arch_prctl, ARCH_GET_UNTAG_MASK, &ptr); + + /* Check mask returned is expected */ + if (lam == LAM_U57_BITS) + ret = (ptr != ~(LAM_U57_MASK)); + else if (lam == LAM_NONE) + ret = (ptr != -1ULL); + + return ret; +} + +static unsigned long get_default_tag_bits(void) +{ + pid_t pid; + int lam = LAM_NONE; + int ret = 0; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + } else if (pid == 0) { + /* Set LAM mode in child process */ + if (set_lam(LAM_U57_BITS) == 0) + lam = LAM_U57_BITS; + else + lam = LAM_NONE; + exit(lam); + } else { + wait(&ret); + lam = WEXITSTATUS(ret); + } + + return lam; +} + +/* According to LAM mode, set metadata in high bits */ +static uint64_t set_metadata(uint64_t src, unsigned long lam) +{ + uint64_t metadata; + + srand(time(NULL)); + /* Get a random value as metadata */ + metadata = rand(); + + switch (lam) { + case LAM_U57_BITS: /* Set metadata in bits 62:57 */ + metadata = (src & ~(LAM_U57_MASK)) | ((metadata & 0x3f) << 57); + break; + default: + metadata = src; + break; + } + + return metadata; +} + +/* + * Set metadata in user pointer, compare new pointer with original pointer. + * both pointers should point to the same address. + * + * @return: + * 0: value on the pointer with metadate and value on original are same + * 1: not same. + */ +static int handle_lam_test(void *src, unsigned int lam) +{ + char *ptr; + + strcpy((char *)src, "USER POINTER"); + + ptr = (char *)set_metadata((uint64_t)src, lam); + if (src == ptr) + return 0; + + /* Copy a string into the pointer with metadata */ + strcpy((char *)ptr, "METADATA POINTER"); + + return (!!strcmp((char *)src, (char *)ptr)); +} + + +int handle_max_bits(struct testcases *test) +{ + unsigned long exp_bits = get_default_tag_bits(); + unsigned long bits = 0; + + if (exp_bits != LAM_NONE) + exp_bits = LAM_U57_BITS; + + /* Get LAM max tag bits */ + if (syscall(SYS_arch_prctl, ARCH_GET_MAX_TAG_BITS, &bits) == -1) + return 1; + + return (exp_bits != bits); +} + +/* + * Test lam feature through dereference pointer get from malloc. + * @return 0: Pass test. 1: Get failure during test 2: Get SIGSEGV + */ +static int handle_malloc(struct testcases *test) +{ + char *ptr = NULL; + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) == -1) + return 1; + + ptr = (char *)malloc(MALLOC_LEN); + if (ptr == NULL) { + perror("malloc() failure\n"); + return 1; + } + + /* Set signal handler */ + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = handle_lam_test(ptr, test->lam); + } else { + ret = 2; + } + + if (test->later != 0 && test->lam != 0) + if (set_lam(test->lam) == -1 && ret == 0) + ret = 1; + + free(ptr); + + return ret; +} + +static int fork_test(struct testcases *test) +{ + int ret, child_ret; + pid_t pid; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + ret = 1; + } else if (pid == 0) { + ret = test->test_func(test); + exit(ret); + } else { + wait(&child_ret); + ret = WEXITSTATUS(child_ret); + } + + return ret; +} + +static void run_test(struct testcases *test, int count) +{ + int i, ret = 0; + + for (i = 0; i < count; i++) { + struct testcases *t = test + i; + + /* fork a process to run test case */ + ret = fork_test(t); + if (ret != 0) + ret = (t->expected == ret); + else + ret = !(t->expected); + + tests_cnt++; + ksft_test_result(ret, t->msg); + } +} + +static struct testcases malloc_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_malloc, + .msg = "MALLOC: LAM_U57. Dereferencing pointer with metadata\n", + }, + { + .later = 1, + .expected = 2, + .lam = LAM_U57_BITS, + .test_func = handle_malloc, + .msg = "MALLOC:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + + +static struct testcases bits_cases[] = { + { + .test_func = handle_max_bits, + .msg = "BITS: Check default tag bits\n", + }, +}; + +static void cmd_help(void) +{ + printf("usage: lam [-h] [-t test list]\n"); + printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); + printf("\t\t0x1:malloc; 0x2:max_bits;\n"); + printf("\t-h: help\n"); +} + +int main(int argc, char **argv) +{ + int c = 0; + unsigned int tests = TEST_MASK; + + tests_cnt = 0; + + if (!cpu_has_lam()) { + ksft_print_msg("Unsupported LAM feature!\n"); + return -1; + } + + while ((c = getopt(argc, argv, "ht:")) != -1) { + switch (c) { + case 't': + tests = strtoul(optarg, NULL, 16); + if (!(tests & TEST_MASK)) { + ksft_print_msg("Invalid argument!\n"); + return -1; + } + break; + case 'h': + cmd_help(); + return 0; + default: + ksft_print_msg("Invalid argument\n"); + return -1; + } + } + + if (tests & FUNC_MALLOC) + run_test(malloc_cases, ARRAY_SIZE(malloc_cases)); + + if (tests & FUNC_BITS) + run_test(bits_cases, ARRAY_SIZE(bits_cases)); + + ksft_set_plan(tests_cnt); + + return ksft_exit_pass(); +} From patchwork Tue Oct 18 11:33:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4123 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1911761wrs; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4UJ1vnInk1pUB0heHH0LVmxDmfHzxUUqdnF139v/gX5yK4vyydl4CkaV8VbNkk8B5lLTSv X-Received: by 2002:a63:8949:0:b0:46b:2f56:a910 with SMTP id v70-20020a638949000000b0046b2f56a910mr2398818pgd.158.1666093417153; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093417; cv=none; d=google.com; s=arc-20160816; b=DMCCaWLLy3WIDmj3b0/sLPbWbRKee+sMvHGkGodXbGPp3ZBoe16E5bJvc61RVcJpiD O5VuyCVrX1a8mTKLvW6MLRgmWm/1k724IgL02Ad35RXlKhhWffs25G7evBo1OizmDU7d ggJLpZCKpNP4qTvIWfg6KqoFy6qyfGAEta0nAKv2wdOndGtIwwbRC93L+ypd0MWGV9B4 y0uSB8rkGHrLauZLzckVcfQQtmVkA3FPQYYWb5NEwWAiCHn7OKgI94+K3FQ39nmroyyI 7m475lnOLN/UlA463/w3WFHAAsCPgfOGGour+oraVFIRB0OwAC8eofZ753hG0ono6ICQ Ahbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=kvClGH7UWdzL+Tdet8/mTebKDV36RRSGh+dWyFD9pTo=; b=dt0CyA0PaLBJmG02T9w5Bl5K8JSj15SVWjQWjSSBI+U7kKEtH0yJut3HOGpShlyQUq QPuuafv8JovaE821/mYx3T/CX0ycAaZGr4a7ALlpP9lbcJNm3cLn3o34wjylaRl44Hui 4ie8fM5N6MWcclr8N5/fRV/WAM+gMcZTUD4xdhIyA0BKXaOvjQ2FIbJWFDc+WOs+D0C9 fsVL543eYyvHshAT8RJXGXBnSOR5dFMTXBwm0Z1R2U5NqyqXsV1X8ty0TMI+RFPtzCys Yw4wDU7WLKyj/m8GTO6EHQufFYvWxRsM3lCfw9M0vk0za5D6ikA0su6RttjIKiNsm86s uA+w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TGKFz7eR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c3-20020a056a00248300b00536bbfa4994si13712895pfv.345.2022.10.18.04.43.24; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=TGKFz7eR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229665AbiJRLhL (ORCPT + 99 others); Tue, 18 Oct 2022 07:37:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230302AbiJRLgu (ORCPT ); Tue, 18 Oct 2022 07:36:50 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EA6CBA917 for ; Tue, 18 Oct 2022 04:36:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092976; x=1697628976; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TuBz7tekTf1BWBxAT1QoET0tgEOR4TTS5pMFs+n0/Gc=; b=TGKFz7eRMZcZuJZalzLqmOKHo+U2/SN/Q1uW+JSlXWNy4Zk2qPPuA3td Pt6y6ReNG5c5ElxwRTDu01B2LTO/SKyI9M3dOgQo8uTJqE08yOBcMLtzk 9dro/henHXaR4n7+qkdy0XdWBrj/6VeVwciGVoYYZd63iuoNPYMkqbEFd op9alSU3eOCEReEB6SdMhdE9j7bMiAID0JkAUSpfvrLamDxd/QDXgDK6G huwFOaMh2oELt2XYXNVz0/xTBSf/H/IXD+0yXY0htd8AyMaQr+zBTsxv2 FGMa2wNu0rMfSrTJbl2/4Rr7dKL4qUUGxlk6lp6t5xBcuEgoRkVP38fJj g==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="368105828" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="368105828" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:18 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763182" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763182" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 70434104BAA; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv10 12/15] selftests/x86/lam: Add mmap and SYSCALL test cases for linear-address masking Date: Tue, 18 Oct 2022 14:33:55 +0300 Message-Id: <20221018113358.7833-13-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025570889985860?= X-GMAIL-MSGID: =?utf-8?q?1747025570889985860?= From: Weihong Zhang Add mmap and SYSCALL test cases. SYSCALL test cases: - LAM supports set metadata in high bits 62:57 (LAM_U57) of a user pointer, pass the pointer to SYSCALL, SYSCALL can dereference the pointer and return correct result. - Disable LAM, pass a pointer with metadata in high bits to SYSCALL, SYSCALL returns -1 (EFAULT). MMAP test cases: - Enable LAM_U57, MMAP with low address (below bits 47), set metadata in high bits of the address, dereference the address should be allowed. - Enable LAM_U57, MMAP with high address (above bits 47), set metadata in high bits of the address, dereference the address should be allowed. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov --- tools/testing/selftests/x86/lam.c | 144 +++++++++++++++++++++++++++++- 1 file changed, 140 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 900a3a0fb709..cdc6e40e00e0 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -29,11 +30,18 @@ /* Specified test function bits */ #define FUNC_MALLOC 0x1 #define FUNC_BITS 0x2 +#define FUNC_MMAP 0x4 +#define FUNC_SYSCALL 0x8 -#define TEST_MASK 0x3 +#define TEST_MASK 0xf + +#define LOW_ADDR (0x1UL << 30) +#define HIGH_ADDR (0x3UL << 48) #define MALLOC_LEN 32 +#define PAGE_SIZE (4 << 10) + struct testcases { unsigned int later; int expected; /* 2: SIGSEGV Error; 1: other errors */ @@ -49,6 +57,7 @@ jmp_buf segv_env; static void segv_handler(int sig) { ksft_print_msg("Get segmentation fault(%d).", sig); + siglongjmp(segv_env, 1); } @@ -61,6 +70,16 @@ static inline int cpu_has_lam(void) return (cpuinfo[0] & (1 << 26)); } +/* Check 5-level page table feature in CPUID.(EAX=07H, ECX=00H):ECX.[bit 16] */ +static inline int cpu_has_la57(void) +{ + unsigned int cpuinfo[4]; + + __cpuid_count(0x7, 0, cpuinfo[0], cpuinfo[1], cpuinfo[2], cpuinfo[3]); + + return (cpuinfo[2] & (1 << 16)); +} + /* * Set tagged address and read back untag mask. * check if the untagged mask is expected. @@ -213,6 +232,68 @@ static int handle_malloc(struct testcases *test) return ret; } +static int handle_mmap(struct testcases *test) +{ + void *ptr; + unsigned int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED; + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + ptr = mmap((void *)test->addr, PAGE_SIZE, PROT_READ | PROT_WRITE, + flags, -1, 0); + if (ptr == MAP_FAILED) { + if (test->addr == HIGH_ADDR) + if (!cpu_has_la57()) + return 3; /* unsupport LA57 */ + return 1; + } + + if (test->later != 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + ret = 1; + + if (ret == 0) { + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = handle_lam_test(ptr, test->lam); + } else { + ret = 2; + } + } + + munmap(ptr, PAGE_SIZE); + return ret; +} + +static int handle_syscall(struct testcases *test) +{ + struct utsname unme, *pu; + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + pu = (struct utsname *)set_metadata((uint64_t)&unme, test->lam); + ret = uname(pu); + if (ret < 0) + ret = 1; + } else { + ret = 2; + } + + if (test->later != 0 && test->lam != 0) + if (set_lam(test->lam) != -1 && ret == 0) + ret = 1; + + return ret; +} + static int fork_test(struct testcases *test) { int ret, child_ret; @@ -241,13 +322,20 @@ static void run_test(struct testcases *test, int count) struct testcases *t = test + i; /* fork a process to run test case */ + tests_cnt++; ret = fork_test(t); + + /* return 3 is not support LA57, the case should be skipped */ + if (ret == 3) { + ksft_test_result_skip(t->msg); + continue; + } + if (ret != 0) ret = (t->expected == ret); else ret = !(t->expected); - tests_cnt++; ksft_test_result(ret, t->msg); } } @@ -268,7 +356,6 @@ static struct testcases malloc_cases[] = { }, }; - static struct testcases bits_cases[] = { { .test_func = handle_max_bits, @@ -276,11 +363,54 @@ static struct testcases bits_cases[] = { }, }; +static struct testcases syscall_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_syscall, + .msg = "SYSCALL: LAM_U57. syscall with metadata\n", + }, + { + .later = 1, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = handle_syscall, + .msg = "SYSCALL:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + +static struct testcases mmap_cases[] = { + { + .later = 1, + .expected = 0, + .lam = LAM_U57_BITS, + .addr = HIGH_ADDR, + .test_func = handle_mmap, + .msg = "MMAP: First mmap high address, then set LAM_U57.\n", + }, + { + .later = 0, + .expected = 0, + .lam = LAM_U57_BITS, + .addr = HIGH_ADDR, + .test_func = handle_mmap, + .msg = "MMAP: First LAM_U57, then High address.\n", + }, + { + .later = 0, + .expected = 0, + .lam = LAM_U57_BITS, + .addr = LOW_ADDR, + .test_func = handle_mmap, + .msg = "MMAP: First LAM_U57, then Low address.\n", + }, +}; + static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits;\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall.\n"); printf("\t-h: help\n"); } @@ -320,6 +450,12 @@ int main(int argc, char **argv) if (tests & FUNC_BITS) run_test(bits_cases, ARRAY_SIZE(bits_cases)); + if (tests & FUNC_MMAP) + run_test(mmap_cases, ARRAY_SIZE(mmap_cases)); + + if (tests & FUNC_SYSCALL) + run_test(syscall_cases, ARRAY_SIZE(syscall_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass(); From patchwork Tue Oct 18 11:33:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4119 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1910807wrs; Tue, 18 Oct 2022 04:40:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6vIIWKq3uq/qCUiI790oX5wLX+b3mIoW/is+7DCD1bFe8HIiv6cvMqnTEZtEx9Y8nSKvNH X-Received: by 2002:a17:90b:17cf:b0:20d:b274:6f50 with SMTP id me15-20020a17090b17cf00b0020db2746f50mr28846792pjb.231.1666093254302; Tue, 18 Oct 2022 04:40:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093254; cv=none; d=google.com; s=arc-20160816; b=nwcPb67gMHmVABSI9knzXjLtSxg0xaCgXdvy038+vEVRS0iYrkYYBJQssFzxXWbOw+ EQHBdfs3FEjGhjj75o8moENtt2GnyPc+jU3yKefr5jjA0FXBuJjnGFf+v/NJruPHK4DG RtgvN5iIAjyuVNJVKCS4Ts5IYOojiPIhEF8sRmXEjbku6KhrmnHlxFTx8dHkEXeZah4j TUACkDcggYO34Ht+K4+rCnCxPkvdFzd/zLeEE9fbouu+RmjRwRLN2BcKWd/p1mxn2wVH 99Q0mCCXj0MTEXTw9zGMK35SxaehVArtObt1vfpt8u81zAsVE3LzCT0R8KXvPIGiviA7 JodA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=9KN/Re8kLSvGk9k/yFun3GFKx3YHJN773QAA//jRU1I=; b=x2/IQlPCT/76GoO+h0xVJg+p+p5niC5Yl6P4iMj7DDuh6LSLMKwAsKjENWWKGtr5pU XvMLENJTwoUVLZtJ0WaEVSY5Fd9dmoFiixKm9PC1wCrD0nI2bzCgf5/K7Z6dlhnUxH1A 0nw1HEOnMhelk03b7yOLMQGd17s+v7XWmD3iTOk4Hw+866hzJIFylzQmIbpaWPNkeHaM v780V1wFNW2oE69CtkQFGyIP7zQnrH4tPJ6FRgh2Fhz3QejbfMA3jKczMmNgM4sVatUb j6KefP6v9s5AIv4y/qN0U2WIZGq5nEt+Ep6beCSqHc6+410GlViCi9yGm9Upcpks5ZTU QVJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FoST7pHj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w8-20020a63f508000000b0043486074814si18218669pgh.661.2022.10.18.04.40.40; Tue, 18 Oct 2022 04:40:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FoST7pHj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230306AbiJRLgu (ORCPT + 99 others); Tue, 18 Oct 2022 07:36:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230201AbiJRLgG (ORCPT ); Tue, 18 Oct 2022 07:36:06 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D9FBDF1D for ; Tue, 18 Oct 2022 04:35:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092930; x=1697628930; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=br6Om71qzm65KaCJtdGB8t7hbJwdX+JrsLtwnbZitNE=; b=FoST7pHjv1AFgm8LDcEnCkftgOa3DKu95m7XpbxHmBbUlOApNeQA6kLz h4ObLWPhlxPOhF666MPVyaXjC8KF7NNtwIovM59g7cCXUhf7D1E5dWKFd 90jAbMl/0DF4kUNwZPuTVO/qKngjZJUWsNEVoTgA3+yU/tNzjqskeibV9 X/sXCng/DRs9h08Pxs1TdUI/8Fp0efUaKrSMsjWXGZ6qsrtqiHSlPqkCi GHNMVmbkiiDCjrAFwJyta2ACHwvQuDGuPhnK7r28Y0d8KzgzQJNGO1fuV oaiWdBeKX10Dk72TUyGlym2BH+SH9kaJM/rfs8erstqr1p6nNgpIBqxu5 w==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="392382137" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="392382137" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:18 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861189" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861189" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 7BC29104BAB; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv10 13/15] selftests/x86/lam: Add io_uring test cases for linear-address masking Date: Tue, 18 Oct 2022 14:33:56 +0300 Message-Id: <20221018113358.7833-14-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025400345092462?= X-GMAIL-MSGID: =?utf-8?q?1747025400345092462?= From: Weihong Zhang LAM should be supported in kernel thread, using io_uring to verify LAM feature. The test cases implement read a file through io_uring, the test cases choose an iovec array as receiving buffer, which used to receive data, according to LAM mode, set metadata in high bits of these buffer. io_uring can deal with these buffers that pointed to pointers with the metadata in high bits. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov --- tools/testing/selftests/x86/lam.c | 341 +++++++++++++++++++++++++++++- 1 file changed, 339 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index cdc6e40e00e0..8ea1fcef4c9f 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -9,8 +9,12 @@ #include #include #include +#include +#include #include +#include +#include #include "../kselftest.h" #ifndef __x86_64__ @@ -32,8 +36,9 @@ #define FUNC_BITS 0x2 #define FUNC_MMAP 0x4 #define FUNC_SYSCALL 0x8 +#define FUNC_URING 0x10 -#define TEST_MASK 0xf +#define TEST_MASK 0x1f #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) @@ -42,6 +47,13 @@ #define PAGE_SIZE (4 << 10) +#define barrier() ({ \ + __asm__ __volatile__("" : : : "memory"); \ +}) + +#define URING_QUEUE_SZ 1 +#define URING_BLOCK_SZ 2048 + struct testcases { unsigned int later; int expected; /* 2: SIGSEGV Error; 1: other errors */ @@ -51,6 +63,33 @@ struct testcases { const char *msg; }; +/* Used by CQ of uring, source file handler and file's size */ +struct file_io { + int file_fd; + off_t file_sz; + struct iovec iovecs[]; +}; + +struct io_uring_queue { + unsigned int *head; + unsigned int *tail; + unsigned int *ring_mask; + unsigned int *ring_entries; + unsigned int *flags; + unsigned int *array; + union { + struct io_uring_cqe *cqes; + struct io_uring_sqe *sqes; + } queue; + size_t ring_sz; +}; + +struct io_ring { + int ring_fd; + struct io_uring_queue sq_ring; + struct io_uring_queue cq_ring; +}; + int tests_cnt; jmp_buf segv_env; @@ -294,6 +333,285 @@ static int handle_syscall(struct testcases *test) return ret; } +int sys_uring_setup(unsigned int entries, struct io_uring_params *p) +{ + return (int)syscall(__NR_io_uring_setup, entries, p); +} + +int sys_uring_enter(int fd, unsigned int to, unsigned int min, unsigned int flags) +{ + return (int)syscall(__NR_io_uring_enter, fd, to, min, flags, NULL, 0); +} + +/* Init submission queue and completion queue */ +int mmap_io_uring(struct io_uring_params p, struct io_ring *s) +{ + struct io_uring_queue *sring = &s->sq_ring; + struct io_uring_queue *cring = &s->cq_ring; + + sring->ring_sz = p.sq_off.array + p.sq_entries * sizeof(unsigned int); + cring->ring_sz = p.cq_off.cqes + p.cq_entries * sizeof(struct io_uring_cqe); + + if (p.features & IORING_FEAT_SINGLE_MMAP) { + if (cring->ring_sz > sring->ring_sz) + sring->ring_sz = cring->ring_sz; + + cring->ring_sz = sring->ring_sz; + } + + void *sq_ptr = mmap(0, sring->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, s->ring_fd, + IORING_OFF_SQ_RING); + + if (sq_ptr == MAP_FAILED) { + perror("sub-queue!"); + return 1; + } + + void *cq_ptr = sq_ptr; + + if (!(p.features & IORING_FEAT_SINGLE_MMAP)) { + cq_ptr = mmap(0, cring->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, s->ring_fd, + IORING_OFF_CQ_RING); + if (cq_ptr == MAP_FAILED) { + perror("cpl-queue!"); + munmap(sq_ptr, sring->ring_sz); + return 1; + } + } + + sring->head = sq_ptr + p.sq_off.head; + sring->tail = sq_ptr + p.sq_off.tail; + sring->ring_mask = sq_ptr + p.sq_off.ring_mask; + sring->ring_entries = sq_ptr + p.sq_off.ring_entries; + sring->flags = sq_ptr + p.sq_off.flags; + sring->array = sq_ptr + p.sq_off.array; + + /* Map a queue as mem map */ + s->sq_ring.queue.sqes = mmap(0, p.sq_entries * sizeof(struct io_uring_sqe), + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, + s->ring_fd, IORING_OFF_SQES); + if (s->sq_ring.queue.sqes == MAP_FAILED) { + munmap(sq_ptr, sring->ring_sz); + if (sq_ptr != cq_ptr) { + ksft_print_msg("failed to mmap uring queue!"); + munmap(cq_ptr, cring->ring_sz); + return 1; + } + } + + cring->head = cq_ptr + p.cq_off.head; + cring->tail = cq_ptr + p.cq_off.tail; + cring->ring_mask = cq_ptr + p.cq_off.ring_mask; + cring->ring_entries = cq_ptr + p.cq_off.ring_entries; + cring->queue.cqes = cq_ptr + p.cq_off.cqes; + + return 0; +} + +/* Init io_uring queues */ +int setup_io_uring(struct io_ring *s) +{ + struct io_uring_params para; + + memset(¶, 0, sizeof(para)); + s->ring_fd = sys_uring_setup(URING_QUEUE_SZ, ¶); + if (s->ring_fd < 0) + return 1; + + return mmap_io_uring(para, s); +} + +/* + * Get data from completion queue. the data buffer saved the file data + * return 0: success; others: error; + */ +int handle_uring_cq(struct io_ring *s) +{ + struct file_io *fi = NULL; + struct io_uring_queue *cring = &s->cq_ring; + struct io_uring_cqe *cqe; + unsigned int head; + off_t len = 0; + + head = *cring->head; + + do { + barrier(); + if (head == *cring->tail) + break; + /* Get the entry */ + cqe = &cring->queue.cqes[head & *s->cq_ring.ring_mask]; + fi = (struct file_io *)cqe->user_data; + if (cqe->res < 0) + break; + + int blocks = (int)(fi->file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + for (int i = 0; i < blocks; i++) + len += fi->iovecs[i].iov_len; + + head++; + } while (1); + + *cring->head = head; + barrier(); + + return (len != fi->file_sz); +} + +/* + * Submit squeue. specify via IORING_OP_READV. + * the buffer need to be set metadata according to LAM mode + */ +int handle_uring_sq(struct io_ring *ring, struct file_io *fi, unsigned long lam) +{ + int file_fd = fi->file_fd; + struct io_uring_queue *sring = &ring->sq_ring; + unsigned int index = 0, cur_block = 0, tail = 0, next_tail = 0; + struct io_uring_sqe *sqe; + + off_t remain = fi->file_sz; + int blocks = (int)(remain + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + while (remain) { + off_t bytes = remain; + void *buf; + + if (bytes > URING_BLOCK_SZ) + bytes = URING_BLOCK_SZ; + + fi->iovecs[cur_block].iov_len = bytes; + + if (posix_memalign(&buf, URING_BLOCK_SZ, URING_BLOCK_SZ)) + return 1; + + fi->iovecs[cur_block].iov_base = (void *)set_metadata((uint64_t)buf, lam); + remain -= bytes; + cur_block++; + } + + next_tail = *sring->tail; + tail = next_tail; + next_tail++; + + barrier(); + + index = tail & *ring->sq_ring.ring_mask; + + sqe = &ring->sq_ring.queue.sqes[index]; + sqe->fd = file_fd; + sqe->flags = 0; + sqe->opcode = IORING_OP_READV; + sqe->addr = (unsigned long)fi->iovecs; + sqe->len = blocks; + sqe->off = 0; + sqe->user_data = (uint64_t)fi; + + sring->array[index] = index; + tail = next_tail; + + if (*sring->tail != tail) { + *sring->tail = tail; + barrier(); + } + + if (sys_uring_enter(ring->ring_fd, 1, 1, IORING_ENTER_GETEVENTS) < 0) + return 1; + + return 0; +} + +/* + * Test LAM in async I/O and io_uring, read current binery through io_uring + * Set metadata in pointers to iovecs buffer. + */ +int do_uring(unsigned long lam) +{ + struct io_ring *ring; + struct file_io *fi; + struct stat st; + int ret = 1; + char path[PATH_MAX]; + + /* get current process path */ + if (readlink("/proc/self/exe", path, PATH_MAX) <= 0) + return 1; + + int file_fd = open(path, O_RDONLY); + + if (file_fd < 0) + return 1; + + if (fstat(file_fd, &st) < 0) + return 1; + + off_t file_sz = st.st_size; + + int blocks = (int)(file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + fi = malloc(sizeof(*fi) + sizeof(struct iovec) * blocks); + if (!fi) + return 1; + + fi->file_sz = file_sz; + fi->file_fd = file_fd; + + ring = malloc(sizeof(*ring)); + if (!ring) + return 1; + + memset(ring, 0, sizeof(struct io_ring)); + + if (setup_io_uring(ring)) + goto out; + + if (handle_uring_sq(ring, fi, lam)) + goto out; + + ret = handle_uring_cq(ring); + +out: + free(ring); + + for (int i = 0; i < blocks; i++) { + if (fi->iovecs[i].iov_base) { + uint64_t addr = ((uint64_t)fi->iovecs[i].iov_base); + + switch (lam) { + case LAM_U57_BITS: /* Clear bits 62:57 */ + addr = (addr & ~(0x3fULL << 57)); + break; + } + free((void *)addr); + fi->iovecs[i].iov_base = NULL; + } + } + + free(fi); + + return ret; +} + +int handle_uring(struct testcases *test) +{ + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = do_uring(test->lam); + } else { + ret = 2; + } + + return ret; +} + static int fork_test(struct testcases *test) { int ret, child_ret; @@ -340,6 +658,22 @@ static void run_test(struct testcases *test, int count) } } +static struct testcases uring_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_uring, + .msg = "URING: LAM_U57. Dereferencing pointer with metadata\n", + }, + { + .later = 1, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = handle_uring, + .msg = "URING:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + static struct testcases malloc_cases[] = { { .later = 0, @@ -410,7 +744,7 @@ static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall.\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring.\n"); printf("\t-h: help\n"); } @@ -456,6 +790,9 @@ int main(int argc, char **argv) if (tests & FUNC_SYSCALL) run_test(syscall_cases, ARRAY_SIZE(syscall_cases)); + if (tests & FUNC_URING) + run_test(uring_cases, ARRAY_SIZE(uring_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass(); From patchwork Tue Oct 18 11:33:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4122 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1911760wrs; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5ofrpZhPVbXyLP/TabgydEcsg1usjyDqvNtMaHSUP1xWlKkUxj4UclIQl0nTsH/7YoimhJ X-Received: by 2002:a17:90b:2691:b0:20c:d655:c67d with SMTP id pl17-20020a17090b269100b0020cd655c67dmr39072270pjb.36.1666093417005; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093416; cv=none; d=google.com; s=arc-20160816; b=X4MMhL1myo9DxBuOWNkUiRVaf1zZgmmseB1x18w5JdhmzgtVupfvtpZIbKZyNFIRpM 7BXDGLi7Zi9d0O21BVls2sDQLckdSIAo4S4rsMMpXjvnxin6qa7yOfrg/NWfcXJtZqJG hFySomuaovMbY3Gbk0QVBzKh0GpusyMnMJmjsrAZ6Ea3HviEJHQX9xQiGTIRFGUUxYXp VNJcxlb41EyuTbWVqZMhn12DJi+YMT4Lgpt+hy1uQQg1czPkYpVdxusSqTXNx6zzf92A Q5wciAApeohCZGKCLFi5uP84cygdKISodrbWUQuUBrgdEXs/EDKh187vrdxrNNjD+o90 dprg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=iRoUaAq1wGIcSIS4qkv8NrfubUNx/rtX6/sfQtwGBdE=; b=vSFT36oXrtI5SQ0ALoUnqYIkIvw9YfsAprCQa+KB5wpgyVAr74u3ksL/zeHpTtI1dY WQ/S5Ztttnl9eVynmEUfbBhbff+C5jKlORPHf7yjm0nyqEWN+lX+6KQsZc7ToKSXgm5x bqkNtKi3Jum/ckcddJ0JydYSM5LBIyEeZ1A22voMBKBQUrrCw0jCMrSOHR0lf0fq/oek BMpw2h1QquURCT7OcwV7I5JqEZjuGu2EDBnQGyi7kONypEddgup7RVF3ErQZsm4mQGCa bbSsYEOSFUBsouYRmMU7mljd7JcQ2K11oQ8HC2UK53rPR6BxFktcUui4rLwhVNmfpGyv 7Q/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Fmi0cMfW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pj4-20020a17090b4f4400b0020cedba54fcsi15328245pjb.55.2022.10.18.04.43.11; Tue, 18 Oct 2022 04:43:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Fmi0cMfW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230309AbiJRLhA (ORCPT + 99 others); Tue, 18 Oct 2022 07:37:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230282AbiJRLgs (ORCPT ); Tue, 18 Oct 2022 07:36:48 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E7EDA0266 for ; Tue, 18 Oct 2022 04:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092948; x=1697628948; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2PBgo8dzetbAgvNxQe4E7GoQrbgUDaybtLdJFJyGEss=; b=Fmi0cMfWUggtBEVDTPdQFjGyi4MGNN/kn5RRpLdcKfShxAx0kBovRf1H N1Y4zxmoN3w8sbDPZnsWCwiamdPwmIqnnfwccBuiguLMzP+XbLbw5L586 UAW2daYv5+q5kxaasFPyxyKJRbPlDzGvqIqXpBtKycYbqLdi8XiBIPxDU CzAe6hAkgRERfOgEbHQ4QM5243IEC2ilWDl6CrMfljNzS67YSmmF0CpkS cwqKqUvclOVVOufusB65V6HpXijTbXWImoPlXSypHq/83wHt5bD0z2d3d OpO5LEeY8s4OZuHlzWBXgjib0kPgjAD4r3C7DCgiKm8IFQD90o4Axiiq3 Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="392382138" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="392382138" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:19 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="661861194" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="661861194" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by orsmga001-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:13 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 86D21104BAC; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv10 14/15] selftests/x86/lam: Add inherit test cases for linear-address masking Date: Tue, 18 Oct 2022 14:33:57 +0300 Message-Id: <20221018113358.7833-15-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025571274805721?= X-GMAIL-MSGID: =?utf-8?q?1747025571274805721?= From: Weihong Zhang LAM is enabled per-thread and gets inherited on fork(2)/clone(2). exec() reverts LAM status to the default disabled state. There are two test scenarios: - Fork test cases: These cases were used to test the inheritance of LAM for per-thread, Child process generated by fork() should inherit LAM feature from parent process, Child process can get the LAM mode same as parent process. - Execve test cases: Processes generated by execve() are different from processes generated by fork(), these processes revert LAM status to disabled status. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov --- tools/testing/selftests/x86/lam.c | 125 +++++++++++++++++++++++++++++- 1 file changed, 121 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 8ea1fcef4c9f..cfc9073c0262 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -37,8 +37,9 @@ #define FUNC_MMAP 0x4 #define FUNC_SYSCALL 0x8 #define FUNC_URING 0x10 +#define FUNC_INHERITE 0x20 -#define TEST_MASK 0x1f +#define TEST_MASK 0x3f #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) @@ -174,6 +175,28 @@ static unsigned long get_default_tag_bits(void) return lam; } +/* + * Set tagged address and read back untag mask. + * check if the untag mask is expected. + */ +static int get_lam(void) +{ + uint64_t ptr = 0; + int ret = -1; + /* Get untagged mask */ + if (syscall(SYS_arch_prctl, ARCH_GET_UNTAG_MASK, &ptr) == -1) + return -1; + + /* Check mask returned is expected */ + if (ptr == ~(LAM_U57_MASK)) + ret = LAM_U57_BITS; + else if (ptr == -1ULL) + ret = LAM_NONE; + + + return ret; +} + /* According to LAM mode, set metadata in high bits */ static uint64_t set_metadata(uint64_t src, unsigned long lam) { @@ -581,7 +604,7 @@ int do_uring(unsigned long lam) switch (lam) { case LAM_U57_BITS: /* Clear bits 62:57 */ - addr = (addr & ~(0x3fULL << 57)); + addr = (addr & ~(LAM_U57_MASK)); break; } free((void *)addr); @@ -632,6 +655,72 @@ static int fork_test(struct testcases *test) return ret; } +static int handle_execve(struct testcases *test) +{ + int ret, child_ret; + int lam = test->lam; + pid_t pid; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + ret = 1; + } else if (pid == 0) { + char path[PATH_MAX]; + + /* Set LAM mode in parent process */ + if (set_lam(lam) != 0) + return 1; + + /* Get current binary's path and the binary was run by execve */ + if (readlink("/proc/self/exe", path, PATH_MAX) <= 0) + exit(-1); + + /* run binary to get LAM mode and return to parent process */ + if (execlp(path, path, "-t 0x0", NULL) < 0) { + perror("error on exec"); + exit(-1); + } + } else { + wait(&child_ret); + ret = WEXITSTATUS(child_ret); + if (ret != LAM_NONE) + return 1; + } + + return 0; +} + +static int handle_inheritance(struct testcases *test) +{ + int ret, child_ret; + int lam = test->lam; + pid_t pid; + + /* Set LAM mode in parent process */ + if (set_lam(lam) != 0) + return 1; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + return 1; + } else if (pid == 0) { + /* Set LAM mode in parent process */ + int child_lam = get_lam(); + + exit(child_lam); + } else { + wait(&child_ret); + ret = WEXITSTATUS(child_ret); + + if (lam != ret) + return 1; + } + + return 0; +} + static void run_test(struct testcases *test, int count) { int i, ret = 0; @@ -740,11 +829,26 @@ static struct testcases mmap_cases[] = { }, }; +static struct testcases inheritance_cases[] = { + { + .expected = 0, + .lam = LAM_U57_BITS, + .test_func = handle_inheritance, + .msg = "FORK: LAM_U57, child process should get LAM mode same as parent\n", + }, + { + .expected = 0, + .lam = LAM_U57_BITS, + .test_func = handle_execve, + .msg = "EXECVE: LAM_U57, child process should get disabled LAM mode\n", + }, +}; + static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring.\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring; 0x20:inherit;\n"); printf("\t-h: help\n"); } @@ -764,7 +868,7 @@ int main(int argc, char **argv) switch (c) { case 't': tests = strtoul(optarg, NULL, 16); - if (!(tests & TEST_MASK)) { + if (tests && !(tests & TEST_MASK)) { ksft_print_msg("Invalid argument!\n"); return -1; } @@ -778,6 +882,16 @@ int main(int argc, char **argv) } } + /* + * When tests is 0, it is not a real test case; + * the option used by test case(execve) to check the lam mode in + * process generated by execve, the process read back lam mode and + * check with lam mode in parent process. + */ + if (!tests) + return (get_lam()); + + /* Run test cases */ if (tests & FUNC_MALLOC) run_test(malloc_cases, ARRAY_SIZE(malloc_cases)); @@ -793,6 +907,9 @@ int main(int argc, char **argv) if (tests & FUNC_URING) run_test(uring_cases, ARRAY_SIZE(uring_cases)); + if (tests & FUNC_INHERITE) + run_test(inheritance_cases, ARRAY_SIZE(inheritance_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass(); From patchwork Tue Oct 18 11:33:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 4120 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1911756wrs; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4jcOywuYPUaK1+6nN0ETWQiNHRHbzV60uQvOzWwRKdXuG0v6PRavFNJWuMmH9l/rysi9uL X-Received: by 2002:a17:902:ef83:b0:17c:a2f:1e3 with SMTP id iz3-20020a170902ef8300b0017c0a2f01e3mr2581896plb.35.1666093417016; Tue, 18 Oct 2022 04:43:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666093417; cv=none; d=google.com; s=arc-20160816; b=BA5DZ6+i1h+FikOUqBZAfDS3XqUMYwhbCwHNKav4liwZZ+PrrXocDVBXciwLZncJDD wOTO6cCTcT6y5Dcg7/gqhsp+MaoZZzEOBpDF2Z81pYdy+3KoTQAHrliN/F5uDhCizIPH Zgs6uClykYwlKuJr5KvPMk3BW38QPzJR6zhKuTjG4qJVeN/SgNdCGavhVjfQfNey+diW 2FG4TKH15o+iXytrWTyvnNdVAhgPVZZftZWXmT9TjEkyhalhbOLKu5tUaBu/QxR4PxXF 8NMdPn9O9HLKSh4Wv1KTWuhh9YH2n8a3vwT2Sb54qhsXfBsdwGuiXpdpcpmBmxLCmLsP SOYA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=g7pzzdWRZn+FOVAsdtyumocBfEDyXZK/P39WTBQ3SdQ=; b=tjiaJxVbCxXlocuaIzfPs3r5m66sy5BSEhBlbR+5OigWCfNEuAQznIu6ftxrZXSl40 PRwxLJLLpxUdJITuLw6+POpIHZMTdxRTmWRcGQF7vvae8PZ8eqapfmjh5xwKmiz0ihFP VyW4jc1b/TF6U1CEZnhqCFZdSiEL5KwJ09BS4v7GkuFWl9wL2yCClXAqg47E0ltEB5/S Txaa5N3iEpkMGA7x049q9eRUujlmB/MRm9WdJk3Yfzbhdyq0vZ8AWgV5U5QACvBJMXUx aoFJ+HBIpZ7fPaWUfoyOvCUH0vTCsF92sy5aelZSb4OBXZW/aWFb6Hw9A6duYZ+5ti7D V2jA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iIKa+Iyw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 33-20020a630f61000000b00462f17e560csi14795030pgp.878.2022.10.18.04.43.24; Tue, 18 Oct 2022 04:43:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iIKa+Iyw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230319AbiJRLhE (ORCPT + 99 others); Tue, 18 Oct 2022 07:37:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33728 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230215AbiJRLgt (ORCPT ); Tue, 18 Oct 2022 07:36:49 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B2EC8768E for ; Tue, 18 Oct 2022 04:35:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1666092956; x=1697628956; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=I7mUpTnORPvv75eTCJTRzwfjLc/N5wp2MuStyApPKRw=; b=iIKa+IywDQuadHIW7b3wKvJHuWHinekRHjR0670pSaQaUZqdu+NHVfKz 0RdEN7R4c2bUOyOKGt5/zRZg+sl0uVmy5KIzuDqpt+xhiJGjlde8FMVq8 85/gNztgeLxuBbQaV0iA8xTbFNEOl/pYvLbSoVNiEKj1gv+6wvoiQsEHb 2W6gPh8GYh2r861sWbLVk5eayOnh0tsvWKr9nmWS2F4utsqyRe9fyMGXX SackRN9NPOYJszgE6lY1RwvDWNBbJbG22HKGccmV0Edg2EJKYCrdhIijC kniM9N7c1BD7r8SRMvV0MZJqdmYomcv7tAyintrlquu/YBlsljr9Mwd+4 A==; X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="392382163" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="392382163" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:24 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10503"; a="691763237" X-IronPort-AV: E=Sophos;i="5.95,193,1661842800"; d="scan'208";a="691763237" Received: from vhavel-mobl.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.51.115]) by fmsmga008-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Oct 2022 04:34:19 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 91A72104BAD; Tue, 18 Oct 2022 14:34:04 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv10 15/15] selftests/x86/lam: Add ARCH_FORCE_TAGGED_SVM test cases for linear-address masking Date: Tue, 18 Oct 2022 14:33:58 +0300 Message-Id: <20221018113358.7833-16-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.0 In-Reply-To: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> References: <20221018113358.7833-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747025570791649132?= X-GMAIL-MSGID: =?utf-8?q?1747025570791649132?= From: Weihong Zhang By default do not allow to enable both LAM and use SVM in the same process. The new ARCH_FORCE_TAGGED_SVM arch_prctl() overrides the limitation. Add new test cases for the new arch_prctl: Defore using ARCH_FORCE_TAGGED_SVM, should not allow to enable LAM/SVM coexisting. the test cases should be negative. The test depands on idxd driver and iommu. before test, need add "intel_iommu=on,sm_on" in kernel command line and insmod idxd driver. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov --- tools/testing/selftests/x86/lam.c | 237 +++++++++++++++++++++++++++++- 1 file changed, 235 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index cfc9073c0262..4b9da41de5c8 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -30,6 +30,7 @@ #define ARCH_GET_UNTAG_MASK 0x4001 #define ARCH_ENABLE_TAGGED_ADDR 0x4002 #define ARCH_GET_MAX_TAG_BITS 0x4003 +#define ARCH_FORCE_TAGGED_SVM 0x4004 /* Specified test function bits */ #define FUNC_MALLOC 0x1 @@ -38,8 +39,9 @@ #define FUNC_SYSCALL 0x8 #define FUNC_URING 0x10 #define FUNC_INHERITE 0x20 +#define FUNC_PASID 0x40 -#define TEST_MASK 0x3f +#define TEST_MASK 0x7f #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) @@ -55,11 +57,19 @@ #define URING_QUEUE_SZ 1 #define URING_BLOCK_SZ 2048 +/* Pasid test define */ +#define LAM_CMD_BIT 0x1 +#define PAS_CMD_BIT 0x2 +#define SVM_CMD_BIT 0x4 + +#define PAS_CMD(cmd1, cmd2, cmd3) (((cmd3) << 8) | ((cmd2) << 4) | ((cmd1) << 0)) + struct testcases { unsigned int later; int expected; /* 2: SIGSEGV Error; 1: other errors */ unsigned long lam; uint64_t addr; + uint64_t cmd; int (*test_func)(struct testcases *test); const char *msg; }; @@ -556,7 +566,7 @@ int do_uring(unsigned long lam) struct file_io *fi; struct stat st; int ret = 1; - char path[PATH_MAX]; + char path[PATH_MAX] = {0}; /* get current process path */ if (readlink("/proc/self/exe", path, PATH_MAX) <= 0) @@ -852,6 +862,226 @@ static void cmd_help(void) printf("\t-h: help\n"); } +/* Check for file existence */ +uint8_t file_Exists(const char *fileName) +{ + struct stat buffer; + + uint8_t ret = (stat(fileName, &buffer) == 0); + + return ret; +} + +/* Sysfs idxd files */ +const char *dsa_configs[] = { + "echo 1 > /sys/bus/dsa/devices/dsa0/wq0.1/group_id", + "echo shared > /sys/bus/dsa/devices/dsa0/wq0.1/mode", + "echo 10 > /sys/bus/dsa/devices/dsa0/wq0.1/priority", + "echo 16 > /sys/bus/dsa/devices/dsa0/wq0.1/size", + "echo 15 > /sys/bus/dsa/devices/dsa0/wq0.1/threshold", + "echo user > /sys/bus/dsa/devices/dsa0/wq0.1/type", + "echo MyApp1 > /sys/bus/dsa/devices/dsa0/wq0.1/name", + "echo 1 > /sys/bus/dsa/devices/dsa0/engine0.1/group_id", + "echo dsa0 > /sys/bus/dsa/drivers/idxd/bind", + /* bind files and devices, generated a device file in /dev */ + "echo wq0.1 > /sys/bus/dsa/drivers/user/bind", +}; + +/* DSA device file */ +const char *dsaDeviceFile = "/dev/dsa/wq0.1"; +/* file for io*/ +const char *dsaPasidEnable = "/sys/bus/dsa/devices/dsa0/pasid_enabled"; + +/* + * DSA depends on kernel cmdline "intel_iommu=on,sm_on" + * return pasid_enabled (0: disable 1:enable) + */ +int Check_DSA_Kernel_Setting(void) +{ + char command[256] = ""; + char buf[256] = ""; + char *ptr; + int rv = -1; + + snprintf(command, sizeof(command) - 1, "cat %s", dsaPasidEnable); + + FILE *cmd = popen(command, "r"); + + if (cmd) { + while (fgets(buf, sizeof(buf) - 1, cmd) != NULL); + + pclose(cmd); + rv = strtol(buf, &ptr, 16); + } + + return rv; +} + +/* + * Config DSA's sysfs files as shared DSA's WQ. + * Generated a device file /dev/dsa/wq0.1 + * Return: 0 OK; 1 Failed; 3 Skip(SVM disabled). + */ +int Dsa_Init_Sysfs(void) +{ + uint len = ARRAY_SIZE(dsa_configs); + const char **p = dsa_configs; + + if (file_Exists(dsaDeviceFile) == 1) + return 0; + + /* check the idxd driver */ + if (file_Exists(dsaPasidEnable) != 1) { + printf("Please make sure idxd driver was loaded\n"); + return 3; + } + + /* Check SVM feature */ + if (Check_DSA_Kernel_Setting() != 1) { + printf("Please enable SVM.(Add intel_iommu=on,sm_on in kernel cmdline)\n"); + return 3; + } + + /* Check the idxd device file on /dev/dsa/ */ + for (int i = 0; i < len; i++) { + if (system(p[i])) + return 1; + } + + /* After config, /dev/dsa/wq0.1 should be generated */ + return (file_Exists(dsaDeviceFile) != 1); +} + +/* + * Open DSA device file, triger API: iommu_sva_alloc_pasid + */ +void *allocate_dsa_pasid(void) +{ + int fd; + void *wq; + + fd = open(dsaDeviceFile, O_RDWR); + if (fd < 0) { + perror("open"); + return MAP_FAILED; + } + + wq = mmap(NULL, 0x1000, PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + if (wq == MAP_FAILED) + perror("mmap"); + + return wq; +} + +int set_force_svm(void) +{ + int ret = 0; + + ret = syscall(SYS_arch_prctl, ARCH_FORCE_TAGGED_SVM); + + return ret; +} + +int handle_pasid(struct testcases *test) +{ + uint tmp = test->cmd; + uint runed = 0x0; + int ret = 0; + void *wq = NULL; + + ret = Dsa_Init_Sysfs(); + if (ret != 0) + return ret; + + for (int i = 0; i < 3; i++) { + int err = 0; + + if (tmp & 0x1) { + /* run set lam mode*/ + if ((runed & 0x1) == 0) { + err = set_lam(LAM_U57_BITS); + runed = runed | 0x1; + } else + err = 1; + } else if (tmp & 0x4) { + /* run force svm */ + if ((runed & 0x4) == 0) { + err = set_force_svm(); + runed = runed | 0x4; + } else + err = 1; + } else if (tmp & 0x2) { + /* run allocate pasid */ + if ((runed & 0x2) == 0) { + runed = runed | 0x2; + wq = allocate_dsa_pasid(); + if (wq == MAP_FAILED) + err = 1; + } else + err = 1; + } + + ret = ret + err; + if (ret > 0) + break; + + tmp = tmp >> 4; + } + + if (wq != MAP_FAILED && wq != NULL) + if (munmap(wq, 0x1000)) + printf("munmap failed %d\n", errno); + + if (runed != 0x7) + ret = 1; + + return (ret != 0); +} + +/* + * Pasid test depends on idxd and SVM, kernel should enable iommu and sm. + * command line(intel_iommu=on,sm_on) + */ +static struct testcases pasid_cases[] = { + { + .expected = 1, + .cmd = PAS_CMD(LAM_CMD_BIT, PAS_CMD_BIT, SVM_CMD_BIT), + .test_func = handle_pasid, + .msg = "PASID: [Negative] Execute LAM, PASID, SVM in sequence\n", + }, + { + .expected = 0, + .cmd = PAS_CMD(LAM_CMD_BIT, SVM_CMD_BIT, PAS_CMD_BIT), + .test_func = handle_pasid, + .msg = "PASID: Execute LAM, SVM, PASID in sequence\n", + }, + { + .expected = 1, + .cmd = PAS_CMD(PAS_CMD_BIT, LAM_CMD_BIT, SVM_CMD_BIT), + .test_func = handle_pasid, + .msg = "PASID: [Negative] Execute PASID, LAM, SVM in sequence\n", + }, + { + .expected = 0, + .cmd = PAS_CMD(PAS_CMD_BIT, SVM_CMD_BIT, LAM_CMD_BIT), + .test_func = handle_pasid, + .msg = "PASID: Execute PASID, SVM, LAM in sequence\n", + }, + { + .expected = 0, + .cmd = PAS_CMD(SVM_CMD_BIT, LAM_CMD_BIT, PAS_CMD_BIT), + .test_func = handle_pasid, + .msg = "PASID: Execute SVM, LAM, PASID in sequence\n", + }, + { + .expected = 0, + .cmd = PAS_CMD(SVM_CMD_BIT, PAS_CMD_BIT, LAM_CMD_BIT), + .test_func = handle_pasid, + .msg = "PASID: Execute SVM, PASID, LAM in sequence\n", + }, +}; + int main(int argc, char **argv) { int c = 0; @@ -910,6 +1140,9 @@ int main(int argc, char **argv) if (tests & FUNC_INHERITE) run_test(inheritance_cases, ARRAY_SIZE(inheritance_cases)); + if (tests & FUNC_PASID) + run_test(pasid_cases, ARRAY_SIZE(pasid_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass();