Message ID | 20221121195151.21812-3-decui@microsoft.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp1798432wrr; Mon, 21 Nov 2022 11:54:00 -0800 (PST) X-Google-Smtp-Source: AA0mqf6Xzsad6sUY/LRkqbXJd2+/vknzXi6GXujK5rwamRzPF1dLgF1Om9HGjbt4diMenmBsE8le X-Received: by 2002:a17:90a:b906:b0:213:b349:143e with SMTP id p6-20020a17090ab90600b00213b349143emr22020012pjr.114.1669060439798; Mon, 21 Nov 2022 11:53:59 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1669060439; cv=pass; d=google.com; s=arc-20160816; b=yKPt4LcoK2ykXc47XpORoNOKXXIYbuYtS6fDBiM2FgU/Zvu+ZaA1HljtXEH6VsN1Ow tB7wA4bZAUyGltyZpexFgZO0spvbHq4WUZloNz6d/x9+LG6v/a1PoSSCf/Jg55nfRMQQ aAgbA5seRBevVJTYBfZmEZuCpxq90yT9eKw17Kvq5fUtHHVynO3pxdkZyy2xnmE6o7oG x6Aak7M47iXjPOoeNehM6Gmy6Jcwie8sBdErDaYzZSQzEDQP3zIYCRHgg35G/Lzl+k5l 3XWUSeaVTJOYGnO4QqTW60ziAED89d3ddpIqiKiGY8U+CSOP0F6Aqr6rqsGDSWd9M3mb BpDQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=fA157Zk6ADQ5Q1UdtkapEDIAJPf8V706VHw/r6JPUJc=; b=VuOZH8AO6BLYk9J7TX4TOmgr8iOmowoGCycEMyvhkVyGZ2eRUD2tROa6cNPXRzAN5r EMNT3vbH4A1DK2TXEyVSUvOM/Vw9LTlapKHfy6h7Ob4CnjfOYN7M5ygNFMUqRnOybNqn mhtsMJ0ncNoRwxE8kJ/ENhAVqtonE6rij0vyHo63Q5TUixKKhXwL700hb/l+AsPyddnT txGrK57zHFMob2RkG5rb8Jm65RlG8Gwj7eS2BZAaDk+eTovIVsJkgqIidO1odEEyVvaK pmpIAjK+XvqicIzwexPkYUPgm4UyhvWYpvp6nj5qPfsBU/c6MC03DNlEoPpJHG/7Bksh Lnow== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector2 header.b=RBqg2Ytz; arc=pass (i=1 spf=pass spfdomain=microsoft.com dkim=pass dkdomain=microsoft.com dmarc=pass fromdomain=microsoft.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mr20-20020a17090b239400b001fac102fdeesi10554654pjb.95.2022.11.21.11.53.46; Mon, 21 Nov 2022 11:53:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector2 header.b=RBqg2Ytz; arc=pass (i=1 spf=pass spfdomain=microsoft.com dkim=pass dkdomain=microsoft.com dmarc=pass fromdomain=microsoft.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231838AbiKUTx3 (ORCPT <rfc822;cjcooper78@gmail.com> + 99 others); Mon, 21 Nov 2022 14:53:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43948 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231726AbiKUTxC (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 21 Nov 2022 14:53:02 -0500 Received: from na01-obe.outbound.protection.outlook.com (mail-eastusazon11022019.outbound.protection.outlook.com [52.101.53.19]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4AE97429AC; Mon, 21 Nov 2022 11:53:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XnmbqObCx0UvuAZAsu32Amp3RFi+0VD8y1RPTPXBgFjPpwNQaK8yLoEfBSvtw8DXXQU4C+eW03UJCduDu8KJbvAeyucRQqEib7EORJQxlCDR8t/gwda7erByJhe/vEkfwPeSiPSgaQ5ewXjX4ChYrKyie4D316XyNKlg9hzLgVvNf5g8s7hHMNMsGegyB25r1Zp9SG4pVpBJ0wb2G61oKXFue9/WCEHplDGAMOlKOn0ljTmtuxzMoAzCKV9q42Rwup7coAPAm0g1E6PBQ2Og7wWQntedHXh3OAO7CqgQeja3ehKodjlSsl/S3hxpfo77MsxGcxU+m8zT8HXiA6s74Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fA157Zk6ADQ5Q1UdtkapEDIAJPf8V706VHw/r6JPUJc=; b=Lz9yXLNt6Kjw1Nr0DG61wIj0KhjTtabFzrrhrAOt9P6j66k9Cnq6DVi24uoK+TyEeurvUHR+FJQyhZfk2s9WijLbYWUr4+R9kTSBXNCTtsYZk5MTNH+vHvnj7VkD4iqPlnDWXi5krGEu3qgjJq40SVdChcep+Uc/axzuBo4bCFIpaUCwiN8jrob+giObWThdUvpFGEC7mST50ulqMGdAhIzqSG7ip7H+2iDC2jgXov/T+WyM0oromJkTYd2kVLjSvqKF/IxAnY83vEsZIed/MXUpbkI31v12fCFxugkDndmPJLXHzvhavJIIvnWNkdFKwxdDvQukoXqHkzdZKHkiGw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=microsoft.com; dmarc=pass action=none header.from=microsoft.com; dkim=pass header.d=microsoft.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fA157Zk6ADQ5Q1UdtkapEDIAJPf8V706VHw/r6JPUJc=; b=RBqg2YtzBYCYjVMmUJnZMIB8uwnsGtT5r0d+TN/yGR0ru0kKAJC+PwLEsb5FzbI+TXNUI6SY1tdYQgYaJxQuNecP+E9XXxbYM3zWXKbCK2Gt3uVVSXpmCtuRrHbIdiuAuja85FPpO6W/CwylbLBVrXS0A9TBY4vmXmMY7JfZ7xI= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=microsoft.com; Received: from BL0PR2101MB1092.namprd21.prod.outlook.com (2603:10b6:207:30::23) by BL1PR21MB3280.namprd21.prod.outlook.com (2603:10b6:208:398::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5880.2; Mon, 21 Nov 2022 19:52:53 +0000 Received: from BL0PR2101MB1092.namprd21.prod.outlook.com ([fe80::ae03:5b1e:755c:a020]) by BL0PR2101MB1092.namprd21.prod.outlook.com ([fe80::ae03:5b1e:755c:a020%7]) with mapi id 15.20.5880.001; Mon, 21 Nov 2022 19:52:53 +0000 From: Dexuan Cui <decui@microsoft.com> To: ak@linux.intel.com, arnd@arndb.de, bp@alien8.de, brijesh.singh@amd.com, dan.j.williams@intel.com, dave.hansen@linux.intel.com, haiyangz@microsoft.com, hpa@zytor.com, jane.chu@oracle.com, kirill.shutemov@linux.intel.com, kys@microsoft.com, linux-arch@vger.kernel.org, linux-hyperv@vger.kernel.org, luto@kernel.org, mingo@redhat.com, peterz@infradead.org, rostedt@goodmis.org, sathyanarayanan.kuppuswamy@linux.intel.com, seanjc@google.com, tglx@linutronix.de, tony.luck@intel.com, wei.liu@kernel.org, x86@kernel.org Cc: linux-kernel@vger.kernel.org, Dexuan Cui <decui@microsoft.com> Subject: [PATCH 2/6] x86/tdx: Retry TDVMCALL_MAP_GPA() when needed Date: Mon, 21 Nov 2022 11:51:47 -0800 Message-Id: <20221121195151.21812-3-decui@microsoft.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20221121195151.21812-1-decui@microsoft.com> References: <20221121195151.21812-1-decui@microsoft.com> Content-Type: text/plain X-ClientProxiedBy: BY3PR03CA0027.namprd03.prod.outlook.com (2603:10b6:a03:39a::32) To BL0PR2101MB1092.namprd21.prod.outlook.com (2603:10b6:207:30::23) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BL0PR2101MB1092:EE_|BL1PR21MB3280:EE_ X-MS-Office365-Filtering-Correlation-Id: 5e480212-775f-49a1-d707-08dacbf9fa14 X-LD-Processed: 72f988bf-86f1-41af-91ab-2d7cd011db47,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: VG7V8qlKtz9xvH4It+AljjSB724Ie+SvG8AWm8+WW+i/VWeLBnbw3i5r/heslVj0D6qBQlnS/o4nzQi7dB0TcR6fN0Grc6Jdb+8vP+6RJWoVDviGYHl1FVJ2g1sgETEzvhB5AB2c+gscf1O1lJ7wyXADE9SGh7igSi0h+euzxuQKMQfNhM4z0afxVPFJUw1RrAgID7D6ivmI4vsqV7IzMm0Zb3AhRH5fbWfmGFb/Oq7X5fiqflGfBMGZuqTjgHVG+SdPTUMMCnbP1cX2JSSMTpn1RftWr1YaNTZLDi+AOl05BDQMqu76R4aUas54MBB3tO6PPSgk7buoK8GXwKq5eo9wAeekMA+cNtRveqIk0GRlSMg31ETtfAWYpzvnxrqYifV0Pp0KoTMg2qPXF1m8yDwjYRdyN70m5XrQsZmZb7GQZeloxUEOpxqmDZW8ytv+eX/PzlsPf68HWoVicu0R7zOxPscoxSWy+IsH1vUALThJn9EdoBHuynsyEamKBhZE9hD8GIbOTt4VgCQNoNdFuQxP8FxD8KXCk2/XOR1+BeLLKCQL8DFm15RhR4BTkEnRYNeK45UPLa/uJgfNDXBfQDHsj3BNdYqWEVE9tjQFYnLa9g5jUP90Y7HyYsp57wJbNY3d2eWB8KN5NmNx0yxwB2qcsboiX2i+tAb7Uh5uNx+0Ha3FfJprpENZn9g8EcLpY6wchKp+rstgGeVJpy6/MVixaNodujmsy9PPJwxiuvA= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:BL0PR2101MB1092.namprd21.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230022)(4636009)(346002)(366004)(396003)(136003)(376002)(39860400002)(451199015)(83380400001)(2906002)(7416002)(5660300002)(8936002)(2616005)(186003)(1076003)(36756003)(86362001)(41300700001)(82950400001)(38100700002)(921005)(82960400001)(6666004)(10290500003)(107886003)(6506007)(52116002)(6512007)(316002)(478600001)(66556008)(8676002)(66946007)(66476007)(6486002)(4326008);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: RRiG/Zm/Z2PbHB3Pv996duy996aUtmvLNEH3D0hKkdPNs4p5l/WuqyXZiM5FLoBdsYTNsrSWzZuTW7xQxVQAwk6SYx11PVH49Ai/ZkcG1ukk+QbghfBqWgScIKm2Nrq2d38pm1FygcF/EP6knIMXwgyyk74P9fsS9xMKeyFx5nIIf+FyIvddemVRDh39YLNDIZEK9htUjD/0zamhP5AfQ30wdZFygrp22hLTNWdTHLhHIdMtEk2EiJXpS/13tuYEZG4uuX377uiUe286p+kd+YRIydhpNWEqPqTZy11pR3s0vNhtLIJq0Ncw1o2eV81+jfZx+E1B5QDTdywssQiGDemrlksAyCDzGLeU6y2KKce6bbi7AMLapsFWQrF++HfEvVb/gSzNyfI5eDIWPhodosDsqKO5OA8Us8ndr0eQ9DrgJ/njaN4w9S4Zh/Xbjxy2KyRRFKDHeIEK5041Vjw2EDJz7cM0XbUbk0G1WAuvWZ0SJRq6wmO5Q43i1h+9h3DPIW5/vTl460GpGZwo0bCQECBdn98OzKUDHRikGrP9hQOs8Z9BB5nwbWaRy2eh7gebTIPqeQzOqBnIrigTTLIe7tCu13biLJDGKwRIpM3j3tgCHL16gSZeuqQwkR5+WrJUedaAK1UuBtCmPFXuwVeRbAperC7vl9o5ZqHrcxRsT80EdchLrF72vPo++KHtXTbE0WmqbTZquZI3Xuoned6twaboikXbGDkEjZAFStTLfiaw4p1YKDPtL85sAOu5YP1rx7fwSEZrV2664RwFbQ8H2+X1GPMivqMMtxalE2Z/pJNNP9syDDquiz/y+bFMJ4YVvM6nbNNeIEUA4WYkMaZpCVTJgL0oHDbLwEW+uDcx/l+PDx2SeD2/bXvDJjZ4VP1FO12RbPlE6hjsct004kU/lF8s6DYp5PepmUvJMFNUykNansecOwh79Renm2CitNVe6hIptg1Sg4hRgNRxK5QyPAN+oH9uK60YhxAkAApP0kGtb0neSzyT8oH6//+UfFoWsQVrJIixHuv3CCaVW4c0ousRZvoxMImB9SpwiaFqITN1mMqs8PAXV0Yt9egHME3F5VhCROjE6hGaDgA/PS3ww3F8+2wt6qc2dKNEU5wNrK40zaw1evaZcJOh1knGW87owrXqQJopAY2GIerc6LCOjSTM9hfnJYN5mXGeOXatKEKW73VgQ1q7K7nTMpaM919LjOvZ6gg4gk8eIRHpGW98FAb2/UPdtnvR7qasePs1NR+9ixcPtmZgrcLNS6L139wR2ONFBWn/VqjwpTGjTt/NgtvL6JuAMElNm3K2LZwviUsjPdC1RjEr5acS9RKY3cqN8yA3+bidRhbfutt5ekea7Wstu8SUfuXf+26rlo41/m3t4umyclX+UL8E4h/J52Rj9msSs9fFbgKBLaYiLfcRdXQA+5VBAugwenXmEe05VnT7BdxmRpqLEh90beC8UvjAi/ddeFCYcL0UVMc6KoCge00RVi+ybBCX01ridzjNyheWMIkNV351rDzQkb2KW9VIvfw30hpp7WAl6dMtrtQ6+HQHF8+DOu3228Xg1z0tDRe7zMA9CNsAAwHqks1YeoQ43ee6JIEKDyp/pbj9GRF6K0d8k5RFbdOWeRtQ53RGVwRZ3JIJXpINRAPUEF6vGNpv X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5e480212-775f-49a1-d707-08dacbf9fa14 X-MS-Exchange-CrossTenant-AuthSource: BL0PR2101MB1092.namprd21.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Nov 2022 19:52:53.1186 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: ZqxzgourQicJx5tudqMkpZXdRrrvVXCGjLQrD5beeMyOc1aPeJVScB8x0enYUswDknJc+flDVqW1Hg3e09qKZw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: BL1PR21MB3280 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750136719667601817?= X-GMAIL-MSGID: =?utf-8?q?1750136719667601817?= |
Series |
Support TDX guests on Hyper-V
|
|
Commit Message
Dexuan Cui
Nov. 21, 2022, 7:51 p.m. UTC
GHCI spec for TDX 1.0 says that the MapGPA call may fail with the R10
error code = TDG.VP.VMCALL_RETRY (1), and the guest must retry this
operation for the pages in the region starting at the GPA specified
in R11.
When a TDX guest runs on Hyper-V, Hyper-V returns the retry error
when hyperv_init() -> swiotlb_update_mem_attributes() ->
set_memory_decrypted() decrypts up to 1GB of swiotlb bounce buffers.
Signed-off-by: Dexuan Cui <decui@microsoft.com>
---
arch/x86/coco/tdx/tdx.c | 65 +++++++++++++++++++++++++++++++++++++----
1 file changed, 59 insertions(+), 6 deletions(-)
Comments
On 11/21/22 11:51, Dexuan Cui wrote: > +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > +{ > + u64 ret, r11; 'r11' needs a real, logical name. > + while (1) { > + ret = _tdx_hypercall_output_r11(TDVMCALL_MAP_GPA, start, > + end - start, 0, 0, &r11); > + if (!ret) > + break; > + > + if (ret != TDVMCALL_STATUS_RETRY) > + break; > + > + /* > + * The guest must retry the operation for the pages in the > + * region starting at the GPA specified in R11. Make sure R11 > + * contains a sane value. > + */ > + if ((r11 & ~cc_mkdec(0)) < (start & ~cc_mkdec(0)) || > + (r11 & ~cc_mkdec(0)) >= (end & ~cc_mkdec(0))) > + return false; This statement is, um, a wee bit ugly. First, it's not obvious at all why the address masking is necessary. Second, it's utterly insane to do that mask to 'r11' twice. Third, it's silly do logically the same thing to start and end every time through the loop. This also seems to have built in the idea that cc_mkdec() *SETS* bits rather than clearing them. That's true for TDX today, but it's a horrible chunk of code to leave around because it'll confuse actual legitimate cc_enc/dec() users. The more I write about it, the more I dislike it. Why can't this just be: if ((map_fail_paddr < start) || (map_fail_paddr >= end)) return false; If the hypervisor returns some crazy address in r11 that isn't masked like the inputs, just fail. > + start = r11; > + > + /* Set the shared (decrypted) bit. */ > + if (!enc) > + start |= cc_mkdec(0); Why is only one side of this necessary? Shouldn't it need to be something like: if (enc) start = cc_mkenc(start); else start = cc_mkdec(start); ?? > + } > + > + return !ret; > +} > + > /* > * Inform the VMM of the guest's intent for this physical page: shared with > * the VMM or private to the guest. The VMM is expected to change its mapping > @@ -707,12 +765,7 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) > end |= cc_mkdec(0); > } > > - /* > - * Notify the VMM about page mapping conversion. More info about ABI > - * can be found in TDX Guest-Host-Communication Interface (GHCI), > - * section "TDG.VP.VMCALL<MapGPA>" > - */ > - if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) > + if (!tdx_map_gpa(start, end, enc)) > return false; > > /* private->shared conversion requires only MapGPA call */
On Mon, Nov 21, 2022 at 11:51:47AM -0800, Dexuan Cui wrote: > GHCI spec for TDX 1.0 says that the MapGPA call may fail with the R10 > error code = TDG.VP.VMCALL_RETRY (1), and the guest must retry this > operation for the pages in the region starting at the GPA specified > in R11. > > When a TDX guest runs on Hyper-V, Hyper-V returns the retry error > when hyperv_init() -> swiotlb_update_mem_attributes() -> > set_memory_decrypted() decrypts up to 1GB of swiotlb bounce buffers. > > Signed-off-by: Dexuan Cui <decui@microsoft.com> > --- > arch/x86/coco/tdx/tdx.c | 65 +++++++++++++++++++++++++++++++++++++---- > 1 file changed, 59 insertions(+), 6 deletions(-) > > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c > index 3fee96931ff5..46971cc7d006 100644 > --- a/arch/x86/coco/tdx/tdx.c > +++ b/arch/x86/coco/tdx/tdx.c > @@ -20,6 +20,8 @@ > /* TDX hypercall Leaf IDs */ > #define TDVMCALL_MAP_GPA 0x10001 > > +#define TDVMCALL_STATUS_RETRY 1 > + > /* MMIO direction */ > #define EPT_READ 0 > #define EPT_WRITE 1 > @@ -52,6 +54,25 @@ static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) > return __tdx_hypercall(&args, 0); > } > > +static inline u64 _tdx_hypercall_output_r11(u64 fn, u64 r12, u64 r13, u64 r14, > + u64 r15, u64 *r11) > +{ > + struct tdx_hypercall_args args = { > + .r10 = TDX_HYPERCALL_STANDARD, > + .r11 = fn, > + .r12 = r12, > + .r13 = r13, > + .r14 = r14, > + .r15 = r15, > + }; > + > + u64 ret; > + > + ret = __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT); > + *r11 = args.r11; > + return ret; > +} > + I'm not convinced it deserves a separate helper for one user. Does it look that ugly if tdx_map_gpa() uses __tdx_hypercall() directly? > /* Called from __tdx_hypercall() for unrecoverable failure */ > void __tdx_hypercall_failed(void) > { > @@ -691,6 +712,43 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len, > return true; > } > > +/* > + * Notify the VMM about page mapping conversion. More info about ABI > + * can be found in TDX Guest-Host-Communication Interface (GHCI), > + * section "TDG.VP.VMCALL<MapGPA>" > + */ > +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > +{ > + u64 ret, r11; > + > + while (1) { Endless? Maybe an upper limit if no progress? > + ret = _tdx_hypercall_output_r11(TDVMCALL_MAP_GPA, start, > + end - start, 0, 0, &r11); > + if (!ret) > + break; > + > + if (ret != TDVMCALL_STATUS_RETRY) > + break; > + > + /* > + * The guest must retry the operation for the pages in the > + * region starting at the GPA specified in R11. Make sure R11 > + * contains a sane value. > + */ > + if ((r11 & ~cc_mkdec(0)) < (start & ~cc_mkdec(0)) || > + (r11 & ~cc_mkdec(0)) >= (end & ~cc_mkdec(0))) > + return false; Emm. All of them suppose to have shared bit set, why not compare directly without cc_mkdec() dance? > + > + start = r11; > + > + /* Set the shared (decrypted) bit. */ > + if (!enc) > + start |= cc_mkdec(0); > + } > + > + return !ret; > +} > + > /* > * Inform the VMM of the guest's intent for this physical page: shared with > * the VMM or private to the guest. The VMM is expected to change its mapping > @@ -707,12 +765,7 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) > end |= cc_mkdec(0); > } > > - /* > - * Notify the VMM about page mapping conversion. More info about ABI > - * can be found in TDX Guest-Host-Communication Interface (GHCI), > - * section "TDG.VP.VMCALL<MapGPA>" > - */ > - if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) > + if (!tdx_map_gpa(start, end, enc)) > return false; > > /* private->shared conversion requires only MapGPA call */ > -- > 2.25.1 >
> Sent: Monday, November 21, 2022 12:56 PM > On 11/21/22 11:51, Dexuan Cui wrote: > > +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > > +{ > > + u64 ret, r11; > > 'r11' needs a real, logical name. OK, will use "map_fail_paddr" (as you implied below). > > + while (1) { > > + ret = _tdx_hypercall_output_r11(TDVMCALL_MAP_GPA, start, > > + end - start, 0, 0, &r11); > > + if (!ret) > > + break; > > + > > + if (ret != TDVMCALL_STATUS_RETRY) > > + break; > > + > > + /* > > + * The guest must retry the operation for the pages in the > > + * region starting at the GPA specified in R11. Make sure R11 > > + * contains a sane value. > > + */ > > + if ((r11 & ~cc_mkdec(0)) < (start & ~cc_mkdec(0)) || > > + (r11 & ~cc_mkdec(0)) >= (end & ~cc_mkdec(0))) > > + return false; > > This statement is, um, a wee bit ugly. > > First, it's not obvious at all why the address masking is necessary. It turns out that the masking is completely unnecessary :-) I incorrectly assumed that if the input 'start' has the bit 47, Hyper-V always returns a physical address without bit 47. This is not the case. I'll remove the masking code in v2. > Second, it's utterly insane to do that mask to 'r11' twice. Third, it's > silly do logically the same thing to start and end every time through > the loop. > > This also seems to have built in the idea that cc_mkdec() *SETS* bits > rather than clearing them. That's true for TDX today, but it's a > horrible chunk of code to leave around because it'll confuse actual > legitimate cc_enc/dec() users. > > The more I write about it, the more I dislike it. > > Why can't this just be: > > if ((map_fail_paddr < start) || > (map_fail_paddr >= end)) > return false; > > If the hypervisor returns some crazy address in r11 that isn't masked > like the inputs, just fail. Will use your example code in v2. > > + start = r11; > > + > > + /* Set the shared (decrypted) bit. */ > > + if (!enc) > > + start |= cc_mkdec(0); > > Why is only one side of this necessary? Shouldn't it need to be > something like: > > if (enc) > start = cc_mkenc(start); > else > start = cc_mkdec(start); > > ?? The code is unnecessary. Will remove it in v2.
> From: Kirill A. Shutemov <kirill@shutemov.name> > Sent: Monday, November 21, 2022 4:01 PM > [...] > On Mon, Nov 21, 2022 at 11:51:47AM -0800, Dexuan Cui wrote: > [...] > I'm not convinced it deserves a separate helper for one user. > Does it look that ugly if tdx_map_gpa() uses __tdx_hypercall() directly? Will use __tdx_hypercall() directly. > > /* Called from __tdx_hypercall() for unrecoverable failure */ > > void __tdx_hypercall_failed(void) > > { > > @@ -691,6 +712,43 @@ static bool try_accept_one(phys_addr_t *start, > unsigned long len, > > return true; > > } > > > > +/* > > + * Notify the VMM about page mapping conversion. More info about ABI > > + * can be found in TDX Guest-Host-Communication Interface (GHCI), > > + * section "TDG.VP.VMCALL<MapGPA>" > > + */ > > +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > > +{ > > + u64 ret, r11; > > + > > + while (1) { > > Endless? Maybe an upper limit if no progress? I'll add a max count of 1000, which should be far more than enough. > > + ret = _tdx_hypercall_output_r11(TDVMCALL_MAP_GPA, start, > > + end - start, 0, 0, &r11); > > + if (!ret) > > + break; > > + > > + if (ret != TDVMCALL_STATUS_RETRY) > > + break; > > + > > + /* > > + * The guest must retry the operation for the pages in the > > + * region starting at the GPA specified in R11. Make sure R11 > > + * contains a sane value. > > + */ > > + if ((r11 & ~cc_mkdec(0)) < (start & ~cc_mkdec(0)) || > > + (r11 & ~cc_mkdec(0)) >= (end & ~cc_mkdec(0))) > > + return false; > > Emm. All of them suppose to have shared bit set, why not compare directly > without cc_mkdec() dance? The above code is unnecessary and will be removed. So, I'll use the below in v2: /* * Notify the VMM about page mapping conversion. More info about ABI * can be found in TDX Guest-Host-Communication Interface (GHCI), * section "TDG.VP.VMCALL<MapGPA>" */ static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) { int max_retry_cnt = 1000, retry_cnt = 0; struct tdx_hypercall_args args; u64 map_fail_paddr, ret; while (1) { args.r10 = TDX_HYPERCALL_STANDARD; args.r11 = TDVMCALL_MAP_GPA; args.r12 = start; args.r13 = end - start; args.r14 = 0; args.r15 = 0; ret = __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT); if (!ret) break; if (ret != TDVMCALL_STATUS_RETRY) break; /* * The guest must retry the operation for the pages in the * region starting at the GPA specified in R11. Make sure R11 * contains a sane value. */ map_fail_paddr = args.r11 ; if (map_fail_paddr < start || map_fail_paddr >= end) return false; if (map_fail_paddr == start) { retry_cnt++; if (retry_cnt > max_retry_cnt) return false; } else { retry_cnt = 0;; start = map_fail_paddr; } } return !ret; }
From: Dexuan Cui <decui@microsoft.com> Sent: Tuesday, November 22, 2022 7:27 PM > > > From: Kirill A. Shutemov <kirill@shutemov.name> > > Sent: Monday, November 21, 2022 4:01 PM > > [...] > > On Mon, Nov 21, 2022 at 11:51:47AM -0800, Dexuan Cui wrote: > > [...] > > I'm not convinced it deserves a separate helper for one user. > > Does it look that ugly if tdx_map_gpa() uses __tdx_hypercall() directly? > > Will use __tdx_hypercall() directly. > > > > /* Called from __tdx_hypercall() for unrecoverable failure */ > > > void __tdx_hypercall_failed(void) > > > { > > > @@ -691,6 +712,43 @@ static bool try_accept_one(phys_addr_t *start, > > unsigned long len, > > > return true; > > > } > > > > > > +/* > > > + * Notify the VMM about page mapping conversion. More info about ABI > > > + * can be found in TDX Guest-Host-Communication Interface (GHCI), > > > + * section "TDG.VP.VMCALL<MapGPA>" > > > + */ > > > +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > > > +{ > > > + u64 ret, r11; > > > + > > > + while (1) { > > > > Endless? Maybe an upper limit if no progress? > > I'll add a max count of 1000, which should be far more than enough. > > > > + ret = _tdx_hypercall_output_r11(TDVMCALL_MAP_GPA, start, > > > + end - start, 0, 0, &r11); > > > + if (!ret) > > > + break; > > > + > > > + if (ret != TDVMCALL_STATUS_RETRY) > > > + break; > > > + > > > + /* > > > + * The guest must retry the operation for the pages in the > > > + * region starting at the GPA specified in R11. Make sure R11 > > > + * contains a sane value. > > > + */ > > > + if ((r11 & ~cc_mkdec(0)) < (start & ~cc_mkdec(0)) || > > > + (r11 & ~cc_mkdec(0)) >= (end & ~cc_mkdec(0))) > > > + return false; > > > > Emm. All of them suppose to have shared bit set, why not compare directly > > without cc_mkdec() dance? > > The above code is unnecessary and will be removed. > > So, I'll use the below in v2: > > /* > * Notify the VMM about page mapping conversion. More info about ABI > * can be found in TDX Guest-Host-Communication Interface (GHCI), > * section "TDG.VP.VMCALL<MapGPA>" > */ > static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > { > int max_retry_cnt = 1000, retry_cnt = 0; > struct tdx_hypercall_args args; > u64 map_fail_paddr, ret; > > while (1) { > args.r10 = TDX_HYPERCALL_STANDARD; > args.r11 = TDVMCALL_MAP_GPA; > args.r12 = start; > args.r13 = end - start; > args.r14 = 0; > args.r15 = 0; > > ret = __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT); > if (!ret) > break; The above test is redundant and can be removed. The "success" case is implicitly handled by the test below for != TDVMCALL_STATUS_RETRY. > > if (ret != TDVMCALL_STATUS_RETRY) > break; > /* > * The guest must retry the operation for the pages in the > * region starting at the GPA specified in R11. Make sure R11 > * contains a sane value. > */ > map_fail_paddr = args.r11 ; > if (map_fail_paddr < start || map_fail_paddr >= end) > return false; > > if (map_fail_paddr == start) { > retry_cnt++; > if (retry_cnt > max_retry_cnt) > return false; > } else { > retry_cnt = 0;; > start = map_fail_paddr; Just summarizing the code, we increment the retry count if the hypercall returns STATUS_RETRY but did nothing (i.e., map_fail_paddr == start). But if the hypercall returns STATUS_RETRY after making at least some progress, then we reset the retry count. So in the worst case, for example, if the hypercall processed only one page on each invocation, the loop will continue until completion, without hitting any retry limits. That scenario seems plausible and within the spec. Do we have any indication about the likelihood of the "RETRY but did nothing" case? The spec doesn't appear to disallow this case, but does Hyper-V actually do this? It seems like a weird case. Michael > } > } > > return !ret; > }
> From: Michael Kelley (LINUX) <mikelley@microsoft.com> > Sent: Wednesday, November 23, 2022 5:30 AM > > [...] > > static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) > > { > > int max_retry_cnt = 1000, retry_cnt = 0; > > struct tdx_hypercall_args args; > > u64 map_fail_paddr, ret; > > > > while (1) { > > args.r10 = TDX_HYPERCALL_STANDARD; > > args.r11 = TDVMCALL_MAP_GPA; > > args.r12 = start; > > args.r13 = end - start; > > args.r14 = 0; > > args.r15 = 0; > > > > ret = __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT); > > if (!ret) > > break; > > The above test is redundant and can be removed. The "success" case is > implicitly handled by the test below for != TDVMCALL_STATUS_RETRY. Good point. Will remove the redundant test. > > if (ret != TDVMCALL_STATUS_RETRY) > > break; > > /* > > * The guest must retry the operation for the pages in > the > > * region starting at the GPA specified in R11. Make sure > R11 > > * contains a sane value. > > */ > > map_fail_paddr = args.r11 ; > > if (map_fail_paddr < start || map_fail_paddr >= end) > > return false; > > > > if (map_fail_paddr == start) { > > retry_cnt++; > > if (retry_cnt > max_retry_cnt) > > return false; > > } else { > > retry_cnt = 0;; > > start = map_fail_paddr; > > Just summarizing the code, we increment the retry count if the hypercall > returns STATUS_RETRY but did nothing (i.e., map_fail_paddr == start). But > if the hypercall returns STATUS_RETRY after making at least some progress, > then we reset the retry count. So in the worst case, for example, if the > hypercall processed only one page on each invocation, the loop will continue > until completion, without hitting any retry limits. That scenario seems > plausible and within the spec. Exactly. > Do we have any indication about the likelihood of the "RETRY but did > nothing" case? The spec doesn't appear to disallow this case, but does > Hyper-V actually do this? It seems like a weird case. > > Michael Yes, Hyper-V does do this, according to my test. It looks like this is not because the operation is too time-consuming -- it looks like there is some Hyper-V specific activity going on.
diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c index 3fee96931ff5..46971cc7d006 100644 --- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -20,6 +20,8 @@ /* TDX hypercall Leaf IDs */ #define TDVMCALL_MAP_GPA 0x10001 +#define TDVMCALL_STATUS_RETRY 1 + /* MMIO direction */ #define EPT_READ 0 #define EPT_WRITE 1 @@ -52,6 +54,25 @@ static inline u64 _tdx_hypercall(u64 fn, u64 r12, u64 r13, u64 r14, u64 r15) return __tdx_hypercall(&args, 0); } +static inline u64 _tdx_hypercall_output_r11(u64 fn, u64 r12, u64 r13, u64 r14, + u64 r15, u64 *r11) +{ + struct tdx_hypercall_args args = { + .r10 = TDX_HYPERCALL_STANDARD, + .r11 = fn, + .r12 = r12, + .r13 = r13, + .r14 = r14, + .r15 = r15, + }; + + u64 ret; + + ret = __tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT); + *r11 = args.r11; + return ret; +} + /* Called from __tdx_hypercall() for unrecoverable failure */ void __tdx_hypercall_failed(void) { @@ -691,6 +712,43 @@ static bool try_accept_one(phys_addr_t *start, unsigned long len, return true; } +/* + * Notify the VMM about page mapping conversion. More info about ABI + * can be found in TDX Guest-Host-Communication Interface (GHCI), + * section "TDG.VP.VMCALL<MapGPA>" + */ +static bool tdx_map_gpa(phys_addr_t start, phys_addr_t end, bool enc) +{ + u64 ret, r11; + + while (1) { + ret = _tdx_hypercall_output_r11(TDVMCALL_MAP_GPA, start, + end - start, 0, 0, &r11); + if (!ret) + break; + + if (ret != TDVMCALL_STATUS_RETRY) + break; + + /* + * The guest must retry the operation for the pages in the + * region starting at the GPA specified in R11. Make sure R11 + * contains a sane value. + */ + if ((r11 & ~cc_mkdec(0)) < (start & ~cc_mkdec(0)) || + (r11 & ~cc_mkdec(0)) >= (end & ~cc_mkdec(0))) + return false; + + start = r11; + + /* Set the shared (decrypted) bit. */ + if (!enc) + start |= cc_mkdec(0); + } + + return !ret; +} + /* * Inform the VMM of the guest's intent for this physical page: shared with * the VMM or private to the guest. The VMM is expected to change its mapping @@ -707,12 +765,7 @@ static bool tdx_enc_status_changed(unsigned long vaddr, int numpages, bool enc) end |= cc_mkdec(0); } - /* - * Notify the VMM about page mapping conversion. More info about ABI - * can be found in TDX Guest-Host-Communication Interface (GHCI), - * section "TDG.VP.VMCALL<MapGPA>" - */ - if (_tdx_hypercall(TDVMCALL_MAP_GPA, start, end - start, 0, 0)) + if (!tdx_map_gpa(start, end, enc)) return false; /* private->shared conversion requires only MapGPA call */