[v2,1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline
Message ID | 1684862062-51576-1-git-send-email-mikelley@microsoft.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2311358vqo; Tue, 23 May 2023 10:39:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7vMZiJ1gEYWfr2vtaZQcTE2Cc2QCXyJonpUO7TXOPpX8Z76K4BtI9ivfWGKcep27EwbpkY X-Received: by 2002:a17:90a:f198:b0:24d:ee34:57b6 with SMTP id bv24-20020a17090af19800b0024dee3457b6mr13843215pjb.41.1684863587186; Tue, 23 May 2023 10:39:47 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1684863587; cv=pass; d=google.com; s=arc-20160816; b=ks/yuHkeXljhZinbok/BkrPTzh/rKCsoYcOk56QkQ9L2Cm5u27MLnReybCUeQc719o +Yp/1bZ+vKp27sE02X4IIo/ebpbRaJGpqGBWQHaftLWNRqalXBqWj/EHC5cyK5Wgo1Kn ONRYaBeeJtcuTPRIz2pt7YQdsn8/max9mRAVW/8vP6qPk1rN2wBz/1cXRwEfANCzWz2J fjGDVPdDc/mIZW/Jgn+s0W33O3I37cu5fdj5K7SaT07nyVktVjzQ9DGSdd08HpB8sG87 dofmQrVx0bl/62AiXsQRVStmCAzPNRmiMI0cKG69By9Z8TBriEU9cCWoOBepG0pcv42b +WDA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from :dkim-signature; bh=bWkCBjMumEVdnOiFUDuUb7Y96m9hR7cTb6hMrFPOr3E=; b=qGMlSRAVlFxdfqLisHz6CNi+nDXq8GwYHsGlNWtyftvE7Cke88F8cA84CHgigsI6jV z3ACKd68XYUq5BzrmpZ09J7/F/W51+1/cMKCtQk0U6NVMkQ1habgW9eom8IHW3mRsjou 3sJH0Fw7gt677V+XK1U0V8QQmx+GJtuWkUNZmzWcBzhaR9nmN6hfuL8ZcHzGDkUeQp38 P3flia5nQUK6ZOba1YW+5Nwre2PYZHJPzXnqZnixgnwkcFyN3OxK3qH0BuNz2MKALWF/ hIkFSj638CoNCWELR/nodrSPNVwjNywQBLt7ZZ16EX5SzeoeJe6acUohiaVZoUt5ZnBX TZfg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector2 header.b=iMkbuy1P; arc=pass (i=1 spf=pass spfdomain=microsoft.com dkim=pass dkdomain=microsoft.com dmarc=pass fromdomain=microsoft.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mm16-20020a17090b359000b0024e1138cd9dsi951952pjb.157.2023.05.23.10.39.35; Tue, 23 May 2023 10:39:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@microsoft.com header.s=selector2 header.b=iMkbuy1P; arc=pass (i=1 spf=pass spfdomain=microsoft.com dkim=pass dkdomain=microsoft.com dmarc=pass fromdomain=microsoft.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237713AbjEWRbx (ORCPT <rfc822;ahmedalshaiji.dev@gmail.com> + 99 others); Tue, 23 May 2023 13:31:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237576AbjEWRbr (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 23 May 2023 13:31:47 -0400 Received: from BN3PR00CU001.outbound.protection.outlook.com (mail-eastus2azon11020021.outbound.protection.outlook.com [52.101.56.21]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34A5510D9; Tue, 23 May 2023 10:31:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gFTEK1ppAPF4g4F6qc7LbuuCDYnwhNExj1KzhG2/xzoXMpYM/RRm819Rre0yE9GbhZ+UanMwkENDanfQ0o8jSbZMprOfiJdTPRGkBEK/16jgYsheGLxLei2XVt7huZsynzv9CMdiEnFSaxzDzluzjVQJUZg+3ort8xLjDweKKoYgMEvlReEpgsOKDUsqIbLthmQBlZt/4f1r4ADyD/4BnP/Gg8GZM+v0q+HfaaqwAnGt2YXckcmAVbxsTDJ9aWWs3Iy2oBK5BuSG5TTVvspcdQUGEg2OezjP6W185VitgduNyub2wpCFYjH0pW37c34+tO6hXcbGbphoQyZAlaHddQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=bWkCBjMumEVdnOiFUDuUb7Y96m9hR7cTb6hMrFPOr3E=; b=aHBXwy+xDkDsWgXPPy1iWYOE3mmrddK0VQVfWlXlKI4P7SYhxx8rvH+Weh7Xe+cQdcQcawnzyRXwWbSuXDgK/AOXrRyP1J2m0lGK6OUlvC7wQjEme9faoeDtO2RmFrSmSjBQugQvBZ8MFS9CMYAcPWfeegAUCaxV3JnAlb2aHzGjgIxaQNhlQnc3PjTr4EeatScKjEg8DhU5AKJRBkKFr/I7hcufspuNNYZk63bhFBhazT9CzFWWm1DcNb3NH6NwidMYP3JIviIHpuHThk3AMcjnF1FQDgg/QprntqUWDXre7VE+zJ1mDq1+d06gMDNdPTMgNOQvB0dxSE56JOKkaA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=microsoft.com; dmarc=pass action=none header.from=microsoft.com; dkim=pass header.d=microsoft.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bWkCBjMumEVdnOiFUDuUb7Y96m9hR7cTb6hMrFPOr3E=; b=iMkbuy1PVhs5sR2KE03fhDdBXXFGgKHXhgBYpnGWAUTUAhwM7hcnjNT+x5qYMu2D+YiFcggioemXAIVOyK8oDLgYZgDSxnrVq4+k3aNlXhiD0qrueJmT4AyG/Y2UiYuQVPNIOKqa8PbM116jc9zsNrUTk/hkb4uSgTVEwGOTo/Q= Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=microsoft.com; Received: from DM6PR21MB1370.namprd21.prod.outlook.com (2603:10b6:5:16b::28) by SA3PR21MB3866.namprd21.prod.outlook.com (2603:10b6:806:2fb::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6433.6; Tue, 23 May 2023 17:14:59 +0000 Received: from DM6PR21MB1370.namprd21.prod.outlook.com ([fe80::b7e9:4da1:3c23:35f]) by DM6PR21MB1370.namprd21.prod.outlook.com ([fe80::b7e9:4da1:3c23:35f%6]) with mapi id 15.20.6433.013; Tue, 23 May 2023 17:14:59 +0000 From: Michael Kelley <mikelley@microsoft.com> To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, catalin.marinas@arm.com, will@kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org Cc: mikelley@microsoft.com Subject: [PATCH v2 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline Date: Tue, 23 May 2023 10:14:21 -0700 Message-Id: <1684862062-51576-1-git-send-email-mikelley@microsoft.com> X-Mailer: git-send-email 1.8.3.1 Content-Type: text/plain X-ClientProxiedBy: CYXPR03CA0065.namprd03.prod.outlook.com (2603:10b6:930:d1::7) To DM6PR21MB1370.namprd21.prod.outlook.com (2603:10b6:5:16b::28) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR21MB1370:EE_|SA3PR21MB3866:EE_ X-MS-Office365-Filtering-Correlation-Id: 822a898c-5334-4b58-85a3-08db5bb13d07 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: BxX5MV5mWgOqWIf/0YUmcuUUrJ8dYI/3+ZkyOKOXcGUG/D1KruFGl94lwjNP7K0fH2rHVkXFeI91eOCimSP759ak9HoBetVqNHIpqDyRbSqeevDFmRvquW6e0WOM/X1/LzFQiTV76m69yrSxJsjFsALnBQy96zosTuIGXetaPNCQ+AdaUSvpdFNflJ8/jI1e+Rn0pj/5K3kc1G5yktKBRztkxMVmsu2L/EOGg91zfbImUkLoOKHGMaUez2dkokwdcWs7heaDoEBx+SdgNA4zhsRujV6bAFwoG8Ixj+YTSbhMKyK+tDxPBL2LcBqRXY0iv7fP9YL6dMj311Yjy0iwD1hNg5JdnoOE6YLcAnu1uKfaZt0ABHyjEFL+onADAfyHFMin1rpM0Mtp043yJUWygLPDmhicIvlKzzHgsJkib0kzhTxDyRGjO8jz2/1kj7SDwdV9IXLtD9YRyqxEsdoZ7qf4nvouUXPhWsIV7DC8dMI+EInIFm58rr4jNdyOeworDzjXpS0RSnTSCFHLEDh8bFMIYCQqP1u7lDiZI0jRBmcm2Cv0TW7DbSZOKGytJ0h6tDbyuvS8L1psS1LZsHKWN0Kr5XOKb2R+qkdF7DqxgpQgxZjKFWDr6DbQprFgGaCgO+47DC3t0JNdTvPKYTeQSUdjYu5lu3kIrnvUF6GBzSY9sbs/LkUGZqxcZ4NdKJKz X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR21MB1370.namprd21.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230028)(4636009)(346002)(39860400002)(366004)(396003)(136003)(376002)(451199021)(5660300002)(6486002)(6666004)(66946007)(8936002)(786003)(52116002)(316002)(41300700001)(8676002)(10290500003)(107886003)(478600001)(7416002)(4326008)(66556008)(66476007)(2906002)(186003)(6506007)(6512007)(26005)(82950400001)(2616005)(83380400001)(82960400001)(36756003)(86362001)(38350700002)(921005)(38100700002);DIR:OUT;SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: WAErviylwowS1miCbzfZdElFWB2GTy1nC1cUZviSSBkZaJEt3QH2eSrxxlT9UL0pSM+mHm/1ZpoIAc+HfjbBHLDMvgM/LXyHQyVK+f99WJ2IRTo6wNUJqnYnoNQHhAKGSTXnVr5jb5v/3zuLTZsIpyz2wXv5y3EctLUkMMnxcyYU2O5naZ7WN5f4xP4VtC0D/HPsVkPI1QT+kbS4z6ADPXysqYlLpA7CZlczTFpPJf748fH5EQjCyqoZvJjUIvnrDqdGZvEHio0S1LZa2ndPUPJ06vd9ajaBm317DikHp5Hes7TOrm0mCyLiy/MrmhYYSbbtQ1Vts9MG+dmK6KxwHPfV1okJznWu4Tn7blR29mAv3yzjOCKPFxSWN6+LaTcR/9YYLvRyowcmwiQzcfUSbKevn2kiDruaXgwli4CvbED/iu2URdZ0urhz9J0gpfD6pfUZUCx2hZUm6eXoZA8Uy2yXcinTaP5NQkcVwOumKJjDd0Q4C7SkNkfoNBeUY4070BzaScfcRW16H1OqkJQq4ldv0jDH71LVRML+TuMp8pQlyHJVR+QSHYLJLTWLDZqFwHV2SZ19C9e5sljlqGkGlgZiUnGMwf945UBRd3xO1u3OQln2Orv60AZU3Y/xmduNYnDVi26sEl36Q2qmudBUDf3Yd+wNPkZcKVaudzrX0fyHzpz3AX0OAX2hyhlTQH1llfcIuRdVy3wxk29tpkVN/8D2AOdM7fOYwnvuPg1p2Kz7e6snqkWd4oIJgwjASU5DgvmlFLrUcHYEYObUfOVfwfiCwzWyL+H8xsGU41JOtOSOjzUH+kvR35ChieC51nzm7dZClV4flMwYXa64PlhlK+xCVfmrg6xZxNsJvFS3b+IB+L0Pi6b5JxcPC+hvN4Wi/XNmIxlmkonLZ9KAzqi2THkkKGPt1SyF8TXO70/UEbB/4rMFsmiixCczBCljztW0/hBpdB+2nJFxkJ2HKHiGXzdwlnBO5OhOlpDcO/w4+2Q5fDilZOvpY0qeIoORfPNNVboInQnxE+13+ErWbZd6g522bDrNt2w/TKHSGiojRu8AwsAp0+m9zj8mAHXf2pYxBj3d97cfqHWOOM7W6U2nbSZ8FmLVf/xVyPAVplDqDXuaBPqxNgV9TtuOB+mKFduF2PbY6WtQttvbiPlIDC1AZdgrQjF6KYR6iC0JjuaWj91gR75X6wZuYCEPU0xU9yKHIm4VkNmBisZKmZxEIXlRmi+d0WlTZreLaNyKMm5ayopYW0ndfoFpyY6naSRyVyEvvbnPgUe9mPp8VDDJHtgzGZBWKQCbLV8f8RD888j6xTQNP+3UivoL6xeg55BIzbXFEIvHTJvkpXwncJhk7evwM3cfbbbGgssCbFq72VHyXepgrvLiweCPfRs2faad9XcDZMEArgH+MkwJHV/ubTxDZWLHD3prO5Gbydff7NuIz0c/glTMEenI7PFrg5SIIPUTMt/fHSNDG2oauGvoPO2234g17vExOQUVVfGSTHkDrt+ZA3L2q8NBF7x+/789OgcWROwrFg5ZXrO2Z6zfADRJWPPYNnt8ENq7ZJbgAQm5g+Y3qC/SU3iMNmA3yl9zvMa2 X-OriginatorOrg: microsoft.com X-MS-Exchange-CrossTenant-Network-Message-Id: 822a898c-5334-4b58-85a3-08db5bb13d07 X-MS-Exchange-CrossTenant-AuthSource: DM6PR21MB1370.namprd21.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 May 2023 17:14:59.5735 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 72f988bf-86f1-41af-91ab-2d7cd011db47 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: aW2PJaBegIgR9tAGkeTO6J4kbsOa8qB10biWnzNxnqv1OI0cQGloi1RtZclMxBRX9ooP+HXUYGkyDDbDUyRqcQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR21MB3866 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766333535021063281?= X-GMAIL-MSGID: =?utf-8?q?1766707520825527435?= |
Series |
[v2,1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling when CPUs go online/offline
|
|
Commit Message
Michael Kelley (LINUX)
May 23, 2023, 5:14 p.m. UTC
These commits
a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg")
2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs")
update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg
because that memory will be correctly marked as decrypted or encrypted
for all VM types (CoCo or normal). But problems ensue when CPUs in the
VM go online or offline after virtual PCI devices have been configured.
When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is
initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN.
But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which
may call the virtual PCI driver and fault trying to use the as yet
uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo
VM if the MMIO read and write hypercalls are used from state
CPUHP_AP_IRQ_AFFINITY_ONLINE.
When a CPU is taken offline, IRQs may be reassigned in state
CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to
use the hyperv_pcpu_input_arg that has already been freed by a
higher state.
Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE
immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE)
and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for
Hyper-V initialization so that hyperv_pcpu_input_arg is allocated
early enough.
Fix the offlining problem by not freeing hyperv_pcpu_input_arg when
a CPU goes offline. Retain the allocated memory, and reuse it if
the CPU comes back online later.
Signed-off-by: Michael Kelley <mikelley@microsoft.com>
---
Changes in v2:
* Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly
Kuznetsov]
arch/x86/hyperv/hv_init.c | 2 +-
drivers/hv/hv_common.c | 48 +++++++++++++++++++++++-----------------------
include/linux/cpuhotplug.h | 1 +
3 files changed, 26 insertions(+), 25 deletions(-)
Comments
> From: Michael Kelley (LINUX) <mikelley@microsoft.com> > Sent: Tuesday, May 23, 2023 10:14 AM > To: KY Srinivasan <kys@microsoft.com>; Haiyang Zhang > <haiyangz@microsoft.com>; wei.liu@kernel.org; Dexuan Cui > <decui@microsoft.com>; catalin.marinas@arm.com; will@kernel.org; > tglx@linutronix.de; mingo@redhat.com; bp@alien8.de; > dave.hansen@linux.intel.com; hpa@zytor.com; linux-kernel@vger.kernel.org; > linux-hyperv@vger.kernel.org; linux-arm-kernel@lists.infradead.org; > x86@kernel.org > Cc: Michael Kelley (LINUX) <mikelley@microsoft.com> > Subject: [PATCH v2 1/2] x86/hyperv: Fix hyperv_pcpu_input_arg handling > when CPUs go online/offline > > These commits > > a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with > hyperv_pcpu_input_arg") > 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs") > > update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg > because that memory will be correctly marked as decrypted or encrypted > for all VM types (CoCo or normal). But problems ensue when CPUs in the > VM go online or offline after virtual PCI devices have been configured. > > When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is > initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN. > But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which > may call the virtual PCI driver and fault trying to use the as yet > uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo > VM if the MMIO read and write hypercalls are used from state > CPUHP_AP_IRQ_AFFINITY_ONLINE. > > When a CPU is taken offline, IRQs may be reassigned in state > CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to > use the hyperv_pcpu_input_arg that has already been freed by a > higher state. > > Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE > immediately after CPUHP_AP_ONLINE_IDLE (similar to > CPUHP_AP_KVM_ONLINE) > and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for > Hyper-V initialization so that hyperv_pcpu_input_arg is allocated > early enough. > > Fix the offlining problem by not freeing hyperv_pcpu_input_arg when > a CPU goes offline. Retain the allocated memory, and reuse it if > the CPU comes back online later. > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > --- > > Changes in v2: > * Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly > Kuznetsov] Reviewed-by: Dexuan Cui <decui@microsoft.com>
On Tue, May 23, 2023 at 10:14:21AM -0700, Michael Kelley wrote: [...] > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index 0f1001d..3ceb9df 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -200,6 +200,7 @@ enum cpuhp_state { > > /* Online section invoked on the hotplugged CPU from the hotplug thread */ > CPUHP_AP_ONLINE_IDLE, > + CPUHP_AP_HYPERV_ONLINE, x86 maintainers, are you okay with this? Thanks, Wei. > CPUHP_AP_KVM_ONLINE, > CPUHP_AP_SCHED_WAIT_EMPTY, > CPUHP_AP_SMPBOOT_THREADS, > -- > 1.8.3.1 >
Michael Kelley <mikelley@microsoft.com> writes: > These commits > > a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg") > 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs") > > update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg > because that memory will be correctly marked as decrypted or encrypted > for all VM types (CoCo or normal). But problems ensue when CPUs in the > VM go online or offline after virtual PCI devices have been configured. > > When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is > initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN. > But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which > may call the virtual PCI driver and fault trying to use the as yet > uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo > VM if the MMIO read and write hypercalls are used from state > CPUHP_AP_IRQ_AFFINITY_ONLINE. > > When a CPU is taken offline, IRQs may be reassigned in state > CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to > use the hyperv_pcpu_input_arg that has already been freed by a > higher state. > > Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE > immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE) > and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for > Hyper-V initialization so that hyperv_pcpu_input_arg is allocated > early enough. > > Fix the offlining problem by not freeing hyperv_pcpu_input_arg when > a CPU goes offline. Retain the allocated memory, and reuse it if > the CPU comes back online later. > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> Very minor nitpick: I would've changed the Subject of the patch to something a bit more meaningful, e.g. "x86/hyperv: Avoid freeing per-CPU hyperv_pcpu_{input,output}_arg upon CPU offlining" or something like that. > --- > > Changes in v2: > * Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly > Kuznetsov] > > arch/x86/hyperv/hv_init.c | 2 +- > drivers/hv/hv_common.c | 48 +++++++++++++++++++++++----------------------- > include/linux/cpuhotplug.h | 1 + > 3 files changed, 26 insertions(+), 25 deletions(-) > > diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c > index a5f9474..6c04b52 100644 > --- a/arch/x86/hyperv/hv_init.c > +++ b/arch/x86/hyperv/hv_init.c > @@ -416,7 +416,7 @@ void __init hyperv_init(void) > goto free_vp_assist_page; > } > > - cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online", > + cpuhp = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "x86/hyperv_init:online", > hv_cpu_init, hv_cpu_die); > if (cpuhp < 0) > goto free_ghcb_page; > diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c > index 64f9cec..542a1d5 100644 > --- a/drivers/hv/hv_common.c > +++ b/drivers/hv/hv_common.c > @@ -364,13 +364,20 @@ int hv_common_cpu_init(unsigned int cpu) > flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL; > > inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > - *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags); > - if (!(*inputarg)) > - return -ENOMEM; > > - if (hv_root_partition) { > - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); > - *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE; > + /* > + * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already > + * allocated if this CPU was previously online and then taken offline > + */ > + if (!*inputarg) { > + *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags); Nitpick: I see this was in the original code but we could probably switch to kcalloc() here. > + if (!(*inputarg)) > + return -ENOMEM; > + > + if (hv_root_partition) { > + outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); > + *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE; > + } > } > > msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX); > @@ -385,24 +392,17 @@ int hv_common_cpu_init(unsigned int cpu) > > int hv_common_cpu_die(unsigned int cpu) > { > - unsigned long flags; > - void **inputarg, **outputarg; > - void *mem; > - > - local_irq_save(flags); > - > - inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); > - mem = *inputarg; > - *inputarg = NULL; > - > - if (hv_root_partition) { > - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); > - *outputarg = NULL; > - } > - > - local_irq_restore(flags); > - > - kfree(mem); > + /* > + * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory > + * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg > + * may be used by the Hyper-V vPCI driver in reassigning interrupts > + * as part of the offlining process. The interrupt reassignment > + * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and > + * called this function. > + * > + * If a previously offlined CPU is brought back online again, the > + * originally allocated memory is reused in hv_common_cpu_init(). > + */ > > return 0; > } > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > index 0f1001d..3ceb9df 100644 > --- a/include/linux/cpuhotplug.h > +++ b/include/linux/cpuhotplug.h > @@ -200,6 +200,7 @@ enum cpuhp_state { > > /* Online section invoked on the hotplugged CPU from the hotplug thread */ > CPUHP_AP_ONLINE_IDLE, > + CPUHP_AP_HYPERV_ONLINE, > CPUHP_AP_KVM_ONLINE, > CPUHP_AP_SCHED_WAIT_EMPTY, > CPUHP_AP_SMPBOOT_THREADS, Reviewed-by: Vitaly Kuznetsov <vkuznets@redhat.com>
From: Wei Liu <wei.liu@kernel.org> > > On Tue, May 23, 2023 at 10:14:21AM -0700, Michael Kelley wrote: > [...] > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > index 0f1001d..3ceb9df 100644 > > --- a/include/linux/cpuhotplug.h > > +++ b/include/linux/cpuhotplug.h > > @@ -200,6 +200,7 @@ enum cpuhp_state { > > > > /* Online section invoked on the hotplugged CPU from the hotplug thread */ > > CPUHP_AP_ONLINE_IDLE, > > + CPUHP_AP_HYPERV_ONLINE, > > x86 maintainers, are you okay with this? > Boris -- Are you OK with this, and could give an ACK? This small patch set fixes a problem introduced into 6.4-rc1 by other Confidential VM changes, so this fix needs to be incorporated before 6.4 is released. Michael > > CPUHP_AP_KVM_ONLINE, > > CPUHP_AP_SCHED_WAIT_EMPTY, > > CPUHP_AP_SMPBOOT_THREADS, > > -- > > 1.8.3.1 > >
> From: Michael Kelley (LINUX) <mikelley@microsoft.com> > Sent: Thursday, June 8, 2023 7:39 AM > To: bp@alien8.de > > From: Wei Liu <wei.liu@kernel.org> > > > > On Tue, May 23, 2023 at 10:14:21AM -0700, Michael Kelley wrote: > > [...] > > > diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h > > > index 0f1001d..3ceb9df 100644 > > > --- a/include/linux/cpuhotplug.h > > > +++ b/include/linux/cpuhotplug.h > > > @@ -200,6 +200,7 @@ enum cpuhp_state { > > > > > > /* Online section invoked on the hotplugged CPU from the hotplug > thread */ > > > CPUHP_AP_ONLINE_IDLE, > > > + CPUHP_AP_HYPERV_ONLINE, > > > > x86 maintainers, are you okay with this? > > Boris -- Are you OK with this, and could give an ACK? This small patch > set fixes a problem introduced into 6.4-rc1 by other Confidential VM > changes, so this fix needs to be incorporated before 6.4 is released. > > Michael Hi Boris, gentle ping -- I hope this patch can be accepted soon as I and Tianyu need to rebase our SNP/TDX patches on this patch. > > > CPUHP_AP_KVM_ONLINE, > > > CPUHP_AP_SCHED_WAIT_EMPTY, > > > CPUHP_AP_SMPBOOT_THREADS, > > > -- > > > 1.8.3.1 > > >
On Tue, May 23, 2023 at 10:14:21AM -0700, Michael Kelley wrote: > These commits > > a494aef23dfc ("PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg") > 2c6ba4216844 ("PCI: hv: Enable PCI pass-thru devices in Confidential VMs") > > update the Hyper-V virtual PCI driver to use the hyperv_pcpu_input_arg > because that memory will be correctly marked as decrypted or encrypted > for all VM types (CoCo or normal). But problems ensue when CPUs in the > VM go online or offline after virtual PCI devices have been configured. > > When a CPU is brought online, the hyperv_pcpu_input_arg for that CPU is > initialized by hv_cpu_init() running under state CPUHP_AP_ONLINE_DYN. > But this state occurs after state CPUHP_AP_IRQ_AFFINITY_ONLINE, which > may call the virtual PCI driver and fault trying to use the as yet > uninitialized hyperv_pcpu_input_arg. A similar problem occurs in a CoCo > VM if the MMIO read and write hypercalls are used from state > CPUHP_AP_IRQ_AFFINITY_ONLINE. > > When a CPU is taken offline, IRQs may be reassigned in state > CPUHP_TEARDOWN_CPU. Again, the virtual PCI driver may fault trying to > use the hyperv_pcpu_input_arg that has already been freed by a > higher state. > > Fix the onlining problem by adding state CPUHP_AP_HYPERV_ONLINE > immediately after CPUHP_AP_ONLINE_IDLE (similar to CPUHP_AP_KVM_ONLINE) > and before CPUHP_AP_IRQ_AFFINITY_ONLINE. Use this new state for > Hyper-V initialization so that hyperv_pcpu_input_arg is allocated > early enough. > > Fix the offlining problem by not freeing hyperv_pcpu_input_arg when > a CPU goes offline. Retain the allocated memory, and reuse it if > the CPU comes back online later. > > Signed-off-by: Michael Kelley <mikelley@microsoft.com> > --- > > Changes in v2: > * Put CPUHP_AP_HYPERV_ONLINE before CPUHP_AP_KVM_ONLINE [Vitaly > Kuznetsov] > > arch/x86/hyperv/hv_init.c | 2 +- > drivers/hv/hv_common.c | 48 +++++++++++++++++++++++----------------------- > include/linux/cpuhotplug.h | 1 + > 3 files changed, 26 insertions(+), 25 deletions(-) Acked-by: Borislav Petkov (AMD) <bp@alien8.de>
diff --git a/arch/x86/hyperv/hv_init.c b/arch/x86/hyperv/hv_init.c index a5f9474..6c04b52 100644 --- a/arch/x86/hyperv/hv_init.c +++ b/arch/x86/hyperv/hv_init.c @@ -416,7 +416,7 @@ void __init hyperv_init(void) goto free_vp_assist_page; } - cpuhp = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "x86/hyperv_init:online", + cpuhp = cpuhp_setup_state(CPUHP_AP_HYPERV_ONLINE, "x86/hyperv_init:online", hv_cpu_init, hv_cpu_die); if (cpuhp < 0) goto free_ghcb_page; diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c index 64f9cec..542a1d5 100644 --- a/drivers/hv/hv_common.c +++ b/drivers/hv/hv_common.c @@ -364,13 +364,20 @@ int hv_common_cpu_init(unsigned int cpu) flags = irqs_disabled() ? GFP_ATOMIC : GFP_KERNEL; inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); - *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags); - if (!(*inputarg)) - return -ENOMEM; - if (hv_root_partition) { - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); - *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE; + /* + * hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory is already + * allocated if this CPU was previously online and then taken offline + */ + if (!*inputarg) { + *inputarg = kmalloc(pgcount * HV_HYP_PAGE_SIZE, flags); + if (!(*inputarg)) + return -ENOMEM; + + if (hv_root_partition) { + outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); + *outputarg = (char *)(*inputarg) + HV_HYP_PAGE_SIZE; + } } msr_vp_index = hv_get_register(HV_REGISTER_VP_INDEX); @@ -385,24 +392,17 @@ int hv_common_cpu_init(unsigned int cpu) int hv_common_cpu_die(unsigned int cpu) { - unsigned long flags; - void **inputarg, **outputarg; - void *mem; - - local_irq_save(flags); - - inputarg = (void **)this_cpu_ptr(hyperv_pcpu_input_arg); - mem = *inputarg; - *inputarg = NULL; - - if (hv_root_partition) { - outputarg = (void **)this_cpu_ptr(hyperv_pcpu_output_arg); - *outputarg = NULL; - } - - local_irq_restore(flags); - - kfree(mem); + /* + * The hyperv_pcpu_input_arg and hyperv_pcpu_output_arg memory + * is not freed when the CPU goes offline as the hyperv_pcpu_input_arg + * may be used by the Hyper-V vPCI driver in reassigning interrupts + * as part of the offlining process. The interrupt reassignment + * happens *after* the CPUHP_AP_HYPERV_ONLINE state has run and + * called this function. + * + * If a previously offlined CPU is brought back online again, the + * originally allocated memory is reused in hv_common_cpu_init(). + */ return 0; } diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h index 0f1001d..3ceb9df 100644 --- a/include/linux/cpuhotplug.h +++ b/include/linux/cpuhotplug.h @@ -200,6 +200,7 @@ enum cpuhp_state { /* Online section invoked on the hotplugged CPU from the hotplug thread */ CPUHP_AP_ONLINE_IDLE, + CPUHP_AP_HYPERV_ONLINE, CPUHP_AP_KVM_ONLINE, CPUHP_AP_SCHED_WAIT_EMPTY, CPUHP_AP_SMPBOOT_THREADS,