Message ID | 20240229025759.1187910-1-stevensd@google.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-86064-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2097:b0:108:e6aa:91d0 with SMTP id gs23csp142570dyb; Wed, 28 Feb 2024 18:58:30 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCU457EG9REZnWItaaTBEq0icvPkHawh68U/bpUprv4HWmBKHoBzDe+YbZ610pLMt0NNggTeKCSTY50jyzlx2rjIjMpC8A== X-Google-Smtp-Source: AGHT+IFSJqRPr6Vagr5cpay9Xl9Nso8qld21IlFu9EsPV0KsP77mkKFm1TYY7QgXfMqfmGvGL2CX X-Received: by 2002:a05:6214:4105:b0:68f:2c72:a76a with SMTP id kc5-20020a056214410500b0068f2c72a76amr291306qvb.3.1709175510544; Wed, 28 Feb 2024 18:58:30 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1709175510; cv=pass; d=google.com; s=arc-20160816; b=dVrxs5wmpNGPVGAJ6fSE60HSfu/04rfBFgE7Voc+Z6TVkgNtuqqfQlUxUXY+0zWBsf wlnErq231qqOajmAtapEbOHv2D7nZc4Ys1abBklAQr6v9bE8c9gus7eOKLjHTz5Dd9GE /ECC8HwbY1POtovVvJLFjzXvC9TgRrqjIKj1pfvQY5NlGdmyJmG1RZktbT/hQEs7LzH4 0cSU0Foc9jhxNv+FvP3s/2CSfLzhDO9fz40nFtvFO25yn7fccYK/EOhm2KocrZHJi9Dj C2MH+3U+c6QUWTgeWR6+rtV5D7wNrAbaSqy7CjjLdS63Y5StCAQbL1iX7eDDW+8pi8CD 7QFA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=GYrIjEX8sBb4dT+FuvYa3Se1T3Bf/IWCGgkiK7PdSp0=; fh=CpSf0vJnfezBKbBSKE41znQ/j7dRelAFrR4zx8LxNpg=; b=m1IcgNXkCyd9EIC5V7nSMduCBjDekCYfi9iznYq3JWDTQV1aIJWCwoobVDVkAPj+K9 tSg8Fl8yXnvDRt7d9wacGly2pxPHs+W+dlrN5/Oycizuq4Vl7qX7QnkIco7DAZU1r+I3 ESXxqTP6/LiS9fBMmhyQT+1nhLPKLyW1a08NWPglUhBjrN2/QRwzaasf8Z9uQs7Z2p90 Ry6yn5te8D4QllWsrV9eQ6enb0ZER3nxkDMf7u3umxWe+XZRB2UFd8UchLf6kyYGhbHl +yt9Zvqb0XEzPBA0Sl3eJRbV/2czAsR+MbKgiqGyVGdQX5hJwMDlWBEnFuyzHBPXC+Fa EjNQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=jPg2dr5p; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-86064-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-86064-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id fo10-20020ad45f0a000000b0068efb5d3adbsi509351qvb.165.2024.02.28.18.58.30 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Feb 2024 18:58:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-86064-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=jPg2dr5p; arc=pass (i=1 spf=pass spfdomain=chromium.org dkim=pass dkdomain=chromium.org dmarc=pass fromdomain=chromium.org); spf=pass (google.com: domain of linux-kernel+bounces-86064-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-86064-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 579881C216AD for <ouuuleilei@gmail.com>; Thu, 29 Feb 2024 02:58:30 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 11F1A374DE; Thu, 29 Feb 2024 02:58:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b="jPg2dr5p" Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74F6C36123 for <linux-kernel@vger.kernel.org>; Thu, 29 Feb 2024 02:58:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709175489; cv=none; b=S7PMHI1fVSfld/MFsUale/J3ZqHlLKo52FutzqHPnAbq59C7/LokId0fGJXpxlnUNx3dEoHVcRHYPqJZvomrxiQU0cjkQMGwzbIBE2S4Ax1JH/xMXtSB38Nyy1z3m9C0xuceF//u2vmIVvia3ZnXAE42ZZpyX2gXoXi8xf3ZRPo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709175489; c=relaxed/simple; bh=vTo43K8wKC81JH1cmE6G3o/ZjBDx+QgMlTVTqwHGI38=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=T7hTYaxxM9Jhu3fhlTpIqNB6jd6HRnQRhXswyafbn26jw5kwTpIlGoXi2ckKalRq8sfkHQ2uTyx0ME3RnkUNGfxetKA0AcuYmwGR5L9Rg82KHFX3/jOJNUMdftJ1E7UIrjqN4F3Rs+W1pP7AV+0Yzl886n+w+hHsk+Ct3iwKFeA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org; spf=pass smtp.mailfrom=chromium.org; dkim=pass (1024-bit key) header.d=chromium.org header.i=@chromium.org header.b=jPg2dr5p; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=chromium.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=chromium.org Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-1d93edfa76dso4861585ad.1 for <linux-kernel@vger.kernel.org>; Wed, 28 Feb 2024 18:58:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1709175487; x=1709780287; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=GYrIjEX8sBb4dT+FuvYa3Se1T3Bf/IWCGgkiK7PdSp0=; b=jPg2dr5pROUofHcv0fIh3KOujgBfZOUB8aC6+4UuVmVQYEwGVSeQhRO5VY0Uw6U+j/ 79BvHRZCSKjJxQUo6XaO1PO7MJDLtnSyNcozeglUTOM8mOujpMR0wsu8zA6xDs01YwUW dOdMjW95ZXju/hC+bPLMZ7If2CCINPfOknExc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709175487; x=1709780287; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GYrIjEX8sBb4dT+FuvYa3Se1T3Bf/IWCGgkiK7PdSp0=; b=stxtpVuybznz8kv2Q5qniT83NZGV0OtLTH/CsiGYgTC7dr/qNruof2rkjlLohywY+B rfgtEl4/LUr72OTm1EDbFRFz81NmB2bLQhSgYK8J2kwtsfAFJsk3618xVSyUp6paPyq0 fqKye5rGG+iPdX5vQ5I7B+UOHC7TUa3F9YdXiiOK5Z+VT2jYXhi77csqsXP9HTnnal/e egSYjN2YlDlClpGraqofUKg51K2cMZQxCmQQzwOC5hmh2Lp8Eob5uBO67LrdN5L8+5i0 Ig0QjHB0o+0TUJFn84BYYcwPaxx2VjAtD1VklP7PUpXt6F3DGJ8EwQjk9PCpfAtirmBO +nqQ== X-Forwarded-Encrypted: i=1; AJvYcCXW231/2+sGAp9mYRPJi86gWchoTHmTKMgiDcukuTklyrEUhlaSe2cgnPT24jIG5l9yNPJH0stbs+BifEpNuCuqRVTV/75q5y8HhPlS X-Gm-Message-State: AOJu0Yz0HZRB5tNaBX38e4Io+qKVQIH6t1SOo1mUoJn+8I5PqqQbKZgq +S2Des18gpul4sfqtv3e5vuBeACSvlJmUpQu7NBG8KqPf4fSX+8mV/GokeHcsg== X-Received: by 2002:a17:902:6b41:b0:1db:ccd0:e77e with SMTP id g1-20020a1709026b4100b001dbccd0e77emr849550plt.35.1709175486838; Wed, 28 Feb 2024 18:58:06 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:f51:e79e:9056:77ea]) by smtp.gmail.com with UTF8SMTPSA id i6-20020a170902eb4600b001dc38eaa7fdsm174087pli.278.2024.02.28.18.58.04 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 28 Feb 2024 18:58:06 -0800 (PST) From: David Stevens <stevensd@chromium.org> X-Google-Original-From: David Stevens <stevensd@google.com> To: Sean Christopherson <seanjc@google.com>, Paolo Bonzini <pbonzini@redhat.com> Cc: Yu Zhang <yu.c.zhang@linux.intel.com>, Isaku Yamahata <isaku.yamahata@gmail.com>, Zhi Wang <zhi.wang.linux@gmail.com>, Maxim Levitsky <mlevitsk@redhat.com>, kvmarm@lists.linux.dev, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, David Stevens <stevensd@chromium.org> Subject: [PATCH v11 0/8] KVM: allow mapping non-refcounted pages Date: Thu, 29 Feb 2024 11:57:51 +0900 Message-ID: <20240229025759.1187910-1-stevensd@google.com> X-Mailer: git-send-email 2.44.0.rc1.240.g4c46232300-goog Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1792200420293210245 X-GMAIL-MSGID: 1792200420293210245 |
Series |
KVM: allow mapping non-refcounted pages
|
|
Message
David Stevens
Feb. 29, 2024, 2:57 a.m. UTC
From: David Stevens <stevensd@chromium.org>
This patch series adds support for mapping VM_IO and VM_PFNMAP memory
that is backed by struct pages that aren't currently being refcounted
(e.g. tail pages of non-compound higher order allocations) into the
guest.
Our use case is virtio-gpu blob resources [1], which directly map host
graphics buffers into the guest as "vram" for the virtio-gpu device.
This feature currently does not work on systems using the amdgpu driver,
as that driver allocates non-compound higher order pages via
ttm_pool_alloc_page().
First, this series replaces the gfn_to_pfn_memslot() API with a more
extensible kvm_follow_pfn() API. The updated API rearranges
gfn_to_pfn_memslot()'s args into a struct and where possible packs the
bool arguments into a FOLL_ flags argument. The refactoring changes do
not change any behavior.
From there, this series extends the kvm_follow_pfn() API so that
non-refconuted pages can be safely handled. This invloves adding an
input parameter to indicate whether the caller can safely use
non-refcounted pfns and an output parameter to tell the caller whether
or not the returned page is refcounted. This change includes a breaking
change, by disallowing non-refcounted pfn mappings by default, as such
mappings are unsafe. To allow such systems to continue to function, an
opt-in module parameter is added to allow the unsafe behavior.
This series only adds support for non-refcounted pages to x86. Other
MMUs can likely be updated without too much difficulty, but it is not
needed at this point. Updating other parts of KVM (e.g. pfncache) is not
straightforward [2].
[1]
https://patchwork.kernel.org/project/dri-devel/cover/20200814024000.2485-1-gurchetansingh@chromium.org/
[2] https://lore.kernel.org/all/ZBEEQtmtNPaEqU1i@google.com/
v10 -> v11:
- Switch to u64 __read_mostly shadow_refcounted_mask.
- Update comments about allow_non_refcounted_struct_page.
v9 -> v10:
- Re-add FOLL_GET changes.
- Split x86/mmu spte+non-refcount-page patch into two patches.
- Rename 'foll' variables to 'kfp'.
- Properly gate usage of refcount spte bit when it's not available.
- Replace kfm_follow_pfn's is_refcounted_page output parameter with
a struct page *refcounted_page pointing to the page in question.
- Add patch downgrading BUG_ON to WARN_ON_ONCE.
v8 -> v9:
- Make paying attention to is_refcounted_page mandatory. This means
that FOLL_GET is no longer necessary. For compatibility with
un-migrated callers, add a temporary parameter to sidestep
ref-counting issues.
- Add allow_unsafe_mappings, which is a breaking change.
- Migrate kvm_vcpu_map and other callsites used by x86 to the new API.
- Drop arm and ppc changes.
v7 -> v8:
- Set access bits before releasing mmu_lock.
- Pass FOLL_GET on 32-bit x86 or !tdp_enabled.
- Refactor FOLL_GET handling, add kvm_follow_refcounted_pfn helper.
- Set refcounted bit on >4k pages.
- Add comments and apply formatting suggestions.
- rebase on kvm next branch.
v6 -> v7:
- Replace __gfn_to_pfn_memslot with a more flexible __kvm_faultin_pfn,
and extend that API to support non-refcounted pages (complete
rewrite).
David Stevens (7):
KVM: Relax BUG_ON argument validation
KVM: mmu: Introduce kvm_follow_pfn()
KVM: mmu: Improve handling of non-refcounted pfns
KVM: Migrate kvm_vcpu_map() to kvm_follow_pfn()
KVM: x86: Migrate to kvm_follow_pfn()
KVM: x86/mmu: Track if sptes refer to refcounted pages
KVM: x86/mmu: Handle non-refcounted pages
Sean Christopherson (1):
KVM: Assert that a page's refcount is elevated when marking
accessed/dirty
arch/x86/kvm/mmu/mmu.c | 108 +++++++---
arch/x86/kvm/mmu/mmu_internal.h | 2 +
arch/x86/kvm/mmu/paging_tmpl.h | 7 +-
arch/x86/kvm/mmu/spte.c | 5 +-
arch/x86/kvm/mmu/spte.h | 16 +-
arch/x86/kvm/mmu/tdp_mmu.c | 22 +-
arch/x86/kvm/x86.c | 11 +-
include/linux/kvm_host.h | 58 +++++-
virt/kvm/guest_memfd.c | 8 +-
virt/kvm/kvm_main.c | 345 +++++++++++++++++++-------------
virt/kvm/kvm_mm.h | 3 +-
virt/kvm/pfncache.c | 11 +-
12 files changed, 399 insertions(+), 197 deletions(-)
base-commit: 54be6c6c5ae8e0d93a6c4641cb7528eb0b6ba478
Comments
On Thu, Feb 29, 2024 at 11:57:51AM +0900, David Stevens wrote: > Our use case is virtio-gpu blob resources [1], which directly map host > graphics buffers into the guest as "vram" for the virtio-gpu device. > This feature currently does not work on systems using the amdgpu driver, > as that driver allocates non-compound higher order pages via > ttm_pool_alloc_page(). . and just as last time around that is still the problem that needs to be fixed instead of creating a monster like this to map non-refcounted pages.