From patchwork Tue Nov 29 19:35:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27459 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535235wrr; Tue, 29 Nov 2022 11:48:07 -0800 (PST) X-Google-Smtp-Source: AA0mqf6LTHCgYIqc8K2g/qVB8zqQ0llO42DD+v3VRwQG4n6g/Z2+XBuYjuKVXoWNuxNqYTMKdtBs X-Received: by 2002:a05:6402:5412:b0:460:e19c:15a3 with SMTP id ev18-20020a056402541200b00460e19c15a3mr36107585edb.252.1669751287479; Tue, 29 Nov 2022 11:48:07 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751287; cv=none; d=google.com; s=arc-20160816; b=CJa1H8d+q6vSYf6xLtmSVe7T7vGU9NSAvO4ZJlns/CxGNAhtFQsUpb6m9NDtCxNzSQ Ad4UbLC/4OPj3YbzZEnilTYpFfWPIh83argWbfT8YKrzo+EE7XHhGVWiTZ9YWYqYx2A+ jG0r2D1ovHf+H8MR1LiHrcP0xCICHNtYoE+HcbhSshBtQc1VyQbqCsXIKkaWHiC1H2C3 BamalJ8abgQlfxX5s9nXdrARxqUKEcQbEYda9tOhe6+4ftyXwCnVZQojcc0wACmzAlTB 7k6hSoeVBFvuthNBX7qXUJOpAnTGFX3RzJ53+SolCsidW0eC75iPKbQax+xK8sO6+ZA8 3Djg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oUaUMokAqyi3NHrr8ZrIfiPSDBorB4zEKfPSOjIyLzs=; b=xkO8UxliKfnwH6Xp4w11mn3e5wMg01HvJDg9BVm3KC46hUtD3BBvxvxQMGAPKvcMol 2xsrt4iSxKiPrFzRS6lYafdgLdgHEUw1v135lyIx23M6JUdjlbxa54dktY1jG3ez/hRK AgJF095nU0FNA7/059RfD+l7iC1wnfXWWFzf6PJh+6eV5iQXjgacxwW/TES0rVtk6vka +R6FtIaZLQGjtfEE37XV+kO3xvuGH4CmRG3OcUWJXn6/XhBbJ9ECu6MHQv3C+OURhPbb NW8k8kOXuDkh+bbHX0HTxhoUP7nkn+70KvUYL5xjGrt87oF4ARMZewMy+b+7VFP3DC63 jFgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ECSG+bkI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c13-20020a05640227cd00b0046ab9b37cf2si11877685ede.116.2022.11.29.11.47.44; Tue, 29 Nov 2022 11:48:07 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ECSG+bkI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237032AbiK2Thj (ORCPT + 99 others); Tue, 29 Nov 2022 14:37:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49038 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237083AbiK2Tgy (ORCPT ); Tue, 29 Nov 2022 14:36:54 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 31C495B844 for ; Tue, 29 Nov 2022 11:35:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750534; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=oUaUMokAqyi3NHrr8ZrIfiPSDBorB4zEKfPSOjIyLzs=; b=ECSG+bkId8KgrsWq1A+PQCFRT54WsNO/jS+s5kqbAAKdgi7UE7aUvArTbBQgOtzRla2e3H TSA6Ul4J53OzVReEhsZ3J3sjH9TR2P0quuLNyWHsHwpokFZMu6Gp59X88hUTm1RcNaWeJI 8HVcoZJUfRcHXFiCbXg9KlGP5OlV6U0= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-79-63AQCtf0OQ-VbWLGMWAWCg-1; Tue, 29 Nov 2022 14:35:32 -0500 X-MC-Unique: 63AQCtf0OQ-VbWLGMWAWCg-1 Received: by mail-qk1-f198.google.com with SMTP id y22-20020a05620a25d600b006fc49e06062so29392650qko.4 for ; Tue, 29 Nov 2022 11:35:32 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oUaUMokAqyi3NHrr8ZrIfiPSDBorB4zEKfPSOjIyLzs=; b=38TnVtYOUz7cKzzPcUqi82E/w55Y6tbgz9QqffS0Fz0iM1fSTxQH+HC89TijjPEzxT QSVHGz/3h+liFFSBtp2dJTzhzaVb4jTXNu4yd1o5SWiaQbQJQbnmPS8UPVabQJlGCNGT 7nUo93LyQiWwpJg3TzTpj9LEv9XPaJctL7wgt9/6lnTDvIR3oj3YncS0MbWsb6+KGYMR kXz2ykHvWUjHZlWJ0qLRN9YFPyxA7/RIBnG6qbeMujIkisWOVSpFVikYKPgwEyjKsciG XOKVwjzlapJChmA5K0TMKa9ZOdyq+U+wVcNCkRqh4Ul1LHxRwlgFoaHLxIKAcFmNg7sh yAxA== X-Gm-Message-State: ANoB5pmyTDtaNArNZP/asiV8uFwfw2g7H6XZAIEXYvFuNFmHsgEAGD9P k9EbNooDgmwnJuuU//Jhu0AL7vXk90KQzSeDyH+QnEtXbwV/qJoqg5R+kV7pgTPIhWaLh4fihzF cXxPxk618RFXsYcVuFiHHkVbD X-Received: by 2002:a05:620a:b83:b0:6fb:ec6:da03 with SMTP id k3-20020a05620a0b8300b006fb0ec6da03mr51132381qkh.206.1669750532188; Tue, 29 Nov 2022 11:35:32 -0800 (PST) X-Received: by 2002:a05:620a:b83:b0:6fb:ec6:da03 with SMTP id k3-20020a05620a0b8300b006fb0ec6da03mr51132361qkh.206.1669750531938; Tue, 29 Nov 2022 11:35:31 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:29 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 01/10] mm/hugetlb: Let vma_offset_start() to return start Date: Tue, 29 Nov 2022 14:35:17 -0500 Message-Id: <20221129193526.3588187-2-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861126233978350?= X-GMAIL-MSGID: =?utf-8?q?1750861126233978350?= Even though vma_offset_start() is named like that, it's not returning "the start address of the range" but rather the offset we should use to offset the vma->vm_start address. Make it return the real value of the start vaddr, and it also helps for all the callers because whenever the retval is used, it'll be ultimately added into the vma->vm_start anyway, so it's better. Reviewed-by: Mike Kravetz Signed-off-by: Peter Xu Reviewed-by: David Hildenbrand --- fs/hugetlbfs/inode.c | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 790d2727141a..fdb16246f46e 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -412,10 +412,12 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, */ static unsigned long vma_offset_start(struct vm_area_struct *vma, pgoff_t start) { + unsigned long offset = 0; + if (vma->vm_pgoff < start) - return (start - vma->vm_pgoff) << PAGE_SHIFT; - else - return 0; + offset = (start - vma->vm_pgoff) << PAGE_SHIFT; + + return vma->vm_start + offset; } static unsigned long vma_offset_end(struct vm_area_struct *vma, pgoff_t end) @@ -457,7 +459,7 @@ static void hugetlb_unmap_file_folio(struct hstate *h, v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - if (!hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) + if (!hugetlb_vma_maps_page(vma, v_start, page)) continue; if (!hugetlb_vma_trylock_write(vma)) { @@ -473,8 +475,8 @@ static void hugetlb_unmap_file_folio(struct hstate *h, break; } - unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, - NULL, ZAP_FLAG_DROP_MARKER); + unmap_hugepage_range(vma, v_start, v_end, NULL, + ZAP_FLAG_DROP_MARKER); hugetlb_vma_unlock_write(vma); } @@ -507,10 +509,9 @@ static void hugetlb_unmap_file_folio(struct hstate *h, */ v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - if (hugetlb_vma_maps_page(vma, vma->vm_start + v_start, page)) - unmap_hugepage_range(vma, vma->vm_start + v_start, - v_end, NULL, - ZAP_FLAG_DROP_MARKER); + if (hugetlb_vma_maps_page(vma, v_start, page)) + unmap_hugepage_range(vma, v_start, v_end, NULL, + ZAP_FLAG_DROP_MARKER); kref_put(&vma_lock->refs, hugetlb_vma_lock_release); hugetlb_vma_unlock_write(vma); @@ -540,8 +541,7 @@ hugetlb_vmdelete_list(struct rb_root_cached *root, pgoff_t start, pgoff_t end, v_start = vma_offset_start(vma, start); v_end = vma_offset_end(vma, end); - unmap_hugepage_range(vma, vma->vm_start + v_start, v_end, - NULL, zap_flags); + unmap_hugepage_range(vma, v_start, v_end, NULL, zap_flags); /* * Note that vma lock only exists for shared/non-private From patchwork Tue Nov 29 19:35:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27460 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535274wrr; Tue, 29 Nov 2022 11:48:14 -0800 (PST) X-Google-Smtp-Source: AA0mqf43MFxUnY+r5b/Fsd0XYX4h3ldITSxGcdDyQH+FlR4gE7QWDKMDoOKl1igo6yKin3iIs88n X-Received: by 2002:a05:6402:114f:b0:462:1e07:1dd7 with SMTP id g15-20020a056402114f00b004621e071dd7mr53967815edw.293.1669751293891; Tue, 29 Nov 2022 11:48:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751293; cv=none; d=google.com; s=arc-20160816; b=z7JwXzI++Uaxv3W7DJ0U54FvJBuqRah2Wqs0sqnc1Pe0NyAE0umAl4rPyWBjcne9Sg hHFDwEiLjfZBOtdTLli+rx4EXZC+DVagKCznavDjmQnPhtUKcJvY3uHMDLy+lXPkU0xX HJFWorR3YPNyMl96SSOaY6GUjhy8eUJTxpnQjGsoMh6jUIlJfYGYo6yhLSLKK/Fx2HnW lF1CYaXMU9S9BuUQbnTywmKqRbbDKKfU951SWNNn+M4f5s7LL3SYUxnroZkj7D5OF4Ev rLuqxqcBo3TcZOBtTcIgM/z7dkvZVmiG3QuVw8VKRD3HNN+WDe579e6KBnYTerIjck44 tveg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=069LEwdmpjbCoEgrG7X25LyQKpeCovMtZOQCiu77QNo=; b=uTv9QPWX2Df66lGqF6wLtg7IPJyVmLi3ZN+Ej4TrqmurYmRkVGWT/RKIsYVuSP/+n1 8PDJ5p/di3x5bdsl+G80l/akUpvb47MRO0pok59cCbO71UbBRMrCbwqvOuqXgT2pPk7v /NUJ97lwDn1Drh9FKxW8/z8RRMdtzZbeSezs0u+nEpXqsbiBU25WxuoGWPh4p6mpSHcN VBvUvRWwYXx5Rjoghs3cFtbvnwp/1edZkyDpcWOSJOXZxRQCg1h3bJtuNZsDOolPvgf5 pVbmfKECNQGX9+UBcnRwYxjXLT/+pbQHFvGbpLxPIYucKFJmX2WDdII6jZIlhaY/uRmR Rapg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="Rfxez/FP"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nd22-20020a170907629600b007877f3132d9si13250249ejc.438.2022.11.29.11.47.50; Tue, 29 Nov 2022 11:48:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="Rfxez/FP"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237046AbiK2Ths (ORCPT + 99 others); Tue, 29 Nov 2022 14:37:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48410 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237100AbiK2ThE (ORCPT ); Tue, 29 Nov 2022 14:37:04 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7088B5B865 for ; Tue, 29 Nov 2022 11:35:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750535; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=069LEwdmpjbCoEgrG7X25LyQKpeCovMtZOQCiu77QNo=; b=Rfxez/FP9JJaudHFIXy4W5UtTsBr5aMRRqpt4QbucuzCXp2qwJtSb/RckuK3/fRrxeZE3c EwWofrqsM26KyfSP663WyWgCTDrrFr4qXnKAhd7oFht7VDx4yCwmfa6zN4gIsk9T2x8I2j 446K7zzGYtN/mhIsTZInKoShtPwYTrs= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-115-dgg8WBogPbebm8-WkaycRQ-1; Tue, 29 Nov 2022 14:35:34 -0500 X-MC-Unique: dgg8WBogPbebm8-WkaycRQ-1 Received: by mail-qk1-f198.google.com with SMTP id v7-20020a05620a0f0700b006faffce43b2so30637050qkl.9 for ; Tue, 29 Nov 2022 11:35:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=069LEwdmpjbCoEgrG7X25LyQKpeCovMtZOQCiu77QNo=; b=PprMjNl2kr8/RPwgLp7H19MEUsnQo1bffNVKGPnmJESYTs2RJNj43xwpv/H8rUJbq4 QMKVJloqJ4nh9hukGVVsc7apZrUEJ19tuBtdl+/P5jUzFs5/nzjAd3W+DpdLsjrs9DTF tQhDDlh7Jk2dq0U2Arp59ee6QzywZ2EmW8VgfsfUDVxQV4tBsFNOS19sNT5bD3/rY7Ou nPpPBSObIQiDoI473Z4HkYTzXT3RR+2VIRDxf4jbwQeK2Ud7hr64lrA7WZNxbCQQ2v60 XHKlB1NOJ2CQaEPH3sfQuyqVT1bxkAR0OdovQW9rhTKSPuFiXg4tTbh8LqGZD6Pf2aWX Y5tA== X-Gm-Message-State: ANoB5pnvrBlmCf7tCektNwRwcjt5k+RSJ4+i9IDEW1E/3x0KctuHfY1h D0Vz0JbCxYOSemSeS4B782ZhAYBuS8Xwqm6+vG9YX+I8tubESDDxPhZKKwPSPL24DYCyRzO1RjB t1i4hv5S9iN+WkvfCOAEMvxg2 X-Received: by 2002:a05:6214:3b0b:b0:4c6:fb71:d337 with SMTP id nm11-20020a0562143b0b00b004c6fb71d337mr12678553qvb.110.1669750533675; Tue, 29 Nov 2022 11:35:33 -0800 (PST) X-Received: by 2002:a05:6214:3b0b:b0:4c6:fb71:d337 with SMTP id nm11-20020a0562143b0b00b004c6fb71d337mr12678530qvb.110.1669750533458; Tue, 29 Nov 2022 11:35:33 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:33 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 02/10] mm/hugetlb: Don't wait for migration entry during follow page Date: Tue, 29 Nov 2022 14:35:18 -0500 Message-Id: <20221129193526.3588187-3-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861132761616719?= X-GMAIL-MSGID: =?utf-8?q?1750861132761616719?= That's what the code does with !hugetlb pages, so we should logically do the same for hugetlb, so migration entry will also be treated as no page. This is probably also the last piece in follow_page code that may sleep, the last one should be removed in cf994dd8af27 ("mm/gup: remove FOLL_MIGRATION", 2022-11-16). Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz Reviewed-by: David Hildenbrand --- mm/hugetlb.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 9d97c9a2a15d..dfe677fadaf8 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6234,7 +6234,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; -retry: pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) return NULL; @@ -6257,16 +6256,6 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, page = NULL; goto out; } - } else { - if (is_hugetlb_entry_migration(entry)) { - spin_unlock(ptl); - __migration_entry_wait_huge(pte, ptl); - goto retry; - } - /* - * hwpoisoned entry is treated as no_page_table in - * follow_page_mask(). - */ } out: spin_unlock(ptl); From patchwork Tue Nov 29 19:35:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27461 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535291wrr; Tue, 29 Nov 2022 11:48:16 -0800 (PST) X-Google-Smtp-Source: AA0mqf5JgmvCBD3qPxHBAdUpUa3OxEu5sBrRzwUqZnpSV5/hrugn+h5MgpbOS8fHRzS5ysGeQd2q X-Received: by 2002:a62:ea18:0:b0:56c:2d:1e56 with SMTP id t24-20020a62ea18000000b0056c002d1e56mr39436409pfh.41.1669751296313; Tue, 29 Nov 2022 11:48:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751296; cv=none; d=google.com; s=arc-20160816; b=F3kPRsyCuOjr6C7jJ2zY3gw+0sQ8VzrLR+VDtfj81ArlrryRkrQ7jHM5J+JjKac5Op FSBBcPQqTnVGWDBv84jPKWPuN8l0nnoEbDfQZ+ZsXFX6tWvvRy+9afQb3+mfXgXJQBF/ +Vajw8TdOaQoj1ShUFEgVHX8K4hQW9Sso8wS22sE1jR/TG2NGhnr6l37TnhRZDovct2Y eiiRa8aSPXW6iEaQ8jLu16TheJFBbIAQXtBQuIckab+XLnBQ9hwrRjqEWI4i+3bH6vgj /It8ZsZfaYYD6GzWnO6T5hzSKQTFa2RTMpDAvSk3tnQ6pg8LaRGWSflot1EtCpUJNRpW hy/A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=1A7zSSAdH3Q+Nn9HaLXwUXJ3cHWzn45QfpKMkITSm0w=; b=0gs4qjh8lIaZNR/PCUS1aoX1c8fiKaYRQRt50HGsbUGCSPKlAJ9GNm7ghQgw/30iuQ k1GfZN+s9MdyC/6HyjobxjR1Q7hC9DaF65MTzCokscdF0q2PEJaiHqMXDQnA+JzUi+35 V+Ofb+FkgJ9FY0Rm4hpQZEQlQvZ3mWIvpqACMqVhBVyfw34MfvwCZnEdt2VAnsAJSIdX 6KPSMUtySnVPVg1Hb6+F7VzfFJRTJXYGULBrs7O1KffSoeuIjcLlvYSbKl5qW/K4jnDw VKFw9IRt7Vr+rn7SQM8NDhIQVJhHafsFT8sw5XizqvdhsbWOQbVvDNGfzw9devgF+YrS eFxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=b6gVKCGZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k16-20020a635a50000000b0047757685d7esi16164524pgm.772.2022.11.29.11.48.02; Tue, 29 Nov 2022 11:48:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=b6gVKCGZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237073AbiK2TiO (ORCPT + 99 others); Tue, 29 Nov 2022 14:38:14 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237122AbiK2ThL (ORCPT ); Tue, 29 Nov 2022 14:37:11 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF7242F657 for ; Tue, 29 Nov 2022 11:35:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750537; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=1A7zSSAdH3Q+Nn9HaLXwUXJ3cHWzn45QfpKMkITSm0w=; b=b6gVKCGZCvbnoU5aOoZzsIibfF+bR1hZQIon/dYwWZEkyI+xD/keZLKldUD6ds9Nb+7zT6 s39XNLh8ldeMS5hzTnsCGCtrmvEt+Bpm9X1cziLyx6nl29hk80lcOQpM5LB0XXz4QNwTeG mgcqWaT1+GJIr78wiEGaF+W847PoFXA= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-526-sxDb94_hNwmBDLCYUvi_LA-1; Tue, 29 Nov 2022 14:35:35 -0500 X-MC-Unique: sxDb94_hNwmBDLCYUvi_LA-1 Received: by mail-qt1-f200.google.com with SMTP id u31-20020a05622a199f00b003a51fa90654so23116707qtc.19 for ; Tue, 29 Nov 2022 11:35:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1A7zSSAdH3Q+Nn9HaLXwUXJ3cHWzn45QfpKMkITSm0w=; b=cmz5NIYTerrTD0Qr0NAw/VIW3XftWYzQA8kND8qUY2TZQPyg76spF+1GW04xM4ZpA/ f15lAQv2cVrJanJi4xqBLCGbyXWCsuQPmJm2/ZFdgIoBmkVanC0/9bw++3ZRS4y8T9le Q6kSaH8bhG8rGAfy08+Gb75pB6hLYueglORu/yw/ZHSiD/iZT5oGP1ejAXHqXGrDhl8+ sJKbpaPofxwKQJ4Jpbr4t6zdA4Kws4DEm1bmQITEPS6xG3uSKLuRFKmsoEC82Dl6oFey wTxoP7X2hk6Hp12pd3akL7YFMuzu+b4xkNjxhoQbEijlqbCylsFjzrGgXaEI1vqq8/Ml I4VA== X-Gm-Message-State: ANoB5pnh8McmTcEBukyfjPiALCPApXk5M7kPpFCWqtZzIz6JYHxxSMlc qtpzL4mv2sG1+Ud9BbgWTKk547GtVE/8ZZbp1TMOVORm3Ig2cQIptmzyGJGIeoShL68MfQiQm1d sEhk9TtYySRy5ZDFUh90FUOXC X-Received: by 2002:a05:620a:573:b0:6fc:1ddf:deec with SMTP id p19-20020a05620a057300b006fc1ddfdeecmr31840269qkp.595.1669750535316; Tue, 29 Nov 2022 11:35:35 -0800 (PST) X-Received: by 2002:a05:620a:573:b0:6fc:1ddf:deec with SMTP id p19-20020a05620a057300b006fc1ddfdeecmr31840242qkp.595.1669750535029; Tue, 29 Nov 2022 11:35:35 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:34 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 03/10] mm/hugetlb: Document huge_pte_offset usage Date: Tue, 29 Nov 2022 14:35:19 -0500 Message-Id: <20221129193526.3588187-4-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861135152547356?= X-GMAIL-MSGID: =?utf-8?q?1750861135152547356?= huge_pte_offset() is potentially a pgtable walker, looking up pte_t* for a hugetlb address. Normally, it's always safe to walk a generic pgtable as long as we're with the mmap lock held for either read or write, because that guarantees the pgtable pages will always be valid during the process. But it's not true for hugetlbfs, especially shared: hugetlbfs can have its pgtable freed by pmd unsharing, it means that even with mmap lock held for current mm, the PMD pgtable page can still go away from under us if pmd unsharing is possible during the walk. So we have two ways to make it safe even for a shared mapping: (1) If we're with the hugetlb vma lock held for either read/write, it's okay because pmd unshare cannot happen at all. (2) If we're with the i_mmap_rwsem lock held for either read/write, it's okay because even if pmd unshare can happen, the pgtable page cannot be freed from under us. Document it. Signed-off-by: Peter Xu --- include/linux/hugetlb.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 551834cd5299..81efd9b9baa2 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -192,6 +192,38 @@ extern struct list_head huge_boot_pages; pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz); +/* + * huge_pte_offset(): Walk the hugetlb pgtable until the last level PTE. + * Returns the pte_t* if found, or NULL if the address is not mapped. + * + * Since this function will walk all the pgtable pages (including not only + * high-level pgtable page, but also PUD entry that can be unshared + * concurrently for VM_SHARED), the caller of this function should be + * responsible of its thread safety. One can follow this rule: + * + * (1) For private mappings: pmd unsharing is not possible, so it'll + * always be safe if we're with the mmap sem for either read or write. + * This is normally always the case, IOW we don't need to do anything + * special. + * + * (2) For shared mappings: pmd unsharing is possible (so the PUD-ranged + * pgtable page can go away from under us! It can be done by a pmd + * unshare with a follow up munmap() on the other process), then we + * need either: + * + * (2.1) hugetlb vma lock read or write held, to make sure pmd unshare + * won't happen upon the range (it also makes sure the pte_t we + * read is the right and stable one), or, + * + * (2.2) hugetlb mapping i_mmap_rwsem lock held read or write, to make + * sure even if unshare happened the racy unmap() will wait until + * i_mmap_rwsem is released. + * + * Option (2.1) is the safest, which guarantees pte stability from pmd + * sharing pov, until the vma lock released. Option (2.2) doesn't protect + * a concurrent pmd unshare, but it makes sure the pgtable page is safe to + * access. + */ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long sz); unsigned long hugetlb_mask_last_page(struct hstate *h); From patchwork Tue Nov 29 19:35:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27462 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535321wrr; Tue, 29 Nov 2022 11:48:20 -0800 (PST) X-Google-Smtp-Source: AA0mqf4W3N1r0Q/oao8gFrXGFGciBlvUkmSoTFupCK5Ti/fOqkcdLaOosnfEKeaREzauy22MsEbz X-Received: by 2002:aa7:d85a:0:b0:46b:81a8:1ff6 with SMTP id f26-20020aa7d85a000000b0046b81a81ff6mr4162241eds.174.1669751300816; Tue, 29 Nov 2022 11:48:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751300; cv=none; d=google.com; s=arc-20160816; b=OKXBfS7c8oFjzaWgy8/8xEdIV8eV6d3yWvDgIozP2Ny1aLXap7j6ZeuRkPZ3Pz7Bfs kmiXN1ga9MNmg96ZIB1MQkI2qinpiUb2P/NPObtCBoiRgekiOjLbrgd2q02y2lHbz24a baFnsBIImdey+YaWTowUzlC5yNEhp2oXEWWuAntCw06+rtgNze3908TDQYsUlYKKBvuz pCQ+fCleEq6VoRNqoqgkcqfHOISSGhnF5nfJst5a2utIlQjHP2rddJBIUVSSxddJoQ9x n8rM/0YfeK7ajJpQmRP5a+5OMIuKKLq64Ov/7XKJ4m28PUHW0gT9I02Oz424Dy4QBvb8 d9oQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Oh9DqPSDqDrKnK09kN0HQudmWQU6fy0ZbcaOAGSajb0=; b=XgsEiaCLZr++mJwzEfKcQz2Z3NS1YlD2eo2kOTAPPYTPfVs9whZpoCYiH/cAlg7QXh MpBaF1AdIvuByoJEucB0ZlXTXU0kQNf3Y/UM1EBwjzRLHTMfpMMqkB4D5CPCoLma6dR6 jZ86jdUTElmpvI8Yt/4OS1AE53RaMOGeeSNBY2UGLiHafiv/4k1Fr5fHQafZBIe4DzPH JLp642Peq1gq1ju8NwYZF2O/ibRhDEt0lQ0f6mkuYvGt3n8RDa6pwAVJJ5VjlYb3FwrL s52XJld0KlxY5/D9ly0uUzDVvgcFhRLVhPDySmSy6A8+8SEF3aVNRbXEIOYn/rNfZ2iw VMqA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PgzLfkJN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a4-20020a05640213c400b0046314dd20a0si10949684edx.3.2022.11.29.11.47.57; Tue, 29 Nov 2022 11:48:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PgzLfkJN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237058AbiK2TiH (ORCPT + 99 others); Tue, 29 Nov 2022 14:38:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237106AbiK2ThH (ORCPT ); Tue, 29 Nov 2022 14:37:07 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94DD132070 for ; Tue, 29 Nov 2022 11:35:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750538; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Oh9DqPSDqDrKnK09kN0HQudmWQU6fy0ZbcaOAGSajb0=; b=PgzLfkJNxZ5d9+Yym5+DzY7RUvi8UjALwBt/Kotvz/PUfG98cfxnki0sKSd2iXeD5IORqu 215/yP8l8aviOV1tEGUsFJFZ5KhbsGbmUQw5gxQ/gotCFDpooWrIPr234jOxs8egttJDM0 DcbbkFt4uWMMgx6hgZ1vus1t6cUdL/A= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-275-BzWUi0sAMuaB9OIsoZw15A-1; Tue, 29 Nov 2022 14:35:37 -0500 X-MC-Unique: BzWUi0sAMuaB9OIsoZw15A-1 Received: by mail-qk1-f198.google.com with SMTP id i17-20020a05620a249100b006fa2e10a2ecso31818911qkn.16 for ; Tue, 29 Nov 2022 11:35:37 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Oh9DqPSDqDrKnK09kN0HQudmWQU6fy0ZbcaOAGSajb0=; b=fYsesGuO3X6DKk+bZXR1ph/Oc+rQBzHlbkIVrHdqTKpcmk2z6wu2g+HuBVahdSNrEv tjv/jvB8aLVlI3D0wI3pQ6fNZ4wz8CujnyYRovVdMF17drIyl8sitaMsroe2TSAbhT5U 0hs1Uxqwqxpp2+JaL0wirplPOorfgSNpaAjr4IgYpnxqIkBCc/SRD24RQ9fvShYq/nBu X9Nz66wQumk/HmsT5oz+aRseO9kG3rNw6zKQT3Ly8WhscejviAGZ/c27YKWZYCn8Hdxr +RYEdtFC3sTPr+1R/5TkO8tdc2kimyLVuYc8StFkkS9uKJpjOl82eF/kS8ndFlxMfVRY +Ftg== X-Gm-Message-State: ANoB5pnS4jbEklCd/VBsTAII3kKZjv0WaMtPsZnGrddD3MO2uBYI1THa r0znETUnD5QhO1YknmTCsGbh13SwxKagC0xJVsz9HJtw7znShDXwG96H7xTHQ8ADzJZS0j/HYwS Md5N+355JmuIuGYj6nY+DxSIL X-Received: by 2002:a05:620a:a07:b0:6fa:438d:c86f with SMTP id i7-20020a05620a0a0700b006fa438dc86fmr51165974qka.712.1669750536642; Tue, 29 Nov 2022 11:35:36 -0800 (PST) X-Received: by 2002:a05:620a:a07:b0:6fa:438d:c86f with SMTP id i7-20020a05620a0a0700b006fa438dc86fmr51165954qka.712.1669750536397; Tue, 29 Nov 2022 11:35:36 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:36 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 04/10] mm/hugetlb: Move swap entry handling into vma lock when faulted Date: Tue, 29 Nov 2022 14:35:20 -0500 Message-Id: <20221129193526.3588187-5-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861140251992437?= X-GMAIL-MSGID: =?utf-8?q?1750861140251992437?= In hugetlb_fault(), there used to have a special path to handle swap entry at the entrance using huge_pte_offset(). That's unsafe because huge_pte_offset() for a pmd sharable range can access freed pgtables if without any lock to protect the pgtable from being freed after pmd unshare. Here the simplest solution to make it safe is to move the swap handling to be after the vma lock being held. We may need to take the fault mutex on either migration or hwpoison entries now (also the vma lock, but that's really needed), however neither of them is hot path. Note that the vma lock cannot be released in hugetlb_fault() when the migration entry is detected, because in migration_entry_wait_huge() the pgtable page will be used again (by taking the pgtable lock), so that also need to be protected by the vma lock. Modify migration_entry_wait_huge() so that it must be called with vma read lock held, and properly release the lock in __migration_entry_wait_huge(). Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz --- include/linux/swapops.h | 6 ++++-- mm/hugetlb.c | 32 +++++++++++++++----------------- mm/migrate.c | 25 +++++++++++++++++++++---- 3 files changed, 40 insertions(+), 23 deletions(-) diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 27ade4f22abb..09b22b169a71 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -335,7 +335,8 @@ extern void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address); #ifdef CONFIG_HUGETLB_PAGE -extern void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl); +extern void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl); extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); #endif /* CONFIG_HUGETLB_PAGE */ #else /* CONFIG_MIGRATION */ @@ -364,7 +365,8 @@ static inline void __migration_entry_wait(struct mm_struct *mm, pte_t *ptep, static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { } #ifdef CONFIG_HUGETLB_PAGE -static inline void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) { } +static inline void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl) { } static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { } #endif /* CONFIG_HUGETLB_PAGE */ static inline int is_writable_migration_entry(swp_entry_t entry) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index dfe677fadaf8..776e34ccf029 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5826,22 +5826,6 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, int need_wait_lock = 0; unsigned long haddr = address & huge_page_mask(h); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); - if (ptep) { - /* - * Since we hold no locks, ptep could be stale. That is - * OK as we are only making decisions based on content and - * not actually modifying content here. - */ - entry = huge_ptep_get(ptep); - if (unlikely(is_hugetlb_entry_migration(entry))) { - migration_entry_wait_huge(vma, ptep); - return 0; - } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) - return VM_FAULT_HWPOISON_LARGE | - VM_FAULT_SET_HINDEX(hstate_index(h)); - } - /* * Serialize hugepage allocation and instantiation, so that we don't * get spurious allocation failures if two CPUs race to instantiate @@ -5888,8 +5872,22 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * fault, and is_hugetlb_entry_(migration|hwpoisoned) check will * properly handle it. */ - if (!pte_present(entry)) + if (!pte_present(entry)) { + if (unlikely(is_hugetlb_entry_migration(entry))) { + /* + * Release fault lock first because the vma lock is + * needed to guard the huge_pte_lockptr() later in + * migration_entry_wait_huge(). The vma lock will + * be released there. + */ + mutex_unlock(&hugetlb_fault_mutex_table[hash]); + migration_entry_wait_huge(vma, ptep); + return 0; + } else if (unlikely(is_hugetlb_entry_hwpoisoned(entry))) + ret = VM_FAULT_HWPOISON_LARGE | + VM_FAULT_SET_HINDEX(hstate_index(h)); goto out_mutex; + } /* * If we are going to COW/unshare the mapping later, we examine the diff --git a/mm/migrate.c b/mm/migrate.c index 267ad0d073ae..c13c828d34f3 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -326,24 +326,41 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, } #ifdef CONFIG_HUGETLB_PAGE -void __migration_entry_wait_huge(pte_t *ptep, spinlock_t *ptl) +void __migration_entry_wait_huge(struct vm_area_struct *vma, + pte_t *ptep, spinlock_t *ptl) { pte_t pte; + /* + * The vma read lock must be taken, which will be released before + * the function returns. It makes sure the pgtable page (along + * with its spin lock) not be freed in parallel. + */ + hugetlb_vma_assert_locked(vma); + spin_lock(ptl); pte = huge_ptep_get(ptep); - if (unlikely(!is_hugetlb_entry_migration(pte))) + if (unlikely(!is_hugetlb_entry_migration(pte))) { spin_unlock(ptl); - else + hugetlb_vma_unlock_read(vma); + } else { + /* + * If migration entry existed, safe to release vma lock + * here because the pgtable page won't be freed without the + * pgtable lock released. See comment right above pgtable + * lock release in migration_entry_wait_on_locked(). + */ + hugetlb_vma_unlock_read(vma); migration_entry_wait_on_locked(pte_to_swp_entry(pte), NULL, ptl); + } } void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, pte); - __migration_entry_wait_huge(pte, ptl); + __migration_entry_wait_huge(vma, pte, ptl); } #endif From patchwork Tue Nov 29 19:35:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27463 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535374wrr; Tue, 29 Nov 2022 11:48:27 -0800 (PST) X-Google-Smtp-Source: AA0mqf5mWmA7IWe9EvI2Aa8LD/ygdzsNrXCHN7PFG0rDR9VbI1N2BfhgcEuqE7qpKcidl19ac7a6 X-Received: by 2002:a17:906:a242:b0:7c0:8889:92b with SMTP id bi2-20020a170906a24200b007c08889092bmr3740701ejb.439.1669751307410; Tue, 29 Nov 2022 11:48:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751307; cv=none; d=google.com; s=arc-20160816; b=kKh37FbZZEzoc8JBUmFNTa//Y5MoWIf/zG/y0AyX3TSqUqegejBnKQFy7rYEllbY9h YoJBqIbSqnWSSYR4K+9K5UX/SXUJzrlcJ2NiDZlp1MRNIX1vscP4o51M7/7Ft9+D9tom TNEKypbWr5m6qpoXgdVEXsvYL+LcnhQi5XFU6EPQSr/lBF3jouszev5FljrsWPaxoJVO igMPwno6XPKCJvhktv2e6iaf1ZjCfz9/qmZFgeytAcWbAt5xCCQ6kxl3n3jV5Tu816Oo aVZxTghnPSMjFItMAw1K6fKTMcsbkTVkF2mJykn5i0kVIzMsV7471Foyr6nreDIMDqRw ILFw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=U2GgktAyXhDRBB8fn5rbcYoK/H0Sll/0lbzArKYhQzY=; b=SdZxyE7PFQbT0RIZz+EgpN/axbj9j9aHC4dje6Q5E6uiftOp/oG6QQnG6dHd1pQuh5 UBPQGSK1nxlH2QUtBMY2DtA/hAwjj2gqhdbYmYKf/Af1bbGGEv6xj//3jEXPVAcZGsBC 376XBEy+TTuUO6zDnRoI69hIxwmUCl9GKgz+70cdjTBz7EDSybaT/DhN2HuvkyaN2E6A Y99nXO/XTpQq/d0zSiRO85aSwUvvNMya3WOQJWF2Sgtdq5afp6Yu0YYhGXPflOY9zYTe uY+INwLaGSSkjc5lzu/3Tp8+CzDEGRvImE56iUdgWUKh+Va/EDxs+5KBwiXK3d5Vu0zg 0IFA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HOXLYMVv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lg15-20020a170906f88f00b007a8c58b51a1si10781413ejb.179.2022.11.29.11.48.03; Tue, 29 Nov 2022 11:48:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=HOXLYMVv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237097AbiK2Tib (ORCPT + 99 others); Tue, 29 Nov 2022 14:38:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49284 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237141AbiK2ThP (ORCPT ); Tue, 29 Nov 2022 14:37:15 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0139950D58 for ; Tue, 29 Nov 2022 11:35:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750540; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=U2GgktAyXhDRBB8fn5rbcYoK/H0Sll/0lbzArKYhQzY=; b=HOXLYMVvhzztEaPIwe4aXANqPDSJyGxCzx5yIJu7YQPpDhqU0aG0azJHaxV0ifm3yaxoVL 8mbRB9uzCSomu6jCnX1j07MV+XwZJQDxFm+YjPzYngTDBW8WENpX0D26bXy8Y9GD+uImFl lbPn2T0j+HMv1vaT1/xMOYYTTCzOLd4= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-505-lbRug1EjMHWUwpBv52iZPQ-1; Tue, 29 Nov 2022 14:35:38 -0500 X-MC-Unique: lbRug1EjMHWUwpBv52iZPQ-1 Received: by mail-qk1-f197.google.com with SMTP id w14-20020a05620a424e00b006fc46116f7dso30274291qko.12 for ; Tue, 29 Nov 2022 11:35:38 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U2GgktAyXhDRBB8fn5rbcYoK/H0Sll/0lbzArKYhQzY=; b=SeQpGKgodsfAn5gimG4Bn9WV9811ovWBM6clWcnQoVuBM6aFb95EZqqL2nnM1BxT08 A2IIxjUQgEJoqeOIneFBDxL/C58H27pfRealAwkK657KA5eFlPZr2NPYZUeKB/khshDl FW6MdXbq8AztRmmcnYceeit2pBOZMDSSfSAlvNFTtGhzYJon/gwETCI3a51SWRJWVUoe BVYFo+b/yREWjuz1PiqoO7OSJwS89Nj6pG2fFtHksZjBfHuhifKQ5DobPUWT8ubB0Z+4 tformfiLKe04CdamWnkn/zr6T1Zf4iMo97U1U3aihkTH2u/D7mDwUfqW6yLuplvkkEZL 2mbw== X-Gm-Message-State: ANoB5pnVEkgI//4ENiMna2uxeE7kgpwOt8du4AmkNuKlg4WaXQsI0znq Cqw5BBiJQsPLaYKiPJ9eWR0KhQrfU7se8JiTCz7DP1lTCsdwi0eWLOmc78EKtHg4I6wZXVneczr 7v6VXn70VufCz9oXdLcbYgTAx X-Received: by 2002:ad4:5a12:0:b0:4c6:cfb3:461f with SMTP id ei18-20020ad45a12000000b004c6cfb3461fmr29830214qvb.18.1669750538313; Tue, 29 Nov 2022 11:35:38 -0800 (PST) X-Received: by 2002:ad4:5a12:0:b0:4c6:cfb3:461f with SMTP id ei18-20020ad45a12000000b004c6cfb3461fmr29830188qvb.18.1669750538086; Tue, 29 Nov 2022 11:35:38 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:37 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 05/10] mm/hugetlb: Make userfaultfd_huge_must_wait() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:21 -0500 Message-Id: <20221129193526.3588187-6-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861146945800689?= X-GMAIL-MSGID: =?utf-8?q?1750861146945800689?= We can take the hugetlb walker lock, here taking vma lock directly. Signed-off-by: Peter Xu Reviewed-by: David Hildenbrand Reviewed-by: Mike Kravetz --- fs/userfaultfd.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 07c81ab3fd4d..a602f008dde5 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -376,7 +376,8 @@ static inline unsigned int userfaultfd_get_blocking_state(unsigned int flags) */ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) { - struct mm_struct *mm = vmf->vma->vm_mm; + struct vm_area_struct *vma = vmf->vma; + struct mm_struct *mm = vma->vm_mm; struct userfaultfd_ctx *ctx; struct userfaultfd_wait_queue uwq; vm_fault_t ret = VM_FAULT_SIGBUS; @@ -403,7 +404,7 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) */ mmap_assert_locked(mm); - ctx = vmf->vma->vm_userfaultfd_ctx.ctx; + ctx = vma->vm_userfaultfd_ctx.ctx; if (!ctx) goto out; @@ -493,6 +494,13 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) blocking_state = userfaultfd_get_blocking_state(vmf->flags); + /* + * This stablizes pgtable for hugetlb on e.g. pmd unsharing. Need + * to be before setting current state. + */ + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_lock_read(vma); + spin_lock_irq(&ctx->fault_pending_wqh.lock); /* * After the __add_wait_queue the uwq is visible to userland @@ -507,13 +515,15 @@ vm_fault_t handle_userfault(struct vm_fault *vmf, unsigned long reason) set_current_state(blocking_state); spin_unlock_irq(&ctx->fault_pending_wqh.lock); - if (!is_vm_hugetlb_page(vmf->vma)) + if (!is_vm_hugetlb_page(vma)) must_wait = userfaultfd_must_wait(ctx, vmf->address, vmf->flags, reason); else - must_wait = userfaultfd_huge_must_wait(ctx, vmf->vma, + must_wait = userfaultfd_huge_must_wait(ctx, vma, vmf->address, vmf->flags, reason); + if (is_vm_hugetlb_page(vma)) + hugetlb_vma_unlock_read(vma); mmap_read_unlock(mm); if (likely(must_wait && !READ_ONCE(ctx->released))) { From patchwork Tue Nov 29 19:35:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27467 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535505wrr; Tue, 29 Nov 2022 11:48:41 -0800 (PST) X-Google-Smtp-Source: AA0mqf5f6ZTvRnyTUyvlt9eS3SP1uiX6uU3QLcwQsPf/farfVg38nXvRTEabwPVE8wVMsAOJ3/4n X-Received: by 2002:a17:906:8a4d:b0:7b5:73aa:9988 with SMTP id gx13-20020a1709068a4d00b007b573aa9988mr39413275ejc.597.1669751321345; Tue, 29 Nov 2022 11:48:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751321; cv=none; d=google.com; s=arc-20160816; b=0AmBomM2HTb7owCIJHscFYBDA3WCjf1oYtQgjaAqx3W59+2Ne4ytjuAvpE7lz0YDXK 8yvxgyjAPDvXFjCJCW1lbZbZnHahvSdG+j9mTzCWcsSqQoYOtV54e8PwFlQfA4JM7XZv QFJ2KsDhP1RGxcayoUVVW+vNUCXZtTeSPNtc9zkkCmmL874jfEX5dNuk7oOuxCRBEd6Q IhNESZXHYkeeabr+sZkEPqShy2rlb/daR3T77VEGVpzSqbaohEkO6C6sX67j6c6TO48p hI/2PD7agm5kOZnDMptE114byly2uGxW1h0TDy3I4jIhDxugboBqX67kYcvjJ9GO8y4E UN3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=4wYg4N/GqmDYcLEMWI8FN15PYOVGEp4Ds74Kweju+do=; b=wdronpBooFIm2WKxt231PPqWQwYGz+LD5E440xNS6uVQUwqOj1bkTyEI0eVk91BH/Q vuN9bKsY7W/lNoZY3NpZHaST6JxX8KSsV1rPej/qfulZDxWWUR9eHs9fC5Wtmhnn1JWE fUpma1g9ngBMY8OwSRdoX9rlU8a0K0EQ6g4N9GqTvWm1pOnejm2/e6S8W6bb3JMacJMP dKqfulcEytdD2SL4FQMKUqm/PMXoCB3/lx51J48nRoUIe8CYm3rAaKA280L95LtF35dC jwDDA1avoHTco6nAPAEkXuSftn44yt16WLY1DjDA0Bj8JX7tS2Y3ubHttB/mRLAKOKd1 SkCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="VL+TY/0M"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t1-20020a50d701000000b0046744d9fff8si1288721edi.348.2022.11.29.11.48.16; Tue, 29 Nov 2022 11:48:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="VL+TY/0M"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237107AbiK2Tiz (ORCPT + 99 others); Tue, 29 Nov 2022 14:38:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236981AbiK2Th1 (ORCPT ); Tue, 29 Nov 2022 14:37:27 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EF7050D76 for ; Tue, 29 Nov 2022 11:35:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750543; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4wYg4N/GqmDYcLEMWI8FN15PYOVGEp4Ds74Kweju+do=; b=VL+TY/0MQDEO1WM70MooWGBlO02wzHzY2iLEhtaHgoKSvOnC8v70UuQ4g5Is9tS4JNPQms GEOtclbSalzJgO0yd1ac5Zvw1I2nfQrSLeQ1XWsSXHZhRk/HZ9VMNky5R6q1U+VNURX6dJ Q+cKD+JoCj2JDrVF3NMuhwiQ3w4ToyE= Received: from mail-qt1-f200.google.com (mail-qt1-f200.google.com [209.85.160.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-135-WlClVYzNNPCdJxpBjrExTw-1; Tue, 29 Nov 2022 14:35:40 -0500 X-MC-Unique: WlClVYzNNPCdJxpBjrExTw-1 Received: by mail-qt1-f200.google.com with SMTP id n12-20020ac85a0c000000b003a5849497f9so23431862qta.20 for ; Tue, 29 Nov 2022 11:35:40 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4wYg4N/GqmDYcLEMWI8FN15PYOVGEp4Ds74Kweju+do=; b=sOJXNsWpyq1/zwKn+hSfelw4dkfmTLAK6H6PtMKyDDcz/cnsmvJ+xOX+8eiQftk6ZJ YFs7Pnl5RrrFyD/6s4MfRKBRy+nXQxF/jUZj8zjVkLTOS+ZxnfDfa2nyIZN/0PNy0g9x NLSbxwhjuNbCcEAF6dgVjxwquMR+QVmi9cAFtdzaanGw/qbNSRd3udBj8RTKPZJ3DnLW xsGAV5MgiHg/Hx2zrT0BfG5c6ICwlcCWDdgo5fO1RS4vK7fi04bmqp1lFLscB10J7wZ4 hXJ/gS6hEM6lOfS1T/xuoI1JsT/FzjTyzZnJd7gPK2lJ8/OQi7FGGZjL6mGaYsnRb1gN M8Ow== X-Gm-Message-State: ANoB5pl2OkPa0TDmQq+prO9rd3syMMmW3GO8+HWGdrzXcVoLpdfKnKAW VegOEVpTx3JBuYs4dtjNaG//hloM+/jsjaZav2Va/9Y9WtDMm7Pj3a3CmC1vZr075U3eZcyOz8n aFYuB2r3HF0g+yevXk9SG9eOA X-Received: by 2002:a05:6214:207:b0:4c6:4ac0:12c1 with SMTP id i7-20020a056214020700b004c64ac012c1mr37457103qvt.111.1669750539639; Tue, 29 Nov 2022 11:35:39 -0800 (PST) X-Received: by 2002:a05:6214:207:b0:4c6:4ac0:12c1 with SMTP id i7-20020a056214020700b004c64ac012c1mr37457088qvt.111.1669750539374; Tue, 29 Nov 2022 11:35:39 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:38 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 06/10] mm/hugetlb: Make hugetlb_follow_page_mask() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:22 -0500 Message-Id: <20221129193526.3588187-7-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861161491625873?= X-GMAIL-MSGID: =?utf-8?q?1750861161491625873?= Since hugetlb_follow_page_mask() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- mm/hugetlb.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 776e34ccf029..d6bb1d22f1c4 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6232,9 +6232,10 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, if (WARN_ON_ONCE(flags & FOLL_PIN)) return NULL; + hugetlb_vma_lock_read(vma); pte = huge_pte_offset(mm, haddr, huge_page_size(h)); if (!pte) - return NULL; + goto out_unlock; ptl = huge_pte_lock(h, mm, pte); entry = huge_ptep_get(pte); @@ -6257,6 +6258,8 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, } out: spin_unlock(ptl); +out_unlock: + hugetlb_vma_unlock_read(vma); return page; } From patchwork Tue Nov 29 19:35:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27464 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535451wrr; Tue, 29 Nov 2022 11:48:34 -0800 (PST) X-Google-Smtp-Source: AA0mqf5yQNkwersUnDtcY4yfmFG1kvL5fRL5txvVNF04s43eBqH/05oJRWxEOc5BtLAMOXeHVfTs X-Received: by 2002:a17:906:5fcc:b0:7c0:8d00:7da3 with SMTP id k12-20020a1709065fcc00b007c08d007da3mr2938846ejv.192.1669751314324; Tue, 29 Nov 2022 11:48:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751314; cv=none; d=google.com; s=arc-20160816; b=SyPM4LpjKatu4kLq8BjdAW5UmNDGGyP93DBkXHzq9BweIJ+EANVnkTRhXLeFu7tKT6 gr5CLsE+xi+IU7DG3iX0YdnXvITQhhYlJXkCasuYHOSYBPduYhrcvT3RU2AcfLX9nG6g ObRgENlI1IipdYGzT2u7wXkAUj20xwm7FbWlHq7he/Ddvak2lDoCHbmqCerBSSSXzUxj Kllcu9a/ROFx1+Z8j+jkA9c4t31w1JeIxfXxT/YNmItZ4hs0CWO4rhbHvHxevm5MMUon wg2eMz/9xFno2F2ituWNsyzpGJ82UzpxgqpWCw/DGVg87E2Yg3KkHbNuLY9cSbv41RUi ejrg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wQQtMmya3oIPlns9VTTG82x3u2HOgOUNM9IGJVSq3FI=; b=dP+ogj+r+HBAf3Io6oti2ttzBAHEXpDF7dA/l3EHtTiWIC2Zjc0JmSnW5hA0DmiroD cJ3l4KHjR8PyWeldD/cg+saTfMwet8ryxgsTk3kgka72yzJpR+AMt/wZaJwXfobmisJM bLZUyfCP4q/ts9uFwR7Y0mJrJQiBSckrbMWy4yMkb4lMpEkOYhp2+tsSN3Y2dANUnfuV jVAlK8o13+0kUro0NJr32fkoJpOqsxo9+xJZpOa9GTRUAS/lfl4HpllWELZb+XdKBDzw RlzI6oBKGYlJaVT+n+gEMfIwRAPUDOHXl+IKPnMfjCtpwQgBQj1Y7qLfoONEBofMCNqj z1Gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BobxrCsn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gt31-20020a1709072d9f00b007bdf57f885esi9784477ejc.37.2022.11.29.11.48.09; Tue, 29 Nov 2022 11:48:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=BobxrCsn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236988AbiK2Tiv (ORCPT + 99 others); Tue, 29 Nov 2022 14:38:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49404 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236405AbiK2Th1 (ORCPT ); Tue, 29 Nov 2022 14:37:27 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D54F84B999 for ; Tue, 29 Nov 2022 11:35:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750543; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wQQtMmya3oIPlns9VTTG82x3u2HOgOUNM9IGJVSq3FI=; b=BobxrCsnxiTXACK5mU7SM9z9bp+hW+KdF+RtjvhQu6v5oKLys1Bo3QfI5cAgHI+VQwDS6P qGDDYEr9fnxNod8GXbDd1EL4V2U6jDatGfMNXStrdI905QFo1fU+ZhZVCk2J1Q/YBOei0G mD5cvCqgiXqGjOfB0trffnCzbI3fgjo= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-217-6HTHZ3SjP92ta5qqvLe3cw-1; Tue, 29 Nov 2022 14:35:41 -0500 X-MC-Unique: 6HTHZ3SjP92ta5qqvLe3cw-1 Received: by mail-qv1-f72.google.com with SMTP id ng1-20020a0562143bc100b004bb706b3a27so21657804qvb.20 for ; Tue, 29 Nov 2022 11:35:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wQQtMmya3oIPlns9VTTG82x3u2HOgOUNM9IGJVSq3FI=; b=0YeHLzClE0i0oo2WjGYyRr0gPapLOb91nob9GZ1L6Xk+ommYDuDWOaWCmjNAMZG0tP RhGlSTvELgKfYn7TxgBaDX6ZMw0Be0z2MBGcs77NZdMpYxITiPN4UryS6S2Q+J+N/vDi Q8GJGoALb+PrPmnzpjQJAg7ns28ZSShQNpst8RE3qXLRicOvRcmRtOR1gZDjUtCH1dYk lLS+AgVzq68CLCfH8cIDRH+nBf0m+l8ioqOVT8T8X6+hnijQL2rxwG1CI6OB8ORnTqNA n9ZxQlxouYuQZw+z4UouDpvF/MJkltRALNAcD6gQ7Tu12GIcca9LUm6DX9iJP0d+rrNy yn3g== X-Gm-Message-State: ANoB5pk3oGtAaHV1DBpYXPfdZITRIeB9p/zV8t2xye9XwqnY4Kndnqq5 pCLKZ/cuPZCqavkWrDYyH3VmHfSKAQOIX/XV3XTf7NIeHGjP6nZs86snnLk7aDNH2C7nwNOdP1b dB0vLt9gzkDEkeE5SwoLKCDfL X-Received: by 2002:a05:622a:4891:b0:3a5:280a:3c9c with SMTP id fc17-20020a05622a489100b003a5280a3c9cmr37951512qtb.282.1669750540822; Tue, 29 Nov 2022 11:35:40 -0800 (PST) X-Received: by 2002:a05:622a:4891:b0:3a5:280a:3c9c with SMTP id fc17-20020a05622a489100b003a5280a3c9cmr37951490qtb.282.1669750540508; Tue, 29 Nov 2022 11:35:40 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:40 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 07/10] mm/hugetlb: Make follow_hugetlb_page() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:23 -0500 Message-Id: <20221129193526.3588187-8-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861154255847540?= X-GMAIL-MSGID: =?utf-8?q?1750861154255847540?= Since follow_hugetlb_page() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- mm/hugetlb.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d6bb1d22f1c4..df645a5824e3 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6290,6 +6290,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, break; } + hugetlb_vma_lock_read(vma); /* * Some archs (sparc64, sh*) have multiple pte_ts to * each hugepage. We have to make sure we get the @@ -6314,6 +6315,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, !hugetlbfs_pagecache_present(h, vma, vaddr)) { if (pte) spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); remainder = 0; break; } @@ -6335,6 +6337,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, if (pte) spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); + if (flags & FOLL_WRITE) fault_flags |= FAULT_FLAG_WRITE; else if (unshare) @@ -6394,6 +6398,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, remainder -= pages_per_huge_page(h); i += pages_per_huge_page(h); spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); continue; } @@ -6421,6 +6426,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, if (WARN_ON_ONCE(!try_grab_folio(pages[i], refs, flags))) { spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); remainder = 0; err = -ENOMEM; break; @@ -6432,6 +6438,7 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, i += refs; spin_unlock(ptl); + hugetlb_vma_unlock_read(vma); } *nr_pages = remainder; /* From patchwork Tue Nov 29 19:35:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27465 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535457wrr; Tue, 29 Nov 2022 11:48:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf4pfW06nYNZZkkG+KFyDAgLh0ASLPYDs2aqXZJ0RRINR+Kmu2t73tDNaysCqBv+sMXsggFv X-Received: by 2002:a17:902:e5c6:b0:189:a50d:2a1d with SMTP id u6-20020a170902e5c600b00189a50d2a1dmr1237386plf.18.1669751314903; Tue, 29 Nov 2022 11:48:34 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751314; cv=none; d=google.com; s=arc-20160816; b=a+nxXflQKumpfCI3XK2m3UOFzGjo8RJdXBaUbUcqtxq4Z7x/Z76jDFbqLxt6iHBa2z +aEMoKZH7iXi33Ut8RGptm/vCurliko08X1yqGVhOw25WtStufjlHwVfQGmxz1nDH7Qz nA8lPQlUmtTKnYyHL+I7SqGaZ6WPLJf31a2vUPHQlgcPj55A0y5MhOPCxwWsCpm6KNUQ GQt9U7jaBHXFEGhWARlRbCZQ74ZWO5WIosyKW77ffRj54d55JG8XbpZWqbmzqsVnDx2Y nH5OhV5ROCdu0TawzcOE4lux52c66OQty2tBfvRlKcENiQ/AtZXznLWNspXpJVUmfomq o6Ew== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tlgp91ck5zE/ROfq0OY9j8l1q/C3T3fwuEiiEDoRd8M=; b=G8mNrDbOIYD7EJf11l/kZpPOMgE1qQgdOUTbi5/5yK7gUzkfW1pAxyhEolVIK1pfZb Y5oLVMhiUbPXrTfxq5z5EEef+bPsXNP4LKsuOrbwAOf1xTpwsGKzOvcceZbVF9NuEDAu WhPR15OqLzu/GpD3RGf0TAOgmSe+mB4xar5srCuJRowht58fBtOI574LG2OwKdGX1Yop 7Y1gFPLwg2cOv/GHNJ1lDaH/kNpCcX9eRHDfkBKhAyZhDlEfmlW/swZIeq3uOrRACQNA bBzFTOuRPou3I/EoGTOzLp7AoUH6yFufETMELBfwgmU+dmj5+7vK/ePStYdTckgB7AKa 3MrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PLmp9qAA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d7-20020a656b87000000b0047829d1b8f0si5587086pgw.738.2022.11.29.11.48.20; Tue, 29 Nov 2022 11:48:34 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=PLmp9qAA; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237110AbiK2Ti6 (ORCPT + 99 others); Tue, 29 Nov 2022 14:38:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236996AbiK2Th3 (ORCPT ); Tue, 29 Nov 2022 14:37:29 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 952C15917E for ; Tue, 29 Nov 2022 11:35:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750544; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tlgp91ck5zE/ROfq0OY9j8l1q/C3T3fwuEiiEDoRd8M=; b=PLmp9qAAVoXZ/DLCrhHauwbfIJ8eucPnqbLK/nRBJIKfSy2xRYyOfG9cpnvH4yNsOCxf60 ONlKONJCwUJht1zh2iZ82SSNfzcMDGnyIEYpeBPBqEMkp7MCCQCSIvZDQqthMmEHJpg43b 9OyN7osSfnd1FAHdBbWORvnupPB2HUQ= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-413-6Mz9jCuLNK6tvdNUgjXp2w-1; Tue, 29 Nov 2022 14:35:43 -0500 X-MC-Unique: 6Mz9jCuLNK6tvdNUgjXp2w-1 Received: by mail-qk1-f197.google.com with SMTP id u5-20020a05620a0c4500b006fb30780443so32368012qki.22 for ; Tue, 29 Nov 2022 11:35:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tlgp91ck5zE/ROfq0OY9j8l1q/C3T3fwuEiiEDoRd8M=; b=lWfkIv9xg8ZejBv4f/6TxbEFtPTmtTJQhuFtb0mA6ngyiFi9eOLCwYX7bKa849ZW5r vw0KTQFp9XXfCAJhd8JZ4ECUv+Cs3PFq050gxKhdS4PWKB9cy+i0FMf+XIkyVUNTwvPX yTA8MlD/RWK1tBUw+vOJ41Y0VCXuu7Vs5Rn+eDmvxlBF3s3VOgtK4GDI3wbnWfLXME/M kakemfAKk3N/R3sUuQDPM7G/HCwCgNtIzg2j70OAvzhRYP03smopomcvKTm335plKh1K lqYTT5LLwtzZd38ceCAv5SVshAUjD9F5duU/dySEIbdLU5e5n17Q4b90t2D7HUdcnF02 +13w== X-Gm-Message-State: ANoB5pll1dXkrF4iD7lPIt4RtIeXwNBlYmGoJwfBKj1GNgSi6QkrP1Hq 6RnwMuBfiF5xrJ+xY8OXjlx7q+OF292wZ2rJqlxlFVhK22ZarBKgfIZ7/9qo8giEJeozByz/Wgg NPyTwmRvaGoNFsJUFm4QhNXXZ X-Received: by 2002:a05:6214:3607:b0:4c6:fb3e:4993 with SMTP id nv7-20020a056214360700b004c6fb3e4993mr12852435qvb.110.1669750542529; Tue, 29 Nov 2022 11:35:42 -0800 (PST) X-Received: by 2002:a05:6214:3607:b0:4c6:fb3e:4993 with SMTP id nv7-20020a056214360700b004c6fb3e4993mr12852419qvb.110.1669750542285; Tue, 29 Nov 2022 11:35:42 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:41 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 08/10] mm/hugetlb: Make walk_hugetlb_range() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:24 -0500 Message-Id: <20221129193526.3588187-9-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861154382093342?= X-GMAIL-MSGID: =?utf-8?q?1750861154382093342?= Since walk_hugetlb_range() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- mm/pagewalk.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 7f1c9b274906..d98564a7be57 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -302,6 +302,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, const struct mm_walk_ops *ops = walk->ops; int err = 0; + hugetlb_vma_lock_read(vma); do { next = hugetlb_entry_end(h, addr, end); pte = huge_pte_offset(walk->mm, addr & hmask, sz); @@ -314,6 +315,7 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, if (err) break; } while (addr = next, addr != end); + hugetlb_vma_unlock_read(vma); return err; } From patchwork Tue Nov 29 19:35:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27466 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535500wrr; Tue, 29 Nov 2022 11:48:40 -0800 (PST) X-Google-Smtp-Source: AA0mqf6lXITAdYfhJT4KdLzcFxlYnYZO5IZo/rjET4EjIITxPRgIri4YdVVKAK3+x+QyGJHXqNlX X-Received: by 2002:a17:902:ab8d:b0:17f:8232:257a with SMTP id f13-20020a170902ab8d00b0017f8232257amr39189076plr.138.1669751320642; Tue, 29 Nov 2022 11:48:40 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751320; cv=none; d=google.com; s=arc-20160816; b=u6eUAPu/vi2IvdFl18NU8fRdsk6sooh8WCN8syOhC/tQC53lfHBaK8isipx2Gte8qb 7FJSfFPhoY0pK5pBcIFUXHkYIPZqyuQS+lcRg7e2ifxTYQkGey6GKA/EfKvxMbb5F5bZ +c/KIpLGjm5Zue3CKhMcK1HZdfn6y+sTH17s30fqHeHK7/I8q4HbarJRc+hfii2GDeO1 YsyexZiMjsVhMmJ8WOJO2kaZcFw9JLlE8v4HarV8rVV1FWThYYkZsPUAAkIgHjO23mcv qp/nJ9g12W1O0VHNYqHsPAMDdmRTwq4pRWyE2ErgoP95xHZuihOwjstRJ6JxOKXBh0fR yDOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=T6R1JTF+AaMAR6h+PxJxqsbJoPo1sRZuxu2H68hPMl4=; b=FC48p3rV4UT02efF0RcyMxtYPRJWjim9H1aa7Am7aIDEJbDIc7BnUqDmgHlouc3OM/ LAX4kjTu+vh+jcfsgapZ32xz43NlxyJwFmML5wrniliGZCjRPyQqCg9DbNV4ISz985xZ 9BOSDmDysX4vzLBjFJZse+Au7HLgLF29Ey3T1ixGspyHPdO0db6Jw7l27hhndbrJ+iJM hvAGL4Qzd5crSasD9ewvh31+PRIcibkmaI+srOfPAIYNZ5J9o712Nko5F7hRQBgqB+vC aG3mApbch+0zmYoJChpeN4g0SU7pGkMYu9281HwUhUhwxIYLBgBk9/SYY1TchPgY1ZZV B/5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZyJNUbZY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j7-20020a654d47000000b00477bf7b0c43si15549480pgt.458.2022.11.29.11.48.27; Tue, 29 Nov 2022 11:48:40 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=ZyJNUbZY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237117AbiK2TjE (ORCPT + 99 others); Tue, 29 Nov 2022 14:39:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237009AbiK2The (ORCPT ); Tue, 29 Nov 2022 14:37:34 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C55F61B9E for ; Tue, 29 Nov 2022 11:35:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750546; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T6R1JTF+AaMAR6h+PxJxqsbJoPo1sRZuxu2H68hPMl4=; b=ZyJNUbZYcDJiXuNvY51fAT1dqbwwBU567CV9FlKVUidJW/2OMKqYsdI8rX0kJPYoQ5aKwm IBsMo71H7IAq8K8eTNWsIrNIQNdEAETQGglVgN7i10/42DG0rMSRxxDTUjiTKLn8C50kcO VmpY121zu8V7Q3rPfKl68Ly6YX0ZtBo= Received: from mail-qk1-f200.google.com (mail-qk1-f200.google.com [209.85.222.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-194-nlRha3X-Mq67M3YdpnzmwA-1; Tue, 29 Nov 2022 14:35:44 -0500 X-MC-Unique: nlRha3X-Mq67M3YdpnzmwA-1 Received: by mail-qk1-f200.google.com with SMTP id u5-20020a05620a0c4500b006fb30780443so32368208qki.22 for ; Tue, 29 Nov 2022 11:35:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T6R1JTF+AaMAR6h+PxJxqsbJoPo1sRZuxu2H68hPMl4=; b=zxN9CAUXaF+XXAlA4QMp79Z/t5PkTH95AZKxBV0xAuVvNHwE5meKkXDVcYpOaTg+N9 ORIiOiO9EiWNfqNeax+aPlyuhRdiMs56+pGBPUVNcvmoryXReBeoOCBox3uTeqUUHPHw RxtMVPhBtIwIxeDqv7xs1VTrx256o0+2f6HO9hflgiQTOO4DBiQeM2JG2GfEY3VF02tE 7GhvqWfLDqm0HLMELU1Gtky7cJTdCYUBBWQjvR1KMn4AJ/TKP9dUM3IOecbZUxJo+o6I 7X0tYQjBaBVsdKckpoYVPe1LMEhHeaPIfXiXS9TPH7HQxsX9s20bXniC5cp9lMpNKprc zOdA== X-Gm-Message-State: ANoB5pmBUZjVSTYCvL4qePWWtGOdkwb+e7gJf+Xw82YPDGrjLVvOzbHb ID/gW/n0SL7ZE9nGIdUyzft07Gf48vI8/VujPvPBf6uA5p9XigMKB/mmUkdaXvMY+aDCFznGw3U hi3E05q7P41ZmfVSs92kUk1nK X-Received: by 2002:ac8:5511:0:b0:3a5:ae62:7b5a with SMTP id j17-20020ac85511000000b003a5ae627b5amr54714914qtq.595.1669750543739; Tue, 29 Nov 2022 11:35:43 -0800 (PST) X-Received: by 2002:ac8:5511:0:b0:3a5:ae62:7b5a with SMTP id j17-20020ac85511000000b003a5ae627b5amr54714888qtq.595.1669750543457; Tue, 29 Nov 2022 11:35:43 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:43 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 09/10] mm/hugetlb: Make page_vma_mapped_walk() safe to pmd unshare Date: Tue, 29 Nov 2022 14:35:25 -0500 Message-Id: <20221129193526.3588187-10-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861161260715246?= X-GMAIL-MSGID: =?utf-8?q?1750861161260715246?= Since page_vma_mapped_walk() walks the pgtable, it needs the vma lock to make sure the pgtable page will not be freed concurrently. Signed-off-by: Peter Xu Acked-by: David Hildenbrand Reviewed-by: Mike Kravetz --- include/linux/rmap.h | 4 ++++ mm/page_vma_mapped.c | 5 ++++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index bd3504d11b15..a50d18bb86aa 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -13,6 +13,7 @@ #include #include #include +#include /* * The anon_vma heads a list of private "related" vmas, to scan if @@ -408,6 +409,9 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) pte_unmap(pvmw->pte); if (pvmw->ptl) spin_unlock(pvmw->ptl); + /* This needs to be after unlock of the spinlock */ + if (is_vm_hugetlb_page(pvmw->vma)) + hugetlb_vma_unlock_read(pvmw->vma); } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 93e13fc17d3c..f94ec78b54ff 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -169,10 +169,13 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) if (pvmw->pte) return not_found(pvmw); + hugetlb_vma_lock_read(vma); /* when pud is not present, pte will be NULL */ pvmw->pte = huge_pte_offset(mm, pvmw->address, size); - if (!pvmw->pte) + if (!pvmw->pte) { + hugetlb_vma_unlock_read(vma); return false; + } pvmw->ptl = huge_pte_lock(hstate, mm, pvmw->pte); if (!check_pte(pvmw)) From patchwork Tue Nov 29 19:35:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 27468 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp535548wrr; Tue, 29 Nov 2022 11:48:48 -0800 (PST) X-Google-Smtp-Source: AA0mqf7FdpnhVZa4BfzFC0lG5vHJoH6OkTb81x6QZvJYcn+TzMfjXapRzlTTfHO//4flVMJlING3 X-Received: by 2002:a05:6402:1c01:b0:467:621f:879e with SMTP id ck1-20020a0564021c0100b00467621f879emr53614545edb.380.1669751328809; Tue, 29 Nov 2022 11:48:48 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1669751328; cv=none; d=google.com; s=arc-20160816; b=kBrimmTm+yB257n016y7zYMKg/HkhEKhQJ+pcHwbZLXLvj4C/c7X2UAViwtCPTIsCo v4bXqXNkHScUKGpTVTJLnGnr0peA5VrXMA/b8JDabdcUTNBWWf31zsrRioJUo6fvh4s9 BmVAAidf9hg9Q4pjDrfGKJSFmLBpe2PKzA08qEpp2jsvAir2vAF+u7eNsJwUmTSpzox6 ElePBq77owjTwqOKCe7uhILXyH2zcESNwc33cJWbZHQJ4U1o912QQWMY2LYfVf/b5h6S EZgXbJK8yBv7A/Vr1kIJvVVwQEtUKQbAdtitG2iPie3NJNSrS0vf+fGlCPe+ygOrgr/a VGJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rIOUAJrVD+5ozo92pUGGgS61bSj8F1LiHfipiIThQWE=; b=wDRMA9Pao/oLfKsuJe6C0L3BV+drnrtJLik5qDzWG+VLF0FQYdaxR1J0rj79vils85 ZqQ6yDv4A5g6aO7bWAHqgSdIzXXoRNGj7IpJ8wyHg/5+wMoMrYs4juavHdE/P8xA4RfM ywdh67cTzaO+3ZtBaKA3js9sWXC6OCLiOxeuf7Tji5pN+GI1D8QU6KyjjR2lntsfvAr4 RSC6mIfLF2UXDt9fJSyEFXQQJE93bg5GaJMl4VfL1jMj56MllaEdZ5PPH/p7XRU2fJNz Zr1+TVWIai0XTjP/iIX8yxvFjljsMH6Z13e86H3UNGBfNe9R205zHYaO1LMy0WTyritC Pkgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Iq5ze4Ps; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l6-20020a056402254600b0045d8bff7afesi15428223edb.376.2022.11.29.11.48.25; Tue, 29 Nov 2022 11:48:48 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Iq5ze4Ps; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237112AbiK2TjC (ORCPT + 99 others); Tue, 29 Nov 2022 14:39:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236463AbiK2Thb (ORCPT ); Tue, 29 Nov 2022 14:37:31 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E930A64552 for ; Tue, 29 Nov 2022 11:35:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669750547; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=rIOUAJrVD+5ozo92pUGGgS61bSj8F1LiHfipiIThQWE=; b=Iq5ze4PsAccH1uiXcAwHSiBqOlhIGk1woxz/esJVgwoRyT8VpRMk3msDHPMXwIDym9sBMq uCQ7ljefwTpoZo+HKKC0Un20QEsolJXE9ybFxCDw1W07aQUWTRghQJ6Y20tzHzMyL8dwyZ Z4+Gm+16nC5TuTqL6Mf12SeKH+N4f/E= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-513-EclHKSCcPeq6MAIPNOW9Rg-1; Tue, 29 Nov 2022 14:35:45 -0500 X-MC-Unique: EclHKSCcPeq6MAIPNOW9Rg-1 Received: by mail-qk1-f199.google.com with SMTP id bj4-20020a05620a190400b006fc7c5d454cso16904884qkb.14 for ; Tue, 29 Nov 2022 11:35:45 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rIOUAJrVD+5ozo92pUGGgS61bSj8F1LiHfipiIThQWE=; b=GEvmQJ54yaIhkee5JbfE0oASzq7JdvPA3Kn1vWD1HmcJhQZzdb77m5Yb6g8u80eXLx p/jRlUDsIcM4Jn3jA8YWrMmvvgp8BFposc53t9j4P+ox1rVsDkdori+xio5V3BCZCxcE llYxJcbf0MLqvV2fgBcHnmM66gJUb6i7blum9Tcm8R1NZJhBZP9UB33ztcAQ9I0u/oNp V3vcWeF3m8HwXpIwOIW3qIyLCwuk3/SAphCfckxSpbIEWr0YnIOLpttHWTwqvrMnOAKK nRWkhiLt08501nY9yh5RhVvfhmC43RrXVRWvdsiF+P/teGLLJhNn8uys+wh+fCRtSYWx au0A== X-Gm-Message-State: ANoB5pn7KKlvAmf5n+1iFBbYLuvJyJAAwbNyilbcCJXUIrF8TbTE/KLc zeIHsVittbr96Tf5e7HmuRMpcpa9BT1qHU7rJ0eL2G1roZfh0RFWpDx1/z49PNnoMBVHQK5dDRD zRK3fuizULlXqPE7uc2BEeN6C X-Received: by 2002:ac8:47c5:0:b0:3a5:6a0e:db3c with SMTP id d5-20020ac847c5000000b003a56a0edb3cmr54798767qtr.398.1669750545250; Tue, 29 Nov 2022 11:35:45 -0800 (PST) X-Received: by 2002:ac8:47c5:0:b0:3a5:6a0e:db3c with SMTP id d5-20020ac847c5000000b003a56a0edb3cmr54798742qtr.398.1669750544917; Tue, 29 Nov 2022 11:35:44 -0800 (PST) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-46-70-31-27-79.dsl.bell.ca. [70.31.27.79]) by smtp.gmail.com with ESMTPSA id n1-20020a05620a294100b006fa16fe93bbsm11313013qkp.15.2022.11.29.11.35.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Nov 2022 11:35:44 -0800 (PST) From: Peter Xu To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: James Houghton , Jann Horn , peterx@redhat.com, Andrew Morton , Andrea Arcangeli , Rik van Riel , Nadav Amit , Miaohe Lin , Muchun Song , Mike Kravetz , David Hildenbrand Subject: [PATCH 10/10] mm/hugetlb: Introduce hugetlb_walk() Date: Tue, 29 Nov 2022 14:35:26 -0500 Message-Id: <20221129193526.3588187-11-peterx@redhat.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221129193526.3588187-1-peterx@redhat.com> References: <20221129193526.3588187-1-peterx@redhat.com> MIME-Version: 1.0 Content-type: text/plain X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1750861169405898948?= X-GMAIL-MSGID: =?utf-8?q?1750861169405898948?= huge_pte_offset() is the main walker function for hugetlb pgtables. The name is not really representing what it does, though. Instead of renaming it, introduce a wrapper function called hugetlb_walk() which will use huge_pte_offset() inside. Assert on the locks when walking the pgtable. Note, the vma lock assertion will be a no-op for private mappings. Signed-off-by: Peter Xu Reviewed-by: Mike Kravetz --- fs/hugetlbfs/inode.c | 4 +--- fs/userfaultfd.c | 6 ++---- include/linux/hugetlb.h | 37 +++++++++++++++++++++++++++++++++++++ mm/hugetlb.c | 34 ++++++++++++++-------------------- mm/page_vma_mapped.c | 2 +- mm/pagewalk.c | 4 +--- 6 files changed, 56 insertions(+), 31 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index fdb16246f46e..48f1a8ad2243 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -388,9 +388,7 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, { pte_t *ptep, pte; - ptep = huge_pte_offset(vma->vm_mm, addr, - huge_page_size(hstate_vma(vma))); - + ptep = hugetlb_walk(vma, addr, huge_page_size(hstate_vma(vma))); if (!ptep) return false; diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index a602f008dde5..f31fe1a9f4c5 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -237,14 +237,12 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, unsigned long flags, unsigned long reason) { - struct mm_struct *mm = ctx->mm; pte_t *ptep, pte; bool ret = true; - mmap_assert_locked(mm); - - ptep = huge_pte_offset(mm, address, vma_mmu_pagesize(vma)); + mmap_assert_locked(ctx->mm); + ptep = hugetlb_walk(vma, address, vma_mmu_pagesize(vma)); if (!ptep) goto out; diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 81efd9b9baa2..1a51c45fdf2e 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -196,6 +196,11 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, * huge_pte_offset(): Walk the hugetlb pgtable until the last level PTE. * Returns the pte_t* if found, or NULL if the address is not mapped. * + * IMPORTANT: we should normally not directly call this function, instead + * this is only a common interface to implement arch-specific walker. + * Please consider using the hugetlb_walk() helper to make sure of the + * correct locking is satisfied. + * * Since this function will walk all the pgtable pages (including not only * high-level pgtable page, but also PUD entry that can be unshared * concurrently for VM_SHARED), the caller of this function should be @@ -1229,4 +1234,36 @@ bool want_pmd_share(struct vm_area_struct *vma, unsigned long addr); #define flush_hugetlb_tlb_range(vma, addr, end) flush_tlb_range(vma, addr, end) #endif +static inline bool +__vma_shareable_flags_pmd(struct vm_area_struct *vma) +{ + return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && + vma->vm_private_data; +} + +/* + * Safe version of huge_pte_offset() to check the locks. See comments + * above huge_pte_offset(). + */ +static inline pte_t * +hugetlb_walk(struct vm_area_struct *vma, unsigned long addr, unsigned long sz) +{ +#if defined(CONFIG_ARCH_WANT_HUGE_PMD_SHARE) && defined(CONFIG_LOCKDEP) + struct hugetlb_vma_lock *vma_lock = vma->vm_private_data; + + /* + * If pmd sharing possible, locking needed to safely walk the + * hugetlb pgtables. More information can be found at the comment + * above huge_pte_offset() in the same file. + * + * NOTE: lockdep_is_held() is only defined with CONFIG_LOCKDEP. + */ + if (__vma_shareable_flags_pmd(vma)) + WARN_ON_ONCE(!lockdep_is_held(&vma_lock->rw_sema) && + !lockdep_is_held( + &vma->vm_file->f_mapping->i_mmap_rwsem)); +#endif + return huge_pte_offset(vma->vm_mm, addr, sz); +} + #endif /* _LINUX_HUGETLB_H */ diff --git a/mm/hugetlb.c b/mm/hugetlb.c index df645a5824e3..05867e82b467 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4816,7 +4816,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, } else { /* * For shared mappings the vma lock must be held before - * calling huge_pte_offset in the src vma. Otherwise, the + * calling hugetlb_walk() in the src vma. Otherwise, the * returned ptep could go away if part of a shared pmd and * another thread calls huge_pmd_unshare. */ @@ -4826,7 +4826,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, last_addr_mask = hugetlb_mask_last_page(h); for (addr = src_vma->vm_start; addr < src_vma->vm_end; addr += sz) { spinlock_t *src_ptl, *dst_ptl; - src_pte = huge_pte_offset(src, addr, sz); + src_pte = hugetlb_walk(src_vma, addr, sz); if (!src_pte) { addr |= last_addr_mask; continue; @@ -5030,7 +5030,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, hugetlb_vma_lock_write(vma); i_mmap_lock_write(mapping); for (; old_addr < old_end; old_addr += sz, new_addr += sz) { - src_pte = huge_pte_offset(mm, old_addr, sz); + src_pte = hugetlb_walk(vma, old_addr, sz); if (!src_pte) { old_addr |= last_addr_mask; new_addr |= last_addr_mask; @@ -5093,7 +5093,7 @@ static void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct last_addr_mask = hugetlb_mask_last_page(h); address = start; for (; address < end; address += sz) { - ptep = huge_pte_offset(mm, address, sz); + ptep = hugetlb_walk(vma, address, sz); if (!ptep) { address |= last_addr_mask; continue; @@ -5406,7 +5406,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, mutex_lock(&hugetlb_fault_mutex_table[hash]); hugetlb_vma_lock_read(vma); spin_lock(ptl); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); + ptep = hugetlb_walk(vma, haddr, huge_page_size(h)); if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) goto retry_avoidcopy; @@ -5444,7 +5444,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * before the page tables are altered */ spin_lock(ptl); - ptep = huge_pte_offset(mm, haddr, huge_page_size(h)); + ptep = hugetlb_walk(vma, haddr, huge_page_size(h)); if (likely(ptep && pte_same(huge_ptep_get(ptep), pte))) { /* Break COW or unshare */ huge_ptep_clear_flush(vma, haddr, ptep); @@ -5841,7 +5841,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * until finished with ptep. This prevents huge_pmd_unshare from * being called elsewhere and making the ptep no longer valid. * - * ptep could have already be assigned via huge_pte_offset. That + * ptep could have already be assigned via hugetlb_walk(). That * is OK, as huge_pte_alloc will return the same value unless * something has changed. */ @@ -6233,7 +6233,7 @@ struct page *hugetlb_follow_page_mask(struct vm_area_struct *vma, return NULL; hugetlb_vma_lock_read(vma); - pte = huge_pte_offset(mm, haddr, huge_page_size(h)); + pte = hugetlb_walk(vma, haddr, huge_page_size(h)); if (!pte) goto out_unlock; @@ -6298,8 +6298,8 @@ long follow_hugetlb_page(struct mm_struct *mm, struct vm_area_struct *vma, * * Note that page table lock is not held when pte is null. */ - pte = huge_pte_offset(mm, vaddr & huge_page_mask(h), - huge_page_size(h)); + pte = hugetlb_walk(vma, vaddr & huge_page_mask(h), + huge_page_size(h)); if (pte) ptl = huge_pte_lock(h, mm, pte); absent = !pte || huge_pte_none(huge_ptep_get(pte)); @@ -6485,7 +6485,7 @@ unsigned long hugetlb_change_protection(struct vm_area_struct *vma, last_addr_mask = hugetlb_mask_last_page(h); for (; address < end; address += psize) { spinlock_t *ptl; - ptep = huge_pte_offset(mm, address, psize); + ptep = hugetlb_walk(vma, address, psize); if (!ptep) { address |= last_addr_mask; continue; @@ -6863,12 +6863,6 @@ void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma, *end = ALIGN(*end, PUD_SIZE); } -static bool __vma_shareable_flags_pmd(struct vm_area_struct *vma) -{ - return vma->vm_flags & (VM_MAYSHARE | VM_SHARED) && - vma->vm_private_data; -} - void hugetlb_vma_lock_read(struct vm_area_struct *vma) { if (__vma_shareable_flags_pmd(vma)) { @@ -7034,8 +7028,8 @@ pte_t *huge_pmd_share(struct mm_struct *mm, struct vm_area_struct *vma, saddr = page_table_shareable(svma, vma, addr, idx); if (saddr) { - spte = huge_pte_offset(svma->vm_mm, saddr, - vma_mmu_pagesize(svma)); + spte = hugetlb_walk(svma, saddr, + vma_mmu_pagesize(svma)); if (spte) { get_page(virt_to_page(spte)); break; @@ -7394,7 +7388,7 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) hugetlb_vma_lock_write(vma); i_mmap_lock_write(vma->vm_file->f_mapping); for (address = start; address < end; address += PUD_SIZE) { - ptep = huge_pte_offset(mm, address, sz); + ptep = hugetlb_walk(vma, address, sz); if (!ptep) continue; ptl = huge_pte_lock(h, mm, ptep); diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index f94ec78b54ff..bb782dea4b42 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -171,7 +171,7 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) hugetlb_vma_lock_read(vma); /* when pud is not present, pte will be NULL */ - pvmw->pte = huge_pte_offset(mm, pvmw->address, size); + pvmw->pte = hugetlb_walk(vma, pvmw->address, size); if (!pvmw->pte) { hugetlb_vma_unlock_read(vma); return false; diff --git a/mm/pagewalk.c b/mm/pagewalk.c index d98564a7be57..cb23f8a15c13 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -305,13 +305,11 @@ static int walk_hugetlb_range(unsigned long addr, unsigned long end, hugetlb_vma_lock_read(vma); do { next = hugetlb_entry_end(h, addr, end); - pte = huge_pte_offset(walk->mm, addr & hmask, sz); - + pte = hugetlb_walk(vma, addr & hmask, sz); if (pte) err = ops->hugetlb_entry(pte, hmask, addr, next, walk); else if (ops->pte_hole) err = ops->pte_hole(addr, next, -1, walk); - if (err) break; } while (addr = next, addr != end);