From patchwork Mon Dec 4 17:26:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Houghton X-Patchwork-Id: 17510 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp2916508vqy; Mon, 4 Dec 2023 09:27:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IFqfG1E9jQ6fZggJOsQQdxpf9C6Mn6/6CUn+hgTV4O1GRxyDokXNOosu7qHrg3jieNMjzt8 X-Received: by 2002:a05:6a00:9398:b0:6cb:d24c:4a9f with SMTP id ka24-20020a056a00939800b006cbd24c4a9fmr5564036pfb.29.1701710832114; Mon, 04 Dec 2023 09:27:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701710832; cv=none; d=google.com; s=arc-20160816; b=EcQfyZ5G20sVVIez2OBWC1gmBw44zAnXRXl9boRKLP4uWIE8A4TRoI+Hr+w1LUM99d YkcWJelmEcGStTfLAqYZDM9EEZmHczqU21I0coPAKa+c+3zy99z2occYcL7FcoML4VQY RqtadWDqSS0fH+YlYotnvgbwpdGY8EQAosJaoMj7PwfGKRZ10P0kWjxWh5KxNNCvjC3c Mr36VRC2Lkz/m6uBgckNiqzdLlkY/m6W4bNPEnVottYiSJBS3itLcE9vKgjGEENqg1l/ 4m98aps31Sz2GVsFYGztRK2alWajqClrnrGnYj/D7SdyUYSj64Ed1feyZ/PSaPGqoWLX 9Nrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=M646tH9YYmwVxUliD4L7Z8ZBWWZDACIb65kUaN+JDEY=; fh=MJxrAMNk1xt5eTwPYlce6LT4l1O5DiRP7H6bsCVP2JY=; b=sdy7LS/TGyg/ViFkGq019K/BJ+nqTqY8YU+d9U4Tlq6xAgfdtf5jtYIfrYDap1BMfr YsazUxSqIoGGT7qik8CgpIjzWG/0irqrCJBPcArH4bTTBTjPIoauAl5Zualo2cskrQfs ZIb4A6fV3jtn8NPRefFKGB8poFZHmI4mr5hsLdcctG1V6pm/dm0bRTHnlkyMrfzBt+Ho cmqc3aK7FS5klcaN0rVVZu/T3XZoRXhP8kfxpLfXPb141VxEET3Afn7b8L0lbzGFvUdz Rj7yWfqJxDriyovb9UVKEJIFHNf6valzmVHz33dUezacoTQcLrJll8gBeVVPtjBsE0Rs gwLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=lTt710ez; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id f1-20020a056a000b0100b006ce403f239bsi3141733pfu.68.2023.12.04.09.27.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 09:27:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=lTt710ez; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id E9DBD807665B; Mon, 4 Dec 2023 09:27:08 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230513AbjLDR04 (ORCPT + 99 others); Mon, 4 Dec 2023 12:26:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229485AbjLDR0z (ORCPT ); Mon, 4 Dec 2023 12:26:55 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 312AE83 for ; Mon, 4 Dec 2023 09:27:02 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5caf61210e3so66086277b3.0 for ; Mon, 04 Dec 2023 09:27:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701710821; x=1702315621; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=M646tH9YYmwVxUliD4L7Z8ZBWWZDACIb65kUaN+JDEY=; b=lTt710ezl89QNlOsCBl4a9+pweU89TFCWVh9hqGjr29urQKjCQaQEuuk3oHrYnR8ld 2GNveWe990ytW/1NxCWPohuoubvBzUqfem0UlypiZQU/eK01QD0PEkYE7bGcyomoFHF2 k4FW+4Yfshiv5SNqKVcc/qUdSBWvNodeZAyR+fU4CdrLluxjWDJ5Ja7KRjcZdXvu1Xpr UhAyo6hDs3ZdjNbwMTgpoNer4uOiBoOwQ87zHFWJlbqAR/ON/LYK1RTVaTCg7KRb8rtK WPKeoYHTNWaTwN5Tnz9mYCrdsb/g+zaYck4oPW+DZiGBQlN6CI/PpN4msxXJOOBP+Wsl YY4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701710821; x=1702315621; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=M646tH9YYmwVxUliD4L7Z8ZBWWZDACIb65kUaN+JDEY=; b=aaKeX2LDnMT46cxs42k4ZPqdJRyYKLm42N2hukUOzuZupwe8kq6tOMng/3Fz6gkzXW SO+23N32KxwEJ6x7YpRzxKaBOONPLCpjaiaGdNcHRz2uGyJQUXVsm9EM0/ZPRCShoEFH Ok/tdumQLSBIXmtKXuHygOdS1B15LtBNSYM3CAmlrTr5n63Z+xFFO852QOdHMadBtceC w+ZSwlou5QhaNOAGf2kDgmaX+VNKTurI81th/cna20BH9xSQIpIWC4aSmwXUVp4oMaZ7 ojMyhEbdt1UPzMo4+tnMpBjRTVpcrBWkKXgYytvxa8ROqba06TLTn8BSVTmI1Uzuodfw Xpag== X-Gm-Message-State: AOJu0Yx55+kKIHw79GSnKBQc7036i5WOopn22hC6craxLcyCZIumW1pi R/jCf9AahbGrC6h25iONY3JP8PhSiDtWhzP5 X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a0d:d60d:0:b0:5d3:9222:a83e with SMTP id y13-20020a0dd60d000000b005d39222a83emr415074ywd.10.1701710821479; Mon, 04 Dec 2023 09:27:01 -0800 (PST) Date: Mon, 4 Dec 2023 17:26:44 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.rc2.451.g8631bc7472-goog Message-ID: <20231204172646.2541916-1-jthoughton@google.com> Subject: [PATCH 0/2] arm64: hugetlb: Fix page fault loop for sw-dirty/hw-clean contiguous PTEs From: James Houghton To: Steve Capper , Will Deacon , Andrew Morton Cc: Mike Kravetz , Muchun Song , Anshuman Khandual , Catalin Marinas , Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 04 Dec 2023 09:27:09 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784373137218348865 X-GMAIL-MSGID: 1784373137218348865 It is currently possible for a userspace application to enter a page fault loop when using HugeTLB pages implemented with contiguous PTEs when HAFDBS is not available. This happens because: 1. The kernel may sometimes write PTEs that are sw-dirty but hw-clean (PTE_DIRTY | PTE_RDONLY | PTE_WRITE). 2. If, during a write, the CPU uses a sw-dirty, hw-clean PTE in handling the memory access on a system without HAFDBS, we will get a page fault. 3. HugeTLB will check if it needs to update the dirty bits on the PTE. For contiguous PTEs, it will check to see if the pgprot bits need updating. In this case, HugeTLB wants to write a sequence of sw-dirty, hw-dirty PTEs, but it finds that all the PTEs it is about to overwrite are all pte_dirty() (pte_sw_dirty() => pte_dirty()), so it thinks no update is necessary. Please see this[1] reproducer. I think (though I may be wrong) that both step (1) and step (3) are buggy. The first patch in this series fixes step (3); instead of checking if pte_dirty is matching in __cont_access_flags_changed, check pte_hw_dirty and pte_sw_dirty separately. The second patch in this series makes step (1) less likely to occur. Without this patch, we can get the kernel to write a sw-dirty, hw-clean PTE with the following steps (showing the relevant VMA flags and pgprot bits): i. Create a valid, writable contiguous PTE. VMA vmflags: VM_SHARED | VM_READ | VM_WRITE VMA pgprot bits: PTE_RDONLY | PTE_WRITE PTE pgprot bits: PTE_DIRTY | PTE_WRITE ii. mprotect the VMA to PROT_NONE. VMA vmflags: VM_SHARED VMA pgprot bits: PTE_RDONLY PTE pgprot bits: PTE_DIRTY | PTE_RDONLY iii. mprotect the VMA back to PROT_READ | PROT_WRITE. VMA vmflags: VM_SHARED | VM_READ | VM_WRITE VMA pgprot bits: PTE_RDONLY | PTE_WRITE PTE pgprot bits: PTE_DIRTY | PTE_WRITE | PTE_RDONLY Applying either one of the two patches in this patchset will fix the particular issue with HugeTLB pages implemented with contiguous PTEs. It's possible that only one of these patches should be taken, or that the right fix is something else entirely. [1]: https://gist.github.com/48ca/11d1e466deee032cb35aa8c2280f93b0 James Houghton (2): arm64: hugetlb: Distinguish between hw and sw dirtiness in __cont_access_flags_changed arm64: mm: Always make sw-dirty PTEs hw-dirty in pte_modify arch/arm64/include/asm/pgtable.h | 6 ++++++ arch/arm64/mm/hugetlbpage.c | 5 ++++- 2 files changed, 10 insertions(+), 1 deletion(-) base-commit: 645a9a454fdb7e698a63a275edca6a17ef97afc4