From patchwork Fri Oct 21 10:11:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6608 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp608550wrr; Fri, 21 Oct 2022 03:14:34 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5PiYqe03ClYWnSejubp4dc2mAhTC+pFAU13iF5xbaxG5/Al0iKyo8HSVHAOrGD3XwOQ5Zv X-Received: by 2002:a17:907:7fa4:b0:791:9307:9d6a with SMTP id qk36-20020a1709077fa400b0079193079d6amr14780725ejc.464.1666347274591; Fri, 21 Oct 2022 03:14:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347274; cv=none; d=google.com; s=arc-20160816; b=uSctW5OPuz7R3eNPLG+ZnWLPwyFxI36pZ+DII9Bmb6wvHKcxocCA1mLTldUa6YdVE1 xoCUrffonLzTktkdrs/dmdeW/Gaav5Ie8bMsB3IRA8T9zssgzBqV+bh2T2GKx6O4/nYU toV4R09CifyiLvx/Cxwk4D3Btnw1yisgkzf8jtNuulUMk+1GcRn106uuywZMFd03b49H 6z1AAOK/wY3vjaNazF16YHJcYPktyRaGy0K4xyrlsNHPi0V2w8e0X4Mwnzh/3ktsr7Mv 4jZ03eIEwBGPrsWuwJ4QbTsNoWWpavk79lcLAtyvLdwrbNQIHhlTFTLLFWhLnuJ8+SGI zNeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HpvAVKz3qbonI1lfYmBeIQnX7aY+jIE8nipT3HyDP+I=; b=cTiFuq5+Q/wJolgFkX09AFZlB4tGKq8Q7DwZKRGfjqqm8ACabApgTAN8RcJ8Uj9eEg qCiEEd0Uwl6/sXNglBztYpgFMOrvWuLeWtNTzcpQgxPqVpnfMtg6h8fJ3SP0Fq8XSIA1 7Mqec35a+qdnsmULbpE4IZBAXFE7UueoOs/qpjqlcHUEDe2NLzGcJeix1MecyfdIgcnV zuhWdF+cdaC9pwPaDszwsd99sL7Fc4U9z+XuFiqXDK2i9DvhARBSjFMPU8Uy3GiZb1OS xm3axsIPSsfpvDqaJM5bBfq88O0l8ujOpq7u41Qo9A7h5WRQ6fJffy3VUOQvTdbimVpx BYLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hGn4Rrsh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qw22-20020a1709066a1600b0078dcf11ccf7si19051617ejc.802.2022.10.21.03.14.09; Fri, 21 Oct 2022 03:14:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=hGn4Rrsh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230283AbiJUKMT (ORCPT + 99 others); Fri, 21 Oct 2022 06:12:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230251AbiJUKMM (ORCPT ); Fri, 21 Oct 2022 06:12:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 991D07AB02 for ; Fri, 21 Oct 2022 03:12:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347128; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HpvAVKz3qbonI1lfYmBeIQnX7aY+jIE8nipT3HyDP+I=; b=hGn4Rrsh3VNBWXJms6MSicok7N8j0Gus4wUhPP81jDPsGZJrzKb0sfpAjnir6udNQ0yKSf yw6KzCqiEBXSGr0jcRUwws6TFr5lwKqx+3Mfdtnyqb/39eDRoO3zh/A6fDiTYBNbHb32Wo Yyh0EQbrZWndNbBbt2Iru1++myf+W3w= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-614-HesOCGjDNr6QxqCIcGuzdw-1; Fri, 21 Oct 2022 06:12:05 -0400 X-MC-Unique: HesOCGjDNr6QxqCIcGuzdw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 5C98387A386; Fri, 21 Oct 2022 10:12:04 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id DB8CA40D2998; Fri, 21 Oct 2022 10:11:50 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 1/9] selftests/vm: add test to measure MADV_UNMERGEABLE performance Date: Fri, 21 Oct 2022 12:11:33 +0200 Message-Id: <20221021101141.84170-2-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291759640465573?= X-GMAIL-MSGID: =?utf-8?q?1747291759640465573?= Let's add a test to measure performance of KSM breaking not triggered via COW, but triggered by disabling KSM on an area filled with KSM pages via MADV_UNMERGEABLE. Acked-by: Peter Xu Signed-off-by: David Hildenbrand --- tools/testing/selftests/vm/ksm_tests.c | 76 +++++++++++++++++++++++++- 1 file changed, 74 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/vm/ksm_tests.c b/tools/testing/selftests/vm/ksm_tests.c index 0d85be2350fa..f9eb4d67e0dd 100644 --- a/tools/testing/selftests/vm/ksm_tests.c +++ b/tools/testing/selftests/vm/ksm_tests.c @@ -40,6 +40,7 @@ enum ksm_test_name { CHECK_KSM_NUMA_MERGE, KSM_MERGE_TIME, KSM_MERGE_TIME_HUGE_PAGES, + KSM_UNMERGE_TIME, KSM_COW_TIME }; @@ -108,7 +109,10 @@ static void print_help(void) " -P evaluate merging time and speed.\n" " For this test, the size of duplicated memory area (in MiB)\n" " must be provided using -s option\n" - " -H evaluate merging time and speed of area allocated mostly with huge pages\n" + " -H evaluate merging time and speed of area allocated mostly with huge pages\n" + " For this test, the size of duplicated memory area (in MiB)\n" + " must be provided using -s option\n" + " -D evaluate unmerging time and speed when disabling KSM.\n" " For this test, the size of duplicated memory area (in MiB)\n" " must be provided using -s option\n" " -C evaluate the time required to break COW of merged pages.\n\n"); @@ -188,6 +192,16 @@ static int ksm_merge_pages(void *addr, size_t size, struct timespec start_time, return 0; } +static int ksm_unmerge_pages(void *addr, size_t size, + struct timespec start_time, int timeout) +{ + if (madvise(addr, size, MADV_UNMERGEABLE)) { + perror("madvise"); + return 1; + } + return 0; +} + static bool assert_ksm_pages_count(long dupl_page_count) { unsigned long max_page_sharing, pages_sharing, pages_shared; @@ -560,6 +574,53 @@ static int ksm_merge_time(int mapping, int prot, int timeout, size_t map_size) return KSFT_FAIL; } +static int ksm_unmerge_time(int mapping, int prot, int timeout, size_t map_size) +{ + void *map_ptr; + struct timespec start_time, end_time; + unsigned long scan_time_ns; + + map_size *= MB; + + map_ptr = allocate_memory(NULL, prot, mapping, '*', map_size); + if (!map_ptr) + return KSFT_FAIL; + if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) { + perror("clock_gettime"); + goto err_out; + } + if (ksm_merge_pages(map_ptr, map_size, start_time, timeout)) + goto err_out; + + if (clock_gettime(CLOCK_MONOTONIC_RAW, &start_time)) { + perror("clock_gettime"); + goto err_out; + } + if (ksm_unmerge_pages(map_ptr, map_size, start_time, timeout)) + goto err_out; + if (clock_gettime(CLOCK_MONOTONIC_RAW, &end_time)) { + perror("clock_gettime"); + goto err_out; + } + + scan_time_ns = (end_time.tv_sec - start_time.tv_sec) * NSEC_PER_SEC + + (end_time.tv_nsec - start_time.tv_nsec); + + printf("Total size: %lu MiB\n", map_size / MB); + printf("Total time: %ld.%09ld s\n", scan_time_ns / NSEC_PER_SEC, + scan_time_ns % NSEC_PER_SEC); + printf("Average speed: %.3f MiB/s\n", (map_size / MB) / + ((double)scan_time_ns / NSEC_PER_SEC)); + + munmap(map_ptr, map_size); + return KSFT_PASS; + +err_out: + printf("Not OK\n"); + munmap(map_ptr, map_size); + return KSFT_FAIL; +} + static int ksm_cow_time(int mapping, int prot, int timeout, size_t page_size) { void *map_ptr; @@ -644,7 +705,7 @@ int main(int argc, char *argv[]) bool merge_across_nodes = KSM_MERGE_ACROSS_NODES_DEFAULT; long size_MB = 0; - while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPCH")) != -1) { + while ((opt = getopt(argc, argv, "ha:p:l:z:m:s:MUZNPCHD")) != -1) { switch (opt) { case 'a': prot = str_to_prot(optarg); @@ -701,6 +762,9 @@ int main(int argc, char *argv[]) case 'H': test_name = KSM_MERGE_TIME_HUGE_PAGES; break; + case 'D': + test_name = KSM_UNMERGE_TIME; + break; case 'C': test_name = KSM_COW_TIME; break; @@ -762,6 +826,14 @@ int main(int argc, char *argv[]) ret = ksm_merge_hugepages_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec, size_MB); break; + case KSM_UNMERGE_TIME: + if (size_MB == 0) { + printf("Option '-s' is required.\n"); + return KSFT_FAIL; + } + ret = ksm_unmerge_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, + ksm_scan_limit_sec, size_MB); + break; case KSM_COW_TIME: ret = ksm_cow_time(MAP_PRIVATE | MAP_ANONYMOUS, prot, ksm_scan_limit_sec, page_size); From patchwork Fri Oct 21 10:11:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6609 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp608656wrr; Fri, 21 Oct 2022 03:14:48 -0700 (PDT) X-Google-Smtp-Source: AMsMyM78e7pe7ekdNk2WjHZRXX6ILYkgwPKwhx/vI1OOcyByvdHQGeQU0pCw/OrmwF8Yv3oS3EYw X-Received: by 2002:a17:906:cc4d:b0:78d:fb86:3979 with SMTP id mm13-20020a170906cc4d00b0078dfb863979mr15223744ejb.421.1666347288347; Fri, 21 Oct 2022 03:14:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347288; cv=none; d=google.com; s=arc-20160816; b=DozYt7BuKH0JySdBFR1qkAR/kyqAHFcq+HBE4lpVI5+2hZQleRbnDMJe09HZDTCo0V 0j55qXoRvWgxKU7XfdcPJP+3KFWKB0H+6sK7bWPmtgDZX1z7/jnv2hcK66sx3ufsDMqv L+gIsJi18+ZHtffRCb4CwyVXJzUUooHGarT8pyAS393aDkyZRjbsE3PbAQeFgn9V//fY oBTPtxti9RLdJzI5OsFYkEqC49jdSLM9qNecg7uLN7VxruNmYiTewn/a0fgHdbqMOjzL /Jy2EqbeNpt+KG1SGFikOLC1ng1liW6bLRekEGsnpONdeK5sZm4ixaAxBCPcEgM4yr2R 2I/w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nRQc9H/82kxOKI2FpppEsEOSQhzrh5bTImBzbqeGf2g=; b=HZebfoEWQhnrJHuQYk2uq/XUNgoAn7UrnI95KI4Nv+Y/kbjL7257QAc9C7vVKZFWK1 12x30/yP/GcW+9f//mnHn/d4pYbLZkTsedYZ0giFliK13t07pwJl3dTU+qXVMToguPJZ n6bqMw2tuFwAUP7BwTquCw+Xlzrb/YIG6byK3DAvTHJoG6fbIBZJcUnyF8WDvtu1U3SV KguIMR/NQvyPQJQjTysSN2dSrZiLN3LAISRbKsMvYB20RRXLhlme9JtEAKm6uAPR/7JM VHacNqI3h6WP8zdZCr5oOZfofN53mQx0hHEQnUD/LXXK68ljrKN4njF3ttYlWOZ4B1Nf uh+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JQj3pDlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t25-20020a1709064f1900b00783d969f337si15672094eju.307.2022.10.21.03.14.23; Fri, 21 Oct 2022 03:14:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JQj3pDlH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230336AbiJUKMk (ORCPT + 99 others); Fri, 21 Oct 2022 06:12:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230251AbiJUKMi (ORCPT ); Fri, 21 Oct 2022 06:12:38 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9BD037AB12 for ; Fri, 21 Oct 2022 03:12:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347155; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nRQc9H/82kxOKI2FpppEsEOSQhzrh5bTImBzbqeGf2g=; b=JQj3pDlH+Skurj50/0/yVO8Y0o8VP+o/S4WEn4RGNxh0qOmdZRPg+5ESOELoCRxlj4GCxr dF/2eohpENEx3s06MRZbMSg3qpNdnvnWIxdw8ZTWGyogfVvchWEKPQb7c8v03IJDFP0utM aYci/S8AIdAr2kszGgrZTOXG2+WIvZ0= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-563-uUm3cRV3OvSB9KP2Undt3w-1; Fri, 21 Oct 2022 06:12:34 -0400 X-MC-Unique: uUm3cRV3OvSB9KP2Undt3w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 66259857D0A; Fri, 21 Oct 2022 10:12:17 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9C8A040C95B0; Fri, 21 Oct 2022 10:12:04 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 2/9] mm/ksm: simplify break_ksm() to not rely on VM_FAULT_WRITE Date: Fri, 21 Oct 2022 12:11:34 +0200 Message-Id: <20221021101141.84170-3-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291774238105456?= X-GMAIL-MSGID: =?utf-8?q?1747291774238105456?= Now that GUP no longer requires VM_FAULT_WRITE, break_ksm() is the sole remaining user of VM_FAULT_WRITE. As we also want to stop triggering a fake write fault and instead use FAULT_FLAG_UNSHARE -- similar to GUP-triggered unsharing when taking a R/O pin on a shared anonymous page (including KSM pages), let's stop relying on VM_FAULT_WRITE. Let's rework break_ksm() to not rely on the return value of handle_mm_fault() anymore to figure out whether COW-breaking was successful. Simply perform another follow_page() lookup to verify the result. While this makes break_ksm() slightly less efficient, we can simplify handle_mm_fault() a little and easily switch to FAULT_FLAG_UNSHARE without introducing similar KSM-specific behavior for FAULT_FLAG_UNSHARE. In my setup (AMD Ryzen 9 3900X), running the KSM selftest to test unmerge performance on 2 GiB (taskset 0x8 ./ksm_tests -D -s 2048), this results in a performance degradation of ~4% -- 5% (old: ~5250 MiB/s, new: ~5010 MiB/s). I don't think that we particularly care about that performance drop when unmerging. If it ever turns out to be an actual performance issue, we can think about a better alternative for FAULT_FLAG_UNSHARE -- let's just keep it simple for now. Acked-by: Peter Xu Signed-off-by: David Hildenbrand --- mm/ksm.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index c19fcca9bc03..b884a22f3c3c 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -440,26 +440,27 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) vm_fault_t ret = 0; do { + bool ksm_page = false; + cond_resched(); page = follow_page(vma, addr, FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); if (IS_ERR_OR_NULL(page)) break; if (PageKsm(page)) - ret = handle_mm_fault(vma, addr, - FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, - NULL); - else - ret = VM_FAULT_WRITE; + ksm_page = true; put_page(page); - } while (!(ret & (VM_FAULT_WRITE | VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); + + if (!ksm_page) + return 0; + ret = handle_mm_fault(vma, addr, + FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, + NULL); + } while (!(ret & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); /* - * We must loop because handle_mm_fault() may back out if there's - * any difficulty e.g. if pte accessed bit gets updated concurrently. - * - * VM_FAULT_WRITE is what we have been hoping for: it indicates that - * COW has been broken, even if the vma does not permit VM_WRITE; - * but note that a concurrent fault might break PageKsm for us. + * We must loop until we no longer find a KSM page because + * handle_mm_fault() may back out if there's any difficulty e.g. if + * pte accessed bit gets updated concurrently. * * VM_FAULT_SIGBUS could occur if we race with truncation of the * backing file, which also invalidates anonymous pages: that's From patchwork Fri Oct 21 10:11:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6612 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp609769wrr; Fri, 21 Oct 2022 03:16:36 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5e3OmQl2kYfJkSfBV4SYnuzHNiGCAnTlEnWw3iyH/5uN9h0de5CR/XucCS0UVFKrRVi+1G X-Received: by 2002:a17:907:1de1:b0:78d:9919:38cc with SMTP id og33-20020a1709071de100b0078d991938ccmr14572621ejc.311.1666347395864; Fri, 21 Oct 2022 03:16:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347395; cv=none; d=google.com; s=arc-20160816; b=gupG/YtIOvmCcH/kEG6vxWnhq859+lBPGIflI6ZqE5hbo2BEIqCMO/WMH++a9g8ubK lw7ICsv3J6lKPzVtQhSuNc2o10gxtG30iaB8trVskRwMIbA3fzWPMkStShWGhp3pN+gF uWFnKNBWfSKXCmWUg3C/JHlyYbPlkTxRgdR0O6d1WKKcSRTI46FC4at/uj5c/LDfW+72 W8ltYUD4dSOdBs3+uFuhrHxOGEuf1DoTlceBMi4AB4ozztSHPaBEf1bWZF7CyUy09Yl5 OvwImerwbflkMUORmBDNaupXJsiGRKIjowCDR4UbgABYHafZFIKeMWCcsqR5k5KmsYEk It+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uL1RRyPgPkJis952ZjK0SFPNdUV1Kk+hsw2E/TYMifs=; b=x47YASaQ4MxKnKED2Wc29RaCIO/7WIvdFd8DHNGfY9eWJ9FbPbOiME5zTBZUHkxHzw SnMIHdWRn+RAeKZoCyXGzHRakX3tO8pUNFeU7frcp9wX9oRBA8/i2odtNRdNId8oCY68 I45yxZighu67tX7Dtn8wy5EhaszqhlYj67KLp1rq8Pbz37Kk5sfc5WBNiOJvd6f5Iu/v OimjVBf6lrTjlhY2/uNw8x4Y0gZ116FvIV+eqLG8sWAnl8ov2DEB7RDsQpphc2+H6k/9 IcRMTHL6zMOq1xVTjZGK264pZxcvuIg0GreYlDijbkBtwECA2lCUlAzaCHw+S4nlAnuX oANg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="ia0n/9Lm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gn22-20020a1709070d1600b007429f0c69ccsi23946485ejc.579.2022.10.21.03.16.10; Fri, 21 Oct 2022 03:16:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="ia0n/9Lm"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230348AbiJUKMn (ORCPT + 99 others); Fri, 21 Oct 2022 06:12:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55246 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229489AbiJUKMi (ORCPT ); Fri, 21 Oct 2022 06:12:38 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F212E7AB06 for ; Fri, 21 Oct 2022 03:12:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347155; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uL1RRyPgPkJis952ZjK0SFPNdUV1Kk+hsw2E/TYMifs=; b=ia0n/9LmLL+E7X5Fxzf++n8B2xy0+m31zQQDbijcvPkjmhpAyM8b4IBLEHRWR9J0bJ4iGu +omV9NWudGSrEm1e06xzmalRAtjJRki5m/8z+/aIQeDqa2P3xW7+TBp5uHo/ToYjifITXV faYl8YAXct/3h5Nw4YlNdEQe0KgoiQM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-638--bIND0qnMVShTdrbuDHPqA-1; Fri, 21 Oct 2022 06:12:31 -0400 X-MC-Unique: -bIND0qnMVShTdrbuDHPqA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1D72186EB31; Fri, 21 Oct 2022 10:12:26 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6896D40D299B; Fri, 21 Oct 2022 10:12:07 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 3/9] mm: remove VM_FAULT_WRITE Date: Fri, 21 Oct 2022 12:11:35 +0200 Message-Id: <20221021101141.84170-4-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291887056945953?= X-GMAIL-MSGID: =?utf-8?q?1747291887056945953?= All users -- GUP and KSM -- are gone, let's just remove it. Acked-by: Peter Xu Signed-off-by: David Hildenbrand --- include/linux/mm_types.h | 3 --- mm/huge_memory.c | 2 +- mm/memory.c | 9 ++++----- 3 files changed, 5 insertions(+), 9 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 500e536796ca..6bc3baced3e3 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -847,7 +847,6 @@ typedef __bitwise unsigned int vm_fault_t; * @VM_FAULT_OOM: Out Of Memory * @VM_FAULT_SIGBUS: Bad access * @VM_FAULT_MAJOR: Page read from storage - * @VM_FAULT_WRITE: Special case for get_user_pages * @VM_FAULT_HWPOISON: Hit poisoned small page * @VM_FAULT_HWPOISON_LARGE: Hit poisoned large page. Index encoded * in upper bits @@ -868,7 +867,6 @@ enum vm_fault_reason { VM_FAULT_OOM = (__force vm_fault_t)0x000001, VM_FAULT_SIGBUS = (__force vm_fault_t)0x000002, VM_FAULT_MAJOR = (__force vm_fault_t)0x000004, - VM_FAULT_WRITE = (__force vm_fault_t)0x000008, VM_FAULT_HWPOISON = (__force vm_fault_t)0x000010, VM_FAULT_HWPOISON_LARGE = (__force vm_fault_t)0x000020, VM_FAULT_SIGSEGV = (__force vm_fault_t)0x000040, @@ -894,7 +892,6 @@ enum vm_fault_reason { { VM_FAULT_OOM, "OOM" }, \ { VM_FAULT_SIGBUS, "SIGBUS" }, \ { VM_FAULT_MAJOR, "MAJOR" }, \ - { VM_FAULT_WRITE, "WRITE" }, \ { VM_FAULT_HWPOISON, "HWPOISON" }, \ { VM_FAULT_HWPOISON_LARGE, "HWPOISON_LARGE" }, \ { VM_FAULT_SIGSEGV, "SIGSEGV" }, \ diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 1cc4a5f4791e..be13fe55b798 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1379,7 +1379,7 @@ vm_fault_t do_huge_pmd_wp_page(struct vm_fault *vmf) if (pmdp_set_access_flags(vma, haddr, vmf->pmd, entry, 1)) update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); spin_unlock(vmf->ptl); - return VM_FAULT_WRITE; + return 0; } unlock_fallback: diff --git a/mm/memory.c b/mm/memory.c index f88c351aecd4..8e72f703ed99 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3242,7 +3242,7 @@ static vm_fault_t wp_page_copy(struct vm_fault *vmf) } delayacct_wpcopy_end(); - return (page_copied && !unshare) ? VM_FAULT_WRITE : 0; + return 0; oom_free_new: put_page(new_page); oom: @@ -3306,14 +3306,14 @@ static vm_fault_t wp_pfn_shared(struct vm_fault *vmf) return finish_mkwrite_fault(vmf); } wp_page_reuse(vmf); - return VM_FAULT_WRITE; + return 0; } static vm_fault_t wp_page_shared(struct vm_fault *vmf) __releases(vmf->ptl) { struct vm_area_struct *vma = vmf->vma; - vm_fault_t ret = VM_FAULT_WRITE; + vm_fault_t ret = 0; get_page(vmf->page); @@ -3464,7 +3464,7 @@ static vm_fault_t do_wp_page(struct vm_fault *vmf) return 0; } wp_page_reuse(vmf); - return VM_FAULT_WRITE; + return 0; } else if (unshare) { /* No anonymous page -> nothing to do. */ pte_unmap_unlock(vmf->pte, vmf->ptl); @@ -3983,7 +3983,6 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (vmf->flags & FAULT_FLAG_WRITE) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; - ret |= VM_FAULT_WRITE; } rmap_flags |= RMAP_EXCLUSIVE; } From patchwork Fri Oct 21 10:11:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6613 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp609869wrr; Fri, 21 Oct 2022 03:16:46 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4ZgcI6Do2FktZ6VjzdExMfuenEGgIGosXwe3F7/a7Nh1bDAqOkGvGH70ow3ENKM7WFBPcs X-Received: by 2002:a05:6402:414d:b0:451:73f0:e113 with SMTP id x13-20020a056402414d00b0045173f0e113mr16906586eda.207.1666347406363; Fri, 21 Oct 2022 03:16:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347406; cv=none; d=google.com; s=arc-20160816; b=dNDKXRu38bVCwG5dAvwnrgFVIforCxQmhKI4sVtL0bfPjrFKGgiUgw9LosBHBM+1oV DLGMkPT44+9gEZLqbd0NDg23PSSHkK+toJoBj+rR2+WMKCs8zURzpHLXdkQ91yXkXq12 U6vj7PpZBsfsdbAyKsUZOUxNrfA/YZrCCO1hGK/XWun1vhs7v4Rq9pFB7GZU1J9JYeco Z23uQhG4Gc+lr7a7tGOaoLe+oC8GdKjbdzh6l0C/avrgrWK1OVzDJYNeVbvGWusi8uSf 0n0FNZG/b2Nr7CZBVtF2SVDTJkJR0b3L1+mnsoa6KnMHClp6O7oTvU4+L2gPmlesoCcC nWsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uXtz/Zj1ptThtlYCmULXjCE80g7BkkS7MWtxrW61b/8=; b=J4GUGCS1uoElf515zlGSvmRtF3Rbqc3r5V+DcbrjW9qakqWH8iIoJTbIj9iiwr610S UXcQAbc+smK8aJzIM9i1YRs3XiYQkcVyI0pG1uJWlh0tDNwkxsaPY0bEn7Z+RNEkORFO tZcpMXYZTHcK15trvw+GLAJF2cLf75g6HEMIpM6BpTDpPJcI0vadiyzrAbC6kzvGLb+e XDPqqlwQDFWMQC+k4iVTvVyBDpGLuDpQhAd7uCMPt9ZWFsvc+wccIGcVGcvvWRbKO6vr 4koW5SHRBDR4Gc/FyAyLHCsOep1a72cF48to9Spy/kYEhmKeknJhruJXTkHbPXhPtYxH GIbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Fjgu4sv1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id xg14-20020a170907320e00b0078217f3a033si22251910ejb.652.2022.10.21.03.16.20; Fri, 21 Oct 2022 03:16:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Fjgu4sv1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230357AbiJUKMs (ORCPT + 99 others); Fri, 21 Oct 2022 06:12:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55516 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230325AbiJUKMj (ORCPT ); Fri, 21 Oct 2022 06:12:39 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1E077AB08 for ; Fri, 21 Oct 2022 03:12:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347156; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=uXtz/Zj1ptThtlYCmULXjCE80g7BkkS7MWtxrW61b/8=; b=Fjgu4sv1/rOMA7veGfxs0A1diuaZTGj+y/eRW66zdzGCluqqAChFzb262UrsylH5rdL0hF bxH28kO9tWcP9ZpF8F+YyZ+MOoJVZXyhP1tLz00fRKAC0X8avsZ/iUMxpnOJhI/RliTnDV UxpGbkjDKrI+HEGdfxBcfwdr5JF0HMU= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-277-JeVJkWYEM-2dEQrYwBCSRw-1; Fri, 21 Oct 2022 06:12:33 -0400 X-MC-Unique: JeVJkWYEM-2dEQrYwBCSRw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A6D5B3C0F248; Fri, 21 Oct 2022 10:12:32 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id 62B1440B4976; Fri, 21 Oct 2022 10:12:15 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 4/9] selftests/vm: add KSM unmerge tests Date: Fri, 21 Oct 2022 12:11:36 +0200 Message-Id: <20221021101141.84170-5-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291898143385525?= X-GMAIL-MSGID: =?utf-8?q?1747291898143385525?= Let's add three unmerge tests (MADV_UNMERGEABLE unmerging all pages in the range). test_unmerge(): basic unmerge tests test_unmerge_discarded(): have some pte_none() entries in the range test_unmerge_uffd_wp(): protect the merged pages using uffd-wp ksm_tests.c currently contains a mixture of benchmarks and tests, whereby each test is carried out by executing the ksm_tests binary with specific parameters. Let's add new ksm_functional_tests.c that performs multiple, smaller functional tests all at once. Signed-off-by: David Hildenbrand --- tools/testing/selftests/vm/Makefile | 2 + .../selftests/vm/ksm_functional_tests.c | 279 ++++++++++++++++++ tools/testing/selftests/vm/run_vmtests.sh | 2 + tools/testing/selftests/vm/vm_util.c | 10 + tools/testing/selftests/vm/vm_util.h | 1 + 5 files changed, 294 insertions(+) create mode 100644 tools/testing/selftests/vm/ksm_functional_tests.c diff --git a/tools/testing/selftests/vm/Makefile b/tools/testing/selftests/vm/Makefile index 163c2fde3cb3..2d640a48255c 100644 --- a/tools/testing/selftests/vm/Makefile +++ b/tools/testing/selftests/vm/Makefile @@ -52,6 +52,7 @@ TEST_GEN_FILES += userfaultfd TEST_GEN_PROGS += soft-dirty TEST_GEN_PROGS += split_huge_page_test TEST_GEN_FILES += ksm_tests +TEST_GEN_PROGS += ksm_functional_tests ifeq ($(MACHINE),x86_64) CAN_BUILD_I386 := $(shell ./../x86/check_cc.sh "$(CC)" ../x86/trivial_32bit_program.c -m32) @@ -96,6 +97,7 @@ TEST_FILES += va_128TBswitch.sh include ../lib.mk $(OUTPUT)/khugepaged: vm_util.c +$(OUTPUT)/ksm_functional_tests: vm_util.c $(OUTPUT)/madv_populate: vm_util.c $(OUTPUT)/soft-dirty: vm_util.c $(OUTPUT)/split_huge_page_test: vm_util.c diff --git a/tools/testing/selftests/vm/ksm_functional_tests.c b/tools/testing/selftests/vm/ksm_functional_tests.c new file mode 100644 index 000000000000..96644be68962 --- /dev/null +++ b/tools/testing/selftests/vm/ksm_functional_tests.c @@ -0,0 +1,279 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * KSM functional tests + * + * Copyright 2022, Red Hat, Inc. + * + * Author(s): David Hildenbrand + */ +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../kselftest.h" +#include "vm_util.h" + +#define KiB 1024u +#define MiB (1024 * KiB) + +static int ksm_fd; +static int ksm_full_scans_fd; +static int pagemap_fd; +static size_t pagesize; + +static bool range_maps_duplicates(char *addr, unsigned long size) +{ + unsigned long offs_a, offs_b, pfn_a, pfn_b; + + /* + * There is no easy way to check if there are KSM pages mapped into + * this range. We only check that the range does not map the same PFN + * twice by comaring each pair of mapped pages. + */ + for (offs_a = 0; offs_a < size; offs_a += pagesize) { + pfn_a = pagemap_get_pfn(pagemap_fd, addr + offs_a); + /* Page not present or PFN not exposed by the kernel. */ + if (pfn_a == -1ull || !pfn_a) + continue; + + for (offs_b = offs_a + pagesize; offs_b < size; + offs_b += pagesize) { + pfn_b = pagemap_get_pfn(pagemap_fd, addr + offs_b); + if (pfn_b == -1ull || !pfn_b) + continue; + if (pfn_a == pfn_b) + return true; + } + } + return false; +} + +static long ksm_get_full_scans(void) +{ + char buf[10]; + ssize_t ret; + + ret = pread(ksm_full_scans_fd, buf, sizeof(buf) - 1, 0); + if (ret <= 0) + return -errno; + buf[ret] = 0; + + return strtol(buf, NULL, 10); +} + +static int ksm_merge(void) +{ + long start_scans, end_scans; + + /* Wait for two full scans such that any possible merging happened. */ + start_scans = ksm_get_full_scans(); + if (start_scans < 0) + return start_scans; + if (write(ksm_fd, "1", 1) != 1) + return -errno; + do { + end_scans = ksm_get_full_scans(); + if (end_scans < 0) + return end_scans; + } while (end_scans < start_scans + 2); + + return 0; +} + +static char *mmap_and_merge_range(char val, unsigned long size) +{ + char *map; + + map = mmap(NULL, size, PROT_READ|PROT_WRITE, + MAP_PRIVATE|MAP_ANON, -1, 0); + if (map == MAP_FAILED) { + ksft_test_result_fail("mmap() failed\n"); + return MAP_FAILED; + } + + /* Don't use THP. Ignore if THP are not around on a kernel. */ + if (madvise(map, size, MADV_NOHUGEPAGE) && errno != EINVAL) { + ksft_test_result_fail("MADV_NOHUGEPAGE failed\n"); + goto unmap; + } + + /* Make sure each page contains the same values to merge them. */ + memset(map, val, size); + if (madvise(map, size, MADV_MERGEABLE)) { + ksft_test_result_fail("MADV_MERGEABLE failed\n"); + goto unmap; + } + + /* Run KSM to trigger merging and wait. */ + if (ksm_merge()) { + ksft_test_result_fail("Running KSM failed\n"); + goto unmap; + } + return map; +unmap: + munmap(map, size); + return MAP_FAILED; +} + +static void test_unmerge(void) +{ + const unsigned int size = 2 * MiB; + char *map; + + ksft_print_msg("[RUN] %s\n", __func__); + + map = mmap_and_merge_range(0xcf, size); + if (map == MAP_FAILED) + return; + + if (madvise(map, size, MADV_UNMERGEABLE)) { + ksft_test_result_fail("MADV_UNMERGEABLE failed\n"); + goto unmap; + } + + ksft_test_result(!range_maps_duplicates(map, size), + "Pages were unmerged\n"); +unmap: + munmap(map, size); +} + +static void test_unmerge_discarded(void) +{ + const unsigned int size = 2 * MiB; + char *map; + + ksft_print_msg("[RUN] %s\n", __func__); + + map = mmap_and_merge_range(0xcf, size); + if (map == MAP_FAILED) + return; + + /* Discard half of all mapped pages so we have pte_none() entries. */ + if (madvise(map, size / 2, MADV_DONTNEED)) { + ksft_test_result_fail("MADV_DONTNEED failed\n"); + goto unmap; + } + + if (madvise(map, size, MADV_UNMERGEABLE)) { + ksft_test_result_fail("MADV_UNMERGEABLE failed\n"); + goto unmap; + } + + ksft_test_result(!range_maps_duplicates(map, size), + "Pages were unmerged\n"); +unmap: + munmap(map, size); +} + +#ifdef __NR_userfaultfd +static void test_unmerge_uffd_wp(void) +{ + struct uffdio_writeprotect uffd_writeprotect; + struct uffdio_register uffdio_register; + const unsigned int size = 2 * MiB; + struct uffdio_api uffdio_api; + char *map; + int uffd; + + ksft_print_msg("[RUN] %s\n", __func__); + + map = mmap_and_merge_range(0xcf, size); + if (map == MAP_FAILED) + return; + + /* See if UFFD is around. */ + uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); + if (uffd < 0) { + ksft_test_result_skip("__NR_userfaultfd failed\n"); + goto unmap; + } + + /* See if UFFD-WP is around. */ + uffdio_api.api = UFFD_API; + uffdio_api.features = UFFD_FEATURE_PAGEFAULT_FLAG_WP; + if (ioctl(uffd, UFFDIO_API, &uffdio_api) < 0) { + ksft_test_result_fail("UFFDIO_API failed\n"); + goto close_uffd; + } + if (!(uffdio_api.features & UFFD_FEATURE_PAGEFAULT_FLAG_WP)) { + ksft_test_result_skip("UFFD_FEATURE_PAGEFAULT_FLAG_WP not available\n"); + goto close_uffd; + } + + /* Register UFFD-WP, no need for an actual handler. */ + uffdio_register.range.start = (unsigned long) map; + uffdio_register.range.len = size; + uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; + if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) < 0) { + ksft_test_result_fail("UFFDIO_REGISTER_MODE_WP failed\n"); + goto close_uffd; + } + + /* Write-protect the range using UFFD-WP. */ + uffd_writeprotect.range.start = (unsigned long) map; + uffd_writeprotect.range.len = size; + uffd_writeprotect.mode = UFFDIO_WRITEPROTECT_MODE_WP; + if (ioctl(uffd, UFFDIO_WRITEPROTECT, &uffd_writeprotect)) { + ksft_test_result_fail("UFFDIO_WRITEPROTECT failed\n"); + goto close_uffd; + } + + if (madvise(map, size, MADV_UNMERGEABLE)) { + ksft_test_result_fail("MADV_UNMERGEABLE failed\n"); + goto close_uffd; + } + + ksft_test_result(!range_maps_duplicates(map, size), + "Pages were unmerged\n"); +close_uffd: + close(uffd); +unmap: + munmap(map, size); +} +#endif + +int main(int argc, char **argv) +{ + unsigned int tests = 2; + int err; + +#ifdef __NR_userfaultfd + tests++; +#endif + + ksft_print_header(); + ksft_set_plan(tests); + + pagesize = getpagesize(); + + ksm_fd = open("/sys/kernel/mm/ksm/run", O_RDWR); + if (ksm_fd < 0) + ksft_exit_skip("open(\"/sys/kernel/mm/ksm/run\") failed\n"); + ksm_full_scans_fd = open("/sys/kernel/mm/ksm/full_scans", O_RDONLY); + if (ksm_full_scans_fd < 0) + ksft_exit_skip("open(\"/sys/kernel/mm/ksm/full_scans\") failed\n"); + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); + if (pagemap_fd < 0) + ksft_exit_skip("open(\"/proc/self/pagemap\") failed\n"); + + test_unmerge(); + test_unmerge_discarded(); +#ifdef __NR_userfaultfd + test_unmerge_uffd_wp(); +#endif + + err = ksft_get_fail_cnt(); + if (err) + ksft_exit_fail_msg("%d out of %d tests failed\n", + err, ksft_test_num()); + return ksft_exit_pass(); +} diff --git a/tools/testing/selftests/vm/run_vmtests.sh b/tools/testing/selftests/vm/run_vmtests.sh index e780e76c26b8..b8950891259b 100755 --- a/tools/testing/selftests/vm/run_vmtests.sh +++ b/tools/testing/selftests/vm/run_vmtests.sh @@ -184,6 +184,8 @@ run_test ./ksm_tests -N -m 1 # KSM test with 2 NUMA nodes and merge_across_nodes = 0 run_test ./ksm_tests -N -m 0 +run_test ./ksm_functional_tests + # protection_keys tests if [ -x ./protection_keys_32 ] then diff --git a/tools/testing/selftests/vm/vm_util.c b/tools/testing/selftests/vm/vm_util.c index f11f8adda521..dbd8889324e6 100644 --- a/tools/testing/selftests/vm/vm_util.c +++ b/tools/testing/selftests/vm/vm_util.c @@ -28,6 +28,16 @@ bool pagemap_is_softdirty(int fd, char *start) return entry & 0x0080000000000000ull; } +unsigned long pagemap_get_pfn(int fd, char *start) +{ + uint64_t entry = pagemap_get_entry(fd, start); + + /* If present (63th bit), PFN is at bit 0 -- 54. */ + if (entry & 0x8000000000000000ull) + return entry & 0x007fffffffffffffull; + return -1ull; +} + void clear_softdirty(void) { int ret; diff --git a/tools/testing/selftests/vm/vm_util.h b/tools/testing/selftests/vm/vm_util.h index 5c35de454e08..acecb5b6f8ca 100644 --- a/tools/testing/selftests/vm/vm_util.h +++ b/tools/testing/selftests/vm/vm_util.h @@ -4,6 +4,7 @@ uint64_t pagemap_get_entry(int fd, char *start); bool pagemap_is_softdirty(int fd, char *start); +unsigned long pagemap_get_pfn(int fd, char *start); void clear_softdirty(void); bool check_for_pattern(FILE *fp, const char *pattern, char *buf, size_t len); uint64_t read_pmd_pagesize(void); From patchwork Fri Oct 21 10:11:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6617 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp610360wrr; Fri, 21 Oct 2022 03:17:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5ze5VAeipuaOya+wgVgtyeLQLBaAdYB0J71557Tg4+rzAcxWF/dDX7WZ5qmYOLu9hsmAyE X-Received: by 2002:a17:906:3fd2:b0:78d:b793:5ef9 with SMTP id k18-20020a1709063fd200b0078db7935ef9mr14353514ejj.496.1666347449007; Fri, 21 Oct 2022 03:17:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347449; cv=none; d=google.com; s=arc-20160816; b=u2c4D3wfzWSp2cTK5x75y7oZfHuaRMNwMozZOd7qTxk8jAV6EHZXCdBk6r1ZI9OlaY ot9iZC0acuJSWQiCajwgs9cqGFX+Cvt1kDXZRlI3+4r0pJfAD4M4eBetKNI8Yl4l1eWa 0RYBhLoaOxakpWa7Iyvtt6adbX4MRHfPBERRz/+YjaPZl1BahND+PDpf1mo4/ShLcGTZ ROsrUn0hNbPrjLHo7V2z/LW2xAu6puPQLpPhBcWoYLt21IHdolq/pIaeGyyrO18pn2cq c/T9gYLz+nM0A4+swQBw4K7j/Afv5+KtL+kOsUSbJb/hD1mZTjlZw9B2oR3eqOiNnRfB S0wA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jHrPlMhzTs5FfUPfL2J2Q5lLuPRkNLMRcyfMgFj4nT4=; b=Irjz2y5XuDPnMRBR+U7DctOpp2ZMNiC1F3Y84Qs4JrmjZkOrG/YyRZs7th0Dq+Cw7K PicimJSPAiisrY5FfssMd/22BcDpps2dhv4e5VDTV3fp3pSqON+W3PThv2h4Jt7ZThb8 HBiKYbnIPNdZjSnpG8h46mnNVs27l9IW+H0Zwd39y/DxGpYqYxTCtvuyYdjXmNaxnLld vve9OLIgioXKXz8X38db+PfePinRRhPRo6B2fPjci2/FBlhapQEMvPpxead0cYRrx9gV LBXSU+zSpiB0+01I5L324Znt8VMH40yAcCql8Oui4I4GO/eo1f06rAJTNJhem04i73/a b2Ww== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=N0Ce5rcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q14-20020a056402032e00b004611d1ec715si2817832edw.293.2022.10.21.03.17.04; Fri, 21 Oct 2022 03:17:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=N0Ce5rcV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230393AbiJUKNG (ORCPT + 99 others); Fri, 21 Oct 2022 06:13:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55572 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230334AbiJUKMm (ORCPT ); Fri, 21 Oct 2022 06:12:42 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D72FF7AB12 for ; Fri, 21 Oct 2022 03:12:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347158; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jHrPlMhzTs5FfUPfL2J2Q5lLuPRkNLMRcyfMgFj4nT4=; b=N0Ce5rcVUBQr3gTezV1zh0XBJS1P/gMC0KTXioI/3eSQJG+DnXbbumJGMxAX+2gvVaiLS8 sp0PiYrQmi6MsFHf/21NsbIXhGffhXvssfr7dApHsqE5h1HNZX/HUXDe6pEsTy9IxQRECh YtAjQdjRHmmbMArpoyVQxxbnbR9LYPs= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-466-dn75feFuN_6Iop431YOpQQ-1; Fri, 21 Oct 2022 06:12:34 -0400 X-MC-Unique: dn75feFuN_6Iop431YOpQQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 36AF23815D31; Fri, 21 Oct 2022 10:12:34 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id D682940E42FB; Fri, 21 Oct 2022 10:12:22 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 5/9] mm/ksm: fix KSM COW breaking with userfaultfd-wp via FAULT_FLAG_UNSHARE Date: Fri, 21 Oct 2022 12:11:37 +0200 Message-Id: <20221021101141.84170-6-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291942907404985?= X-GMAIL-MSGID: =?utf-8?q?1747291942907404985?= Let's stop breaking COW via a fake write fault and let's use FAULT_FLAG_UNSHARE instead. This avoids any wrong side effects of the fake write fault, such as mapping the PTE writable and marking the pte dirty/softdirty. Consequently, we will no longer trigger a fake write fault and break COW without any such side-effects. Also, this fixes KSM interaction with userfaultfd-wp: when we have a KSM page that's write-protected by userfaultfd, break_ksm()->handle_mm_fault() will fail with VM_FAULT_SIGBUS and will simply return in break_ksm() with 0 instead of actually breaking COW. For now, the KSM unmerge tests can trigger that: $ sudo ./ksm_functional_tests TAP version 13 1..3 # [RUN] test_unmerge ok 1 Pages were unmerged # [RUN] test_unmerge_discarded ok 2 Pages were unmerged # [RUN] test_unmerge_uffd_wp not ok 3 Pages were unmerged Bail out! 1 out of 3 tests failed # Planned tests != run tests (2 != 3) # Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0 The warning in dmesg also indicates this wrong handling: [ 230.096368] FAULT_FLAG_ALLOW_RETRY missing 881 [ 230.100822] CPU: 1 PID: 1643 Comm: ksm-uffd-wp [...] [ 230.110124] Hardware name: [...] [ 230.117775] Call Trace: [ 230.120227] [ 230.122334] dump_stack_lvl+0x44/0x5c [ 230.126010] handle_userfault.cold+0x14/0x19 [ 230.130281] ? tlb_finish_mmu+0x65/0x170 [ 230.134207] ? uffd_wp_range+0x65/0xa0 [ 230.137959] ? _raw_spin_unlock+0x15/0x30 [ 230.141972] ? do_wp_page+0x50/0x590 [ 230.145551] __handle_mm_fault+0x9f5/0xf50 [ 230.149652] ? mmput+0x1f/0x40 [ 230.152712] handle_mm_fault+0xb9/0x2a0 [ 230.156550] break_ksm+0x141/0x180 [ 230.159964] unmerge_ksm_pages+0x60/0x90 [ 230.163890] ksm_madvise+0x3c/0xb0 [ 230.167295] do_madvise.part.0+0x10c/0xeb0 [ 230.171396] ? do_syscall_64+0x67/0x80 [ 230.175157] __x64_sys_madvise+0x5a/0x70 [ 230.179082] do_syscall_64+0x58/0x80 [ 230.182661] ? do_syscall_64+0x67/0x80 [ 230.186413] entry_SYSCALL_64_after_hwframe+0x63/0xcd This is primarily a fix for KSM+userfaultfd-wp, however, the fake write fault was always questionable. As this fix is not easy to backport and it's not very critical, let's not cc stable. Fixes: 529b930b87d9 ("userfaultfd: wp: hook userfault handler to write protection fault") Acked-by: Peter Xu Signed-off-by: David Hildenbrand --- mm/ksm.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index b884a22f3c3c..c6f58aa6e731 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -420,17 +420,15 @@ static inline bool ksm_test_exit(struct mm_struct *mm) } /* - * We use break_ksm to break COW on a ksm page: it's a stripped down + * We use break_ksm to break COW on a ksm page by triggering unsharing, + * such that the ksm page will get replaced by an exclusive anonymous page. * - * if (get_user_pages(addr, 1, FOLL_WRITE, &page, NULL) == 1) - * put_page(page); - * - * but taking great care only to touch a ksm page, in a VM_MERGEABLE vma, + * We take great care only to touch a ksm page, in a VM_MERGEABLE vma, * in case the application has unmapped and remapped mm,addr meanwhile. * Could a ksm page appear anywhere else? Actually yes, in a VM_PFNMAP * mmap of /dev/mem, where we would not want to touch it. * - * FAULT_FLAG/FOLL_REMOTE are because we do this outside the context + * FAULT_FLAG_REMOTE/FOLL_REMOTE are because we do this outside the context * of the process that owns 'vma'. We also do not want to enforce * protection keys here anyway. */ @@ -454,7 +452,7 @@ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) if (!ksm_page) return 0; ret = handle_mm_fault(vma, addr, - FAULT_FLAG_WRITE | FAULT_FLAG_REMOTE, + FAULT_FLAG_UNSHARE | FAULT_FLAG_REMOTE, NULL); } while (!(ret & (VM_FAULT_SIGBUS | VM_FAULT_SIGSEGV | VM_FAULT_OOM))); /* From patchwork Fri Oct 21 10:11:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6615 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp609989wrr; Fri, 21 Oct 2022 03:16:57 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4t2fmaKP9Xov+42u54IvB2ef4eV4yMC9Cf7hVjPpvD1J39eVUQjHRbQprl4SzPhRtxudnY X-Received: by 2002:a17:907:2cea:b0:78d:eac6:2d09 with SMTP id hz10-20020a1709072cea00b0078deac62d09mr15014282ejc.124.1666347416828; Fri, 21 Oct 2022 03:16:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347416; cv=none; d=google.com; s=arc-20160816; b=DJxXyga6c1cqD5D2TyvoxKFzWPVtu37Mzb8V2oQEvDgS0Z85vvuo6Hudpy3HPQiD7y RpQ/Fp/+B9bbX1StSRgeEcpQ/JI4qm5MnPE1xuV2TkmCqkeon3qbh2tP4cR23pOM8x4X GLIO45Q8r6q3rUSzFjKNkcDztdKHpMkk4gEg9RRqIoA5Ng5hFaHSX07/uIW39ZqdQs6O bCHBLYI5ZsPfVSJBiEyIBFyI2Hy+4ZUOuR1tlwBAH3k+VIICI0aTeqh1QY6XKnjQg2xq 3SH5XvGwUhk6A/PprpL41kzHjpA2X98LeY2Rrss3IDuhy6yvQx4MH2vh74RKblUn/kXQ /tTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=QvbHA1NlNVduFtzvLq26bYOPMz6FpLJa39idJS7z9qQ=; b=Nk9x83YgQ1uQDqCWtKxKoW26zBNPyhFrWmZIANVm06ihSaf6GMztru3B+lURVhIyLh LiGawEbeB1WdUE6cv2nUzy3T/jB9KoqySPPgr75EdcOlsF88YmmMUvcqMvXVJv2mf12V gdbR/C+Nyleeu/d2mRPruxhkRebvoJYFWQaM7Uq8pB0yoJE0onTQ5hkdb8BXDRTlDLm4 bKbdowkLBy0pmbHPW6yt/a4hbp9uTbAGLwdO0FLfgCL23AVE1daI7Gq6AxmZTA+8Wl/D kyg+G4KztrrA4fozw6+6cWhkfQ9zuUKNg/ozXyigtOalsfxZlRWxRVWBscnryRB8EChA yxCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=D0fWQBBv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fe8-20020a056402390800b004612505ece4si2776756edb.483.2022.10.21.03.16.32; Fri, 21 Oct 2022 03:16:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=D0fWQBBv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230368AbiJUKMw (ORCPT + 99 others); Fri, 21 Oct 2022 06:12:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230337AbiJUKMk (ORCPT ); Fri, 21 Oct 2022 06:12:40 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2905A7AB17 for ; Fri, 21 Oct 2022 03:12:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347157; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=QvbHA1NlNVduFtzvLq26bYOPMz6FpLJa39idJS7z9qQ=; b=D0fWQBBvQLAzhX2zjMOZNPgyUEE8F7g2aGliW+hHkYnMg5vL9pzFrRzAGm2lghQCUaoo8o pl9lwHe9CS/PJvjBOS8OD3hoPeSucVyz3lPMsRgNM8lVMdzU0aSLHVl7WWbf96gEd6Jtl5 zu5Iip2VSrucIfolCPzLpdIggfG8+zo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-642-ViDLH6jQP72Zr-84yIPLcQ-1; Fri, 21 Oct 2022 06:12:33 -0400 X-MC-Unique: ViDLH6jQP72Zr-84yIPLcQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2B6183C0F242; Fri, 21 Oct 2022 10:12:33 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id 31D7B40E80E7; Fri, 21 Oct 2022 10:12:29 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 6/9] mm/pagewalk: don't trigger test_walk() in walk_page_vma() Date: Fri, 21 Oct 2022 12:11:38 +0200 Message-Id: <20221021101141.84170-7-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291909090485057?= X-GMAIL-MSGID: =?utf-8?q?1747291909090485057?= As Peter points out, the caller passes a single VMA and can just do that check itself. And in fact, no existing users rely on test_walk() getting called. So let's just remove it and make the implementation slightly more efficient. Signed-off-by: David Hildenbrand --- include/linux/pagewalk.h | 2 ++ mm/pagewalk.c | 7 ------- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index f3fafb731ffd..37dc0208862d 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -27,6 +27,8 @@ struct mm_walk; * "do page table walk over the current vma", returning * a negative value means "abort current page table walk * right now" and returning 1 means "skip the current vma" + * Note that this callback is not called when the caller + * passes in a single VMA as for walk_page_vma(). * @pre_vma: if set, called before starting walk on a non-null vma. * @post_vma: if set, called after a walk on a non-null vma, provided * that @pre_vma and the vma walk succeeded. diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 2ff3a5bebceb..0a5d71aaf9c7 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -526,18 +526,11 @@ int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops, .vma = vma, .private = private, }; - int err; if (!walk.mm) return -EINVAL; mmap_assert_locked(walk.mm); - - err = walk_page_test(vma->vm_start, vma->vm_end, &walk); - if (err > 0) - return 0; - if (err < 0) - return err; return __walk_page_range(vma->vm_start, vma->vm_end, &walk); } From patchwork Fri Oct 21 10:11:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6616 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp610020wrr; Fri, 21 Oct 2022 03:16:59 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5FxIoRqGb5Z4xZIkMn3v3qaq5gaoSrw8Ee3vnITuzKPBSOx5VJaeZVUwjONuCbrL7EdPjB X-Received: by 2002:a17:907:a4a:b0:77b:c1b2:479a with SMTP id be10-20020a1709070a4a00b0077bc1b2479amr11802445ejc.109.1666347419630; Fri, 21 Oct 2022 03:16:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347419; cv=none; d=google.com; s=arc-20160816; b=isbXVwF3Gxqa5rE77fyeHEtjkhHoiGCr/oO17CiqvD4n3eHkQ9HIsepxlv1Tr9/y5m ww0lXhKFs1DZ3hfI39xPm6xuOfsD6aho3JDnPgDBS84OUbCAFxGqL2yJZQpot0RJuB2S k9/KOYfEqAjfBngZpmYox9L6qqJlwR2h+1jSHtRYhyX/Fuir6zgM3gVsYquaIEi8U5bH MAUlMouDQXElS2bnWM2MgylMgLo/W4E8BgX+Ja/DEuppxHJ5CqtJEsQSgJYOKBEGN62T Pd1Q21YbAwK97r7j52yYV/O079yoLXzVKwxJWjZjkFrXvhv41jtSrZQtRQaLIust7M8P K3Sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3jjsbfVX32kRXRzGJzqZMgspzFJroQ0YMSc5QEJ3H8s=; b=a+zl2CGj4mOfODJUfVP2aMd9GSDV09UHUJMbQE6vJYrVbdhYvDeAdrkwbewWxYAdHS A1BbfOeLfV0ULubQbiQS7in4ye3jUuhhZAEYNOKc0XYsTSenFX6aZQtn9JopU/kS/mL5 vn3w9I8d95pZ2SdlYQcxGZ7kQyZ4vpz0yU2TyF6Qro6gw78gohlXS/8Hyd26MH0fDrXa Dc9SPh8tRdoRU2p911rVBwsaV2OWVyxcN2JzFPAA38WdpB/PFbN1WpnS52JtarHNxG6C KzfgHHlPREDht1FLTUsz+V03YpskTlHlR2nmIiTMXxAdsCmW5DPC9vN5eRZVIC+sSUMv 5FUg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="gcO/3AFs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qf24-20020a1709077f1800b007416e100f3dsi20058904ejc.986.2022.10.21.03.16.34; Fri, 21 Oct 2022 03:16:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="gcO/3AFs"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230373AbiJUKM5 (ORCPT + 99 others); Fri, 21 Oct 2022 06:12:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230339AbiJUKMm (ORCPT ); Fri, 21 Oct 2022 06:12:42 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 191637AB08 for ; Fri, 21 Oct 2022 03:12:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347159; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3jjsbfVX32kRXRzGJzqZMgspzFJroQ0YMSc5QEJ3H8s=; b=gcO/3AFsz3ar9p76VY8o1UnQXQvkCtMBXQ7xzXeP15/xultDfC4XSFDM4nLtf1RMQblHyL 2o0cLOtiOaOoeUMG22b0C3Q++xpq+VyJrCk+j3Mn1jf+GoiHkRtBrawmanDYnngpCjHn9q 16KMOHFr+gZsmlZ3adl2umOitQWPnq8= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-621-079V-auANpW0rqJrmq2chA-1; Fri, 21 Oct 2022 06:12:36 -0400 X-MC-Unique: 079V-auANpW0rqJrmq2chA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 97B9B3C0F23B; Fri, 21 Oct 2022 10:12:35 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id 6A76A40E80E9; Fri, 21 Oct 2022 10:12:33 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 7/9] mm/pagewalk: add walk_page_range_vma() Date: Fri, 21 Oct 2022 12:11:39 +0200 Message-Id: <20221021101141.84170-8-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291911630404189?= X-GMAIL-MSGID: =?utf-8?q?1747291911630404189?= Let's add walk_page_range_vma(), which is similar to walk_page_vma(), however, is only interested in a subset of the VMA range. To be used in KSM code to stop using follow_page() next. Signed-off-by: David Hildenbrand --- include/linux/pagewalk.h | 3 +++ mm/pagewalk.c | 20 ++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index 37dc0208862d..959f52e5867d 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -101,6 +101,9 @@ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, unsigned long end, const struct mm_walk_ops *ops, pgd_t *pgd, void *private); +int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, + unsigned long end, const struct mm_walk_ops *ops, + void *private); int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops, void *private); int walk_page_mapping(struct address_space *mapping, pgoff_t first_index, diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 0a5d71aaf9c7..7f1c9b274906 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -517,6 +517,26 @@ int walk_page_range_novma(struct mm_struct *mm, unsigned long start, return walk_pgd_range(start, end, &walk); } +int walk_page_range_vma(struct vm_area_struct *vma, unsigned long start, + unsigned long end, const struct mm_walk_ops *ops, + void *private) +{ + struct mm_walk walk = { + .ops = ops, + .mm = vma->vm_mm, + .vma = vma, + .private = private, + }; + + if (start >= end || !walk.mm) + return -EINVAL; + if (start < vma->vm_start || end > vma->vm_end) + return -EINVAL; + + mmap_assert_locked(walk.mm); + return __walk_page_range(start, end, &walk); +} + int walk_page_vma(struct vm_area_struct *vma, const struct mm_walk_ops *ops, void *private) { From patchwork Fri Oct 21 10:11:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6614 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp609958wrr; Fri, 21 Oct 2022 03:16:54 -0700 (PDT) X-Google-Smtp-Source: AMsMyM61BadgutaWCRkZT9r5Fmczfi/aROpL+b6z4u6SrNo9vN3iDMNwoV3QDML9oBPnC+Xtyr1H X-Received: by 2002:a05:6402:191:b0:45c:83e8:d74a with SMTP id r17-20020a056402019100b0045c83e8d74amr15937445edv.329.1666347414206; Fri, 21 Oct 2022 03:16:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347414; cv=none; d=google.com; s=arc-20160816; b=DPBSQhwbGKjIrAAtNanpNYfSRjzHsjzSoNOi3JfL3opA1YKnsZR36J+QnilXncSCkk 9V4XhdKCJa6/VdwfFwLijdMyH5B1gtbEUbHOwLzbiVDyheQy5z8cxUQkdlmUt/H1wv3B /hqLQ7VMjnMq/E6AiJb4HuKBbpzw5q0xb4KJx1jNr0SjhICf0I+zfD+QZncMvlLcnctw oVPuRIVpqx9zqmXxd5TUeScC3KPXHwXm+s4chOk/go2q7ShkozufucPL7uEpOaZ2noQo rAgC/Y0ck1jU3IGe2vLkace5H6+XvSJojTb91G879VqQmA3j3lM3J0ZB9/fu9MWnvkw5 bRGw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=RBzNhruieU6ZfUTXGoryk7ArPTHcpoWfg1dNbBcF6Js=; b=rYQM3lLZ1tvfS2ikd06g6X3jxmRiFdYu1ul4f5Eg3pLMK8PawvTgwV1OH735Ck4Tbh IWJS1vwetCLTGn1kcYn+eXEPZMumb9wpAsc/dfl3nX6B0uLrDF9fqerwVAAzvpgVTNI0 pZ9VsI2f8VbHfciKVplJD4XJulU3JC7khk0Z+dLTCc8WH43PSR0MnmUbfGiq8/LvzhgB hd2Xpl2tKIvHoK4DSFo4KLi6FxeSLZx33H9aqBrlVoXPRbfJj7JEPB7C/a70BOH1i2sd 1KqTHrNSanI0lfMiY5vb1rBsUrCrkRIgsmCCgLAnEsyHRrDcyCoWsO+vz4b4qWtRkHkz uh8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AZtaZF+4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id gb6-20020a170907960600b007414dda0c62si23665632ejc.817.2022.10.21.03.16.29; Fri, 21 Oct 2022 03:16:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=AZtaZF+4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230414AbiJUKNU (ORCPT + 99 others); Fri, 21 Oct 2022 06:13:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56444 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbiJUKNI (ORCPT ); Fri, 21 Oct 2022 06:13:08 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECD0025B27E for ; Fri, 21 Oct 2022 03:12:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347165; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RBzNhruieU6ZfUTXGoryk7ArPTHcpoWfg1dNbBcF6Js=; b=AZtaZF+4iGNg24ymVruN1Tf/FWJtig4Rvz4nFUQgm28nObf6Z1VSqS06XgjShtIOaGDsfc qW7/4CZYC2dvO8HJ4u/Q9jWHQJ9IkHuDJE49CUKkdAFnykSk4Z9VYE6EaS1Z1iXEky5VIQ q5ceA2E7N5bUrvOBzdPlIAPZCXSNPWo= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-632-nBZ72PQWNyWQLn-TMGl-EA-1; Fri, 21 Oct 2022 06:12:38 -0400 X-MC-Unique: nBZ72PQWNyWQLn-TMGl-EA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 230601C06EDC; Fri, 21 Oct 2022 10:12:38 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id D93DC40E80E3; Fri, 21 Oct 2022 10:12:35 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 8/9] mm/ksm: convert break_ksm() to use walk_page_range_vma() Date: Fri, 21 Oct 2022 12:11:40 +0200 Message-Id: <20221021101141.84170-9-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291906379959528?= X-GMAIL-MSGID: =?utf-8?q?1747291906379959528?= FOLL_MIGRATION exists only for the purpose of break_ksm(), and actually, there is not even the need to wait for the migration to finish, we only want to know if we're dealing with a KSM page. Using follow_page() just to identify a KSM page overcomplicates GUP code. Let's use walk_page_range_vma() instead, because we don't actually care about the page itself, we only need to know a single property -- no need to even grab a reference. So, get rid of follow_page() usage such that we can get rid of FOLL_MIGRATION now and eventually be able to get rid of follow_page() in the future. In my setup (AMD Ryzen 9 3900X), running the KSM selftest to test unmerge performance on 2 GiB (taskset 0x8 ./ksm_tests -D -s 2048), this results in a performance degradation of ~2% (old: ~5010 MiB/s, new: ~4900 MiB/s). I don't think we particularly care for now. Interestingly, the benchmark reduction is due to the single callback. Adding a second callback (e.g., pud_entry()) reduces the benchmark by another 100-200 MiB/s. Signed-off-by: David Hildenbrand --- mm/ksm.c | 49 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 39 insertions(+), 10 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index c6f58aa6e731..5cdb852ff132 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include "internal.h" @@ -419,6 +420,39 @@ static inline bool ksm_test_exit(struct mm_struct *mm) return atomic_read(&mm->mm_users) == 0; } +static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long next, + struct mm_walk *walk) +{ + struct page *page = NULL; + spinlock_t *ptl; + pte_t *pte; + int ret; + + if (pmd_leaf(*pmd) || !pmd_present(*pmd)) + return 0; + + pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); + if (pte_present(*pte)) { + page = vm_normal_page(walk->vma, addr, *pte); + } else if (!pte_none(*pte)) { + swp_entry_t entry = pte_to_swp_entry(*pte); + + /* + * As KSM pages remain KSM pages until freed, no need to wait + * here for migration to end. + */ + if (is_migration_entry(entry)) + page = pfn_swap_entry_to_page(entry); + } + ret = page && PageKsm(page); + pte_unmap_unlock(pte, ptl); + return ret; +} + +static const struct mm_walk_ops break_ksm_ops = { + .pmd_entry = break_ksm_pmd_entry, +}; + /* * We use break_ksm to break COW on a ksm page by triggering unsharing, * such that the ksm page will get replaced by an exclusive anonymous page. @@ -434,21 +468,16 @@ static inline bool ksm_test_exit(struct mm_struct *mm) */ static int break_ksm(struct vm_area_struct *vma, unsigned long addr) { - struct page *page; vm_fault_t ret = 0; do { - bool ksm_page = false; + int ksm_page; cond_resched(); - page = follow_page(vma, addr, - FOLL_GET | FOLL_MIGRATION | FOLL_REMOTE); - if (IS_ERR_OR_NULL(page)) - break; - if (PageKsm(page)) - ksm_page = true; - put_page(page); - + ksm_page = walk_page_range_vma(vma, addr, addr + 1, + &break_ksm_ops, NULL); + if (WARN_ON_ONCE(ksm_page < 0)) + return ksm_page; if (!ksm_page) return 0; ret = handle_mm_fault(vma, addr, From patchwork Fri Oct 21 10:11:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 6610 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4242:0:0:0:0:0 with SMTP id s2csp609605wrr; Fri, 21 Oct 2022 03:16:18 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6M2g7wT8kYQV3mGY6xeBBGrCoa9dt5k6z+X4eMerjE0MtlsDfrMCzFgQ0//G2esQk7HXFX X-Received: by 2002:a17:907:2cd8:b0:78d:9c3c:d788 with SMTP id hg24-20020a1709072cd800b0078d9c3cd788mr15069071ejc.327.1666347378014; Fri, 21 Oct 2022 03:16:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666347378; cv=none; d=google.com; s=arc-20160816; b=tts45IipFZekHq5bqCgSvpf9UhXcsgvgxBxxVDajbUNdUnFTry/w9bMFDGmy4GzxDD 9G/UxD6EZxBnsnOKRw4jXKmbVnUCZK84D2BW/72WjsFN8Hcmg+Il41TikRO76et0zT9K 0oFepce2dGejYrOBNkmmI4jRVEvPjU+9fE0fHpIqtBqFuZrwQNuIMMIGprGTNEqv2HHF x4q6/NmTVClAMPqG+oD/V+peBL5xcpDXcAdQZkUjx3gIGOJk+tsCaY0PZc27xVPi+Tro HMHPj6gGyTKz+xAFJZfhoMkyExWnFOHna2U1GqnNwEG/2qCsRGVbbQketeWJIJ6N1hFA faYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=F2jxpEehZg91tvyIg0dyRa8PzGRQ+ddZdBpCZCuDxx0=; b=BTgfSeTRyg0my3sV8ADy0AED5bR5xlo3lPDyaamjRnV3SIifU5INgWRFZZ0miKE90I ZHJ1wd50It1C6vlv+M668C5RQmlRssrNDzVZxlpKMMQeL6zoWRmWgIWOBO1d3INX0Yqd cqVwV7/FW/BfSe4ZChSVFjiKKDqqJ63K7tlUbKy+rxIqBkIfhbqtvdj2NrM9/fr3hAyd efbx0e8DSvPrncNV+XB2BsUB29DmIwP+PJstSjHBmgzSAuSCrdZBvuGO/Wz8U9zVpq3l HfU3JmRMBq5RUo1uf2hIyaNuXwRC9l2IZ+E3c8yovBU/+7o7XG6GQrqHD5ngepBe8Tb0 M9Yw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VYTq5ysQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id wu13-20020a170906eecd00b0078daf101aa1si18884041ejb.813.2022.10.21.03.15.51; Fri, 21 Oct 2022 03:16:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=VYTq5ysQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230402AbiJUKNR (ORCPT + 99 others); Fri, 21 Oct 2022 06:13:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56380 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230398AbiJUKNG (ORCPT ); Fri, 21 Oct 2022 06:13:06 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F9F924CCB7 for ; Fri, 21 Oct 2022 03:12:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666347164; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F2jxpEehZg91tvyIg0dyRa8PzGRQ+ddZdBpCZCuDxx0=; b=VYTq5ysQ9mtvp3sVWLrSZeBzOHa8Pv6ez/2Ci7EBnSTWyLo58ymtaZOF8BeIwuWtStx84y RhW0npNkREO6l1RhcW9zv9jl2d0OXfxJQ2QtAjAXebmbHVx4+wmcyf1BBFQ2zb/6HSPsGh DXEK5fDxsOBudwFLUKGB/DTbiJeUPHI= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-91-tMySy2NHMr6eu_lQNLJ3Qw-1; Fri, 21 Oct 2022 06:12:40 -0400 X-MC-Unique: tMySy2NHMr6eu_lQNLJ3Qw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 319E83C0F240; Fri, 21 Oct 2022 10:12:40 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.193.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F9BF40E80E4; Fri, 21 Oct 2022 10:12:38 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Shuah Khan , Hugh Dickins , Vlastimil Babka , Peter Xu , Andrea Arcangeli , "Matthew Wilcox (Oracle)" , Jason Gunthorpe , John Hubbard Subject: [PATCH v2 9/9] mm/gup: remove FOLL_MIGRATION Date: Fri, 21 Oct 2022 12:11:41 +0200 Message-Id: <20221021101141.84170-10-david@redhat.com> In-Reply-To: <20221021101141.84170-1-david@redhat.com> References: <20221021101141.84170-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spam-Status: No, score=-2.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747291868222175178?= X-GMAIL-MSGID: =?utf-8?q?1747291868222175178?= Fortunately, the last user (KSM) is gone, so let's just remove this rather special code from generic GUP handling -- especially because KSM never required the PMD handling as KSM only deals with individual base pages. Signed-off-by: David Hildenbrand --- include/linux/mm.h | 1 - mm/gup.c | 55 +++++----------------------------------------- 2 files changed, 5 insertions(+), 51 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8bbcccbc5565..a63415ac9dc2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2950,7 +2950,6 @@ struct page *follow_page(struct vm_area_struct *vma, unsigned long address, * and return without waiting upon it */ #define FOLL_NOFAULT 0x80 /* do not fault in pages */ #define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */ -#define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ #define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ #define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ #define FOLL_ANON 0x8000 /* don't do file mappings */ diff --git a/mm/gup.c b/mm/gup.c index fe195d47de74..bcb46e9d496e 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -549,30 +549,13 @@ static struct page *follow_page_pte(struct vm_area_struct *vma, return no_page_table(vma, flags); } -retry: if (unlikely(pmd_bad(*pmd))) return no_page_table(vma, flags); ptep = pte_offset_map_lock(mm, pmd, address, &ptl); pte = *ptep; - if (!pte_present(pte)) { - swp_entry_t entry; - /* - * KSM's break_ksm() relies upon recognizing a ksm page - * even while it is being migrated, so for that case we - * need migration_entry_wait(). - */ - if (likely(!(flags & FOLL_MIGRATION))) - goto no_page; - if (pte_none(pte)) - goto no_page; - entry = pte_to_swp_entry(pte); - if (!is_migration_entry(entry)) - goto no_page; - pte_unmap_unlock(ptep, ptl); - migration_entry_wait(mm, pmd, address); - goto retry; - } + if (!pte_present(pte)) + goto no_page; if (pte_protnone(pte) && !gup_can_follow_protnone(flags)) goto no_page; @@ -694,28 +677,8 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, return page; return no_page_table(vma, flags); } -retry: - if (!pmd_present(pmdval)) { - /* - * Should never reach here, if thp migration is not supported; - * Otherwise, it must be a thp migration entry. - */ - VM_BUG_ON(!thp_migration_supported() || - !is_pmd_migration_entry(pmdval)); - - if (likely(!(flags & FOLL_MIGRATION))) - return no_page_table(vma, flags); - - pmd_migration_entry_wait(mm, pmd); - pmdval = READ_ONCE(*pmd); - /* - * MADV_DONTNEED may convert the pmd to null because - * mmap_lock is held in read mode - */ - if (pmd_none(pmdval)) - return no_page_table(vma, flags); - goto retry; - } + if (!pmd_present(pmdval)) + return no_page_table(vma, flags); if (pmd_devmap(pmdval)) { ptl = pmd_lock(mm, pmd); page = follow_devmap_pmd(vma, address, pmd, flags, &ctx->pgmap); @@ -729,18 +692,10 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, if (pmd_protnone(pmdval) && !gup_can_follow_protnone(flags)) return no_page_table(vma, flags); -retry_locked: ptl = pmd_lock(mm, pmd); - if (unlikely(pmd_none(*pmd))) { - spin_unlock(ptl); - return no_page_table(vma, flags); - } if (unlikely(!pmd_present(*pmd))) { spin_unlock(ptl); - if (likely(!(flags & FOLL_MIGRATION))) - return no_page_table(vma, flags); - pmd_migration_entry_wait(mm, pmd); - goto retry_locked; + return no_page_table(vma, flags); } if (unlikely(!pmd_trans_huge(*pmd))) { spin_unlock(ptl);