From patchwork Fri Feb 2 02:20:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ming Lei X-Patchwork-Id: 195526 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:9bc1:b0:106:209c:c626 with SMTP id op1csp161792dyc; Thu, 1 Feb 2024 18:21:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IEHF4Uh9+TUVbj7Pj7fSOy/Tjb8mLbOgnzpIE+UJN1Od6dgOOhIzlg89hDayR/bSjnVzTTr X-Received: by 2002:a05:620a:28c9:b0:783:8071:2473 with SMTP id l9-20020a05620a28c900b0078380712473mr1243727qkp.61.1706840497108; Thu, 01 Feb 2024 18:21:37 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706840497; cv=pass; d=google.com; s=arc-20160816; b=eK++fPCw9CxiNuH7hvWS0AtZjYVh4bJonOgRAR2FJizYLGFbUvkM9pg/qo9mtpBLag SGFIfS7sVzFr0VI2NeQTsETo6WUUaQrbPXJO+kM1kVJWRWmLqzHTxFUA145V6HP5kvb1 +czcrh7lm/tGypybhYZzVO/qdM+LfyFOD+kdeHi8C1HJVREeS0OHtLjEXrK93LtyvMb5 AgEydiUs7H/nccmoNO6BvdyNOt+Bi5bKJT5f6OCAoyqj4pUpLoH8z3Rtz1xbeKJkV8sb ExUZi4Lf6TlfYsoKCWsz8hTT8sL0fgO8m+ACl/kUTIN4qwc4rJaQrfCrt6DrdZ0/V3UK Ufvw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=/AFNq394fKdD6rutt1qh+bLin+C/DGw45kLcKx0YqTM=; fh=jOoXhTp+Zsa0x5dBmehlU46fSvUIlLEev+n5as3UbIc=; b=XACqbqhjYIsuZRGXl8SOr5Rz7qpQsGLxxCqRGe9M2bOXGHhdN8AZI7ATANTT/SSoMn U9jnKFXdkNKOzeacv00/3iEc6vA/QNFb2Z6N5SwB5oUE4bCj7hXVLJwv0G9aVgtIFmSL koPqKxA3ug6ytoViCJRK3s+HPrF28rXZFjrZANAyuYWFDrd05yjbZNajYWSvtvuWzUta MBkzBxdJRFV8IRqx8fXRgyspmOI9DEaekUE/dFL2kZL+wOyWzHPVfJiXru0FCs/G+86A GRM+5vvRd4zhXGQNlosWaY83ZznGHOqEFjLtHCFYdKQt4yEKRhXxS2/XkWcmL5yI5UJN U7gA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=crslRzn8; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-49086-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49086-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com X-Forwarded-Encrypted: i=1; AJvYcCVBXlMi8wstEgPvXFgab9+lswVPwU81KZhK7zpmCcp8FlY74oFK9yVPE6MIKTAOq4cHZpjxPkBleSnvDMlNIinRuBbIfQ== Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id o20-20020a05620a22d400b00783fd87084asi990378qki.97.2024.02.01.18.21.37 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 01 Feb 2024 18:21:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-49086-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=crslRzn8; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-49086-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-49086-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id E20A81C22762 for ; Fri, 2 Feb 2024 02:21:36 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id DD598D275; Fri, 2 Feb 2024 02:21:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="crslRzn8" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6D2CBBA48 for ; Fri, 2 Feb 2024 02:21:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706840466; cv=none; b=VZBLKbkl3tYHNB8i4FUaVgY4NNp01khhxignMOnREX5eFc0NEuIKX4JdY2/s1tTfDxD0a9J2dBtPyrdk0Ksn5aixtAYXgF+vWFNVV1yZkUJrsid2Y9UE/SLV7/K+i71hwnbC8adO8LvaMq+xm5bM9eeOxV40tFkEv2i6biJI3aI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706840466; c=relaxed/simple; bh=EQIwbZHsGuMlZyQ+7R9MpEihML0bmDWknH+HvCa+3QI=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=UppPgccqvHjFKFxSGS06bxYVQKon2wdLrOdn8mWLiCf6pmzeqQF14b9NrNbsLkDj7IJf45cD9yqWKXV4+I/13cLnDee9T3UavzT9rsBmbfQOAUNTwO0sIXECNpPGdPzabniQsAxaDkgesLMYsexVfoYSbEfeRehnkjJ76/AZ77A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=crslRzn8; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706840463; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=/AFNq394fKdD6rutt1qh+bLin+C/DGw45kLcKx0YqTM=; b=crslRzn8TIDEYKtqWxAUJMegEew8Yoh1yPbu4l2l56NrRQa1fg3iIIXBCl49gSRezmOi7w q27LNdKljx+9HCt3D8RtXuGGnZpjeZ/laBf2Z4IVc1RsJeKxMvMeIJxTAk311wCYvBHyBX jiZeMRMZY9bpdSJcgLKWhKoDxfY3Qe8= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-124-aBo1uFCGPYOOyb7-_Gf_gw-1; Thu, 01 Feb 2024 21:20:58 -0500 X-MC-Unique: aBo1uFCGPYOOyb7-_Gf_gw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 975C729AC00A; Fri, 2 Feb 2024 02:20:57 +0000 (UTC) Received: from localhost (unknown [10.72.116.16]) by smtp.corp.redhat.com (Postfix) with ESMTP id ABCD01121306; Fri, 2 Feb 2024 02:20:56 +0000 (UTC) From: Ming Lei To: Andrew Morton , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Ming Lei , David Hildenbrand , Matthew Wilcox , Alexander Viro , Christian Brauner , Don Dutile , Rafael Aquini , Dave Chinner , Mike Snitzer Subject: [PATCH] mm/madvise: set ra_pages as device max request size during ADV_POPULATE_READ Date: Fri, 2 Feb 2024 10:20:29 +0800 Message-ID: <20240202022029.1903629-1-ming.lei@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.3 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789751980927839869 X-GMAIL-MSGID: 1789751980927839869 madvise(MADV_POPULATE_READ) tries to populate all page tables in the specific range, so it is usually sequential IO if VMA is backed by file. Set ra_pages as device max request size for the involved readahead in the ADV_POPULATE_READ, this way reduces latency of madvise(MADV_POPULATE_READ) to 1/10 when running madvise(MADV_POPULATE_READ) over one 1GB file with usual(default) 128KB of read_ahead_kb. Cc: David Hildenbrand Cc: Matthew Wilcox Cc: Alexander Viro Cc: Christian Brauner Cc: Don Dutile Cc: Rafael Aquini Cc: Dave Chinner Cc: Mike Snitzer Cc: Andrew Morton Signed-off-by: Ming Lei --- mm/madvise.c | 52 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 51 insertions(+), 1 deletion(-) diff --git a/mm/madvise.c b/mm/madvise.c index 912155a94ed5..db5452c8abdd 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -900,6 +900,37 @@ static long madvise_dontneed_free(struct vm_area_struct *vma, return -EINVAL; } +static void madvise_restore_ra_win(struct file **file, unsigned int ra_pages) +{ + if (*file) { + struct file *f = *file; + + f->f_ra.ra_pages = ra_pages; + fput(f); + *file = NULL; + } +} + +static struct file *madvise_override_ra_win(struct file *f, + unsigned long start, unsigned long end, + unsigned int *old_ra_pages) +{ + unsigned int io_pages; + + if (!f || !f->f_mapping || !f->f_mapping->host) + return NULL; + + io_pages = inode_to_bdi(f->f_mapping->host)->io_pages; + if (((end - start) >> PAGE_SHIFT) < io_pages) + return NULL; + + f = get_file(f); + *old_ra_pages = f->f_ra.ra_pages; + f->f_ra.ra_pages = io_pages; + + return f; +} + static long madvise_populate(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end, @@ -908,9 +939,21 @@ static long madvise_populate(struct vm_area_struct *vma, const bool write = behavior == MADV_POPULATE_WRITE; struct mm_struct *mm = vma->vm_mm; unsigned long tmp_end; + unsigned int ra_pages; + struct file *file; int locked = 1; long pages; + /* + * In case of file backing mapping, increase readahead window + * for reducing the whole populate latency, and restore it + * after the populate is done + */ + if (behavior == MADV_POPULATE_READ) + file = madvise_override_ra_win(vma->vm_file, start, end, + &ra_pages); + else + file = NULL; *prev = vma; while (start < end) { @@ -920,8 +963,10 @@ static long madvise_populate(struct vm_area_struct *vma, */ if (!vma || start >= vma->vm_end) { vma = vma_lookup(mm, start); - if (!vma) + if (!vma) { + madvise_restore_ra_win(&file, ra_pages); return -ENOMEM; + } } tmp_end = min_t(unsigned long, end, vma->vm_end); @@ -935,6 +980,9 @@ static long madvise_populate(struct vm_area_struct *vma, vma = NULL; } if (pages < 0) { + /* restore ra pages back in case of any failure */ + madvise_restore_ra_win(&file, ra_pages); + switch (pages) { case -EINTR: return -EINTR; @@ -954,6 +1002,8 @@ static long madvise_populate(struct vm_area_struct *vma, } start += pages * PAGE_SIZE; } + + madvise_restore_ra_win(&file, ra_pages); return 0; }