Message ID | 20221016150656.5803-1-fmdefrancesco@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp1060877wrs; Sun, 16 Oct 2022 08:31:45 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6ApRA9vTrbLKeK1df/ZAL8sOe2+QgQqVrOCINPeYdhm1HkgDvYuUmqmuowDesYD3HFZ9VT X-Received: by 2002:a17:90a:1b0d:b0:20d:69b1:70c3 with SMTP id q13-20020a17090a1b0d00b0020d69b170c3mr9026191pjq.5.1665934305421; Sun, 16 Oct 2022 08:31:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665934305; cv=none; d=google.com; s=arc-20160816; b=Eh/ucXZgiFR57vrhG2iwpIw/DsOBNxey1ZTRBKkFR2pNBqWxvQZPXjHSB1Y0tF212J nhCuy6AfujW35th8vn+tFb3HHf8UKFlwWCtSEJcHP741Ijw2sA61wvvowXEEVpcV6DJg TiAlxUkhFRB4JvIwcVZ7uYuMjCQ5DllF6MgmBdzmU5/i0Kk2G0uQARLfh+yIPRvndQeE q5LmCfjBi+4kBEb5M8yZLwlbpPFaQjz5Sb6kSMY7ql5wDOW24gFv4mk/xC0B9QJAdwXk Kg79M087V9JqizcHZpubjGwwrRS2/P/GnMZ08903XMbRlhfPnomD0RF1ef/8Qirbc7hY omnw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=mSANhTB5Zz4UC3KsIc8lXyn9giGudjvJ/sHgGjqk5pM=; b=QU1iz2H9IQo75ikPBgtRwsXZZ6BNIvZi9I/+8le/j7Ir4iMUN58eybBGyc/jrbseCO pGHRPkwnaZeQe//osqIYcp+rTjA1XYAehVUDPjrVQ+/91qxSrGJYzq04Vl69pKtqYWz5 5JhhpLwzXKzS4D8x6EHcPXvCTLvikv8klPwZk6WQlsN+sHLpXNTjU/vkzyrUN1sLJofF gmOs0FfTw7YSt9ns5xbSI1qv35xP2rJSQWlCx9Qq2PGCxFUMMVeOc/tja7vwlaPsYyOF w+k4aLOh1OwC9MDNC8fb1ZFgV9xdbMok031sv4vBvI2jl24Y6sWMu8dJClbeBlZ87ck/ VAfg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=hpET3coX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z5-20020a63c045000000b0043af57e5d16si8354083pgi.724.2022.10.16.08.31.25; Sun, 16 Oct 2022 08:31:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=hpET3coX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229947AbiJPPGx (ORCPT <rfc822;ouuuleilei@gmail.com> + 99 others); Sun, 16 Oct 2022 11:06:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229919AbiJPPGw (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 16 Oct 2022 11:06:52 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 197F31659F; Sun, 16 Oct 2022 08:06:51 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id a10so14687238wrm.12; Sun, 16 Oct 2022 08:06:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=mSANhTB5Zz4UC3KsIc8lXyn9giGudjvJ/sHgGjqk5pM=; b=hpET3coX8+u76QtArP03ZgDQK3NzQELvextbpUV3lG/18lHkFjd9xpzR0OVfiWPRc7 lVxae9IN9hAbnqlXdi8Ypg6+u0dYOYa3uE83Qdx4X+SUUcs3glQJ9gBqqkfrvJl6PggB 8Zo2FRgM324wxP7GJBXbhtNLxFrlp8wuw8BZ/msJBpg9SPxRDVJNJPZh+HFBjANp6eIi OdhulgCSwhBqHFq6wIWNTmPaUhyPDJ/73umzuh9rEPFbYnk9+9zhAm6N39UTLv2vZDWw IEiFJqgrWcQvjFgySSH9rlqO4ogHMaRxxWsd7A0yyBBWtBigMFQ5JODpdsSSzpkJ+flV 7/6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mSANhTB5Zz4UC3KsIc8lXyn9giGudjvJ/sHgGjqk5pM=; b=4H1hAoSkt1as6RBrkkNHE0isMUwSOmV7g1TeDgwSNPlb2nZSZeFz+7O5XdhlkzRSev G6wdB7WNBhHZpPBVyxLBpkfpJwUSGx9gKg8yKz3qsHyuKuCYubVDe+vAYUhNNSa1HVjh xqZLoxjfq3us9AfxJ1VxS2ciWoRPeLxsiwd9J7Iei7MN/z6pr/G7yBcO+DV5a7XC02qh Z04pL5Edn7fhHh/ZPAd+Ab0geMiJDw9rgUnnJteD93rrv7q5M6J68I/UyJp1hBpmhacJ ByAxhzc8i70K/hH3hkt/F2lTo4f2408mg9ULgO/wmj7oVuiTB5iUFxU3twBupf35VIlE A+xg== X-Gm-Message-State: ACrzQf0ikFNbWK+dMWsMw/u3ZcCtxGW5IFasVftoMb1wdBDBtNFZYQT+ iIsKCVik8monXT0A0bpjm3Cu47k4y7243Q== X-Received: by 2002:adf:ffc8:0:b0:231:ce45:7e02 with SMTP id x8-20020adfffc8000000b00231ce457e02mr3795030wrs.383.1665932809400; Sun, 16 Oct 2022 08:06:49 -0700 (PDT) Received: from localhost.localdomain (host-95-250-231-122.retail.telecomitalia.it. [95.250.231.122]) by smtp.gmail.com with ESMTPSA id p39-20020a05600c1da700b003c6d21a19a0sm7615465wms.29.2022.10.16.08.06.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 16 Oct 2022 08:06:48 -0700 (PDT) From: "Fabio M. De Francesco" <fmdefrancesco@gmail.com> To: Alexander Viro <viro@zeniv.linux.org.uk>, Benjamin LaHaise <bcrl@kvack.org>, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, linux-kernel@vger.kernel.org Cc: "Fabio M. De Francesco" <fmdefrancesco@gmail.com>, "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com>, Ira Weiny <ira.weiny@intel.com> Subject: [RESEND PATCH] fs/aio: Replace kmap{,_atomic}() with kmap_local_page() Date: Sun, 16 Oct 2022 17:06:56 +0200 Message-Id: <20221016150656.5803-1-fmdefrancesco@gmail.com> X-Mailer: git-send-email 2.37.3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1746858730586082560?= X-GMAIL-MSGID: =?utf-8?q?1746858730586082560?= |
Series |
[RESEND] fs/aio: Replace kmap{,_atomic}() with kmap_local_page()
|
|
Commit Message
Fabio M. De Francesco
Oct. 16, 2022, 3:06 p.m. UTC
The use of kmap() and kmap_atomic() are being deprecated in favor of kmap_local_page(). There are two main problems with kmap(): (1) It comes with an overhead as the mapping space is restricted and protected by a global lock for synchronization and (2) it also requires global TLB invalidation when the kmap’s pool wraps and it might block when the mapping space is fully utilized until a slot becomes available. With kmap_local_page() the mappings are per thread, CPU local, can take page faults, and can be called from any context (including interrupts). It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, the tasks can be preempted and, when they are scheduled to run again, the kernel virtual addresses are restored and still valid. Since its use in fs/aio.c is safe everywhere, it should be preferred. Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in fs/aio.c. Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel with HIGHMEM64GB enabled. Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com> Suggested-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Ira Weiny <ira.weiny@intel.com> Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> --- I've tested with "./check -g aio". The tests in this group fail 3/26 times, with and without my patch. Therefore, these changes don't introduce further errors. I'm not aware of any further tests I may run, so that any suggestions would be precious and much appreciated :-) I'm resending this patch because some recipients were missing in the previous submissions. In the meantime I'm also adding some more information in the commit message. There are no changes in the code. fs/aio.c | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-)
Comments
"Fabio M. De Francesco" <fmdefrancesco@gmail.com> writes: > The use of kmap() and kmap_atomic() are being deprecated in favor of > kmap_local_page(). > > There are two main problems with kmap(): (1) It comes with an overhead as > the mapping space is restricted and protected by a global lock for > synchronization and (2) it also requires global TLB invalidation when the > kmap’s pool wraps and it might block when the mapping space is fully > utilized until a slot becomes available. > > With kmap_local_page() the mappings are per thread, CPU local, can take > page faults, and can be called from any context (including interrupts). > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > the tasks can be preempted and, when they are scheduled to run again, the > kernel virtual addresses are restored and still valid. > > Since its use in fs/aio.c is safe everywhere, it should be preferred. That sentence is very ambiguous. I don't know what "its" refers to, and I'm not sure what "safe" means in this context. The patch looks okay to me. Reviewed-by: Jeff Moyer <jmoyer@redhat.com>
On Wednesday, October 19, 2022 5:41:21 PM CEST Jeff Moyer wrote: > "Fabio M. De Francesco" <fmdefrancesco@gmail.com> writes: > > > The use of kmap() and kmap_atomic() are being deprecated in favor of > > kmap_local_page(). > > > > There are two main problems with kmap(): (1) It comes with an overhead as > > the mapping space is restricted and protected by a global lock for > > synchronization and (2) it also requires global TLB invalidation when the > > kmap’s pool wraps and it might block when the mapping space is fully > > utilized until a slot becomes available. > > > > With kmap_local_page() the mappings are per thread, CPU local, can take > > page faults, and can be called from any context (including interrupts). > > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > > the tasks can be preempted and, when they are scheduled to run again, the > > kernel virtual addresses are restored and still valid. > > > > Since its use in fs/aio.c is safe everywhere, it should be preferred. > > That sentence is very ambiguous. I don't know what "its" refers to, and > I'm not sure what "safe" means in this context. I'm sorry for not being clearer. "its use" means "the use of kmap_local_page()". Few lines above you may also see "It is faster", meaning "kmap_local_page() is faster". The "safety" is a very concise way to assert that I've checked, by code inspection and by testing (as it is better detailed some lines below) that these conversions (1) don't break any of the rules of use of local mapping when converting kmap() (please read highmem.rst about these) and (2) the call sites of kmap_atomic() didn't rely on its side effects (pagefaults disable and potential preemption disables). Therefore, you may read it as it was: "The use of kmap_local_page() in fs/ aio.c has been carefully checked to assure that the conversions won't break the code, therefore the newer API is preferred". I hope it makes my argument clearer. > > The patch looks okay to me. > > Reviewed-by: Jeff Moyer <jmoyer@redhat.com> > Thank you so much for the "Reviewed-by" tag. Regards, Fabio
"Fabio M. De Francesco" <fmdefrancesco@gmail.com> writes: > On Wednesday, October 19, 2022 5:41:21 PM CEST Jeff Moyer wrote: >> "Fabio M. De Francesco" <fmdefrancesco@gmail.com> writes: >> >> > The use of kmap() and kmap_atomic() are being deprecated in favor of >> > kmap_local_page(). >> > >> > There are two main problems with kmap(): (1) It comes with an overhead as >> > the mapping space is restricted and protected by a global lock for >> > synchronization and (2) it also requires global TLB invalidation when the >> > kmap’s pool wraps and it might block when the mapping space is fully >> > utilized until a slot becomes available. >> > >> > With kmap_local_page() the mappings are per thread, CPU local, can take >> > page faults, and can be called from any context (including interrupts). >> > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, >> > the tasks can be preempted and, when they are scheduled to run again, the >> > kernel virtual addresses are restored and still valid. >> > >> > Since its use in fs/aio.c is safe everywhere, it should be preferred. >> >> That sentence is very ambiguous. I don't know what "its" refers to, and >> I'm not sure what "safe" means in this context. > > I'm sorry for not being clearer. > > "its use" means "the use of kmap_local_page()". Few lines above you may also > see "It is faster", meaning "kmap_local_page() is faster". Got it, thanks. > The "safety" is a very concise way to assert that I've checked, by code > inspection and by testing (as it is better detailed some lines below) that > these conversions (1) don't break any of the rules of use of local mapping > when converting kmap() (please read highmem.rst about these) and (2) the call > sites of kmap_atomic() didn't rely on its side effects (pagefaults disable and > potential preemption disables). OK, good. I agree that the aio code wasn't relying on the side effects of kmap_atomic. > Therefore, you may read it as it was: "The use of kmap_local_page() in fs/ > aio.c has been carefully checked to assure that the conversions won't break > the code, therefore the newer API is preferred". > > I hope it makes my argument clearer. Yes, thank you for explaining! -Jeff > >> >> The patch looks okay to me. >> >> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> >> > > Thank you so much for the "Reviewed-by" tag. > > Regards, > > Fabio
On domenica 16 ottobre 2022 17:06:56 CET Fabio M. De Francesco wrote: > The use of kmap() and kmap_atomic() are being deprecated in favor of > kmap_local_page(). > > There are two main problems with kmap(): (1) It comes with an overhead as > the mapping space is restricted and protected by a global lock for > synchronization and (2) it also requires global TLB invalidation when the > kmap’s pool wraps and it might block when the mapping space is fully > utilized until a slot becomes available. > > With kmap_local_page() the mappings are per thread, CPU local, can take > page faults, and can be called from any context (including interrupts). > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > the tasks can be preempted and, when they are scheduled to run again, the > kernel virtual addresses are restored and still valid. > > Since its use in fs/aio.c is safe everywhere, it should be preferred. > > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in > fs/aio.c. > > Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel > with HIGHMEM64GB enabled. > > Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com> > Suggested-by: Ira Weiny <ira.weiny@intel.com> > Reviewed-by: Ira Weiny <ira.weiny@intel.com> > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> > --- > > I've tested with "./check -g aio". The tests in this group fail 3/26 > times, with and without my patch. Therefore, these changes don't introduce > further errors. I'm not aware of any further tests I may run, so that > any suggestions would be precious and much appreciated :-) > > I'm resending this patch because some recipients were missing in the > previous submissions. In the meantime I'm also adding some more information > in the commit message. There are no changes in the code. > > fs/aio.c | 32 ++++++++++++++++---------------- > 1 file changed, 16 insertions(+), 16 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index 3c249b938632..343fea0c6d1a 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -567,7 +567,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int > nr_events) ctx->user_id = ctx->mmap_base; > ctx->nr_events = nr_events; /* trusted copy */ > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > ring->nr = nr_events; /* user copy */ > ring->id = ~0U; > ring->head = ring->tail = 0; > @@ -575,7 +575,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int > nr_events) ring->compat_features = AIO_RING_COMPAT_FEATURES; > ring->incompat_features = AIO_RING_INCOMPAT_FEATURES; > ring->header_length = sizeof(struct aio_ring); > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > return 0; > @@ -678,9 +678,9 @@ static int ioctx_add_table(struct kioctx *ctx, struct > mm_struct *mm) * we are protected from page migration > * changes ring_pages by - >ring_lock. > */ > - ring = kmap_atomic(ctx- >ring_pages[0]); > + ring = kmap_local_page(ctx- >ring_pages[0]); > ring->id = ctx->id; > - kunmap_atomic(ring); > + kunmap_local(ring); > return 0; > } > > @@ -1024,9 +1024,9 @@ static void user_refill_reqs_available(struct kioctx > *ctx) * against ctx->completed_events below will make sure we do the > * safe/right thing. > */ > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > - kunmap_atomic(ring); > + kunmap_local(ring); > > refill_reqs_available(ctx, head, ctx->tail); > } > @@ -1132,12 +1132,12 @@ static void aio_complete(struct aio_kiocb *iocb) > if (++tail >= ctx->nr_events) > tail = 0; > > - ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > + ev_page = kmap_local_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > event = ev_page + pos % AIO_EVENTS_PER_PAGE; > > *event = iocb->ki_res; > > - kunmap_atomic(ev_page); > + kunmap_local(ev_page); > flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > > pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb, > @@ -1151,10 +1151,10 @@ static void aio_complete(struct aio_kiocb *iocb) > > ctx->tail = tail; > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > ring->tail = tail; > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > ctx->completed_events++; > @@ -1214,10 +1214,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > mutex_lock(&ctx->ring_lock); > > /* Access to ->ring_pages here is protected by ctx->ring_lock. */ > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > tail = ring->tail; > - kunmap_atomic(ring); > + kunmap_local(ring); > > /* > * Ensure that once we've read the current tail pointer, that > @@ -1249,10 +1249,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > avail = min(avail, nr - ret); > avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos); > > - ev = kmap(page); > + ev = kmap_local_page(page); > copy_ret = copy_to_user(event + ret, ev + pos, > sizeof(*ev) * avail); > - kunmap(page); > + kunmap_local(ev); > > if (unlikely(copy_ret)) { > ret = -EFAULT; > @@ -1264,9 +1264,9 @@ static long aio_read_events_ring(struct kioctx *ctx, > head %= ctx->nr_events; > } > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > ring->head = head; > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > pr_debug("%li h%u t%u\n", ret, head, tail); > -- > 2.36.1 Al, Benjamin, I'm sending a gentle ping for this old patch too (and thanking Al again for pointing me out how fs/ufs and fs/sysv conversions must be reworked and mistakes fixed). About this I've had Ira's and Jeff's "Reviewed-by:" tags. I also responded in this thread to a couple of objections from Jeff which were regarding some ambiguities in the commit message. Please let me know if here too there are mistakes which must be fixed and code to be reworked. I'm currently just a little more than a pure hobbyist, therefore please be patient for the time it takes because until mid February 2023 I'll only be able to work few hours per weeks using my spare time. I'm looking forward to hearing from you. Regards, Fabio
On domenica 16 ottobre 2022 17:06:56 CET Fabio M. De Francesco wrote: > The use of kmap() and kmap_atomic() are being deprecated in favor of > kmap_local_page(). > > There are two main problems with kmap(): (1) It comes with an overhead as > the mapping space is restricted and protected by a global lock for > synchronization and (2) it also requires global TLB invalidation when the > kmap’s pool wraps and it might block when the mapping space is fully > utilized until a slot becomes available. > > With kmap_local_page() the mappings are per thread, CPU local, can take > page faults, and can be called from any context (including interrupts). > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > the tasks can be preempted and, when they are scheduled to run again, the > kernel virtual addresses are restored and still valid. > > Since its use in fs/aio.c is safe everywhere, it should be preferred. > > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in > fs/aio.c. > > Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel > with HIGHMEM64GB enabled. > > Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com> > Suggested-by: Ira Weiny <ira.weiny@intel.com> > Reviewed-by: Ira Weiny <ira.weiny@intel.com> Reviewed-by: Jeff Moyer <jmoyer@redhat.com> > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> > --- I'm sorry to resend again. Last time I forgot to forward the "Reviewed-by:" tag from Jeff (thanks!). > > I've tested with "./check -g aio". The tests in this group fail 3/26 > times, with and without my patch. Therefore, these changes don't introduce > further errors. I'm not aware of any further tests I may run, so that > any suggestions would be precious and much appreciated :-) > > I'm resending this patch because some recipients were missing in the > previous submissions. In the meantime I'm also adding some more information > in the commit message. There are no changes in the code. > > fs/aio.c | 32 ++++++++++++++++---------------- > 1 file changed, 16 insertions(+), 16 deletions(-) > > diff --git a/fs/aio.c b/fs/aio.c > index 3c249b938632..343fea0c6d1a 100644 > --- a/fs/aio.c > +++ b/fs/aio.c > @@ -567,7 +567,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int > nr_events) ctx->user_id = ctx->mmap_base; > ctx->nr_events = nr_events; /* trusted copy */ > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > ring->nr = nr_events; /* user copy */ > ring->id = ~0U; > ring->head = ring->tail = 0; > @@ -575,7 +575,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int > nr_events) ring->compat_features = AIO_RING_COMPAT_FEATURES; > ring->incompat_features = AIO_RING_INCOMPAT_FEATURES; > ring->header_length = sizeof(struct aio_ring); > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > return 0; > @@ -678,9 +678,9 @@ static int ioctx_add_table(struct kioctx *ctx, struct > mm_struct *mm) * we are protected from page migration > * changes ring_pages by - >ring_lock. > */ > - ring = kmap_atomic(ctx- >ring_pages[0]); > + ring = kmap_local_page(ctx- >ring_pages[0]); > ring->id = ctx->id; > - kunmap_atomic(ring); > + kunmap_local(ring); > return 0; > } > > @@ -1024,9 +1024,9 @@ static void user_refill_reqs_available(struct kioctx > *ctx) * against ctx->completed_events below will make sure we do the > * safe/right thing. > */ > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > - kunmap_atomic(ring); > + kunmap_local(ring); > > refill_reqs_available(ctx, head, ctx->tail); > } > @@ -1132,12 +1132,12 @@ static void aio_complete(struct aio_kiocb *iocb) > if (++tail >= ctx->nr_events) > tail = 0; > > - ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > + ev_page = kmap_local_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > event = ev_page + pos % AIO_EVENTS_PER_PAGE; > > *event = iocb->ki_res; > > - kunmap_atomic(ev_page); > + kunmap_local(ev_page); > flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > > pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb, > @@ -1151,10 +1151,10 @@ static void aio_complete(struct aio_kiocb *iocb) > > ctx->tail = tail; > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > ring->tail = tail; > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > ctx->completed_events++; > @@ -1214,10 +1214,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > mutex_lock(&ctx->ring_lock); > > /* Access to ->ring_pages here is protected by ctx->ring_lock. */ > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > head = ring->head; > tail = ring->tail; > - kunmap_atomic(ring); > + kunmap_local(ring); > > /* > * Ensure that once we've read the current tail pointer, that > @@ -1249,10 +1249,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > avail = min(avail, nr - ret); > avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos); > > - ev = kmap(page); > + ev = kmap_local_page(page); > copy_ret = copy_to_user(event + ret, ev + pos, > sizeof(*ev) * avail); > - kunmap(page); > + kunmap_local(ev); > > if (unlikely(copy_ret)) { > ret = -EFAULT; > @@ -1264,9 +1264,9 @@ static long aio_read_events_ring(struct kioctx *ctx, > head %= ctx->nr_events; > } > > - ring = kmap_atomic(ctx->ring_pages[0]); > + ring = kmap_local_page(ctx->ring_pages[0]); > ring->head = head; > - kunmap_atomic(ring); > + kunmap_local(ring); > flush_dcache_page(ctx->ring_pages[0]); > > pr_debug("%li h%u t%u\n", ret, head, tail); > -- > 2.36.1
On giovedì 1 dicembre 2022 15:29:17 CET Fabio M. De Francesco wrote: > On domenica 16 ottobre 2022 17:06:56 CET Fabio M. De Francesco wrote: > > The use of kmap() and kmap_atomic() are being deprecated in favor of > > kmap_local_page(). > > > > There are two main problems with kmap(): (1) It comes with an overhead as > > the mapping space is restricted and protected by a global lock for > > synchronization and (2) it also requires global TLB invalidation when the > > kmap’s pool wraps and it might block when the mapping space is fully > > utilized until a slot becomes available. > > > > With kmap_local_page() the mappings are per thread, CPU local, can take > > page faults, and can be called from any context (including interrupts). > > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > > the tasks can be preempted and, when they are scheduled to run again, the > > kernel virtual addresses are restored and still valid. > > > > Since its use in fs/aio.c is safe everywhere, it should be preferred. > > > > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in > > fs/aio.c. > > > > Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel > > with HIGHMEM64GB enabled. > > > > Cc: "Venkataramanan, Anirudh" <anirudh.venkataramanan@intel.com> > > Suggested-by: Ira Weiny <ira.weiny@intel.com> > > Reviewed-by: Ira Weiny <ira.weiny@intel.com> > > Reviewed-by: Jeff Moyer <jmoyer@redhat.com> > > > Signed-off-by: Fabio M. De Francesco <fmdefrancesco@gmail.com> > > --- > > I'm sorry to resend again. Last time I forgot to forward the "Reviewed-by:" > tag from Jeff (thanks!). > > > I've tested with "./check -g aio". The tests in this group fail 3/26 > > times, with and without my patch. Therefore, these changes don't introduce > > further errors. I'm not aware of any further tests I may run, so that > > any suggestions would be precious and much appreciated :-) > > > > I'm resending this patch because some recipients were missing in the > > previous submissions. In the meantime I'm also adding some more information > > in the commit message. There are no changes in the code. > > > > fs/aio.c | 32 ++++++++++++++++---------------- > > 1 file changed, 16 insertions(+), 16 deletions(-) > > > > diff --git a/fs/aio.c b/fs/aio.c > > index 3c249b938632..343fea0c6d1a 100644 > > --- a/fs/aio.c > > +++ b/fs/aio.c > > @@ -567,7 +567,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned > > int > > > nr_events) ctx->user_id = ctx->mmap_base; > > > > ctx->nr_events = nr_events; /* trusted copy */ > > > > - ring = kmap_atomic(ctx->ring_pages[0]); > > + ring = kmap_local_page(ctx->ring_pages[0]); > > > > ring->nr = nr_events; /* user copy */ > > ring->id = ~0U; > > ring->head = ring->tail = 0; > > > > @@ -575,7 +575,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned > > int > > > nr_events) ring->compat_features = AIO_RING_COMPAT_FEATURES; > > > > ring->incompat_features = AIO_RING_INCOMPAT_FEATURES; > > ring->header_length = sizeof(struct aio_ring); > > > > - kunmap_atomic(ring); > > + kunmap_local(ring); > > > > flush_dcache_page(ctx->ring_pages[0]); > > > > return 0; > > > > @@ -678,9 +678,9 @@ static int ioctx_add_table(struct kioctx *ctx, struct > > mm_struct *mm) * we are protected from page migration > > > > * changes ring_pages by - > > > >ring_lock. > > > > */ > > > > - ring = kmap_atomic(ctx- > > > >ring_pages[0]); > > > > + ring = kmap_local_page(ctx- > > > >ring_pages[0]); > > > > ring->id = ctx->id; > > > > - kunmap_atomic(ring); > > + kunmap_local(ring); > > > > return 0; > > > > } > > > > @@ -1024,9 +1024,9 @@ static void user_refill_reqs_available(struct kioctx > > *ctx) * against ctx->completed_events below will make sure we do the > > > > * safe/right thing. > > */ > > > > - ring = kmap_atomic(ctx->ring_pages[0]); > > + ring = kmap_local_page(ctx->ring_pages[0]); > > > > head = ring->head; > > > > - kunmap_atomic(ring); > > + kunmap_local(ring); > > > > refill_reqs_available(ctx, head, ctx->tail); > > > > } > > > > @@ -1132,12 +1132,12 @@ static void aio_complete(struct aio_kiocb *iocb) > > > > if (++tail >= ctx->nr_events) > > > > tail = 0; > > > > - ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > > + ev_page = kmap_local_page(ctx->ring_pages[pos / > > AIO_EVENTS_PER_PAGE]); > > > event = ev_page + pos % AIO_EVENTS_PER_PAGE; > > > > *event = iocb->ki_res; > > > > - kunmap_atomic(ev_page); > > + kunmap_local(ev_page); > > > > flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); > > > > pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb, > > > > @@ -1151,10 +1151,10 @@ static void aio_complete(struct aio_kiocb *iocb) > > > > ctx->tail = tail; > > > > - ring = kmap_atomic(ctx->ring_pages[0]); > > + ring = kmap_local_page(ctx->ring_pages[0]); > > > > head = ring->head; > > ring->tail = tail; > > > > - kunmap_atomic(ring); > > + kunmap_local(ring); > > > > flush_dcache_page(ctx->ring_pages[0]); > > > > ctx->completed_events++; > > > > @@ -1214,10 +1214,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > > > > mutex_lock(&ctx->ring_lock); > > > > /* Access to ->ring_pages here is protected by ctx->ring_lock. */ > > > > - ring = kmap_atomic(ctx->ring_pages[0]); > > + ring = kmap_local_page(ctx->ring_pages[0]); > > > > head = ring->head; > > tail = ring->tail; > > > > - kunmap_atomic(ring); > > + kunmap_local(ring); > > > > /* > > > > * Ensure that once we've read the current tail pointer, that > > > > @@ -1249,10 +1249,10 @@ static long aio_read_events_ring(struct kioctx *ctx, > > > > avail = min(avail, nr - ret); > > avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos); > > > > - ev = kmap(page); > > + ev = kmap_local_page(page); > > > > copy_ret = copy_to_user(event + ret, ev + pos, > > > > sizeof(*ev) * avail); > > > > - kunmap(page); > > + kunmap_local(ev); > > > > if (unlikely(copy_ret)) { > > > > ret = -EFAULT; > > > > @@ -1264,9 +1264,9 @@ static long aio_read_events_ring(struct kioctx *ctx, > > > > head %= ctx->nr_events; > > > > } > > > > - ring = kmap_atomic(ctx->ring_pages[0]); > > + ring = kmap_local_page(ctx->ring_pages[0]); > > > > ring->head = head; > > > > - kunmap_atomic(ring); > > + kunmap_local(ring); > > > > flush_dcache_page(ctx->ring_pages[0]); > > > > pr_debug("%li h%u t%u\n", ret, head, tail); > > > > -- > > 2.36.1 Please disregard this patch because I just sent a v2 with some additional information in the commit message and added Jeff's "Reviewed-by" tag. Thanks, Fabio [1] https://lore.kernel.org/lkml/20230109175629.9482-1-fmdefrancesco@gmail.com/
On Sun, Oct 16, 2022 at 05:06:56PM +0200, Fabio M. De Francesco wrote: > The use of kmap() and kmap_atomic() are being deprecated in favor of > kmap_local_page(). > > There are two main problems with kmap(): (1) It comes with an overhead as > the mapping space is restricted and protected by a global lock for > synchronization and (2) it also requires global TLB invalidation when the > kmap’s pool wraps and it might block when the mapping space is fully > utilized until a slot becomes available. > > With kmap_local_page() the mappings are per thread, CPU local, can take > page faults, and can be called from any context (including interrupts). > It is faster than kmap() in kernels with HIGHMEM enabled. Furthermore, > the tasks can be preempted and, when they are scheduled to run again, the > kernel virtual addresses are restored and still valid. > > Since its use in fs/aio.c is safe everywhere, it should be preferred. > > Therefore, replace kmap() and kmap_atomic() with kmap_local_page() in > fs/aio.c. > > Tested with xfstests on a QEMU/KVM x86_32 VM, 6GB RAM, booting a kernel > with HIGHMEM64GB enabled. I was just looking over this code and made the same kmap -> kmap_local change, but you've done a more complete version - nice. For context, I was the one who added the kmap() call, because copy_to_user() can sleep - anything else at the time would've been more awkward, and highmem machines were already on the way out. But kmap_local is perfect here :) Reviewed-by: Kent Overstreet <kent.overstreet@linux.dev>
diff --git a/fs/aio.c b/fs/aio.c index 3c249b938632..343fea0c6d1a 100644 --- a/fs/aio.c +++ b/fs/aio.c @@ -567,7 +567,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) ctx->user_id = ctx->mmap_base; ctx->nr_events = nr_events; /* trusted copy */ - ring = kmap_atomic(ctx->ring_pages[0]); + ring = kmap_local_page(ctx->ring_pages[0]); ring->nr = nr_events; /* user copy */ ring->id = ~0U; ring->head = ring->tail = 0; @@ -575,7 +575,7 @@ static int aio_setup_ring(struct kioctx *ctx, unsigned int nr_events) ring->compat_features = AIO_RING_COMPAT_FEATURES; ring->incompat_features = AIO_RING_INCOMPAT_FEATURES; ring->header_length = sizeof(struct aio_ring); - kunmap_atomic(ring); + kunmap_local(ring); flush_dcache_page(ctx->ring_pages[0]); return 0; @@ -678,9 +678,9 @@ static int ioctx_add_table(struct kioctx *ctx, struct mm_struct *mm) * we are protected from page migration * changes ring_pages by ->ring_lock. */ - ring = kmap_atomic(ctx->ring_pages[0]); + ring = kmap_local_page(ctx->ring_pages[0]); ring->id = ctx->id; - kunmap_atomic(ring); + kunmap_local(ring); return 0; } @@ -1024,9 +1024,9 @@ static void user_refill_reqs_available(struct kioctx *ctx) * against ctx->completed_events below will make sure we do the * safe/right thing. */ - ring = kmap_atomic(ctx->ring_pages[0]); + ring = kmap_local_page(ctx->ring_pages[0]); head = ring->head; - kunmap_atomic(ring); + kunmap_local(ring); refill_reqs_available(ctx, head, ctx->tail); } @@ -1132,12 +1132,12 @@ static void aio_complete(struct aio_kiocb *iocb) if (++tail >= ctx->nr_events) tail = 0; - ev_page = kmap_atomic(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); + ev_page = kmap_local_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); event = ev_page + pos % AIO_EVENTS_PER_PAGE; *event = iocb->ki_res; - kunmap_atomic(ev_page); + kunmap_local(ev_page); flush_dcache_page(ctx->ring_pages[pos / AIO_EVENTS_PER_PAGE]); pr_debug("%p[%u]: %p: %p %Lx %Lx %Lx\n", ctx, tail, iocb, @@ -1151,10 +1151,10 @@ static void aio_complete(struct aio_kiocb *iocb) ctx->tail = tail; - ring = kmap_atomic(ctx->ring_pages[0]); + ring = kmap_local_page(ctx->ring_pages[0]); head = ring->head; ring->tail = tail; - kunmap_atomic(ring); + kunmap_local(ring); flush_dcache_page(ctx->ring_pages[0]); ctx->completed_events++; @@ -1214,10 +1214,10 @@ static long aio_read_events_ring(struct kioctx *ctx, mutex_lock(&ctx->ring_lock); /* Access to ->ring_pages here is protected by ctx->ring_lock. */ - ring = kmap_atomic(ctx->ring_pages[0]); + ring = kmap_local_page(ctx->ring_pages[0]); head = ring->head; tail = ring->tail; - kunmap_atomic(ring); + kunmap_local(ring); /* * Ensure that once we've read the current tail pointer, that @@ -1249,10 +1249,10 @@ static long aio_read_events_ring(struct kioctx *ctx, avail = min(avail, nr - ret); avail = min_t(long, avail, AIO_EVENTS_PER_PAGE - pos); - ev = kmap(page); + ev = kmap_local_page(page); copy_ret = copy_to_user(event + ret, ev + pos, sizeof(*ev) * avail); - kunmap(page); + kunmap_local(ev); if (unlikely(copy_ret)) { ret = -EFAULT; @@ -1264,9 +1264,9 @@ static long aio_read_events_ring(struct kioctx *ctx, head %= ctx->nr_events; } - ring = kmap_atomic(ctx->ring_pages[0]); + ring = kmap_local_page(ctx->ring_pages[0]); ring->head = head; - kunmap_atomic(ring); + kunmap_local(ring); flush_dcache_page(ctx->ring_pages[0]); pr_debug("%li h%u t%u\n", ret, head, tail);