From patchwork Sun Mar 19 00:20:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 71688 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp532113wrt; Sat, 18 Mar 2023 18:34:44 -0700 (PDT) X-Google-Smtp-Source: AK7set9rfdH0ER5j4kU+xOTNjEF2ZEv1TjsJCA1JV7iCLhu+Tx+5jwkwXUSXF0QKVREVdsfkTG4C X-Received: by 2002:a17:90a:313:b0:23d:35d9:d05b with SMTP id 19-20020a17090a031300b0023d35d9d05bmr13413294pje.14.1679189683800; Sat, 18 Mar 2023 18:34:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679189683; cv=none; d=google.com; s=arc-20160816; b=bAncSH+e9SheO/eSytmz3fK8YI0n4qyh4A9p/JnRvT3aNs7FoLO7kwvO4cCHtt1Gjk /tcKsoFAe5Ttcd26x/cNmGvlQoU76rlLxspMNiz+8fIJWHjg5ucw9yVBvCGlLVs4/MMf 0ctixtwBB5mMdoDfw5D0TbPv4UxkfqTWrMJ8t8X4trIlOpydT7Tg+3sB96obQMpwI6kO n4C2ex7/oLsSlw15m9rih+KhhdwgPpYZmf0DlUVlPsZsG85XNFPrm+UPD40FqhovYRGl TtUmoaUQNbg0hTg4SSzVdw9+Jh1qeixSXmvMZ0MBNCDrlorLlcoUZx/ihJOWqq4iN5Tb /x7Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=csP2U+h4A4N+JyUsV24C6SnnqlSeb33JonwK9o1duFk=; b=0HQFFnbawrfmc08K5SkHjQnsHudgfXpODMk6MUe8v3j6YGd6ESPQIa9yivds0jLw6C zNSa/sqnzneoafJIssxJONJyrGfKuMoFRGLrISnQgnUWvlB6HrSktaqUll62PXrSfa7G 0umdEwppp3amWESM910MIH6mK7F7tqxq1C1GrqzQbaD+x+kHDp3X4oZQHXuRgDac0n2H /lMKFoS49BYQzXcxrXjTluN1+FuWdwP7I4vsR/OwDpg7Rz/IejLD/fyzaZiJHoy8ch8h ZMdMEKS+jzcznkF6jp3p9lda8qtWzyi/LeoVSmkSzy1vK34fUJ5pWrOw1NLESOHWhHKc BMyA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=nC6RYf6j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fs3-20020a17090af28300b0023d53736026si6386775pjb.125.2023.03.18.18.34.31; Sat, 18 Mar 2023 18:34:43 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=nC6RYf6j; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230247AbjCSAZl (ORCPT + 99 others); Sat, 18 Mar 2023 20:25:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230443AbjCSAYW (ORCPT ); Sat, 18 Mar 2023 20:24:22 -0400 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0F2E22A6E8; Sat, 18 Mar 2023 17:21:38 -0700 (PDT) Received: by mail-wr1-x42f.google.com with SMTP id y14so7381832wrq.4; Sat, 18 Mar 2023 17:21:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679185217; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=csP2U+h4A4N+JyUsV24C6SnnqlSeb33JonwK9o1duFk=; b=nC6RYf6jBdzIKQ4o3vm7PvlkCYMlOb1FdP26eTaA7XVeOCh6KBdfA7t8A9kV0DJyUu Zla8U4rG4p3JsUlcTbj1hALd7X+zuQQyHPo31R1hXSlO6RI1L+zQQ6nLiJl+1/Xd7U7U apPCOgsxAoHSLFM0qPvVpo7u61XC0kmIRa5pxuaYfocJcUN0AOST09bBvQuEcJGH8esh Ib/zsOkFPFJkM6s012r+iEmCzFsVMHBLJiz8a/qppPITwzE0psbE/fxPchSR/bE9d1mw OfIHu0q2z+2gyGNfQdqreV8GYmkXjgOOuUaUsHA43nqXVPlwr56UszPzKQAefMoPfCHJ 5Cqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679185217; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=csP2U+h4A4N+JyUsV24C6SnnqlSeb33JonwK9o1duFk=; b=vmsTJjT1h1n0x8CuDFUASxp1SZXpyChGvQvj4UemXTxwc05NpHU+h0XM70NNuOe77q mzZSsaOYaRaNQEgIj1udTHyA6XPibJHNKjCbiVG+sda/dni62mBdy94HWJ0wlcyD2tFi dIzO6G2SsLVCbR4pvPLU3PKgNg8N4VxflmXuRofdHsO7GLmKgcdBPqikGv6uv99CmKpP YqEztcmAFh4g2QKFnYMyYTlQjAbCvUNumelo+Q7iT+b5bwDFDcjaEVNDmMrjFKOot2xf sz1RUYAhU+pfspUyecmtr/ce/0u909DEV7i++h4dnoR/CQmtsFMAUIRhjkvk4cQabP7i z74w== X-Gm-Message-State: AO0yUKUirSlZLywteyOnfhJdifQIT18hIzns1/GfuB05V6vyxcxbNXqD Zgiqyz+nFqAwYgK5QGyIfR8= X-Received: by 2002:a5d:54c1:0:b0:2d3:fba4:e61d with SMTP id x1-20020a5d54c1000000b002d3fba4e61dmr4877179wrv.12.1679185217216; Sat, 18 Mar 2023 17:20:17 -0700 (PDT) Received: from lucifer.home (host86-146-209-214.range86-146.btcentralplus.com. [86.146.209.214]) by smtp.googlemail.com with ESMTPSA id x14-20020adfdd8e000000b002cff0c57b98sm5399639wrl.18.2023.03.18.17.20.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Mar 2023 17:20:16 -0700 (PDT) From: Lorenzo Stoakes To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton Cc: Baoquan He , Uladzislau Rezki , Matthew Wilcox , David Hildenbrand , Liu Shixin , Jiri Olsa , Lorenzo Stoakes Subject: [PATCH 1/4] fs/proc/kcore: Avoid bounce buffer for ktext data Date: Sun, 19 Mar 2023 00:20:09 +0000 Message-Id: <2ed992d6604965fd9eea05fed4473ddf54540989.1679183626.git.lstoakes@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760758002137006670?= X-GMAIL-MSGID: =?utf-8?q?1760758002137006670?= Commit df04abfd181a ("fs/proc/kcore.c: Add bounce buffer for ktext data") introduced the use of a bounce buffer to retrieve kernel text data for /proc/kcore in order to avoid failures arising from hardened user copies enabled by CONFIG_HARDENED_USERCOPY in check_kernel_text_object(). We can avoid doing this if instead of copy_to_user() we use _copy_to_user() which bypasses the hardening check. This is more efficient than using a bounce buffer and simplifies the code. We do so as part an overall effort to eliminate bounce buffer usage in the function with an eye to converting it an iterator read. Signed-off-by: Lorenzo Stoakes --- fs/proc/kcore.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 71157ee35c1a..556f310d6aa4 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -541,19 +541,12 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) case KCORE_VMEMMAP: case KCORE_TEXT: /* - * Using bounce buffer to bypass the - * hardened user copy kernel text checks. + * We use _copy_to_user() to bypass usermode hardening + * which would otherwise prevent this operation. */ - if (copy_from_kernel_nofault(buf, (void *)start, tsz)) { - if (clear_user(buffer, tsz)) { - ret = -EFAULT; - goto out; - } - } else { - if (copy_to_user(buffer, buf, tsz)) { - ret = -EFAULT; - goto out; - } + if (_copy_to_user(buffer, (char *)start, tsz)) { + ret = -EFAULT; + goto out; } break; default: From patchwork Sun Mar 19 00:20:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 71677 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp530221wrt; Sat, 18 Mar 2023 18:28:29 -0700 (PDT) X-Google-Smtp-Source: AK7set+hhOEK/wRlt9MxGt0vaBnndtcoA8JGoJTSRUOYXSRoM0yrqpeCir7lh3qRxJNLsEVPUFG/ X-Received: by 2002:a17:903:32d0:b0:1a1:80ea:4364 with SMTP id i16-20020a17090332d000b001a180ea4364mr13765143plr.31.1679189308687; Sat, 18 Mar 2023 18:28:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679189308; cv=none; d=google.com; s=arc-20160816; b=k4dBfsWmlQeqt7zdXlqbhvcYeikDhfoevNgkklCTZC57zmdI+bLokgmTdYIBEYRtbR 02oneMQltzSBz0RxoCI9SNDzX248xZGinpGnYFy/bdJQBTd7nNKxXvVNd39DLoQEYZ6+ pzmXS4JveSXT/eD8ugaA10tJaerlckZC5b8cBqi7dytFtDq8nlDouQZIiKWxX3pknTXU PjQXYlaapWFOCKBt+3H0G7dVfTWtM9PgP+AG+2yNAZLk3fGt/gyxaP2tFbLjYigBiKzI DgcMr+Tvm8kik4NCo/9Ueh+p2NtyqcdwBKIrF52DroelLdb6fcYvtjzxlu5v1qzskzxj ZVIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=V8uXWzHaUDgnMR3zVzK4ll6XJwTMNIlCa4zu/w/8SY0=; b=ea2HGtk4MSC/ibgfsPmVt70143ulUZf9XHCLM2Eq0aM/kaHd48asBreBDhhqu/Rur0 YLPfIQOBCgvDfupwMZE5/RH0vls5BPXe0nOB/263LBQw/Y6bHel2rnOw15pONF+PMUFc mgAK6/hD7SbJQZUGh2+/5yJf3v35PsqOEtRvdbe2lnpZarXJcJaH+iupCiflqlYmshDx g2+4UPyYkYIxTKDwukGL+62TJVHFV+NfF198kyZ/az+XfFtH6XTmdG6xzJLSPb/SFDpx u6wLd8IAYwCDSoHRzjsK/9uC/vA5cO3BDxPlh5Q2Hc/0xsRmLMjkYrOj6k8O/bHFRtst G0ug== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=BIGAFMU+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x6-20020a1709028ec600b0019a71e14c19si6531045plo.320.2023.03.18.18.28.15; Sat, 18 Mar 2023 18:28:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=BIGAFMU+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230289AbjCSAZo (ORCPT + 99 others); Sat, 18 Mar 2023 20:25:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230306AbjCSAYl (ORCPT ); Sat, 18 Mar 2023 20:24:41 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BFB98C161; Sat, 18 Mar 2023 17:21:53 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id v16so7408472wrn.0; Sat, 18 Mar 2023 17:21:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679185218; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V8uXWzHaUDgnMR3zVzK4ll6XJwTMNIlCa4zu/w/8SY0=; b=BIGAFMU+LM58A19CRYns7kudOoBi/HrNebk/7eL6lz1wzLpeTtZe13tBuzDTR0ocEH 4Piwt9StM4RAAIJ5mkuuwccSNfIPlvjLCkP3BSqrApZwTXhHoPOiuEcsvd+0ZYAtw0Gd KSlzj8h4p6+BTNW3jcdrnry2D85k7wVkh4ZVb1K5O/DFQ2mp+nPbj/Ia0f7t9Ws49+Yl PWou5LVOSkGFPj6zE5kMDDO8XOiggLFV+saVeOmrT+j47nvpI/eBYKzMD6PnPAydRvEF WP5SPYriCXkvUTilNFaw/KB8EobK3C8+BYNpFknl8cU6Dcg5Bgn4/fH8V0qLdCcSt+Ol oP7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679185218; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V8uXWzHaUDgnMR3zVzK4ll6XJwTMNIlCa4zu/w/8SY0=; b=rwt2DjuYhL1e3NrfvqJTfZHeS9wZA8laMeTdHY9nXxSNeOLPXn7ch8UAs7oT8hSvbM vRV7n3S6Uj0sdvjj3D1MNGznHo/+CUTc+p5t+EDwizSuImpLnTSV8dmEDMOUwW9ze6ap ineXbyCUDGEokWtzsn1NIaKBuHiVQ3toD263niJjhdyFm5aCWZ1Aap/ks7moWeKGSJ7O A8L4vjbHrtk/3uiraTVBrO1jkQjJywhWstXqGteDfnTM+ipPsT+MYotqHWOA4RjJmRvW R9Gy7tIc0BLU/mMye6CozyatLQ8xdATnRSJQYCWHm6VwL35H+wwJTFjHrG+wVpoiw5a9 3aKA== X-Gm-Message-State: AO0yUKXsYnrhDT/1CD0cIBS4lDRRCWrvrv/J695K6cN1YyfyXoK7jEYv o37gC9ESk18PQhkjISSFqFk= X-Received: by 2002:a5d:624e:0:b0:2d0:33aa:26db with SMTP id m14-20020a5d624e000000b002d033aa26dbmr10479638wrv.56.1679185218406; Sat, 18 Mar 2023 17:20:18 -0700 (PDT) Received: from lucifer.home (host86-146-209-214.range86-146.btcentralplus.com. [86.146.209.214]) by smtp.googlemail.com with ESMTPSA id x14-20020adfdd8e000000b002cff0c57b98sm5399639wrl.18.2023.03.18.17.20.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Mar 2023 17:20:17 -0700 (PDT) From: Lorenzo Stoakes To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton Cc: Baoquan He , Uladzislau Rezki , Matthew Wilcox , David Hildenbrand , Liu Shixin , Jiri Olsa , Lorenzo Stoakes Subject: [PATCH 2/4] mm: vmalloc: use rwsem, mutex for vmap_area_lock and vmap_block->lock Date: Sun, 19 Mar 2023 00:20:10 +0000 Message-Id: <6c7f1ac0aeb55faaa46a09108d3999e4595870d9.1679183626.git.lstoakes@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760757608719386617?= X-GMAIL-MSGID: =?utf-8?q?1760757608719386617?= vmalloc() is, by design, not permitted to be used in atomic context and already contains components which may sleep, so avoiding spin locks is not a problem from the perspective of atomic context. The global vmap_area_lock is held when the red/black tree rooted in vmap_are_root is accessed and thus is rather long-held and under potentially high contention. It is likely to be under contention for reads rather than write, so replace it with a rwsem. Each individual vmap_block->lock is likely to be held for less time but under low contention, so a mutex is not an outrageous choice here. A subset of test_vmalloc.sh performance results:- fix_size_alloc_test 0.40% full_fit_alloc_test 2.08% long_busy_list_alloc_test 0.34% random_size_alloc_test -0.25% random_size_align_alloc_test 0.06% ... all tests cycles 0.2% This represents a tiny reduction in performance that sits barely above noise. The reason for making this change is to build a basis for vread() to be usable asynchronously, this eliminating the need for a bounce buffer when copying data to userland in read_kcore() and allowing that to be converted to an iterator form. Signed-off-by: Lorenzo Stoakes --- mm/vmalloc.c | 77 +++++++++++++++++++++++++++------------------------- 1 file changed, 40 insertions(+), 37 deletions(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 978194dc2bb8..c24b27664a97 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -725,7 +726,7 @@ EXPORT_SYMBOL(vmalloc_to_pfn); #define DEBUG_AUGMENT_LOWEST_MATCH_CHECK 0 -static DEFINE_SPINLOCK(vmap_area_lock); +static DECLARE_RWSEM(vmap_area_lock); static DEFINE_SPINLOCK(free_vmap_area_lock); /* Export for kexec only */ LIST_HEAD(vmap_area_list); @@ -1537,9 +1538,9 @@ static void free_vmap_area(struct vmap_area *va) /* * Remove from the busy tree/list. */ - spin_lock(&vmap_area_lock); + down_write(&vmap_area_lock); unlink_va(va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + up_write(&vmap_area_lock); /* * Insert/Merge it back to the free tree/list. @@ -1627,9 +1628,9 @@ static struct vmap_area *alloc_vmap_area(unsigned long size, va->vm = NULL; va->flags = va_flags; - spin_lock(&vmap_area_lock); + down_write(&vmap_area_lock); insert_vmap_area(va, &vmap_area_root, &vmap_area_list); - spin_unlock(&vmap_area_lock); + up_write(&vmap_area_lock); BUG_ON(!IS_ALIGNED(va->va_start, align)); BUG_ON(va->va_start < vstart); @@ -1854,9 +1855,9 @@ struct vmap_area *find_vmap_area(unsigned long addr) { struct vmap_area *va; - spin_lock(&vmap_area_lock); + down_read(&vmap_area_lock); va = __find_vmap_area(addr, &vmap_area_root); - spin_unlock(&vmap_area_lock); + up_read(&vmap_area_lock); return va; } @@ -1865,11 +1866,11 @@ static struct vmap_area *find_unlink_vmap_area(unsigned long addr) { struct vmap_area *va; - spin_lock(&vmap_area_lock); + down_write(&vmap_area_lock); va = __find_vmap_area(addr, &vmap_area_root); if (va) unlink_va(va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + up_write(&vmap_area_lock); return va; } @@ -1914,7 +1915,7 @@ struct vmap_block_queue { }; struct vmap_block { - spinlock_t lock; + struct mutex lock; struct vmap_area *va; unsigned long free, dirty; DECLARE_BITMAP(used_map, VMAP_BBMAP_BITS); @@ -1991,7 +1992,7 @@ static void *new_vmap_block(unsigned int order, gfp_t gfp_mask) } vaddr = vmap_block_vaddr(va->va_start, 0); - spin_lock_init(&vb->lock); + mutex_init(&vb->lock); vb->va = va; /* At least something should be left free */ BUG_ON(VMAP_BBMAP_BITS <= (1UL << order)); @@ -2026,9 +2027,9 @@ static void free_vmap_block(struct vmap_block *vb) tmp = xa_erase(&vmap_blocks, addr_to_vb_idx(vb->va->va_start)); BUG_ON(tmp != vb); - spin_lock(&vmap_area_lock); + down_write(&vmap_area_lock); unlink_va(vb->va, &vmap_area_root); - spin_unlock(&vmap_area_lock); + up_write(&vmap_area_lock); free_vmap_area_noflush(vb->va); kfree_rcu(vb, rcu_head); @@ -2047,7 +2048,7 @@ static void purge_fragmented_blocks(int cpu) if (!(vb->free + vb->dirty == VMAP_BBMAP_BITS && vb->dirty != VMAP_BBMAP_BITS)) continue; - spin_lock(&vb->lock); + mutex_lock(&vb->lock); if (vb->free + vb->dirty == VMAP_BBMAP_BITS && vb->dirty != VMAP_BBMAP_BITS) { vb->free = 0; /* prevent further allocs after releasing lock */ vb->dirty = VMAP_BBMAP_BITS; /* prevent purging it again */ @@ -2056,10 +2057,10 @@ static void purge_fragmented_blocks(int cpu) spin_lock(&vbq->lock); list_del_rcu(&vb->free_list); spin_unlock(&vbq->lock); - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); list_add_tail(&vb->purge, &purge); } else - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); } rcu_read_unlock(); @@ -2101,9 +2102,9 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask) list_for_each_entry_rcu(vb, &vbq->free, free_list) { unsigned long pages_off; - spin_lock(&vb->lock); + mutex_lock(&vb->lock); if (vb->free < (1UL << order)) { - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); continue; } @@ -2117,7 +2118,7 @@ static void *vb_alloc(unsigned long size, gfp_t gfp_mask) spin_unlock(&vbq->lock); } - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); break; } @@ -2144,16 +2145,16 @@ static void vb_free(unsigned long addr, unsigned long size) order = get_order(size); offset = (addr & (VMAP_BLOCK_SIZE - 1)) >> PAGE_SHIFT; vb = xa_load(&vmap_blocks, addr_to_vb_idx(addr)); - spin_lock(&vb->lock); + mutex_lock(&vb->lock); bitmap_clear(vb->used_map, offset, (1UL << order)); - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); vunmap_range_noflush(addr, addr + size); if (debug_pagealloc_enabled_static()) flush_tlb_kernel_range(addr, addr + size); - spin_lock(&vb->lock); + mutex_lock(&vb->lock); /* Expand dirty range */ vb->dirty_min = min(vb->dirty_min, offset); @@ -2162,10 +2163,10 @@ static void vb_free(unsigned long addr, unsigned long size) vb->dirty += 1UL << order; if (vb->dirty == VMAP_BBMAP_BITS) { BUG_ON(vb->free); - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); free_vmap_block(vb); } else - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); } static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) @@ -2183,7 +2184,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) rcu_read_lock(); list_for_each_entry_rcu(vb, &vbq->free, free_list) { - spin_lock(&vb->lock); + mutex_lock(&vb->lock); if (vb->dirty && vb->dirty != VMAP_BBMAP_BITS) { unsigned long va_start = vb->va->va_start; unsigned long s, e; @@ -2196,7 +2197,7 @@ static void _vm_unmap_aliases(unsigned long start, unsigned long end, int flush) flush = 1; } - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); } rcu_read_unlock(); } @@ -2451,9 +2452,9 @@ static inline void setup_vmalloc_vm_locked(struct vm_struct *vm, static void setup_vmalloc_vm(struct vm_struct *vm, struct vmap_area *va, unsigned long flags, const void *caller) { - spin_lock(&vmap_area_lock); + down_write(&vmap_area_lock); setup_vmalloc_vm_locked(vm, va, flags, caller); - spin_unlock(&vmap_area_lock); + up_write(&vmap_area_lock); } static void clear_vm_uninitialized_flag(struct vm_struct *vm) @@ -3507,9 +3508,9 @@ static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags if (!vb) goto finished; - spin_lock(&vb->lock); + mutex_lock(&vb->lock); if (bitmap_empty(vb->used_map, VMAP_BBMAP_BITS)) { - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); goto finished; } for_each_set_bitrange(rs, re, vb->used_map, VMAP_BBMAP_BITS) { @@ -3536,7 +3537,7 @@ static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags count -= n; } unlock: - spin_unlock(&vb->lock); + mutex_unlock(&vb->lock); finished: /* zero-fill the left dirty or free regions */ @@ -3576,13 +3577,15 @@ long vread(char *buf, char *addr, unsigned long count) unsigned long buflen = count; unsigned long n, size, flags; + might_sleep(); + addr = kasan_reset_tag(addr); /* Don't allow overflow */ if ((unsigned long) addr + count < count) count = -(unsigned long) addr; - spin_lock(&vmap_area_lock); + down_read(&vmap_area_lock); va = find_vmap_area_exceed_addr((unsigned long)addr); if (!va) goto finished; @@ -3639,7 +3642,7 @@ long vread(char *buf, char *addr, unsigned long count) count -= n; } finished: - spin_unlock(&vmap_area_lock); + up_read(&vmap_area_lock); if (buf == buf_start) return 0; @@ -3980,14 +3983,14 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets, } /* insert all vm's */ - spin_lock(&vmap_area_lock); + down_write(&vmap_area_lock); for (area = 0; area < nr_vms; area++) { insert_vmap_area(vas[area], &vmap_area_root, &vmap_area_list); setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC, pcpu_get_vm_areas); } - spin_unlock(&vmap_area_lock); + up_write(&vmap_area_lock); /* * Mark allocated areas as accessible. Do it now as a best-effort @@ -4114,7 +4117,7 @@ static void *s_start(struct seq_file *m, loff_t *pos) __acquires(&vmap_area_lock) { mutex_lock(&vmap_purge_lock); - spin_lock(&vmap_area_lock); + down_read(&vmap_area_lock); return seq_list_start(&vmap_area_list, *pos); } @@ -4128,7 +4131,7 @@ static void s_stop(struct seq_file *m, void *p) __releases(&vmap_area_lock) __releases(&vmap_purge_lock) { - spin_unlock(&vmap_area_lock); + up_read(&vmap_area_lock); mutex_unlock(&vmap_purge_lock); } From patchwork Sun Mar 19 00:20:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 71656 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp513180wrt; Sat, 18 Mar 2023 17:26:21 -0700 (PDT) X-Google-Smtp-Source: AK7set9hnEI4w8fe+rsh1aYOBDqDEYYD8yYLe7KUqWcR4HyCQffPyJflMcLQJhJdltDskAI/DGLz X-Received: by 2002:a05:6a20:69a3:b0:cb:e735:65a5 with SMTP id t35-20020a056a2069a300b000cbe73565a5mr18147646pzk.40.1679185581521; Sat, 18 Mar 2023 17:26:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679185581; cv=none; d=google.com; s=arc-20160816; b=X8ay9oLvl2Va983Avnm63L0OsW7hXGJgOGjVTQwrFz2LO9TZ0m4IQI6zl+3Pv88ETK 6j3r6AWS2WkJG18F8ScskDmrxMzDxYo/BXC7CrViST7koDl7/qhShX3555F6dEG0qcom aEwtra5jQ0B9l2sYj9nn8TM6DVIIBu8fHJVcDYVk3zQngQIamkrNBN8RNiRaJ06iBvUw xTyMIeLsLvFejBw30cpPafFK23kF00CP9S8jpT1bJcbrf/MKm8vWWcBUycfzAhXARyvA aVAJvxFwM7GnIHGEXfTCPzBCd+870G70HpQvrNuWLgFowkp8XnOrFYKbvI1AA60KsQoT LAxQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ML/fjqBQiFdniaLp2U4iIoVxIHX7dLvD1UO1EbWAfgg=; b=t7FpU0X7Clw+7eafh2doDvh6FTsQrmVjtShN/fJNZ55dgjYNPChboGVxIQPKoLSEHN ZM4CtQ6Q/vvWuTxKCNUqcI6ymZAW1gXn9bdA8JPKaYYdqlEl8T7Eo3/ox7dFxzqkMyWf jJaf9tWdSbXymTyrSiY0cZyW7K/towBEdwSsk/MqSoMTr1r4FcPT8EPjlYrbUmirKvsd 4SMKg6gcp/Odw+4hL4du2za473dTDhcGgP9NCBwk6PbT04DA0T4StMoOgc5lF/8rVRys 9UWYpDwLZkaZFm9f35i0EtmfWuusM755WAmZiKR72zTjIb09pkXsSgbR6zRl4227o/fu Y9GA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="n3/WQOG/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b8-20020a656688000000b0050be7b93e73si6691302pgw.694.2023.03.18.17.26.04; Sat, 18 Mar 2023 17:26:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="n3/WQOG/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230123AbjCSAZi (ORCPT + 99 others); Sat, 18 Mar 2023 20:25:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230317AbjCSAYm (ORCPT ); Sat, 18 Mar 2023 20:24:42 -0400 Received: from mail-wr1-x42c.google.com (mail-wr1-x42c.google.com [IPv6:2a00:1450:4864:20::42c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EA3D2C67F; Sat, 18 Mar 2023 17:21:54 -0700 (PDT) Received: by mail-wr1-x42c.google.com with SMTP id i9so7385662wrp.3; Sat, 18 Mar 2023 17:21:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679185220; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ML/fjqBQiFdniaLp2U4iIoVxIHX7dLvD1UO1EbWAfgg=; b=n3/WQOG/ih4PqFSwkZOumfiFVTaTsKhtI63BBDGI+/vv/zPkXwXwDN0ApMTMTk8SYB ZMBRckYTo15ECbqaHnrBo48DrtOZUTNYDSw8c2T1sKBr5u2pwlk3r9mfSp26dS+X7GiD hT0ZMmPNSr5XADXvConvouz1CmLldien5XgSGZJMS+0yZw8l6uxPhbwG5PXY7KrqrM5C pMxQhA41a0VwFVargL4lvGa73JGXOiX13ei56eMROD8jAJ7I7o5+7YinueUteFCHDPXM 230nGcCMkJr4sHIdUsxmxEX2nkRrEyhgq3V6csMtQPw4EcM45xMSJc3lLbFfE8xRTgJJ bZCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679185220; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ML/fjqBQiFdniaLp2U4iIoVxIHX7dLvD1UO1EbWAfgg=; b=ccR0Go4jV2kgjMuaF5K86GP9semjSbCJLcLD253xGCvgUZApABBgcNpyptCDnxq/b1 nJeiw9RgTNabLQI6doet3ERsI4Dhm+RApIAXU7rB7ImRgJH2k54vaLgDFkzGbGduaHvz 5fm+2DyJq5Q92gEkS+NzCs96bmZPqJudLvJ/Yy3458M9CHt8iDz2s/UxDXzL/mV2l9hE bmXerK7xL0MJwPSvB5IDLkptKL9yog6WSx3fXmHu+7b6+5K9k5UaDY0FlUq6By9GyQwp B1amcmGH2m1czGu8sRkBbpkh+HIaZaiJXYRXyI4wz/JosdW/yfRjaN8N52KqazEuFBGq lLOA== X-Gm-Message-State: AO0yUKVEGpnDIBgcdP5ly/w0T2e9o9LlVAnNi90cXnN3A5cvzXQgHXFF +/KuyV77on1ckuya4HONqK1JBr3DtU0= X-Received: by 2002:a5d:54c1:0:b0:2d3:fba4:e61d with SMTP id x1-20020a5d54c1000000b002d3fba4e61dmr4877232wrv.12.1679185220032; Sat, 18 Mar 2023 17:20:20 -0700 (PDT) Received: from lucifer.home (host86-146-209-214.range86-146.btcentralplus.com. [86.146.209.214]) by smtp.googlemail.com with ESMTPSA id x14-20020adfdd8e000000b002cff0c57b98sm5399639wrl.18.2023.03.18.17.20.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Mar 2023 17:20:19 -0700 (PDT) From: Lorenzo Stoakes To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton Cc: Baoquan He , Uladzislau Rezki , Matthew Wilcox , David Hildenbrand , Liu Shixin , Jiri Olsa , Lorenzo Stoakes Subject: [PATCH 3/4] fs/proc/kcore: convert read_kcore() to read_kcore_iter() Date: Sun, 19 Mar 2023 00:20:11 +0000 Message-Id: <32f8fad50500d0cd0927a66638c5890533725d30.1679183626.git.lstoakes@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760753699945970326?= X-GMAIL-MSGID: =?utf-8?q?1760753699945970326?= Now we have eliminated spinlocks from the vread() case, convert read_kcore() to read_kcore_iter(). For the time being we still use a bounce buffer for vread(), however in the next patch we will convert this to interact directly with the iterator and eliminate the bounce buffer altogether. Signed-off-by: Lorenzo Stoakes --- fs/proc/kcore.c | 58 ++++++++++++++++++++++++------------------------- 1 file changed, 29 insertions(+), 29 deletions(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 556f310d6aa4..25e0eeb8d498 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -24,7 +24,7 @@ #include #include #include -#include +#include #include #include #include @@ -308,9 +308,12 @@ static void append_kcore_note(char *notes, size_t *i, const char *name, } static ssize_t -read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) +read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) { + struct file *file = iocb->ki_filp; char *buf = file->private_data; + loff_t *ppos = &iocb->ki_pos; + size_t phdrs_offset, notes_offset, data_offset; size_t page_offline_frozen = 1; size_t phdrs_len, notes_len; @@ -318,6 +321,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) size_t tsz; int nphdr; unsigned long start; + size_t buflen = iov_iter_count(iter); size_t orig_buflen = buflen; int ret = 0; @@ -333,7 +337,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) notes_offset = phdrs_offset + phdrs_len; /* ELF file header. */ - if (buflen && *fpos < sizeof(struct elfhdr)) { + if (buflen && *ppos < sizeof(struct elfhdr)) { struct elfhdr ehdr = { .e_ident = { [EI_MAG0] = ELFMAG0, @@ -355,19 +359,18 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) .e_phnum = nphdr, }; - tsz = min_t(size_t, buflen, sizeof(struct elfhdr) - *fpos); - if (copy_to_user(buffer, (char *)&ehdr + *fpos, tsz)) { + tsz = min_t(size_t, buflen, sizeof(struct elfhdr) - *ppos); + if (copy_to_iter((char *)&ehdr + *ppos, tsz, iter) != tsz) { ret = -EFAULT; goto out; } - buffer += tsz; buflen -= tsz; - *fpos += tsz; + *ppos += tsz; } /* ELF program headers. */ - if (buflen && *fpos < phdrs_offset + phdrs_len) { + if (buflen && *ppos < phdrs_offset + phdrs_len) { struct elf_phdr *phdrs, *phdr; phdrs = kzalloc(phdrs_len, GFP_KERNEL); @@ -397,22 +400,21 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) phdr++; } - tsz = min_t(size_t, buflen, phdrs_offset + phdrs_len - *fpos); - if (copy_to_user(buffer, (char *)phdrs + *fpos - phdrs_offset, - tsz)) { + tsz = min_t(size_t, buflen, phdrs_offset + phdrs_len - *ppos); + if (copy_to_iter((char *)phdrs + *ppos - phdrs_offset, tsz, + iter) != tsz) { kfree(phdrs); ret = -EFAULT; goto out; } kfree(phdrs); - buffer += tsz; buflen -= tsz; - *fpos += tsz; + *ppos += tsz; } /* ELF note segment. */ - if (buflen && *fpos < notes_offset + notes_len) { + if (buflen && *ppos < notes_offset + notes_len) { struct elf_prstatus prstatus = {}; struct elf_prpsinfo prpsinfo = { .pr_sname = 'R', @@ -447,24 +449,23 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) vmcoreinfo_data, min(vmcoreinfo_size, notes_len - i)); - tsz = min_t(size_t, buflen, notes_offset + notes_len - *fpos); - if (copy_to_user(buffer, notes + *fpos - notes_offset, tsz)) { + tsz = min_t(size_t, buflen, notes_offset + notes_len - *ppos); + if (copy_to_iter(notes + *ppos - notes_offset, tsz, iter) != tsz) { kfree(notes); ret = -EFAULT; goto out; } kfree(notes); - buffer += tsz; buflen -= tsz; - *fpos += tsz; + *ppos += tsz; } /* * Check to see if our file offset matches with any of * the addresses in the elf_phdr on our list. */ - start = kc_offset_to_vaddr(*fpos - data_offset); + start = kc_offset_to_vaddr(*ppos - data_offset); if ((tsz = (PAGE_SIZE - (start & ~PAGE_MASK))) > buflen) tsz = buflen; @@ -497,7 +498,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) } if (!m) { - if (clear_user(buffer, tsz)) { + if (iov_iter_zero(tsz, iter) != tsz) { ret = -EFAULT; goto out; } @@ -508,14 +509,14 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) case KCORE_VMALLOC: vread(buf, (char *)start, tsz); /* we have to zero-fill user buffer even if no read */ - if (copy_to_user(buffer, buf, tsz)) { + if (copy_to_iter(buf, tsz, iter) != tsz) { ret = -EFAULT; goto out; } break; case KCORE_USER: /* User page is handled prior to normal kernel page: */ - if (copy_to_user(buffer, (char *)start, tsz)) { + if (copy_to_iter((char *)start, tsz, iter) != tsz) { ret = -EFAULT; goto out; } @@ -531,7 +532,7 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) */ if (!page || PageOffline(page) || is_page_hwpoison(page) || !pfn_is_ram(pfn)) { - if (clear_user(buffer, tsz)) { + if (iov_iter_zero(tsz, iter) != tsz) { ret = -EFAULT; goto out; } @@ -541,25 +542,24 @@ read_kcore(struct file *file, char __user *buffer, size_t buflen, loff_t *fpos) case KCORE_VMEMMAP: case KCORE_TEXT: /* - * We use _copy_to_user() to bypass usermode hardening + * We use _copy_to_iter() to bypass usermode hardening * which would otherwise prevent this operation. */ - if (_copy_to_user(buffer, (char *)start, tsz)) { + if (_copy_to_iter((char *)start, tsz, iter) != tsz) { ret = -EFAULT; goto out; } break; default: pr_warn_once("Unhandled KCORE type: %d\n", m->type); - if (clear_user(buffer, tsz)) { + if (iov_iter_zero(tsz, iter) != tsz) { ret = -EFAULT; goto out; } } skip: buflen -= tsz; - *fpos += tsz; - buffer += tsz; + *ppos += tsz; start += tsz; tsz = (buflen > PAGE_SIZE ? PAGE_SIZE : buflen); } @@ -603,7 +603,7 @@ static int release_kcore(struct inode *inode, struct file *file) } static const struct proc_ops kcore_proc_ops = { - .proc_read = read_kcore, + .proc_read_iter = read_kcore_iter, .proc_open = open_kcore, .proc_release = release_kcore, .proc_lseek = default_llseek, From patchwork Sun Mar 19 00:20:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lorenzo Stoakes X-Patchwork-Id: 71663 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:604a:0:0:0:0:0 with SMTP id j10csp526573wrt; Sat, 18 Mar 2023 18:14:59 -0700 (PDT) X-Google-Smtp-Source: AK7set8tEP4oXI6pkYyBlWw7JIwwhSMnj5eSZOCMrS5c9Kp1eQ1982AJk8W1kdhabtFY6mng2FE8 X-Received: by 2002:a17:902:d2cc:b0:19d:2542:96a4 with SMTP id n12-20020a170902d2cc00b0019d254296a4mr15958807plc.4.1679188498759; Sat, 18 Mar 2023 18:14:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1679188498; cv=none; d=google.com; s=arc-20160816; b=xXCq1TcxHa23rqdo6XqspRppgYB5tVRcyAGf4wSOpoxNmfy9OdMBECdly7XEWbTPiO S52GrAUyciIthBEtsbvL6ITywjdHPYeBf6qvpg4kUj1TVk0BvCOa4tbYGlNLvcnTNvlh UKbuGLVq1BeBTG6xUzJQapsf5QYejGcdApYMoN2bMGhVZTNoR0PF9zjdtLTzwzl8zxSi vrE3YOUf5Vq4O3VHMCyuWPVj4N3PNE+pD55H38GoyNuR8ybWOdF2nzTNBO6pDU08zL6p /6e0DTp6qpjdV/hD4+TCc6W/+b4/UI5gJ4LoKx6G+14SFklODw7O+qmeyke+A9idqlxY A6tw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=MTgm8eMaEME071JBKDzGLJgbn/ShH65iS57yVvkZwv4=; b=ifJRj9vyOJ13Cjss7BYgoPDs9eiHzDLI+Rfa8gu/iQ7b5fdx5jzaujlFN/j0PZIVjI Ry5wKPNqW8hxRet3hdIT0KNauJ0hc5CvWfxr5t5JwgvdJsOIaLbJhzsF0vfjrW6sfhjJ 4caDmYpX4f4Tncxkd55FE7sjDfOjC1scUUaTmOESRvZcvQoLuIGfuM0YuSXhF0QDGNG6 gRdUzimgx9Ce/DGqHz0A1xqHcoBKSnaM8fMkzWKaAP6HYPJg9CI+hRZ4IwH0QgJfIjBT ZjTUjOQaeOQ5dnjTFGnn8dusOYw6DXbyGxnbr7re9oZweMBkHFVulTXB4k//9MINB61g aeRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=huWPKdm+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bg11-20020a1709028e8b00b0019ad44844b3si6253993plb.118.2023.03.18.18.14.46; Sat, 18 Mar 2023 18:14:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=huWPKdm+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230306AbjCSAZr (ORCPT + 99 others); Sat, 18 Mar 2023 20:25:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47686 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230336AbjCSAYd (ORCPT ); Sat, 18 Mar 2023 20:24:33 -0400 Received: from mail-wr1-x431.google.com (mail-wr1-x431.google.com [IPv6:2a00:1450:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69F312A17A; Sat, 18 Mar 2023 17:21:45 -0700 (PDT) Received: by mail-wr1-x431.google.com with SMTP id y14so7381892wrq.4; Sat, 18 Mar 2023 17:21:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1679185221; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MTgm8eMaEME071JBKDzGLJgbn/ShH65iS57yVvkZwv4=; b=huWPKdm+tRAG9YrVt08wts9oXX1u/L6VFZzCMnd846EM/acbas78Ee3zKgv7qmyDtD bQ4WNX1Lu+YmfrpGKGF9SyFdXzzkh/lHE4Iecb+gC8DONfvV1VWDU6hLr7f8aTtTmN7t HsZ+37pEqNuXSetVcUlvZvK5lC1BPpSHqEtT5J2tC7y9SBf3pcvrpBPKD0wEiJ0KPZiC ECZxUl8khPH3gNwSahF7FMoeapYF2Ooz4nvhyNQPEhGwJDLaneWyb6hY1U7a+bvCC0nr vFI3/O5GQGCBZMjn7KZkrweyNUhrDcliQX2NCWuxFeYw+PB+NuIWEt//techJMcX6FDx cMHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679185221; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MTgm8eMaEME071JBKDzGLJgbn/ShH65iS57yVvkZwv4=; b=jUnRG0+NUetU68lVUW6GIjyl6oI5YY5vWcxAoMZeH2NSqrKKdeO0PmK6BM15eO7dYK lr5fEe3buiFjXMsVjArrMYz8gjAmz8hfvxRgBH6sQCtKAA7R0d3T5fLStLlo5MccQQJq T/My8jdl/vrtxKCAMEo63zDkyjazTkTW/FnRgVUM7KdMNQI3H8V3lVnXFclkORsCk5NV Uqc/FpXeHIibWeu5Vk3k+kIFbc2ZlqYJ+hyo53nOkkDnTHlGDt8vz/FHcCPbEgHZn6DJ n/jncpb22gzWXOZclEhNJFV1UqE00n2OtZ+HkErmutE/GRMQyBOSh8TZTy8/F9TERqWr ZqSQ== X-Gm-Message-State: AO0yUKWYU5mjti0zgdUOB611qSXHeP4dPQ0jFpqe+jgyljRMmv2xW0Mg LKtRRw314KN+3ovnnaUWLbuYfX46T6s= X-Received: by 2002:adf:fd12:0:b0:2ce:306d:6515 with SMTP id e18-20020adffd12000000b002ce306d6515mr9876348wrr.34.1679185221216; Sat, 18 Mar 2023 17:20:21 -0700 (PDT) Received: from lucifer.home (host86-146-209-214.range86-146.btcentralplus.com. [86.146.209.214]) by smtp.googlemail.com with ESMTPSA id x14-20020adfdd8e000000b002cff0c57b98sm5399639wrl.18.2023.03.18.17.20.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Mar 2023 17:20:20 -0700 (PDT) From: Lorenzo Stoakes To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Andrew Morton Cc: Baoquan He , Uladzislau Rezki , Matthew Wilcox , David Hildenbrand , Liu Shixin , Jiri Olsa , Lorenzo Stoakes Subject: [PATCH 4/4] mm: vmalloc: convert vread() to vread_iter() Date: Sun, 19 Mar 2023 00:20:12 +0000 Message-Id: <119871ea9507eac7be5d91db38acdb03981e049e.1679183626.git.lstoakes@gmail.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1760756759451556739?= X-GMAIL-MSGID: =?utf-8?q?1760756759451556739?= Having previously laid the foundation for converting vread() to an iterator function, pull the trigger and do so. This patch attempts to provide minimal refactoring and to reflect the existing logic as best we can, with the exception of aligned_vread_iter() which drops the use of the deprecated kmap_atomic() in favour of kmap_local_page(). All existing logic to zero portions of memory not read remain and there should be no functional difference other than a performance improvement in /proc/kcore access to vmalloc regions. Now we have discarded with the need for a bounce buffer at all in read_kcore_iter(), we dispense with the one allocated there altogether. Signed-off-by: Lorenzo Stoakes --- fs/proc/kcore.c | 21 +-------- include/linux/vmalloc.h | 3 +- mm/vmalloc.c | 101 +++++++++++++++++++++------------------- 3 files changed, 57 insertions(+), 68 deletions(-) diff --git a/fs/proc/kcore.c b/fs/proc/kcore.c index 25e0eeb8d498..8a07f04c9203 100644 --- a/fs/proc/kcore.c +++ b/fs/proc/kcore.c @@ -307,13 +307,9 @@ static void append_kcore_note(char *notes, size_t *i, const char *name, *i = ALIGN(*i + descsz, 4); } -static ssize_t -read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) +static ssize_t read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) { - struct file *file = iocb->ki_filp; - char *buf = file->private_data; loff_t *ppos = &iocb->ki_pos; - size_t phdrs_offset, notes_offset, data_offset; size_t page_offline_frozen = 1; size_t phdrs_len, notes_len; @@ -507,9 +503,7 @@ read_kcore_iter(struct kiocb *iocb, struct iov_iter *iter) switch (m->type) { case KCORE_VMALLOC: - vread(buf, (char *)start, tsz); - /* we have to zero-fill user buffer even if no read */ - if (copy_to_iter(buf, tsz, iter) != tsz) { + if (vread_iter((char *)start, tsz, iter) != tsz) { ret = -EFAULT; goto out; } @@ -582,10 +576,6 @@ static int open_kcore(struct inode *inode, struct file *filp) if (ret) return ret; - filp->private_data = kmalloc(PAGE_SIZE, GFP_KERNEL); - if (!filp->private_data) - return -ENOMEM; - if (kcore_need_update) kcore_update_ram(); if (i_size_read(inode) != proc_root_kcore->size) { @@ -596,16 +586,9 @@ static int open_kcore(struct inode *inode, struct file *filp) return 0; } -static int release_kcore(struct inode *inode, struct file *file) -{ - kfree(file->private_data); - return 0; -} - static const struct proc_ops kcore_proc_ops = { .proc_read_iter = read_kcore_iter, .proc_open = open_kcore, - .proc_release = release_kcore, .proc_lseek = default_llseek, }; diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 69250efa03d1..f70ebdf21f22 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -9,6 +9,7 @@ #include /* pgprot_t */ #include #include +#include #include @@ -251,7 +252,7 @@ static inline void set_vm_flush_reset_perms(void *addr) #endif /* for /proc/kcore */ -extern long vread(char *buf, char *addr, unsigned long count); +extern long vread_iter(char *addr, size_t count, struct iov_iter *iter); /* * Internals. Don't use.. diff --git a/mm/vmalloc.c b/mm/vmalloc.c index c24b27664a97..3a32754266dc 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -37,7 +37,6 @@ #include #include #include -#include #include #include #include @@ -3446,20 +3445,20 @@ EXPORT_SYMBOL(vmalloc_32_user); * small helper routine , copy contents to buf from addr. * If the page is not present, fill zero. */ - -static int aligned_vread(char *buf, char *addr, unsigned long count) +static void aligned_vread_iter(char *addr, size_t count, + struct iov_iter *iter) { - struct page *p; - int copied = 0; + struct page *page; - while (count) { + while (count > 0) { unsigned long offset, length; + size_t copied = 0; offset = offset_in_page(addr); length = PAGE_SIZE - offset; if (length > count) length = count; - p = vmalloc_to_page(addr); + page = vmalloc_to_page(addr); /* * To do safe access to this _mapped_ area, we need * lock. But adding lock here means that we need to add @@ -3467,23 +3466,24 @@ static int aligned_vread(char *buf, char *addr, unsigned long count) * interface, rarely used. Instead of that, we'll use * kmap() and get small overhead in this access function. */ - if (p) { + if (page) { /* We can expect USER0 is not used -- see vread() */ - void *map = kmap_atomic(p); - memcpy(buf, map + offset, length); - kunmap_atomic(map); - } else - memset(buf, 0, length); + void *map = kmap_local_page(page); + + copied = copy_to_iter(map + offset, length, iter); + kunmap_local(map); + } + + if (copied < length) + iov_iter_zero(length - copied, iter); addr += length; - buf += length; - copied += length; count -= length; } - return copied; } -static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags) +static void vmap_ram_vread_iter(char *addr, int count, unsigned long flags, + struct iov_iter *iter) { char *start; struct vmap_block *vb; @@ -3496,7 +3496,7 @@ static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags * handle it here. */ if (!(flags & VMAP_BLOCK)) { - aligned_vread(buf, addr, count); + aligned_vread_iter(addr, count, iter); return; } @@ -3517,22 +3517,24 @@ static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags if (!count) break; start = vmap_block_vaddr(vb->va->va_start, rs); - while (addr < start) { + + if (addr < start) { + size_t to_zero = min_t(size_t, start - addr, count); + + iov_iter_zero(to_zero, iter); + addr += to_zero; + count -= (int)to_zero; if (count == 0) goto unlock; - *buf = '\0'; - buf++; - addr++; - count--; } + /*it could start reading from the middle of used region*/ offset = offset_in_page(addr); n = ((re - rs + 1) << PAGE_SHIFT) - offset; if (n > count) n = count; - aligned_vread(buf, start+offset, n); + aligned_vread_iter(start + offset, n, iter); - buf += n; addr += n; count -= n; } @@ -3541,15 +3543,15 @@ static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags finished: /* zero-fill the left dirty or free regions */ - if (count) - memset(buf, 0, count); + if (count > 0) + iov_iter_zero(count, iter); } /** - * vread() - read vmalloc area in a safe way. - * @buf: buffer for reading data - * @addr: vm address. - * @count: number of bytes to be read. + * vread_iter() - read vmalloc area in a safe way to an iterator. + * @addr: vm address. + * @count: number of bytes to be read. + * @iter: the iterator to which data should be written. * * This function checks that addr is a valid vmalloc'ed area, and * copy data from that area to a given buffer. If the given memory range @@ -3569,13 +3571,13 @@ static void vmap_ram_vread(char *buf, char *addr, int count, unsigned long flags * (same number as @count) or %0 if [addr...addr+count) doesn't * include any intersection with valid vmalloc area */ -long vread(char *buf, char *addr, unsigned long count) +long vread_iter(char *addr, size_t count, struct iov_iter *iter) { struct vmap_area *va; struct vm_struct *vm; - char *vaddr, *buf_start = buf; - unsigned long buflen = count; - unsigned long n, size, flags; + char *vaddr; + size_t buflen = count; + size_t n, size, flags; might_sleep(); @@ -3595,7 +3597,7 @@ long vread(char *buf, char *addr, unsigned long count) goto finished; list_for_each_entry_from(va, &vmap_area_list, list) { - if (!count) + if (count == 0) break; vm = va->vm; @@ -3619,36 +3621,39 @@ long vread(char *buf, char *addr, unsigned long count) if (addr >= vaddr + size) continue; - while (addr < vaddr) { + + if (addr < vaddr) { + size_t to_zero = min_t(size_t, vaddr - addr, count); + + iov_iter_zero(to_zero, iter); + addr += to_zero; + count -= to_zero; if (count == 0) goto finished; - *buf = '\0'; - buf++; - addr++; - count--; } + n = vaddr + size - addr; if (n > count) n = count; if (flags & VMAP_RAM) - vmap_ram_vread(buf, addr, n, flags); + vmap_ram_vread_iter(addr, n, flags, iter); else if (!(vm->flags & VM_IOREMAP)) - aligned_vread(buf, addr, n); + aligned_vread_iter(addr, n, iter); else /* IOREMAP area is treated as memory hole */ - memset(buf, 0, n); - buf += n; + iov_iter_zero(n, iter); + addr += n; count -= n; } finished: up_read(&vmap_area_lock); - if (buf == buf_start) + if (count == buflen) return 0; /* zero-fill memory holes */ - if (buf != buf_start + buflen) - memset(buf, 0, buflen - (buf - buf_start)); + if (count > 0) + iov_iter_zero(count, iter); return buflen; }