From patchwork Mon Jan 22 07:13:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 189930 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp2409260dyb; Sun, 21 Jan 2024 23:13:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IG3mzql4z3Hoj2LNor425OmI7fMbiIAGOekwtJZFoan9+UM2GXrJ42FRkYxCpb0WAoFQNNu X-Received: by 2002:a17:907:6095:b0:a27:ef77:dd31 with SMTP id ht21-20020a170907609500b00a27ef77dd31mr2187165ejc.91.1705907634967; Sun, 21 Jan 2024 23:13:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705907634; cv=pass; d=google.com; s=arc-20160816; b=ApPlSXiuyGP8JTpf4oYRGZhy40in2j1Jd6YuHRia5XeiGdmOLF2F4rn/FsXPrqV7Oq AyXBR7hJLnB1vHaTKZ5tJeMZa2u8orqDfC7kA5iQQN9TAlFZ0nKY6T4TfkpkCgDKsX2+ Vykgjkf9/qf5dc2ZGAGxn5BpxR19t5D5fRe4Lx+l91tEnKXu2Fgr33/+UgQGJRkCSK4E x36g+pn2LDOJ/zTlhfE/3TnI8MIxuID8WQpzVZ9Zm6wRhWcrUMEG0TIsddTEgLzr/iHe 7Pzddmp4iVO5J4VMRuIIvzU5JHRBSHU/kT0HfKjIImpWhVJaGXiyr+T87kZgY2biRmPO qrWw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:dkim-signature; bh=oXyU4pkvCRYAyA2TUvsUC8dh5sAPzBBH4qagn6tOpF0=; fh=HVx+l6ikhg+n58FeoeQPBmJNbG13f9lDW9HE3xPzAhA=; b=QGOAC3DXXcCm2myyg2NGmHtPuE8rY1+0oPHz8ogbHFAti791Cm4HWRfET2pB58HVH+ jbIXwXwvUcuEDzFlNw4FRdn8qgy33IzmiHYnHA7v0badhnH1F5fHcL0Np7gAJz3OupOb kbmamm8Zm3VwzV2wWY9TLy16AyvziMhP+HQHELO+iob8Ltq8roRT/SJs18waraAllpZ9 YuIBEfboKppt6l7aod7fY6wL9q9mvjtRL1WUXrbN4jLtme+hjl6BdGp5BwOgkq6eTouZ ArmN7Hy2BmFG0OFYciWApHi9uZn0ZRBn3g8kFza2BI2s4cHi7ftcMHorkOnPi/Tol+mI tILw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=UsWEKk8r; arc=pass (i=1 spf=pass spfdomain=flex--surenb.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-32414-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32414-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id f9-20020a170906560900b00a2eac3d51dcsi5633143ejq.634.2024.01.21.23.13.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Jan 2024 23:13:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-32414-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=UsWEKk8r; arc=pass (i=1 spf=pass spfdomain=flex--surenb.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-32414-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32414-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 6FB3D1F21921 for ; Mon, 22 Jan 2024 07:13:54 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5A16211C8F; Mon, 22 Jan 2024 07:13:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UsWEKk8r" Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A74AFFBE7 for ; Mon, 22 Jan 2024 07:13:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705907611; cv=none; b=kPyW2obqIT2JCsK2b97u7pUFb0H8kmUhHuGZv1Cr3n/jIyniY1E59+qB0gBFbSgYfEHPf1X+z3i1RdWpbCAK3TcGHJo5a6HgaD4vZe7C28GI76e3ynh3tM5kMsrXru81TOklsKTqAp7tpfKlT1UBmwwUthe/znIRDsJhdxta6LE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705907611; c=relaxed/simple; bh=6Lrv5xwMfaDaPsl1VCKPLCnIY6z/5ht1dxDtK8LPTm4=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=V9wsm3ztT9VWNqCjDNnus0HI32D6SbMDbj7T2N23snCHTD0FAPKOQKNajl0rnoagpZN1DmYSqps4LbM73GOUoZOpePJhVDV83BG6kssf1xAml3TUWb9FHtW2JWeQZs8IqoHoveKOpo0fkRTtJjnkr9CorbffkzlFDURbYvR05SU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UsWEKk8r; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-dbeac1f5045so3614425276.1 for ; Sun, 21 Jan 2024 23:13:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705907608; x=1706512408; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=oXyU4pkvCRYAyA2TUvsUC8dh5sAPzBBH4qagn6tOpF0=; b=UsWEKk8rXs+hKbDWocKQBdrDKXKhfh7jxYSGJNpmt/g5n6Ib5Uh7cFCdtwF1uBsKIo 6bV0X98CVkYYVLY2A8uEkBhzAv5kYDxH8nsK+7P9lARMQCWcL7k4auaMiQ0OacLPjz8J 22aphKXRcuApyh1s4BwTpO8M++11answPeB7Z4GtcZP+d1V+0X+EfzxgcfhAzKXC7hxA lNlWWZp0M+PPptivUqVxF8iO+FulNtCQyaQO9LtolBByrN9Z+w9YkmQX+JaTn9O7VOAk IPNA3xX69jCFo65s2A7SJ9YqakfeYuAO2S3UNxVv3zeckadO/eSh7ZZ3CXtmsRdaqwlQ p5oQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705907608; x=1706512408; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=oXyU4pkvCRYAyA2TUvsUC8dh5sAPzBBH4qagn6tOpF0=; b=ChKshpuksDJNOQVDrSPD9iG4L8wQAYYgb+hnXJP8iepHXjP193/L4tdMwVOT/kH/uw VAKfaRolE9DFUBU8WtIY4iLhHE3JYzFPL6CLxhBNy9L2CwHtk3bpcuNZ3XaoJLAXWv6A QN+LyNB5b+S7P2MUfJaZFG93NniHMI9LEBaWzqbebHojNsM85CPKvjfV2QYGCszrSAdI Gn51TPHDephzNnC9lvP1Z9o+VZbnpEHXJA2Jv8Grek7crfEPjWZFoY9nfGTAfvTEtmse Ov/w/PC8BYyvH2LdqOZuV4NAkhYunq0roOyHOLynzPX71EL6AJu+XgF55gOKU/5k2AD6 9PYQ== X-Gm-Message-State: AOJu0Yw9yGVBBunOFsUFaXzllpE6utfu8TU1Srs88jsRiqC+AP1qeKtt cwquvSp/ADu1TlL0ymwGosBcBag8TdhKsx3q5ytcVFaiJ+12hxlDwAakMtGhxvT6qqw2tdH0HPJ ccw== X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:4979:1d79:d572:5708]) (user=surenb job=sendgmr) by 2002:a25:e90c:0:b0:dbe:49ca:eb03 with SMTP id n12-20020a25e90c000000b00dbe49caeb03mr1889847ybd.5.1705907608657; Sun, 21 Jan 2024 23:13:28 -0800 (PST) Date: Sun, 21 Jan 2024 23:13:22 -0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.429.g432eaa2c6b-goog Message-ID: <20240122071324.2099712-1-surenb@google.com> Subject: [PATCH 1/3] mm: make vm_area_struct anon_name field RCU-safe From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, dchinner@redhat.com, casey@schaufler-ca.com, ben.wolsieffer@hefring.com, paulmck@kernel.org, david@redhat.com, avagin@google.com, usama.anjum@collabora.com, peterx@redhat.com, hughd@google.com, ryan.roberts@arm.com, wangkefeng.wang@huawei.com, Liam.Howlett@Oracle.com, yuzhao@google.com, axelrasmussen@google.com, lstoakes@gmail.com, talumbau@google.com, willy@infradead.org, vbabka@suse.cz, mgorman@techsingularity.net, jhubbard@nvidia.com, vishal.moola@gmail.com, mathieu.desnoyers@efficios.com, dhowells@redhat.com, jgg@ziepe.ca, sidhartha.kumar@oracle.com, andriy.shevchenko@linux.intel.com, yangxingui@huawei.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, kernel-team@android.com, surenb@google.com X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788773804349231929 X-GMAIL-MSGID: 1788773804349231929 For lockless /proc/pid/maps reading we have to ensure all the fields used when generating the output are RCU-safe. The only pointer fields in vm_area_struct which are used to generate that file's output are vm_file and anon_name. vm_file is RCU-safe but anon_name is not. Make anon_name RCU-safe as well. Signed-off-by: Suren Baghdasaryan --- include/linux/mm_inline.h | 10 +++++++++- include/linux/mm_types.h | 3 ++- mm/madvise.c | 30 ++++++++++++++++++++++++++---- 3 files changed, 37 insertions(+), 6 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index f4fe593c1400..bbdb0ca857f1 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -389,7 +389,7 @@ static inline void dup_anon_vma_name(struct vm_area_struct *orig_vma, struct anon_vma_name *anon_name = anon_vma_name(orig_vma); if (anon_name) - new_vma->anon_name = anon_vma_name_reuse(anon_name); + rcu_assign_pointer(new_vma->anon_name, anon_vma_name_reuse(anon_name)); } static inline void free_anon_vma_name(struct vm_area_struct *vma) @@ -411,6 +411,8 @@ static inline bool anon_vma_name_eq(struct anon_vma_name *anon_name1, !strcmp(anon_name1->name, anon_name2->name); } +struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma); + #else /* CONFIG_ANON_VMA_NAME */ static inline void anon_vma_name_get(struct anon_vma_name *anon_name) {} static inline void anon_vma_name_put(struct anon_vma_name *anon_name) {} @@ -424,6 +426,12 @@ static inline bool anon_vma_name_eq(struct anon_vma_name *anon_name1, return true; } +static inline +struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma) +{ + return NULL; +} + #endif /* CONFIG_ANON_VMA_NAME */ static inline void init_tlb_flush_pending(struct mm_struct *mm) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 8b611e13153e..bbe1223cd992 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -545,6 +545,7 @@ struct vm_userfaultfd_ctx {}; struct anon_vma_name { struct kref kref; + struct rcu_head rcu; /* The name needs to be at the end because it is dynamically sized. */ char name[]; }; @@ -699,7 +700,7 @@ struct vm_area_struct { * terminated string containing the name given to the vma, or NULL if * unnamed. Serialized by mmap_lock. Use anon_vma_name to access. */ - struct anon_vma_name *anon_name; + struct anon_vma_name __rcu *anon_name; #endif #ifdef CONFIG_SWAP atomic_long_t swap_readahead_info; diff --git a/mm/madvise.c b/mm/madvise.c index 912155a94ed5..0f222d464254 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -88,14 +88,15 @@ void anon_vma_name_free(struct kref *kref) { struct anon_vma_name *anon_name = container_of(kref, struct anon_vma_name, kref); - kfree(anon_name); + kfree_rcu(anon_name, rcu); } struct anon_vma_name *anon_vma_name(struct vm_area_struct *vma) { mmap_assert_locked(vma->vm_mm); - return vma->anon_name; + return rcu_dereference_protected(vma->anon_name, + rwsem_is_locked(&vma->vm_mm->mmap_lock)); } /* mmap_lock should be write-locked */ @@ -105,7 +106,7 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, struct anon_vma_name *orig_name = anon_vma_name(vma); if (!anon_name) { - vma->anon_name = NULL; + rcu_assign_pointer(vma->anon_name, NULL); anon_vma_name_put(orig_name); return 0; } @@ -113,11 +114,32 @@ static int replace_anon_vma_name(struct vm_area_struct *vma, if (anon_vma_name_eq(orig_name, anon_name)) return 0; - vma->anon_name = anon_vma_name_reuse(anon_name); + rcu_assign_pointer(vma->anon_name, anon_vma_name_reuse(anon_name)); anon_vma_name_put(orig_name); return 0; } + +/* + * Returned anon_vma_name is stable due to elevated refcount but not guaranteed + * to be assigned to the original VMA after the call. + */ +struct anon_vma_name *anon_vma_name_get_rcu(struct vm_area_struct *vma) +{ + struct anon_vma_name __rcu *anon_name; + + WARN_ON_ONCE(!rcu_read_lock_held()); + + anon_name = rcu_dereference(vma->anon_name); + if (!anon_name) + return NULL; + + if (unlikely(!kref_get_unless_zero(&anon_name->kref))) + return NULL; + + return anon_name; +} + #else /* CONFIG_ANON_VMA_NAME */ static int replace_anon_vma_name(struct vm_area_struct *vma, struct anon_vma_name *anon_name) From patchwork Mon Jan 22 07:13:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 189932 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp2411842dyb; Sun, 21 Jan 2024 23:22:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IEsPIBgQY6Peb1f2qF8vnpbaHm0OiEsKtsIzwNczt6KUXkqopT46NFlRsCZdd/8og8p8F28 X-Received: by 2002:a05:6830:158:b0:6dd:e3da:2051 with SMTP id j24-20020a056830015800b006dde3da2051mr2347326otp.47.1705908141611; Sun, 21 Jan 2024 23:22:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705908141; cv=pass; d=google.com; s=arc-20160816; b=gkTYiK4iVe/C1K6yUOkqp/cbSQHmKqjyndWO0FkP1wNZ9I7UF7+TAs02nwxr1juSxu zTftKLVCK7aPIxjsSkZRaBO8maSvNN3qIFHXeHSOHs8zNdXW/LfxsfDztsYWwJfTOWDl 5aGbpVpTOwQK20cPj6SKQVWSIGIXfi2FgtUEIQT6FpObNnoFVKvtBS2iMe6/KhJmdC7s 9PpfFXriLjWIscnjH7RpWHcuAFhKpUvP23/IIY63TiWhwXixUjtQeFFdk3BgsTPQTzGl Q0eDAX63iTEoban70pkn+kYk9GnVvBeC1QoFAkK99MxtPW+V746kbQicTWjPjbY9Or7b SvOA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=OLp6U6EsMSw583KQsA7U2jc3UsvPAv4xgtxefIxudO0=; fh=HVx+l6ikhg+n58FeoeQPBmJNbG13f9lDW9HE3xPzAhA=; b=V3E0CTNHyitng7xSNN/4ohrAHvxbdim1cg7g+wTyH4P0JYfPMAbyLw5QNeeOl0aa+4 DDwfzH/PmBtSIjzajDUbM18VE0o1w5ZPCrp5sFlMG11h4H7yL/QtiOuT+04vHBh7rSrt UPBuMcyI0/VVPfa6wumI3QhKw2mo1mEWlOpISeAW5b2w4v4s2VKDslq5ouJf/JBoQdPw PKsFsfAdFeG3tNdjEFvCoVdeQlIGj10SYoVxVeGWzBEvUCIm5oqkSAjUPC1XaC3FCpFc wDxErNhq1UC2HWGzcVtvCY1Hr9ZwDMqZ3bPiIIl0A3V7axKo1aBTtzehxqQUSJSU/lAP kHXQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=1V6naVaZ; arc=pass (i=1 spf=pass spfdomain=flex--surenb.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-32415-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32415-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id y128-20020a636486000000b005cdfb578621si7698643pgb.112.2024.01.21.23.22.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Jan 2024 23:22:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-32415-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=1V6naVaZ; arc=pass (i=1 spf=pass spfdomain=flex--surenb.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-32415-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32415-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 1E29BB24F67 for ; Mon, 22 Jan 2024 07:14:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 92CC61642D; Mon, 22 Jan 2024 07:13:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="1V6naVaZ" Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01E36FC10 for ; Mon, 22 Jan 2024 07:13:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705907614; cv=none; b=jclvxHsDHjonn/wBO2ry8ZimsnkkD7V24eBV3vufhyqPByTaGwUKQzzsodkLMAKEFXOin5q0ueI14PeO9+lxuAsmS7U3OCtlKbfvyzbnO3pO2A+J4tK6o9/wNyn/Lkt65GeqXNx2JbEqYqBV2y+2bFYttMJqxDHgRBSBD1R6kdk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705907614; c=relaxed/simple; bh=VSmKcMfyN3MYuxa3M+8VItvp5qj4UNGBOJKK9wA9Qlk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=dV76NVrXB8BBYKv6S3SNToKYZ6MFi4cY2uTiS5t7O2g1fLkez7e5DEkltP6E2wRk4jTJUYVBox1qdk1z1nFx1l2W/cwZnX9HKFme+Lh7PcoiWlcle/GPHoqZC8iGBiz9tC3SxtdHKWniSGwGr2QEqOtFEvkwVPUC52y7rYnKx3I= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=1V6naVaZ; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5f6c12872fbso43748577b3.1 for ; Sun, 21 Jan 2024 23:13:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705907611; x=1706512411; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OLp6U6EsMSw583KQsA7U2jc3UsvPAv4xgtxefIxudO0=; b=1V6naVaZaSLjwy19DLCWrHyiuJ3nL9dOJ2i1WamO/OYHbln+7XGcnJa7FVhJZ1en2P hidngHBf6X4ak8lqqu/DpCUgRfjMX1rt2/n8kga7qyAjzfLtYy4HV57HOZXl5t8FujMG 7T47ry/neXetIKOdV59cMb7XDRgiLgR/wKqvpYiFeeWEGNiORBCsSS7D7fTdtW9Mf7QB JBfd1mE6W3oj3lDr7pFP5m72VsT3R9TXsmPynxDOj2Zl/On92L+Ig8wZnVj8XYJyF5aS xK1OIUYq2gRcrqpC4bBRb+fU19EAH+vMP4qmnNrJKmYEu3C0S90dRleNPT8J3yZe1ZuC FKuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705907611; x=1706512411; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OLp6U6EsMSw583KQsA7U2jc3UsvPAv4xgtxefIxudO0=; b=dl9+9suCM2qPx4vz4rxxGj1UtnAOzzjRUjqcrJPl/iRbDD4SXwtcM9AXxPjqqnilPa /+/C+K81IOWDr/VVtXrKLGNgobkNgBzIrujV3IbtmT4jTtF5yWMNqnysjDnSgZRHs3qY qSALoPrmQq+c1ryNp7On6Hf7W/AG7Uf8b2rGaLl/5cteWOOAWBjqJVUzID0OQIGxla18 PjPK92tvMzwbjs3B/EnK7K5Ti5W4a4wDIUAJLcT6i1IVgCcBMSNA+/is52CFijalNxDG wr3G4jU9dZ0vj0iOZdqvrWR9nnl/v+/kWoWvtAizlY/mNDONRtL8ETIHKlWnFjSbZ35h +MZg== X-Gm-Message-State: AOJu0YyKixl2D9UWt1MsgtgCs9Vx9J3if6L26b5jcukRovKIFIgvkuRC uTN9EDGkx/3zbtH4i7WhG2FB5z/QcsyjOGUr51n45yM0FQtht2PNC0meSnPN+qmg9ztA1SNWpUY 8kw== X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:4979:1d79:d572:5708]) (user=surenb job=sendgmr) by 2002:a81:a096:0:b0:5e6:27ee:67fb with SMTP id x144-20020a81a096000000b005e627ee67fbmr1310854ywg.4.1705907611085; Sun, 21 Jan 2024 23:13:31 -0800 (PST) Date: Sun, 21 Jan 2024 23:13:23 -0800 In-Reply-To: <20240122071324.2099712-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240122071324.2099712-1-surenb@google.com> X-Mailer: git-send-email 2.43.0.429.g432eaa2c6b-goog Message-ID: <20240122071324.2099712-2-surenb@google.com> Subject: [PATCH 2/3] mm: add mm_struct sequence number to detect write locks From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, dchinner@redhat.com, casey@schaufler-ca.com, ben.wolsieffer@hefring.com, paulmck@kernel.org, david@redhat.com, avagin@google.com, usama.anjum@collabora.com, peterx@redhat.com, hughd@google.com, ryan.roberts@arm.com, wangkefeng.wang@huawei.com, Liam.Howlett@Oracle.com, yuzhao@google.com, axelrasmussen@google.com, lstoakes@gmail.com, talumbau@google.com, willy@infradead.org, vbabka@suse.cz, mgorman@techsingularity.net, jhubbard@nvidia.com, vishal.moola@gmail.com, mathieu.desnoyers@efficios.com, dhowells@redhat.com, jgg@ziepe.ca, sidhartha.kumar@oracle.com, andriy.shevchenko@linux.intel.com, yangxingui@huawei.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, kernel-team@android.com, surenb@google.com X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788774335666762254 X-GMAIL-MSGID: 1788774335666762254 Provide a way for lockless mm_struct users to detect whether mm might have been changed since some specific point in time. The API provided allows the user to record a counter when it starts using the mm and later use that counter to check if anyone write-locked mmap_lock since the counter was recorded. Recording the counter value should be done while holding mmap_lock at least for reading to prevent the counter from concurrent changes. Every time mmap_lock is write-locked mm_struct updates its mm_wr_seq counter so that checks against counters recorded before that would fail, indicating a possibility of mm being modified. Signed-off-by: Suren Baghdasaryan --- include/linux/mm_types.h | 2 ++ include/linux/mmap_lock.h | 22 ++++++++++++++++++++++ 2 files changed, 24 insertions(+) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index bbe1223cd992..e749f7f09314 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -846,6 +846,8 @@ struct mm_struct { */ int mm_lock_seq; #endif + /* Counter incremented each time mm gets write-locked */ + unsigned long mm_wr_seq; unsigned long hiwater_rss; /* High-watermark of RSS usage */ diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 8d38dcb6d044..0197079cb6fe 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -106,6 +106,8 @@ static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); + /* Pairs with ACQUIRE semantics in mmap_write_seq_read */ + smp_store_release(&mm->mm_wr_seq, mm->mm_wr_seq + 1); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -113,6 +115,8 @@ static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) { __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); + /* Pairs with ACQUIRE semantics in mmap_write_seq_read */ + smp_store_release(&mm->mm_wr_seq, mm->mm_wr_seq + 1); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -122,6 +126,10 @@ static inline int mmap_write_lock_killable(struct mm_struct *mm) __mmap_lock_trace_start_locking(mm, true); ret = down_write_killable(&mm->mmap_lock); + if (!ret) { + /* Pairs with ACQUIRE semantics in mmap_write_seq_read */ + smp_store_release(&mm->mm_wr_seq, mm->mm_wr_seq + 1); + } __mmap_lock_trace_acquire_returned(mm, true, ret == 0); return ret; } @@ -140,6 +148,20 @@ static inline void mmap_write_downgrade(struct mm_struct *mm) downgrade_write(&mm->mmap_lock); } +static inline unsigned long mmap_write_seq_read(struct mm_struct *mm) +{ + /* Pairs with RELEASE semantics in mmap_write_lock */ + return smp_load_acquire(&mm->mm_wr_seq); +} + +static inline void mmap_write_seq_record(struct mm_struct *mm, + unsigned long *mm_wr_seq) +{ + mmap_assert_locked(mm); + /* Nobody can concurrently modify since we hold the mmap_lock */ + *mm_wr_seq = mm->mm_wr_seq; +} + static inline void mmap_read_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, false); From patchwork Mon Jan 22 07:13:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 189931 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp2411674dyb; Sun, 21 Jan 2024 23:21:45 -0800 (PST) X-Google-Smtp-Source: AGHT+IFlA5diKZDiboxAKNSlQ/V7mbq4uOFtZi56dhU2xUmE4eZbbJyPmKELRiwPTsY/i/Ih3gwQ X-Received: by 2002:a05:6870:831d:b0:1ff:6527:350b with SMTP id p29-20020a056870831d00b001ff6527350bmr1593109oae.78.1705908105339; Sun, 21 Jan 2024 23:21:45 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705908105; cv=pass; d=google.com; s=arc-20160816; b=xxhR1vCuc4Tnv/IYjdGkR0MGe3e0Qufe0lYUWBOIELHIu5I/xslimf7IifWl3eYsnh lADB41Jcr08+jM6NOho+kvd2kr6cVCIoIFBSjsXam+SYp8EfEt1e+C6a5rqYI8a3KjsQ MssvBIyW4kG1ezyHT2kSledTksPFp9jpzmTcCMQma3borkdy7KkJt5BVfwh6R9xQymVA AxZdYRLn6HVqTyUoKzMeytPqjATcoz7S+kJajn7S/3ZHQBPKc7UJet0g5zsfvFleeZT7 JFH8VCQ0803Shky2wE9WzW1Wj3wfolOIMVivm02piAaqBjFXPpn4JhXgtXfhzs6RE/MT 7ZhQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=pm6LGPn0g5uP1lRXkRkuwE07r5wy4EXy2R/IId+4s5Y=; fh=HVx+l6ikhg+n58FeoeQPBmJNbG13f9lDW9HE3xPzAhA=; b=vZwiC4/aEyvWBlg0i9xPdyUgpE5q9pDn3dl5aDDrip0J9O6B8eSYs30RWcNBCevEkY vofMkCIEW/uAe/iAzcx80+9cAo7s9VB9y7XJOXXFKI8DeCoi7yHPsWtPtJuvAjQNS9Pj VshSg3KTL0yN70H0u6eJqASlzaC3yQAM5myg1guMLrGGtgRYXQYHvLN8ZwEbwKRWXHHJ 3TfE75/b8cprDuO485qZml4LvOBVxDqFEGaKmkd5Ad9lxLD7yfSCZKB+VjKs2b2/zpGI 1DAl1jzDz/zpJjVxJffVhdFGXSuHK0NESrO3xcrIw0QuUmbVY+y6B2+g5+so2sJWZc7I Xj9A== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Nv+KkvTo; arc=pass (i=1 spf=pass spfdomain=flex--surenb.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-32416-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32416-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id n67-20020a634046000000b005cd966560b7si7690540pga.626.2024.01.21.23.21.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 Jan 2024 23:21:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-32416-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Nv+KkvTo; arc=pass (i=1 spf=pass spfdomain=flex--surenb.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-32416-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-32416-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 72636288551 for ; Mon, 22 Jan 2024 07:14:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A1D6917BC8; Mon, 22 Jan 2024 07:13:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Nv+KkvTo" Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81A3F125B9 for ; Mon, 22 Jan 2024 07:13:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705907616; cv=none; b=fDeoqnWh+sDbHa7hdAka3SYLZoKbL7sF+iE5r8W073zZ0DvLHVGS72/rtjkTUruR7kSkyZ6982Y3ahT00RxYYRloqXxrnIieB22n3Y1dYZD+Kmf9WLjvnswluEwfqoOVGzjj7Iqp+yLmZfPCLIGBkzFNg+7J8uL5Nctpk9snKNE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705907616; c=relaxed/simple; bh=Q4c5kAQB/tihMQ4sonqfCtbk7g6WtRsFX1gU0E6agKU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=TyQ8cKbaucy7ShPAgEOSRCMM4hpm420hBe55JfCxYkIYFF26HGp8SF/SMCfNlpn6QAXr7VFpf3HL1BJl52DoJh51LW7mPqItrO2PGIGjPoy1m+HxazRaW9F3s5xI7Y/tJ5Bd0054YTSOZ/82TBzL5YEN/01XLVDhyvxZTu+TCRE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Nv+KkvTo; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--surenb.bounces.google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5f53b4554b6so38600617b3.3 for ; Sun, 21 Jan 2024 23:13:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1705907613; x=1706512413; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pm6LGPn0g5uP1lRXkRkuwE07r5wy4EXy2R/IId+4s5Y=; b=Nv+KkvToXtL3NAC2LApO/y0zwVfomGjye0FD+j0U5teteYMGoGEa4/oaLHhUK6eBn0 bXPvDUHjwoGKWAZhCowMrIonl7J+yUwsyTEhHy6RDtD7SG0LWVOXlHzRyK9niqdw/Pw7 kpOzkGY+YkktZHQr/dr9ptU05NY/zvxr+Z+3G0seLVaT6QYjNU3HGzE2UJoSO4C/q5J2 TivSWrMUfRUh7MmnE0XacQi8X1svM4kWG+/4Xi13TOZaF3PRwy5yDW9miv+8uYyn92Wt pV0FQSv3KsdThEEWfYaQwD6Cs/lx9wtxaF8FTAyAfAz4J55yZ/KGwnKE07rAn7C2ScJA dp0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705907613; x=1706512413; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pm6LGPn0g5uP1lRXkRkuwE07r5wy4EXy2R/IId+4s5Y=; b=mIXPM3PC2RFr99QRZmATNNOHxaznAkh2/j2LZ5azZht5TD/SqSTGOOuCLCROeQgshv kuNrcWkLKpwJYeebRmB5GnPQtW6YsHQshir1uvTy1qDUAZWlDdOuqHSmhxrxA+s3+69f FQkuj9iPVh98E074juubwIHoHHN0rZtN+txvLar0H7yujY/slld6fWpkRyaaeTRqUVph 0gCc2Ukp8pvhbf6ugWkyKagS55tZiFR8Vk13UeIRCgpAqslitBoVnW1F/DQpoVPjy9Rz UPQhMnhT56wQxXUiV5LUNVdYpnbyk2nxEdN8rfiUttQrYqqAtSRb0fZmbgGjQbBeakjT o6BQ== X-Gm-Message-State: AOJu0Yza77nAAjlAtJZtnOEZhSHYZq8aUZXsgdkdwghw6XOWPqdyo+or SLOLgO/4LRnD5AZxLlJbmijV4H+N8HSR5h6SnNyCxky2doJdbCNPA0TJHKrLVgN9TJCJunxPg+5 gRw== X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:4979:1d79:d572:5708]) (user=surenb job=sendgmr) by 2002:a81:9847:0:b0:5e7:12cc:a60f with SMTP id p68-20020a819847000000b005e712cca60fmr1362943ywg.6.1705907613421; Sun, 21 Jan 2024 23:13:33 -0800 (PST) Date: Sun, 21 Jan 2024 23:13:24 -0800 In-Reply-To: <20240122071324.2099712-1-surenb@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240122071324.2099712-1-surenb@google.com> X-Mailer: git-send-email 2.43.0.429.g432eaa2c6b-goog Message-ID: <20240122071324.2099712-3-surenb@google.com> Subject: [PATCH 3/3] mm/maps: read proc/pid/maps under RCU From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, dchinner@redhat.com, casey@schaufler-ca.com, ben.wolsieffer@hefring.com, paulmck@kernel.org, david@redhat.com, avagin@google.com, usama.anjum@collabora.com, peterx@redhat.com, hughd@google.com, ryan.roberts@arm.com, wangkefeng.wang@huawei.com, Liam.Howlett@Oracle.com, yuzhao@google.com, axelrasmussen@google.com, lstoakes@gmail.com, talumbau@google.com, willy@infradead.org, vbabka@suse.cz, mgorman@techsingularity.net, jhubbard@nvidia.com, vishal.moola@gmail.com, mathieu.desnoyers@efficios.com, dhowells@redhat.com, jgg@ziepe.ca, sidhartha.kumar@oracle.com, andriy.shevchenko@linux.intel.com, yangxingui@huawei.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, kernel-team@android.com, surenb@google.com X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788774297417565482 X-GMAIL-MSGID: 1788774297417565482 With maple_tree supporting vma tree traversal under RCU and per-vma locks making vma access RCU-safe, /proc/pid/maps can be read under RCU and without the need to read-lock mmap_lock. However vma content can change from under us, therefore we make a copy of the vma and we pin pointer fields used when generating the output (currently only vm_file and anon_name). Afterwards we check for concurrent address space modifications, wait for them to end and retry. That last check is needed to avoid possibility of missing a vma during concurrent maple_tree node replacement, which might report a NULL when a vma is replaced with another one. While we take the mmap_lock for reading during such contention, we do that momentarily only to record new mm_wr_seq counter. This change is designed to reduce mmap_lock contention and prevent a process reading /proc/pid/maps files (often a low priority task, such as monitoring/data collection services) from blocking address space updates. Note that this change has a userspace visible disadvantage: it allows for sub-page data tearing as opposed to the previous mechanism where data tearing could happen only between pages of generated output data. Since current userspace considers data tearing between pages to be acceptable, we assume is will be able to handle sub-page data tearing as well. Signed-off-by: Suren Baghdasaryan --- fs/proc/internal.h | 2 + fs/proc/task_mmu.c | 114 ++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 109 insertions(+), 7 deletions(-) diff --git a/fs/proc/internal.h b/fs/proc/internal.h index a71ac5379584..e0247225bb68 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -290,6 +290,8 @@ struct proc_maps_private { struct task_struct *task; struct mm_struct *mm; struct vma_iterator iter; + unsigned long mm_wr_seq; + struct vm_area_struct vma_copy; #ifdef CONFIG_NUMA struct mempolicy *task_mempolicy; #endif diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 3f78ebbb795f..3886d04afc01 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -126,11 +126,96 @@ static void release_task_mempolicy(struct proc_maps_private *priv) } #endif -static struct vm_area_struct *proc_get_vma(struct proc_maps_private *priv, - loff_t *ppos) +#ifdef CONFIG_PER_VMA_LOCK + +static const struct seq_operations proc_pid_maps_op; +/* + * Take VMA snapshot and pin vm_file and anon_name as they are used by + * show_map_vma. + */ +static int get_vma_snapshow(struct proc_maps_private *priv, struct vm_area_struct *vma) { + struct vm_area_struct *copy = &priv->vma_copy; + int ret = -EAGAIN; + + memcpy(copy, vma, sizeof(*vma)); + if (copy->vm_file && !get_file_rcu(©->vm_file)) + goto out; + + if (copy->anon_name && !anon_vma_name_get_rcu(copy)) + goto put_file; + + if (priv->mm_wr_seq == mmap_write_seq_read(priv->mm)) + return 0; + + /* Address space got modified, vma might be stale. Wait and retry. */ + rcu_read_unlock(); + ret = mmap_read_lock_killable(priv->mm); + mmap_write_seq_record(priv->mm, &priv->mm_wr_seq); + mmap_read_unlock(priv->mm); + rcu_read_lock(); + + if (!ret) + ret = -EAGAIN; /* no other errors, ok to retry */ + + if (copy->anon_name) + anon_vma_name_put(copy->anon_name); +put_file: + if (copy->vm_file) + fput(copy->vm_file); +out: + return ret; +} + +static void put_vma_snapshot(struct proc_maps_private *priv) +{ + struct vm_area_struct *vma = &priv->vma_copy; + + if (vma->anon_name) + anon_vma_name_put(vma->anon_name); + if (vma->vm_file) + fput(vma->vm_file); +} + +static inline bool needs_mmap_lock(struct seq_file *m) +{ + /* + * smaps and numa_maps perform page table walk, therefore require + * mmap_lock but maps can be read under RCU. + */ + return m->op != &proc_pid_maps_op; +} + +#else /* CONFIG_PER_VMA_LOCK */ + +/* Without per-vma locks VMA access is not RCU-safe */ +static inline bool needs_mmap_lock(struct seq_file *m) { return true; } + +#endif /* CONFIG_PER_VMA_LOCK */ + +static struct vm_area_struct *proc_get_vma(struct seq_file *m, loff_t *ppos) +{ + struct proc_maps_private *priv = m->private; struct vm_area_struct *vma = vma_next(&priv->iter); +#ifdef CONFIG_PER_VMA_LOCK + if (vma && !needs_mmap_lock(m)) { + int ret; + + put_vma_snapshot(priv); + while ((ret = get_vma_snapshow(priv, vma)) == -EAGAIN) { + /* lookup the vma at the last position again */ + vma_iter_init(&priv->iter, priv->mm, *ppos); + vma = vma_next(&priv->iter); + } + + if (ret) { + put_vma_snapshot(priv); + return NULL; + } + vma = &priv->vma_copy; + } +#endif if (vma) { *ppos = vma->vm_start; } else { @@ -169,12 +254,20 @@ static void *m_start(struct seq_file *m, loff_t *ppos) return ERR_PTR(-EINTR); } + /* Drop mmap_lock if possible */ + if (!needs_mmap_lock(m)) { + mmap_write_seq_record(priv->mm, &priv->mm_wr_seq); + mmap_read_unlock(priv->mm); + rcu_read_lock(); + memset(&priv->vma_copy, 0, sizeof(priv->vma_copy)); + } + vma_iter_init(&priv->iter, mm, last_addr); hold_task_mempolicy(priv); if (last_addr == -2UL) return get_gate_vma(mm); - return proc_get_vma(priv, ppos); + return proc_get_vma(m, ppos); } static void *m_next(struct seq_file *m, void *v, loff_t *ppos) @@ -183,7 +276,7 @@ static void *m_next(struct seq_file *m, void *v, loff_t *ppos) *ppos = -1UL; return NULL; } - return proc_get_vma(m->private, ppos); + return proc_get_vma(m, ppos); } static void m_stop(struct seq_file *m, void *v) @@ -195,7 +288,10 @@ static void m_stop(struct seq_file *m, void *v) return; release_task_mempolicy(priv); - mmap_read_unlock(mm); + if (needs_mmap_lock(m)) + mmap_read_unlock(mm); + else + rcu_read_unlock(); mmput(mm); put_task_struct(priv->task); priv->task = NULL; @@ -283,8 +379,10 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma) start = vma->vm_start; end = vma->vm_end; show_vma_header_prefix(m, start, end, flags, pgoff, dev, ino); - if (mm) - anon_name = anon_vma_name(vma); + if (mm) { + anon_name = needs_mmap_lock(m) ? anon_vma_name(vma) : + anon_vma_name_get_rcu(vma); + } /* * Print the dentry name for named mappings, and a @@ -338,6 +436,8 @@ show_map_vma(struct seq_file *m, struct vm_area_struct *vma) seq_puts(m, name); } seq_putc(m, '\n'); + if (anon_name && !needs_mmap_lock(m)) + anon_vma_name_put(anon_name); } static int show_map(struct seq_file *m, void *v)