Message ID | 20230620235726.3873043-1-surenb@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4022071vqr; Tue, 20 Jun 2023 17:11:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7Hn5zPi1+6pCA6pAe8eB0f+e6ZqjK3nIzkg3Z8pX3L866pNGmR165gSvrcc8WY2jdatjv2 X-Received: by 2002:a05:6a20:2588:b0:11d:2764:d9c1 with SMTP id k8-20020a056a20258800b0011d2764d9c1mr17680416pzd.51.1687306304934; Tue, 20 Jun 2023 17:11:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687306304; cv=none; d=google.com; s=arc-20160816; b=iRzpy3WdgiMYL/LmfJV+DJQHuaE/KtRYBfAD3/rKwJMeiNLRtqsL9BdicEjRX/oCSq WvIsr2zdZLoE/nr0RBuD690Qu+/RZ9njEJVpXBc8hq3xXCn4dO8SiqznZ7B41X1/+q4s DgVY56SOvYLIZWrrd3y45+HDlGIETq5ilLh2xsXdoYrVr/D6yO7ghwY6kVj5qRoBsoRo +vtqsm4Hl8kwcgigGy9/Y1SpgyApMxDrcYnOrPGRg5+DhTLpPN8t6C9iCjML8zWnQfOT 4GbEaLc9bBK/W0x5o1zHL/PhbippuSa0vgRlrAnVLZGI3VOLjRTtb0MLSXp8zZo75gWh ioSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:mime-version:date :dkim-signature; bh=6My5zaxRwCubM/znAlFWMrzsQs0o4/hMLfIyqDwIdCM=; b=fVOCNIae4ROODCePGMr+EbgpumNLIvQ2IX1KHRViVn8IYEpj5FWtWncz7qypkRllmp 7zE1bvTRQVE3FX9Bi0Vz9swiFJYTOI9PrDImiFBC7iEhgfi784/kWaTHWkYpEM6+FUXG iRnXqBfn9uAHrAa4UK3TZVe79FY1Z24KnmZtBd5ITljTgtn7aW+n5GkBBunkEXz+mWXl X42ur/eaXtB0Q55NGH7gqT81S8ACrjUPDcVf3qh1Ewbqp08JmUikDy0ZCGiYaNoYKSC+ /rozlSsBZhPU1qOJ/Rx9Xk0paa7G7UaJ+S02Di/qeNIyuXU1l5TuWeYTFNZ3TdJb2/a/ WuMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=CSSYpE6o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y16-20020a63b510000000b00547a1922957si2639071pge.407.2023.06.20.17.11.31; Tue, 20 Jun 2023 17:11:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=CSSYpE6o; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229684AbjFTX5c (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Tue, 20 Jun 2023 19:57:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229490AbjFTX5b (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 20 Jun 2023 19:57:31 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91636184 for <linux-kernel@vger.kernel.org>; Tue, 20 Jun 2023 16:57:30 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-be7e1d31646so4685638276.0 for <linux-kernel@vger.kernel.org>; Tue, 20 Jun 2023 16:57:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687305450; x=1689897450; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=6My5zaxRwCubM/znAlFWMrzsQs0o4/hMLfIyqDwIdCM=; b=CSSYpE6o2vAYB+EczjNxHQoW9ZZEIQkPp00UpPzxbzxvXBVI4La9GGnHYy8QOAimev sfzfDqWb0009TEU/RKdrkDPXynJU93p0tuZD9SmZGHcLexKdiOKqZWJHOqcUGvtktcrY 6STtqvnsGmXhMto4nbyL6fX0OZ5Mm8HfxnS/LYv9XTquxzNy8LHbClyPaVrhoICYyZ/2 g7WkUN6m7pcrbGPcMrAUVmt45VALmV9NtE5bNAkafpbx7TM/rk6YC/Ke8TslHhaYfECz 8sS3IXfjPcEJnj01RFYOknIlwJz0+ih21+BBteNcARxJv49L9w0p5ZwII+bz34K/JCZ/ JX1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687305450; x=1689897450; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=6My5zaxRwCubM/znAlFWMrzsQs0o4/hMLfIyqDwIdCM=; b=fTP2dqNH0Qu7La1q98t1+3nzBVN642i2q5U6p5/75d1RieybWwialCmzZ5hzWtCbkt CAmwdpOlGRq3I5NvoQT++1lh1h5zRv5JDPvrLrv3i3u9deduzctweLzOfGp2rmmS5HX+ diJMRAJXBiZZfBniZIpU1s6hjOwZeJaAtIwcM4MMX2R6aZKvUlDATACIZIRZq5L7eb14 2GncxBl2M1ghttmHjKlEEh6W6+IvDCOX71WWsnR2e+5FR4rdft9rFmKSpmLc+EJPj6V2 RZeF1VU9fcOzc2rN0TN+i6HS3G1qo7GlPbdNlh5pFttDm3uFOE6vx+M18MpJ63L57ym1 oI0Q== X-Gm-Message-State: AC+VfDxYLaKjnr6gBZpLriNFpQeRYmi7GiQGNI4YhQOOxk6G6gdwx+0y viT3fQJOkjsoxNWY7ukaGE5VidBl7Rk= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:3f57:9854:e19:8906]) (user=surenb job=sendgmr) by 2002:a05:6902:91:b0:ba8:6dc0:cacf with SMTP id h17-20020a056902009100b00ba86dc0cacfmr1731864ybs.12.1687305449890; Tue, 20 Jun 2023 16:57:29 -0700 (PDT) Date: Tue, 20 Jun 2023 16:57:24 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230620235726.3873043-1-surenb@google.com> Subject: [PATCH 1/3] mm: change vma_start_read to fail if VMA got detached from under it From: Suren Baghdasaryan <surenb@google.com> To: akpm@linux-foundation.org Cc: willy@infradead.org, torvalds@linuxfoundation.org, vegard.nossum@oracle.com, mpe@ellerman.id.au, Liam.Howlett@oracle.com, lrh2000@pku.edu.cn, mgorman@suse.de, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769268895647196409?= X-GMAIL-MSGID: =?utf-8?q?1769268895647196409?= |
Series |
[1/3] mm: change vma_start_read to fail if VMA got detached from under it
|
|
Commit Message
Suren Baghdasaryan
June 20, 2023, 11:57 p.m. UTC
Current implementation of vma_start_read() checks VMA for being locked
before taking vma->vm_lock and then checks that again. This mechanism
fails to detect a case when the VMA gets write-locked, modified and
unlocked after the first check but before the vma->vm_lock is obtained.
While this is not strictly a problem (vma_start_read would not produce
a false unlocked result), this allows it to successfully lock a VMA which
got detached from the VMA tree while vma_start_read was locking it.
New condition checks for any change in vma->vm_lock_seq after we obtain
vma->vm_lock and will cause vma_start_read() to fail if the above race
occurs.
Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it")
Suggested-by: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
---
include/linux/mm.h | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
Comments
On Tue, Jun 20, 2023 at 4:57 PM Suren Baghdasaryan <surenb@google.com> wrote: > > By the time VMA is freed it has to be detached with the exception of > exit_mmap which is destroying the whole VMA tree. Enforce this > requirement before freeing the VMA. exit_mmap in the only user calling > __vm_area_free directly, therefore it won't trigger the new check. > Change VMA initialization to mark new VMAs as detached and change that > flag once the VMA is added into a tree. > > Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> > Signed-off-by: Suren Baghdasaryan <surenb@google.com> My tests did not generate the warning but the test coverage is far from perfect, so if someone can run extensive testing on this one that would be greatly appreciated. Thanks, Suren. > --- > include/linux/mm.h | 4 ++-- > kernel/fork.c | 2 ++ > mm/internal.h | 1 + > 3 files changed, 5 insertions(+), 2 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 74e3033c9fc2..9a10fcdb134e 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -247,7 +247,7 @@ void setup_initial_init_mm(void *start_code, void *end_code, > struct vm_area_struct *vm_area_alloc(struct mm_struct *); > struct vm_area_struct *vm_area_dup(struct vm_area_struct *); > void vm_area_free(struct vm_area_struct *); > -/* Use only if VMA has no other users */ > +/* Use only if VMA has no other users and might still be attached to a tree */ > void __vm_area_free(struct vm_area_struct *vma); > > #ifndef CONFIG_MMU > @@ -751,7 +751,7 @@ static inline void vma_init(struct vm_area_struct *vma, struct mm_struct *mm) > vma->vm_mm = mm; > vma->vm_ops = &dummy_vm_ops; > INIT_LIST_HEAD(&vma->anon_vma_chain); > - vma_mark_detached(vma, false); > + vma->detached = true; > vma_numab_state_init(vma); > } > > diff --git a/kernel/fork.c b/kernel/fork.c > index 41c964104b58..000fc429345c 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -540,6 +540,7 @@ static void vm_area_free_rcu_cb(struct rcu_head *head) > > /* The vma should not be locked while being destroyed. */ > VM_BUG_ON_VMA(rwsem_is_locked(&vma->vm_lock->lock), vma); > + WARN_ON_ONCE(!vma->detached); > __vm_area_free(vma); > } > #endif > @@ -549,6 +550,7 @@ void vm_area_free(struct vm_area_struct *vma) > #ifdef CONFIG_PER_VMA_LOCK > call_rcu(&vma->vm_rcu, vm_area_free_rcu_cb); > #else > + WARN_ON_ONCE(!vma->detached); > __vm_area_free(vma); > #endif > } > diff --git a/mm/internal.h b/mm/internal.h > index 68410c6d97ac..728189e6c703 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -1068,6 +1068,7 @@ static inline void vma_iter_store(struct vma_iterator *vmi, > vmi->mas.index = vma->vm_start; > vmi->mas.last = vma->vm_end - 1; > mas_store_prealloc(&vmi->mas, vma); > + vma_mark_detached(vma, false); > } > > static inline int vma_iter_store_gfp(struct vma_iterator *vmi, > -- > 2.41.0.162.gfafddb0af9-goog >
On Tue, Jun 20, 2023 at 11:57 PM Suren Baghdasaryan <surenb@google.com> wrote: > > Current implementation of vma_start_read() checks VMA for being locked > before taking vma->vm_lock and then checks that again. This mechanism > fails to detect a case when the VMA gets write-locked, modified and > unlocked after the first check but before the vma->vm_lock is obtained. > While this is not strictly a problem (vma_start_read would not produce > a false unlocked result), this allows it to successfully lock a VMA which > got detached from the VMA tree while vma_start_read was locking it. > New condition checks for any change in vma->vm_lock_seq after we obtain > vma->vm_lock and will cause vma_start_read() to fail if the above race > occurs. Just a friendly ping for feedback. I know most people were busy fixing the stack expansion problem and these patches are not urgent, so no rush. If nobody reviews them, I'll ping again next week. > > Fixes: 5e31275cc997 ("mm: add per-VMA lock and helper functions to control it") > Suggested-by: Matthew Wilcox <willy@infradead.org> > Signed-off-by: Suren Baghdasaryan <surenb@google.com> > --- > include/linux/mm.h | 19 ++++++++++--------- > 1 file changed, 10 insertions(+), 9 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index 27ce77080c79..8410da79c570 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -639,23 +639,24 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {} > */ > static inline bool vma_start_read(struct vm_area_struct *vma) > { > - /* Check before locking. A race might cause false locked result. */ > - if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) > + int vm_lock_seq = READ_ONCE(vma->vm_lock_seq); > + > + /* > + * Check if VMA is locked before taking vma->vm_lock. A race or > + * mm_lock_seq overflow might cause false locked result. > + */ > + if (vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) > return false; > > if (unlikely(down_read_trylock(&vma->vm_lock->lock) == 0)) > return false; > > - /* > - * Overflow might produce false locked result. > - * False unlocked result is impossible because we modify and check > - * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq > - * modification invalidates all existing locks. > - */ > - if (unlikely(vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) { > + /* Fail if VMA was write-locked after we checked it earlier */ > + if (unlikely(vm_lock_seq != READ_ONCE(vma->vm_lock_seq))) { > up_read(&vma->vm_lock->lock); > return false; > } > + > return true; > } > > -- > 2.41.0.162.gfafddb0af9-goog >
diff --git a/include/linux/mm.h b/include/linux/mm.h index 27ce77080c79..8410da79c570 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -639,23 +639,24 @@ static inline void vma_numab_state_free(struct vm_area_struct *vma) {} */ static inline bool vma_start_read(struct vm_area_struct *vma) { - /* Check before locking. A race might cause false locked result. */ - if (vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) + int vm_lock_seq = READ_ONCE(vma->vm_lock_seq); + + /* + * Check if VMA is locked before taking vma->vm_lock. A race or + * mm_lock_seq overflow might cause false locked result. + */ + if (vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq)) return false; if (unlikely(down_read_trylock(&vma->vm_lock->lock) == 0)) return false; - /* - * Overflow might produce false locked result. - * False unlocked result is impossible because we modify and check - * vma->vm_lock_seq under vma->vm_lock protection and mm->mm_lock_seq - * modification invalidates all existing locks. - */ - if (unlikely(vma->vm_lock_seq == READ_ONCE(vma->vm_mm->mm_lock_seq))) { + /* Fail if VMA was write-locked after we checked it earlier */ + if (unlikely(vm_lock_seq != READ_ONCE(vma->vm_lock_seq))) { up_read(&vma->vm_lock->lock); return false; } + return true; }