Message ID | 20230914163019.4050530-2-ben.wolsieffer@hefring.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp487258vqi; Thu, 14 Sep 2023 09:56:40 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE8m4nN9XYXo8jkwpCOesLM7UNCioWnkI/U4vGz0QDJcx4jYtMkNOXNZYiR2d6PVFe2qoco X-Received: by 2002:a05:6a00:1353:b0:68f:f6dd:e78b with SMTP id k19-20020a056a00135300b0068ff6dde78bmr6854272pfu.17.1694710600128; Thu, 14 Sep 2023 09:56:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694710600; cv=none; d=google.com; s=arc-20160816; b=L5xTwPRJuywOj8WjenCB+WGY14MFunypiLkB7KrIokiK24UkjFcH5ICI5LlCSblIRl qM9r+/JJfbEPFrxxX2y2fEIzOOnzo0u/RB+W2aHr1y/MT8+PusKo8v8QkeBD79TX6SPk 2YUmzca8XUV95uLNw3RmVt91HGHZad8HRGXwGXzJdOMSMZh0ZjGYofcABcWxbLG2Ip0q T8jHBh9zbFd+zBWR+WPY9ozqz62NmI7j3tuPdLOIHHhJgFChT9X/nvE3idnUF+dC+bku WmLf37n7ANmXXgVmKH2ab1I/xjGBSWJQ9n82e1o6Rxf1w1CmKaOy4D1E681cc2n9ERiD VZjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=iBpYCeeiHkRSP1d/Fs1QslqOdnD/9unLXrEK1uz74Tw=; fh=buHe1Y9CN85X5HB3lLmi6J0uidw6CDAmVxSofF1KkbE=; b=H8x6RDjkCjSfQUAcd3MW1TnXrJLAsdwHaulegg+tqjSeffZispF0wpT5a7o2wl/Ssx WK+/BJeVA/gdmTlHkaIOjQ2LFvA3rtinACffqHnaM8lDxSrk/0XDWyvMoQgAdd36Blep F4IerneC/UekMMyJ1uf082J6UwkcPODfgagCQW3XNwZz5VLrZjIykOEf7/149n2mKTZL NLD82lxFHLzoaokCwT62R0EB2Sh5p1+AL0X1tM7Hx9DHECnIaNiSsZUEJKzOPwxbhz2j KK1zkX8DPGauBOP8I2HA/AY6ImDwWG5ZP7b7mh5sIW9SE4T2nrLydF/1D3EbcJMsSxIg tvSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@hefring-com.20230601.gappssmtp.com header.s=20230601 header.b=x+Yd94kQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id fi37-20020a056a0039a500b0068fb6fc3ff1si1960416pfb.209.2023.09.14.09.56.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 09:56:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@hefring-com.20230601.gappssmtp.com header.s=20230601 header.b=x+Yd94kQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 418BA825A8D4; Thu, 14 Sep 2023 09:32:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231243AbjINQcp (ORCPT <rfc822;pwkd43@gmail.com> + 33 others); Thu, 14 Sep 2023 12:32:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232000AbjINQco (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 14 Sep 2023 12:32:44 -0400 Received: from mail-qv1-xf2b.google.com (mail-qv1-xf2b.google.com [IPv6:2607:f8b0:4864:20::f2b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E86AA1FDC for <linux-kernel@vger.kernel.org>; Thu, 14 Sep 2023 09:32:39 -0700 (PDT) Received: by mail-qv1-xf2b.google.com with SMTP id 6a1803df08f44-64f387094ddso6351946d6.3 for <linux-kernel@vger.kernel.org>; Thu, 14 Sep 2023 09:32:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hefring-com.20230601.gappssmtp.com; s=20230601; t=1694709158; x=1695313958; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=iBpYCeeiHkRSP1d/Fs1QslqOdnD/9unLXrEK1uz74Tw=; b=x+Yd94kQKSZq7SABaWGpYOvW1N6EGdOnwF4m7cP5o+RuhkQFc1EOsZuEM7dU+8dk3t uroxGajlrl/9Rutb9mqnhWsJNu3lER21eXbd+Vyq7UwSfy/1znKJ2DSjfFf5YWOdJWHe dCtH6+gio+jjuxGxxnlfPVfSelTsK73kuNvoUdk8qj+zep/yZ+54U0BIw7YE+GvfMGeh MEoacB9uB0AXfrYu9X1lnjugk0VV9LcoZxOv8wP8y8oo7PcV3PiwnQNp80UqEuCnLLnU 7+zLzpDkA7hpGkGKieJByvs4OZ7Bd1+k1P84x8MqEqdJMMgV6svjTOKkkmWybC+RHROX 2zCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694709158; x=1695313958; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=iBpYCeeiHkRSP1d/Fs1QslqOdnD/9unLXrEK1uz74Tw=; b=ks3p1BbDr0boFnxTnEtLVjerlK8yI7CEQ+W7CFhNWm9BjACw3JdbZ7nSXuNOe/GmD8 EYIsjeUQ65wnB3pfmsICp3bbyHYDCPXLyQKWSf7b8FOBswEr6qgiGNIHzsi/2uNRSawd XsfL3axgayjMwNRayUPEQojjlTUyT4T3rV/ZnKrLguUWL6QhC47lS8JgXXT4GmFkdPb+ UXhi7JXmgdCFwFxLTyPEPfKLOpOJV5ntuEzPdJUxUu7SRVLgLGRlzb4FgVCuwhwiReBK 9mDGjQnMKMEJdV46Djb52ztkHV/1wyIWgz5Q+p2sjwMxRW+pHWk5K4SzS7NMwDNXoMbg 4Jzg== X-Gm-Message-State: AOJu0YzJ3tULiyRmx0QjupxZPvSHCk0fuDIAGDCKYP6ATqkUsHB/VBaK Hmw9AhvAtyJg8fZ1HDrg8Ym++0Li5gVG24mwGuipiA== X-Received: by 2002:a0c:e6c3:0:b0:64f:67ae:a132 with SMTP id l3-20020a0ce6c3000000b0064f67aea132mr6129634qvn.23.1694709158650; Thu, 14 Sep 2023 09:32:38 -0700 (PDT) Received: from localhost.localdomain ([50.212.55.89]) by smtp.gmail.com with ESMTPSA id v6-20020a0cdd86000000b0063d38e52c8fsm579122qvk.18.2023.09.14.09.32.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 09:32:38 -0700 (PDT) From: Ben Wolsieffer <ben.wolsieffer@hefring.com> To: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Greg Ungerer <gerg@uclinux.org>, Andrew Morton <akpm@linux-foundation.org>, Oleg Nesterov <oleg@redhat.com>, Giulio Benetti <giulio.benetti@benettiengineering.com>, Ben Wolsieffer <Ben.Wolsieffer@hefring.com>, Ben Wolsieffer <ben.wolsieffer@hefring.com> Subject: [PATCH] proc: nommu: /proc/<pid>/maps: release mmap read lock Date: Thu, 14 Sep 2023 12:30:20 -0400 Message-ID: <20230914163019.4050530-2-ben.wolsieffer@hefring.com> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 14 Sep 2023 09:32:56 -0700 (PDT) X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777032862123902885 X-GMAIL-MSGID: 1777032862123902885 |
Series |
proc: nommu: /proc/<pid>/maps: release mmap read lock
|
|
Commit Message
Ben Wolsieffer
Sept. 14, 2023, 4:30 p.m. UTC
From: Ben Wolsieffer <Ben.Wolsieffer@hefring.com> The no-MMU implementation of /proc/<pid>/map doesn't normally release the mmap read lock, because it uses !IS_ERR_OR_NULL(_vml) to determine whether to release the lock. Since _vml is NULL when the end of the mappings is reached, the lock is not released. This code was incorrectly adapted from the MMU implementation, which at the time released the lock in m_next() before returning the last entry. The MMU implementation has diverged further from the no-MMU version since then, so this patch brings their locking and error handling into sync, fixing the bug and hopefully avoiding similar issues in the future. Fixes: 47fecca15c09 ("fs/proc/task_nommu.c: don't use priv->task->mm") Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> --- fs/proc/task_nommu.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-)
Comments
On Thu, 14 Sep 2023 12:30:20 -0400 Ben Wolsieffer <ben.wolsieffer@hefring.com> wrote: > The no-MMU implementation of /proc/<pid>/map doesn't normally release > the mmap read lock, because it uses !IS_ERR_OR_NULL(_vml) to determine > whether to release the lock. Since _vml is NULL when the end of the > mappings is reached, the lock is not released. > > This code was incorrectly adapted from the MMU implementation, which > at the time released the lock in m_next() before returning the last entry. > > The MMU implementation has diverged further from the no-MMU version > since then, so this patch brings their locking and error handling into > sync, fixing the bug and hopefully avoiding similar issues in the > future. Thanks. Is this bug demonstrable from userspace? If so, how?
On Thu, Sep 14, 2023 at 10:02:03AM -0700, Andrew Morton wrote: > On Thu, 14 Sep 2023 12:30:20 -0400 Ben Wolsieffer <ben.wolsieffer@hefring.com> wrote: > > > The no-MMU implementation of /proc/<pid>/map doesn't normally release > > the mmap read lock, because it uses !IS_ERR_OR_NULL(_vml) to determine > > whether to release the lock. Since _vml is NULL when the end of the > > mappings is reached, the lock is not released. > > > > Thanks. Is this bug demonstrable from userspace? If so, how? Yes, run "cat /proc/1/maps" twice. You should observe that the second run hangs.
On 09/14, Ben Wolsieffer wrote: > > Fixes: 47fecca15c09 ("fs/proc/task_nommu.c: don't use priv->task->mm") > Signed-off-by: Ben Wolsieffer <ben.wolsieffer@hefring.com> > --- > fs/proc/task_nommu.c | 27 +++++++++++++++------------ > 1 file changed, 15 insertions(+), 12 deletions(-) Acked-by: Oleg Nesterov <oleg@redhat.com> ------------------------------------------------------------------------------- Sorry for the offtopic question. I know NOTHING about nommu and when I tried to review this patch I was puzzled by /* See m_next(). Zero at the start or after lseek. */ if (addr == -1UL) return NULL; at the start of m_start(). OK, lets look at static void *m_next(struct seq_file *m, void *_p, loff_t *pos) { struct vm_area_struct *vma = _p; *pos = vma->vm_end; return find_vma(vma->vm_mm, vma->vm_end); } where does this -1UL come from? Does this mean that on nommu last_vma->vm_end == -1UL or what? fs/proc/task_mmu.c has the same check at the start, but in this case the "See m_next()" comment actually helps. Just curious, thanks. Oleg.
On Thu, Sep 14, 2023 at 01:30:08PM -0400, Ben Wolsieffer wrote: > On Thu, Sep 14, 2023 at 10:02:03AM -0700, Andrew Morton wrote: > > On Thu, 14 Sep 2023 12:30:20 -0400 Ben Wolsieffer <ben.wolsieffer@hefring.com> wrote: > > > > > The no-MMU implementation of /proc/<pid>/map doesn't normally release > > > the mmap read lock, because it uses !IS_ERR_OR_NULL(_vml) to determine > > > whether to release the lock. Since _vml is NULL when the end of the > > > mappings is reached, the lock is not released. > > > > > > > Thanks. Is this bug demonstrable from userspace? If so, how? > > Yes, run "cat /proc/1/maps" twice. You should observe that the > second run hangs. Hi Andrew, I apologize because I realized I provided an incorrect reproducer for this bug. I responded from what I remembered of this bug (I originally wrote the patch over a year ago) and did not test it. Reading /proc/1/maps twice doesn't reproduce the bug because it only takes the read lock, which can be taken multiple times and therefore doesn't show any problem if the lock isn't released. Instead, you need to perform some operation that attempts to take the write lock after reading /proc/<pid>/maps. To actually reproduce the bug, compile the following code as 'proc_maps_bug': #include <stdio.h> #include <unistd.h> #include <sys/mman.h> int main(int argc, char *argv[]) { void *buf; sleep(1); buf = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); puts("mmap returned"); return 0; } Then, run: ./proc_maps_bug &; cat /proc/$!/maps; fg Without this patch, mmap() will hang and the command will never complete. Additionally, it turns out you cannot reproduce this bug on recent kernels because 0c563f148043 ("proc: remove VMA rbtree use from nommu") introduces a second bug that completely breaks /proc/<pid>/maps and prevents the locking bug from being triggered. I will have a second patch for that soon. Thanks, Ben
On Fri, Sep 15, 2023 at 02:15:15PM +0200, Oleg Nesterov wrote: > Sorry for the offtopic question. I know NOTHING about nommu and when I tried to > review this patch I was puzzled by > > /* See m_next(). Zero at the start or after lseek. */ > if (addr == -1UL) > return NULL; > > at the start of m_start(). OK, lets look at > > static void *m_next(struct seq_file *m, void *_p, loff_t *pos) > { > struct vm_area_struct *vma = _p; > > *pos = vma->vm_end; > return find_vma(vma->vm_mm, vma->vm_end); > } > > where does this -1UL come from? Does this mean that on nommu > > last_vma->vm_end == -1UL > > or what? > > fs/proc/task_mmu.c has the same check at the start, but in this case > the "See m_next()" comment actually helps. Yes, this is another copying mistake from the MMU implementation. In fact, it turns out that no-MMU /proc/<pid>/maps is completely broken after 0c563f148043 ("proc: remove VMA rbtree use from nommu"). It just returns an empty file. This happens because find_vma() doesn't do what we want here. It "look[s] up the first VMA in which addr resides, NULL if none", and the address will be zero in in m_start(), which makes find_vma() return NULL (unless presumably the zero address is actually part of the process's address space). I didn't run into this because I developed my patch against an older kernel, and didn't test the latest version until today. I'm preparing a second patch to fix this bug. > > Just curious, thanks. > > Oleg. > Thanks, Ben
diff --git a/fs/proc/task_nommu.c b/fs/proc/task_nommu.c index 2c8b62265981..061bd3f82756 100644 --- a/fs/proc/task_nommu.c +++ b/fs/proc/task_nommu.c @@ -205,11 +205,16 @@ static void *m_start(struct seq_file *m, loff_t *pos) return ERR_PTR(-ESRCH); mm = priv->mm; - if (!mm || !mmget_not_zero(mm)) + if (!mm || !mmget_not_zero(mm)) { + put_task_struct(priv->task); + priv->task = NULL; return NULL; + } if (mmap_read_lock_killable(mm)) { mmput(mm); + put_task_struct(priv->task); + priv->task = NULL; return ERR_PTR(-EINTR); } @@ -218,23 +223,21 @@ static void *m_start(struct seq_file *m, loff_t *pos) if (vma) return vma; - mmap_read_unlock(mm); - mmput(mm); return NULL; } -static void m_stop(struct seq_file *m, void *_vml) +static void m_stop(struct seq_file *m, void *v) { struct proc_maps_private *priv = m->private; + struct mm_struct *mm = priv->mm; - if (!IS_ERR_OR_NULL(_vml)) { - mmap_read_unlock(priv->mm); - mmput(priv->mm); - } - if (priv->task) { - put_task_struct(priv->task); - priv->task = NULL; - } + if (!priv->task) + return; + + mmap_read_unlock(mm); + mmput(mm); + put_task_struct(priv->task); + priv->task = NULL; } static void *m_next(struct seq_file *m, void *_p, loff_t *pos)