Message ID | 20230520000049.2226926-27-dhowells@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp53574vqo; Fri, 19 May 2023 17:34:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4J+5dENM8hTBZw1GixpoyDONQEQb3e33lfAvW4uDgo7NEnsEfUkWAhH3TAxfWopIsjtVYs X-Received: by 2002:a17:902:778c:b0:1a6:81fc:b585 with SMTP id o12-20020a170902778c00b001a681fcb585mr3578604pll.41.1684542842515; Fri, 19 May 2023 17:34:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684542842; cv=none; d=google.com; s=arc-20160816; b=wbxhwYbFxveMJTJ4W5X895Yc/0DrG+DYptQHCCBxqxWb/KbT4N/ij9YF0xVPBRaO7p x8B28N169AFvfJcXWsgmnJxoYtN562B3cuOR6wgQ80ydv/6yijaZhP+dvoxhub0Ido90 NCUx2w5J3hSZbAj3W/09joqchQB/TRfywVHZqJmRd0OEPlIuiiKdpKHYPyP+VYcUEScf ADAgVHvUJgsrWhZLNAURuLCiSiExq/bavM72SpUcRTbaIHPjlDxwlJrjo+RW3GIC9JbQ u9JsNS12xECqW+WWPZ1+f5AGEh7cZQXyDVCxumTDQS1LSv3cTZUZN7ei1W8eeYVFjqm2 0KhQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HfMcLqK7SdZvPGZvy5b2RGzipVUrB/qFQK+4kiqdRos=; b=pymP72W2LumdEZKRtvMAn2+souEbSVzSE8MBRQ/WALMzQHnFsxlEHLKT3ek/17c0GI oLgpwRll49wMYysjIeRcEvur5w/eAAb7hPnDBAoOTc9RJ9HziCrSjvwDhlYMHE/kQaX/ C19NpFCPeyMgBfUQNON4VrV4u4q/GgB7lSYHbZXgRRdewMJkrDagYGgXy4BE3ftTa5Me sK1XrjSHXsJJutAXYdkk1Qs2E/Tp+/ightiS6o/daysXhtw4BTRyn832ME5V3Q6ScS4I hhMsMCijbnDYUEo0cJfgtb9f8evPvf7+O50ReXxVsZ60h0zkbDBzKUvu69ReyQaYLL8g PlxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=g4eUsIFh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ju9-20020a170903428900b001a81741ef8dsi189543plb.54.2023.05.19.17.33.50; Fri, 19 May 2023 17:34:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=g4eUsIFh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232125AbjETAEz (ORCPT <rfc822;wlfightup@gmail.com> + 99 others); Fri, 19 May 2023 20:04:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231867AbjETAD7 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 19 May 2023 20:03:59 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 256641BD3 for <linux-kernel@vger.kernel.org>; Fri, 19 May 2023 17:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684540938; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HfMcLqK7SdZvPGZvy5b2RGzipVUrB/qFQK+4kiqdRos=; b=g4eUsIFhPtGa8jXXMmVGEUdgoNI6AuT1r7XHuB/EGqL5HFXhU6qsGZOLHQzfLLMIn4mNce mUawGTUiIuAAwx4NTNM7+PKewKPBZCurAHn/mRzaprHy8bhtjXv9gz3iRfy25iijjbDI7P wdUy3k8Vlr04fT1o1ygrenyWUREQGWM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-146-uJgBlECTNdOvDKp9eNps_A-1; Fri, 19 May 2023 20:02:15 -0400 X-MC-Unique: uJgBlECTNdOvDKp9eNps_A-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 35A573C02521; Sat, 20 May 2023 00:02:14 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.221]) by smtp.corp.redhat.com (Postfix) with ESMTP id E51177C2A; Sat, 20 May 2023 00:02:11 +0000 (UTC) From: David Howells <dhowells@redhat.com> To: Jens Axboe <axboe@kernel.dk>, Al Viro <viro@zeniv.linux.org.uk>, Christoph Hellwig <hch@infradead.org> Cc: David Howells <dhowells@redhat.com>, Matthew Wilcox <willy@infradead.org>, Jan Kara <jack@suse.cz>, Jeff Layton <jlayton@kernel.org>, David Hildenbrand <david@redhat.com>, Jason Gunthorpe <jgg@nvidia.com>, Logan Gunthorpe <logang@deltatee.com>, Hillf Danton <hdanton@sina.com>, Christian Brauner <brauner@kernel.org>, Linus Torvalds <torvalds@linux-foundation.org>, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig <hch@lst.de>, Steven Rostedt <rostedt@goodmis.org>, Masami Hiramatsu <mhiramat@kernel.org>, linux-trace-kernel@vger.kernel.org Subject: [PATCH v21 26/30] splice: Convert trace/seq to use copy_splice_read() Date: Sat, 20 May 2023 01:00:45 +0100 Message-Id: <20230520000049.2226926-27-dhowells@redhat.com> In-Reply-To: <20230520000049.2226926-1-dhowells@redhat.com> References: <20230520000049.2226926-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.5 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766371195927865344?= X-GMAIL-MSGID: =?utf-8?q?1766371195927865344?= |
Series |
splice: Kill ITER_PIPE
|
|
Commit Message
David Howells
May 20, 2023, midnight UTC
For the splice from the trace seq buffer, just use copy_splice_read().
In the future, something better can probably be done by gifting pages from
seq->buf into the pipe, but that would require changing seq->buf into a
vmap over an array of pages.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Christoph Hellwig <hch@lst.de>
cc: Al Viro <viro@zeniv.linux.org.uk>
cc: Jens Axboe <axboe@kernel.dk>
cc: Steven Rostedt <rostedt@goodmis.org>
cc: Masami Hiramatsu <mhiramat@kernel.org>
cc: linux-kernel@vger.kernel.org
cc: linux-trace-kernel@vger.kernel.org
cc: linux-fsdevel@vger.kernel.org
cc: linux-block@vger.kernel.org
cc: linux-mm@kvack.org
---
kernel/trace/trace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
s/splice/trace/ ?
Hi David, On Sat, 20 May 2023 01:00:45 +0100 David Howells <dhowells@redhat.com> wrote: > For the splice from the trace seq buffer, just use copy_splice_read(). So this is because you will remove generic_file_splice_read() (since it's buggy), right? > > In the future, something better can probably be done by gifting pages from > seq->buf into the pipe, but that would require changing seq->buf into a > vmap over an array of pages. So what we need is to introduce a vmap? We introduced splice support for avoiding copy ringbuffer pages, but this drops it. Thus this will drop performance of splice on ring buffer (trace file). If it is correct, can you also add a note about that? Thank you, > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Christoph Hellwig <hch@lst.de> > cc: Al Viro <viro@zeniv.linux.org.uk> > cc: Jens Axboe <axboe@kernel.dk> > cc: Steven Rostedt <rostedt@goodmis.org> > cc: Masami Hiramatsu <mhiramat@kernel.org> > cc: linux-kernel@vger.kernel.org > cc: linux-trace-kernel@vger.kernel.org > cc: linux-fsdevel@vger.kernel.org > cc: linux-block@vger.kernel.org > cc: linux-mm@kvack.org > --- > kernel/trace/trace.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index ebc59781456a..c210d02fac97 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -5171,7 +5171,7 @@ static const struct file_operations tracing_fops = { > .open = tracing_open, > .read = seq_read, > .read_iter = seq_read_iter, > - .splice_read = generic_file_splice_read, > + .splice_read = copy_splice_read, > .write = tracing_write_stub, > .llseek = tracing_lseek, > .release = tracing_release, >
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote: > David Howells <dhowells@redhat.com> wrote: > > > For the splice from the trace seq buffer, just use copy_splice_read(). > > So this is because you will remove generic_file_splice_read() (since > it's buggy), right? An ITER_PIPE iterator has a problem if it gets reverted with other changes I want to make. The problem is that it may not be valid to control the lifetime of the data in the buffer with get_page(). The pages may need a pin taking (FOLL_PIN) or the lifetime might be controlled with kfree() or rmmod. > > In the future, something better can probably be done by gifting pages from > > seq->buf into the pipe, but that would require changing seq->buf into a > > vmap over an array of pages. > > ... We introduced splice support for avoiding copy ringbuffer pages, but > this drops it. Thus this will drop performance of splice on ring buffer > (trace file). If it is correct, can you also add a note about that? Actually, no. There is no special splice support for tracing_fops. You currently use generic_file_splice_read(), which wends its way down into seq_read_iter. However, the seqfile stuff uses kvmalloc() to allocate the buffer, so you are not allowed to splice page refs from kmalloc'd or vmalloc'd memory into a pipe, so it doesn't. It calls copy_to_iter() which will cause ITER_PIPE to allocate bufferage on an as-needed basis. copy_splice_read() instead creates an ITER_BVEC and populates it up front using the bulk allocator, so if you're splicing a lot of data, this ought to be marginally faster. > So what we need is to introduce a vmap? We could implement seq_splice_read(). What we would need to do is to change how the buffer is allocated: bulk allocate a bunch of arbitrary pages which we then vmap(). When we need to splice, we read into the buffer, do a vunmap() and then splice the pages holding the data we used into the pipe. If we don't manage to splice all the data, we can continue splicing from the pages we have left next time. If a read() comes along to view partially spliced data, we would need to copy from the individual pages. When we use up all the data, we discard all the pages we might have spliced from and shuffle down the other pages, call the bulk allocator to replenish the buffer and then vmap() it again. Any pages we've spliced from must be discarded and replaced and not rewritten. If a read() comes without the buffer having been spliced from, it can do as it does now. David
On Sat, 20 May 2023 01:00:45 +0100 David Howells <dhowells@redhat.com> wrote: > For the splice from the trace seq buffer, just use copy_splice_read(). > > In the future, something better can probably be done by gifting pages from > seq->buf into the pipe, but that would require changing seq->buf into a > vmap over an array of pages. > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Christoph Hellwig <hch@lst.de> > cc: Al Viro <viro@zeniv.linux.org.uk> > cc: Jens Axboe <axboe@kernel.dk> > cc: Steven Rostedt <rostedt@goodmis.org> > cc: Masami Hiramatsu <mhiramat@kernel.org> > cc: linux-kernel@vger.kernel.org > cc: linux-trace-kernel@vger.kernel.org > cc: linux-fsdevel@vger.kernel.org > cc: linux-block@vger.kernel.org > cc: linux-mm@kvack.org > --- > kernel/trace/trace.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c > index ebc59781456a..c210d02fac97 100644 > --- a/kernel/trace/trace.c > +++ b/kernel/trace/trace.c > @@ -5171,7 +5171,7 @@ static const struct file_operations tracing_fops = { > .open = tracing_open, > .read = seq_read, > .read_iter = seq_read_iter, > - .splice_read = generic_file_splice_read, > + .splice_read = copy_splice_read, Anyway, for this change: Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org> -- Steve > .write = tracing_write_stub, > .llseek = tracing_lseek, > .release = tracing_release,
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index ebc59781456a..c210d02fac97 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5171,7 +5171,7 @@ static const struct file_operations tracing_fops = { .open = tracing_open, .read = seq_read, .read_iter = seq_read_iter, - .splice_read = generic_file_splice_read, + .splice_read = copy_splice_read, .write = tracing_write_stub, .llseek = tracing_lseek, .release = tracing_release,