From patchwork Thu Oct 6 19:02:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Charles-Fran=C3=A7ois_Natali?= X-Patchwork-Id: 1792 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp462891wrs; Thu, 6 Oct 2022 12:04:25 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7ZhbFTUvq3TsbIm9PAORL1o0q32SnoAXxwX4LgQImM76QOvmElH66l331bBhngiboRjuzU X-Received: by 2002:a05:6402:d43:b0:459:b4a:18b5 with SMTP id ec3-20020a0564020d4300b004590b4a18b5mr1207080edb.171.1665083065188; Thu, 06 Oct 2022 12:04:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1665083065; cv=none; d=google.com; s=arc-20160816; b=OLGqvE4P76b8SeTqGOvhtMdlGYS6OMcS/8I2CKciR+6sgTMp/UM97yItSLIaCHr4P7 am05p7qryO7BVizwov6CnzhmwXQusc7/iZb40OGZjckWzWjHDwvvt6EjCj65wULirAX6 9uHv0a+oN4U3JpAu+f4wxVtunBBXgYwd7l/4N4y3J6GEiDAEL/VXKghBSmOH6RFY9BUR LBx/bujepFpg6JY16+WpdXrStbN+73q3gw5+3OeUSdplzWzge4pb3RlU6lPvJWljd37/ FgBfFebIJ+/8KX+Wh6bAZ1DqaAx7Qip1t3ZUPXkovi5DfSXmk3K0pkumG8Iz9dHN/f7W o1vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:reply-to:from:list-subscribe:list-help :list-post:list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=eiB90ZpBa7H1dEGfY3MVigpQKkCUFi1kVYkZ3kdAwNs=; b=hPPvMZva8bUhZw6+5ABOfmCNcohDbAmbTCe0ybxZ6m/WRIpcfJjRdNMUOY14oOYiPf m7C6pEyo06DLBEm4DxvlkCWl9gmWtSV6coXSr6ByggEnc9Mwp5FAtAURzugC0+IP1U6W 7l88997fgC4k2nrfpwI52mnLnkvBjhhmu4MlPGAJsnCB8Ho7qKlQrIxPvU6SfOinJXnp tr4a+p/YyLh62aQfDE8HVHiVWNUmVA5cKxl9jR4vnue/sgbC9m7bAKDXhVKLVCbWzdeG Y0RZTzPlg86vyG+EK9Zw3WdnzC7O33F2RD0E6ErtGB9+IzeQLGk7VfHkjJ92RIPUrwFh MKhA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=kBtClgGV; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c1-20020a170906170100b00783286f155fsi170972eje.311.2022.10.06.12.04.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Oct 2022 12:04:25 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=kBtClgGV; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9268C38A8146 for ; Thu, 6 Oct 2022 19:04:06 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 9268C38A8146 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1665083046; bh=eiB90ZpBa7H1dEGfY3MVigpQKkCUFi1kVYkZ3kdAwNs=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:Cc:From; b=kBtClgGVPS5xBXflTLSZq8q/lH7nftGWaVCE16UQ7YK6Uw9AhNjFfeAPnhWtL0gIF 43J0vWNnBuClKltYJmywXHyxqHWgsWO4OzCXinz5L2BOvQXneLyw6DS0ZQT+eLFU09 1j5g/LfcGq2hV+B4PF6+hWuziHjqfXXXckvXQjTk= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by sourceware.org (Postfix) with ESMTPS id 189C73887F5E; Thu, 6 Oct 2022 19:03:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 189C73887F5E Received: by mail-wr1-x42a.google.com with SMTP id n12so4073688wrp.10; Thu, 06 Oct 2022 12:03:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=eiB90ZpBa7H1dEGfY3MVigpQKkCUFi1kVYkZ3kdAwNs=; b=K1Nv0liiWdOFEGcBf+edcE2xLRz1BvTMRD1wRO1+96XlRbJB9sor7MyGaLGBx44zEU Y1VAqQkyGbTO+wdW10vQsXxv7a20dVGZzOhe1WmwyyTCvpxBIB1tbZ2vVcaqpmSbO0i0 pAqN6mBCoNHEZ5FiWTJSKRnpcNJEGBD8rPPwAS8TRvby5nkyak/cC9rNHcuIPfmf8Cvt 67+ePSF7l0oS3Ng68rMEBGD9i7wooielAs59PL++y8ez+hwwoyu9znnkOtmKwod7TmsD os2HFttn1KvCd2kpZ8yqoy4XR4/wsgDFpkZpH3nMeQ4K+CykKyGSCFmAwetuH+iVXinF 62/g== X-Gm-Message-State: ACrzQf3Qsnf1UymHZC1brBBeiXwco10jdBmuONO5LVSv4PYRVd03U1pb h0ycV1J9ZfjNwNnQTg93nITNPQw9+Fo= X-Received: by 2002:a5d:51c2:0:b0:22e:3c45:9eff with SMTP id n2-20020a5d51c2000000b0022e3c459effmr925943wrv.118.1665082987863; Thu, 06 Oct 2022 12:03:07 -0700 (PDT) Received: from localhost.localdomain ([90.252.80.23]) by smtp.gmail.com with ESMTPSA id bv10-20020a0560001f0a00b00228fa832b7asm130325wrb.52.2022.10.06.12.03.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Oct 2022 12:03:07 -0700 (PDT) To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH v2] libstdc++: basic_filebuf: don't flush more often than necessary. Date: Thu, 6 Oct 2022 20:02:56 +0100 Message-Id: <20221006190255.361385-1-cf.natali@gmail.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Charles-Francois Natali via Gcc-patches From: =?utf-8?q?Charles-Fran=C3=A7ois_Natali?= Reply-To: Charles-Francois Natali Cc: Charles-Francois Natali Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1743171948458692248?= X-GMAIL-MSGID: =?utf-8?q?1745966140017840616?= `basic_filebuf::xsputn` would bypass the buffer when passed a chunk of size 1024 and above, seemingly as an optimisation. This can have a significant performance impact if the overhead of a `write` syscall is non-negligible, e.g. on a slow disk, on network filesystems, or simply during IO contention because instead of flushing every `BUFSIZ` (by default), we can flush every 1024 char. The impact is even greater with custom larger buffers, e.g. for network filesystems, because the code could issue `write` for example 1000X more often than necessary with respect to the buffer size. It also introduces a significant discontinuity in performance when writing chunks of size 1024 and above. See this reproducer which writes down a fixed number of chunks to a file open with `O_SYNC` - to replicate high-latency `write` - for varying size of chunks: ``` $ cat test_fstream_flush.cpp int main(int argc, char* argv[]) { assert(argc == 3); const auto* path = argv[1]; const auto chunk_size = std::stoul(argv[2]); const auto fd = open(path, O_CREAT | O_TRUNC | O_WRONLY | O_SYNC | O_CLOEXEC, 0666); assert(fd >= 0); auto filebuf = __gnu_cxx::stdio_filebuf(fd, std::ios_base::out); auto stream = std::ostream(&filebuf); const auto chunk = std::vector(chunk_size); for (auto i = 0; i < 1'000; ++i) { stream.write(chunk.data(), chunk.size()); } return 0; } ``` ``` $ g++ -o /tmp/test_fstream_flush test_fstream_flush.cpp -std=c++17 $ for i in $(seq 1021 1025); do echo -e "\n$i"; time /tmp/test_fstream_flush /tmp/foo $i; done 1021 real 0m0.997s user 0m0.000s sys 0m0.038s 1022 real 0m0.939s user 0m0.005s sys 0m0.032s 1023 real 0m0.954s user 0m0.005s sys 0m0.034s 1024 real 0m7.102s user 0m0.040s sys 0m0.192s 1025 real 0m7.204s user 0m0.025s sys 0m0.209s ``` See the huge drop in performance at the 1024-boundary. An `strace` confirms that from size 1024 we effectively defeat buffering: 1023-sized writes ``` $ strace -P /tmp/foo -e openat,write,writev /tmp/test_fstream_flush /tmp/foo 1023 2>&1 | head -n5 openat(AT_FDCWD, "/tmp/foo", O_WRONLY|O_CREAT|O_TRUNC|O_SYNC|O_CLOEXEC, 0666) = 3 writev(3, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8184}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1023}], 2) = 9207 writev(3, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8184}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1023}], 2) = 9207 writev(3, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8184}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1023}], 2) = 9207 writev(3, [{iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=8184}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1023}], 2) = 9207 ``` vs 1024-sized writes ``` $ strace -P /tmp/foo -e openat,write,writev /tmp/test_fstream_flush /tmp/foo 1024 2>&1 | head -n5 openat(AT_FDCWD, "/tmp/foo", O_WRONLY|O_CREAT|O_TRUNC|O_SYNC|O_CLOEXEC, 0666) = 3 writev(3, [{iov_base=NULL, iov_len=0}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1024}], 2) = 1024 writev(3, [{iov_base="", iov_len=0}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1024}], 2) = 1024 writev(3, [{iov_base="", iov_len=0}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1024}], 2) = 1024 writev(3, [{iov_base="", iov_len=0}, {iov_base="\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., iov_len=1024}], 2) = 1024 ``` Instead, it makes sense to only bypass the buffer if the amount of data to be written is larger than the buffer capacity. Closes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63746 Signed-off-by: Charles-Francois Natali --- libstdc++-v3/include/bits/fstream.tcc | 9 ++--- .../27_io/basic_filebuf/sputn/char/63746.cc | 38 +++++++++++++++++++ 2 files changed, 41 insertions(+), 6 deletions(-) create mode 100644 libstdc++-v3/testsuite/27_io/basic_filebuf/sputn/char/63746.cc diff --git a/libstdc++-v3/include/bits/fstream.tcc b/libstdc++-v3/include/bits/fstream.tcc index 7ccc887b8..2e9369628 100644 --- a/libstdc++-v3/include/bits/fstream.tcc +++ b/libstdc++-v3/include/bits/fstream.tcc @@ -757,23 +757,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION { streamsize __ret = 0; // Optimization in the always_noconv() case, to be generalized in the - // future: when __n is sufficiently large we write directly instead of - // using the buffer. + // future: when __n is larger than the available capacity we write + // directly instead of using the buffer. const bool __testout = (_M_mode & ios_base::out || _M_mode & ios_base::app); if (__check_facet(_M_codecvt).always_noconv() && __testout && !_M_reading) { - // Measurement would reveal the best choice. - const streamsize __chunk = 1ul << 10; streamsize __bufavail = this->epptr() - this->pptr(); // Don't mistake 'uncommitted' mode buffered with unbuffered. if (!_M_writing && _M_buf_size > 1) __bufavail = _M_buf_size - 1; - const streamsize __limit = std::min(__chunk, __bufavail); - if (__n >= __limit) + if (__n >= __bufavail) { const streamsize __buffill = this->pptr() - this->pbase(); const char* __buf = reinterpret_cast(this->pbase()); diff --git a/libstdc++-v3/testsuite/27_io/basic_filebuf/sputn/char/63746.cc b/libstdc++-v3/testsuite/27_io/basic_filebuf/sputn/char/63746.cc new file mode 100644 index 000000000..baab93407 --- /dev/null +++ b/libstdc++-v3/testsuite/27_io/basic_filebuf/sputn/char/63746.cc @@ -0,0 +1,38 @@ +// { dg-require-fileio "" } + +#include +#include + +class testbuf : public std::filebuf { +public: + char_type* pub_pprt() const + { + return this->pptr(); + } + + char_type* pub_pbase() const + { + return this->pbase(); + } +}; + +void test01() +{ + using namespace std; + + // Leave capacity to avoid flush. + const streamsize chunk_size = BUFSIZ - 1 - 1; + const char data[chunk_size] = {}; + + testbuf a_f; + VERIFY( a_f.open("tmp_63746_sputn", ios_base::out) ); + VERIFY( chunk_size == a_f.sputn(data, chunk_size) ); + VERIFY( (a_f.pub_pprt() - a_f.pub_pbase()) == chunk_size ); + VERIFY( a_f.close() ); +} + +int main() +{ + test01(); + return 0; +}