From patchwork Mon Mar 6 23:45:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?Jannik_Gl=C3=BCckert?= X-Patchwork-Id: 65181 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp2137523wrd; Mon, 6 Mar 2023 15:47:05 -0800 (PST) X-Google-Smtp-Source: AK7set9vjxBQw3lcvNWm3uVMIhGLE6+3h5A5HuP1ilXxvHI7GakjEryncJYk/lxcYLtqJj38Aht2 X-Received: by 2002:a17:906:3c43:b0:8ec:43ae:6267 with SMTP id i3-20020a1709063c4300b008ec43ae6267mr11547094ejg.51.1678146425145; Mon, 06 Mar 2023 15:47:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678146425; cv=none; d=google.com; s=arc-20160816; b=yTMGczxuJtexFRoxeQKTLf8YftgOp70Qlwz2aswCI3qPacU6a+q0xXnmm2omqbTr/w +Cv7T+xpN2uOnSqObA4Ygt+kv/JjJPgGofefL//tPkg7fdfIK+8dPxlx41E7XnWVqaXY f6SGtvdxvo98MTmLRrOpw7bv0qcpxG/bVW5GccMWze4LgKGPspXUY10Nw9QlKPxh+0rz 6bvf1lKT+F8DatsbiMls4NsRB3/kl7V5/AIXca6YfcN2BDjygdOOHCkd8fYv7xoVaopJ rFuuKuofDClUeFUmLhylF+2odx1blPDRhWq7Z2fxrYX3ywwykQm7Tx0Zs0cjljE5XhOM hi4g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:cc:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=8M9UA6emcmdFeo+vaRuaS1RcdVGstFNGsMmMEGVplvc=; b=lBmiyRo5LAC1BGdcoGAu3GIaBDK6Up0IARm7kSLvM81QCxdUt+2qwqbIIIU57+L5Yh O2KKfdvjCC1R/XJ93FQpJU7MeM83Sewt82wxxlgXs7MT+UW5FzVZ4tKpUf1KQ5OWFDEY AfQuuaxcszXDwk75D5ZYrzf4PVags6iI2yfpRGBcp8gwD+TvwucIiJxanYbpb2mkD/Sk hKeAthoySmQk08YutobLnFbftQQm2RWv7hlJiPftueycrHdo0DeiWZJeSTGXDsk7WcUn Id0qBV0Q+0QFnFkbMyC38RAxfGJTOHnPofm0Jb00EVIXKnDHIw3cVWN6V5kFohuKKosJ /elg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=EhUaoAwA; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id b9-20020a170906d10900b008b178585afcsi10750943ejz.250.2023.03.06.15.47.04 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Mar 2023 15:47:05 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=EhUaoAwA; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 040C938515C3 for ; Mon, 6 Mar 2023 23:46:15 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 040C938515C3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1678146375; bh=8M9UA6emcmdFeo+vaRuaS1RcdVGstFNGsMmMEGVplvc=; h=Date:Subject:To:Cc:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=EhUaoAwAAookpqo4RioK1HLc/I2MWmsRtTkhimkqoUIPjao8iOLtfkIlaDNW/TbAc O5sWJhNRtXtDQM+f87qQ3v0yqf1fQYoQKUNdcVEpGLGY3vP9wVqw3TUpoL/ylZak40 hQX+3SNIa/+FdnQzxeiLazl182CyGZEQx6QseMvg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by sourceware.org (Postfix) with ESMTPS id 7CF4B3858C66; Mon, 6 Mar 2023 23:45:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7CF4B3858C66 Received: by mail-qt1-x829.google.com with SMTP id l13so12684965qtv.3; Mon, 06 Mar 2023 15:45:28 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678146327; h=cc:to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=8M9UA6emcmdFeo+vaRuaS1RcdVGstFNGsMmMEGVplvc=; b=GbyNHSOEVvp7cDrpdD9n6PY5szfo31UNjo//ps6WsisQhdgblH2v3qhLnsXN36t4ie QX2xBkkxVf8xu6gD3NgQ7AYJ3s0uk0XZiBVUjfUdr0ud054rmUtox78yTX5G3iFrcnMB rq9EpMmFwMBm843PKK9gcrZGV/l7LztBwVhx73ERdkXfaUNOePb8SzzA7iehaZ0N7eHs Aq6b+2V8/E/nA6ZaT1IxOZMm+BDCY2mJ318KCtgifF+SjoYXl1AXOMdPGBUjsW3Mlgc4 ggGNNM2PTbsVYzjrq+Zm3z395Yhb4X+blf2jUviMg0ijps4AuY4Wj+yXDmrg4sUTdPL8 X6dQ== X-Gm-Message-State: AO0yUKVofSP17kmGXjRXeuZnRqC/JyAL40MJkl7Fa2S+wuPV5YG6tGyY g3+s94GW328QN83Uvv8f1aOYwAnO9dLYmE4CmUQgdp6b X-Received: by 2002:ac8:146:0:b0:3bf:a0d9:35af with SMTP id f6-20020ac80146000000b003bfa0d935afmr3275144qtg.4.1678146327524; Mon, 06 Mar 2023 15:45:27 -0800 (PST) MIME-Version: 1.0 Date: Tue, 7 Mar 2023 00:45:16 +0100 Message-ID: Subject: [PATCH] libstdc++: use copy_file_range, improve sendfile in filesystem::copy_file To: libstdc++@gcc.gnu.org Cc: gcc-patches@gcc.gnu.org X-Spam-Status: No, score=-10.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: =?utf-8?q?Jannik_Gl=C3=BCckert_via_Gcc-patches?= From: =?utf-8?q?Jannik_Gl=C3=BCckert?= Reply-To: =?utf-8?q?Jannik_Gl=C3=BCckert?= Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759664066039740911?= X-GMAIL-MSGID: =?utf-8?q?1759664066039740911?= The current copy_file implementation is suboptimal. It only uses sendfile for files smaller than 2GB, falling back to a userspace copy, and does not support copy_file_range at all. copy_file_range is particularly of increasing importance with the adoption of reflinks in filesystems. I am pretty sure I got some of the formatting wrong, feel free to tear apart. I don't know if sendfile has identical semantics on linux as it does on solaris, if someone could test with a big file that'd be great. Otherwise, this should not regress. The implementation will fall back to sendfile / userspace copy if copy_file_range is not available for the target paths. The copy implementations for sendfile and copy_file_range were put into separate functions and the callee code simplified to the point where you can basically just copy-paste it to add a new implementation, should new interesting syscalls pop up. Best Jannik From 72b7ad044246e496d90b5f241f59bd0b69e214fa Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Jannik=20Gl=C3=BCckert?= Date: Mon, 6 Mar 2023 23:11:41 +0100 Subject: [PATCH 2/2] libstdc++: use copy_file_range copy_file_range is a recent-ish syscall for copying files. It is similar to sendfile but allows filesystem-specific optimizations. Common are: Reflinks: BTRFS, XFS, ZFS (does not implement the syscall yet) Server-side copy: NFS, SMB If copy_file_range is not available for the given files, fall back to sendfile / userspace copy. libstdc++-v3/ChangeLog: * acinclude.m4 (_GLIBCXX_USE_COPY_FILE_RANGE): define * config.h.in: Regenerate. * configure: Regenerate. * src/filesystem/ops-common.h: use copy_file_range in std::filesystem::copy_file --- libstdc++-v3/acinclude.m4 | 20 ++++++++ libstdc++-v3/config.h.in | 3 ++ libstdc++-v3/configure | 62 ++++++++++++++++++++++++ libstdc++-v3/src/filesystem/ops-common.h | 34 +++++++++++++ 4 files changed, 119 insertions(+) diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4 index 5136c0571e8..ca09e1d22db 100644 --- a/libstdc++-v3/acinclude.m4 +++ b/libstdc++-v3/acinclude.m4 @@ -4581,6 +4581,7 @@ dnl _GLIBCXX_USE_UTIMENSAT dnl _GLIBCXX_USE_ST_MTIM dnl _GLIBCXX_USE_FCHMOD dnl _GLIBCXX_USE_FCHMODAT +dnl _GLIBCXX_USE_COPY_FILE_RANGE dnl _GLIBCXX_USE_SENDFILE dnl HAVE_LINK dnl HAVE_READLINK @@ -4718,6 +4719,25 @@ dnl if test $glibcxx_cv_fchmodat = yes; then AC_DEFINE(_GLIBCXX_USE_FCHMODAT, 1, [Define if fchmodat is available in .]) fi +dnl + AC_CACHE_CHECK([for copy_file_range that can copy files], + glibcxx_cv_copy_file_range, [dnl + case "${target_os}" in + linux*) + GCC_TRY_COMPILE_OR_LINK( + [#include ], + [copy_file_range(1, NULL, 2, NULL, 1, 0);], + [glibcxx_cv_copy_file_range=yes], + [glibcxx_cv_copy_file_range=no]) + ;; + *) + glibcxx_cv_copy_file_range=no + ;; + esac + ]) + if test $glibcxx_cv_copy_file_range = yes; then + AC_DEFINE(_GLIBCXX_USE_COPY_FILE_RANGE, 1, [Define if copy_file_range is available in .]) + fi dnl AC_CACHE_CHECK([for sendfile that can copy files], glibcxx_cv_sendfile, [dnl diff --git a/libstdc++-v3/src/filesystem/ops-common.h b/libstdc++-v3/src/filesystem/ops-common.h index d8afc6a4d64..0491dc8d811 100644 --- a/libstdc++-v3/src/filesystem/ops-common.h +++ b/libstdc++-v3/src/filesystem/ops-common.h @@ -49,6 +49,9 @@ #ifdef NEED_DO_COPY_FILE # include # include +# ifdef _GLIBCXX_USE_COPY_FILE_RANGE +# include // copy_file_range +# endif # ifdef _GLIBCXX_USE_SENDFILE # include // sendfile # endif @@ -358,6 +361,24 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM } #ifdef NEED_DO_COPY_FILE +#ifdef _GLIBCXX_USE_COPY_FILE_RANGE + bool + copy_file_copy_file_range(int fd_in, int fd_out, size_t length) noexcept + { + size_t bytes_left = length; + off_t offset = 0; + ssize_t bytes_copied; + do { + bytes_copied = ::copy_file_range(fd_in, &offset, fd_out, NULL, bytes_left, 0); + if (bytes_copied < 0) + { + return false; + } + bytes_left -= bytes_copied; + } while (bytes_left > 0 && bytes_copied > 0); + return true; + } +#endif #if defined _GLIBCXX_USE_SENDFILE && ! defined _GLIBCXX_FILESYSTEM_IS_WINDOWS bool copy_file_sendfile(int fd_in, int fd_out, size_t length) noexcept @@ -518,6 +539,19 @@ _GLIBCXX_BEGIN_NAMESPACE_FILESYSTEM bool has_copied = false; +#ifdef _GLIBCXX_USE_COPY_FILE_RANGE + if (!has_copied) + has_copied = copy_file_copy_file_range(in.fd, out.fd, from_st->st_size); + if (!has_copied) + { + if (errno != EFBIG && errno != EOPNOTSUPP && errno != EOVERFLOW && errno != EXDEV) + { + ec.assign(errno, std::generic_category()); + return false; + } + } +#endif + #if defined _GLIBCXX_USE_SENDFILE && ! defined _GLIBCXX_FILESYSTEM_IS_WINDOWS if (!has_copied) has_copied = copy_file_sendfile(in.fd, out.fd, from_st->st_size); -- 2.39.2