Message ID | 20230817145554.892543-9-leitao@debian.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp1580884vqi; Fri, 18 Aug 2023 11:02:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGB+5GBD0I9xsaTxM8MISzvb6IGjVBQI3blZpQdw82Ktfpfop/tcrg062kJom3Lx7Yhe5Fv X-Received: by 2002:a17:902:e80d:b0:1bd:e64c:5c71 with SMTP id u13-20020a170902e80d00b001bde64c5c71mr3930205plg.21.1692381771912; Fri, 18 Aug 2023 11:02:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692381771; cv=none; d=google.com; s=arc-20160816; b=M7yFxyStguZ8b6LCuk7+UKAmX+/O0QgTQa24B/TgyRXAKb0xS08A56YkPVZ8Jn0av/ Gi0vMNEx43RfA8njoBAaHmF6sMHRjOG/i3R/Flt7yrgEk+ZFMRg6yXuqPJh1ZxPhnwDC tThmB7erulTn8Q7KMNu67MiH+/8VFwjwq9qCf3YPWnwvrL0VYZ6XxtRGReZNtkrMFn7H omT8WCucqz+b+H/O/8+2GTnTmfJtsRCBl1EODhrwkKbtMGitzPAR9FeVHkd8V7ek5si4 jMkh5euVxvWaBautAJ2QHxM9vT7H2Nj1B+B/nChj5VgjsMbGDDsGFVpIZ+V48fWVIzTN 8YIA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=nQ0mVRxVAzaTLmEd/HemwV6+ljHNBK9BmdpK1e4n1Fk=; fh=9AWIeT3iRePFFiWaUks3e8UIAu0DVI8evaQ1LtP9zWM=; b=Frne3DL/rFOqrGS5QbiYyuULC9Pq5wX0UesKqQEj6bSgx+MWm1Cp03+iY3EY96FrwN 2bIeYzif5ZjvNmmiIhuFFJI7OPEkHi+bMrn1IOUsRnW2s7rUlwV5vfM7WwgC9V6D5Rlx moAyl96pcBczS381VMCmSDiS2sfcAzYAZM7mYrMKszl4vjgwW+T0Pi3E9zidT1FddN8i ABly/YjYZwi3xchI/Tr1XmaWpsvqYCpxMOL5kLS8UQyjBAL6FYX55/mFp38a6Yl5xNn0 z/NPabvVfQ/F9X21WNUR9/h/agrKIFneVwkpwoNrrUII6VfxKP9XtYetR3gW3dzRcB4E tnOQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jy7-20020a17090342c700b001b896086ca1si1738886plb.136.2023.08.18.11.02.28; Fri, 18 Aug 2023 11:02:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352495AbjHQO5I (ORCPT <rfc822;274620705z@gmail.com> + 99 others); Thu, 17 Aug 2023 10:57:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352474AbjHQO4p (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 17 Aug 2023 10:56:45 -0400 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55AA2358D; Thu, 17 Aug 2023 07:56:28 -0700 (PDT) Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-99c3c8adb27so1058627066b.1; Thu, 17 Aug 2023 07:56:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692284187; x=1692888987; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nQ0mVRxVAzaTLmEd/HemwV6+ljHNBK9BmdpK1e4n1Fk=; b=HhakUhuWD8IuqO5cUkbKW+BqG/HC7D/HtZSJNr27Ku0wGEnd3vu2KNLKr/ywy3oKCH eR8RJD/Nq9XCGNVEIMJcz1gNlkPjv8STlLavmrKS/Vl4cUlth7fKImD/uVTtd43QARK1 Oxm+r/aeAZ6u8wESyJMtdRJ1jd8hVQmxsp3dWodb+m8mthgCQBRRN2L4R0yBXXofy3Ad wIFb2roG7pE7mrsrRZtqmMPtDJrG79WYf4UqTY3YCNkCuRUqxhQBXMzyVS64XXlF65VL 37rvGg7odpOXQulsJjksBgFNQt6iiqr4m3LwbHNJ++pZGs1g2W0FR5ylRgeuvd0cZ2qZ vpcw== X-Gm-Message-State: AOJu0YxryfRWS1k02PbgLaJCUnSUSkcRcGeZtr+gae6bPi05Bsuo4dGC yzTS972Xw70hhyT+CTeYdNM= X-Received: by 2002:a17:906:9be4:b0:99e:46b:6a58 with SMTP id de36-20020a1709069be400b0099e046b6a58mr1629764ejc.37.1692284186466; Thu, 17 Aug 2023 07:56:26 -0700 (PDT) Received: from localhost (fwdproxy-cln-003.fbsv.net. [2a03:2880:31ff:3::face:b00c]) by smtp.gmail.com with ESMTPSA id a17-20020a1709062b1100b0099bd5b72d93sm10207857ejg.43.2023.08.17.07.56.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Aug 2023 07:56:26 -0700 (PDT) From: Breno Leitao <leitao@debian.org> To: sdf@google.com, axboe@kernel.dk, asml.silence@gmail.com, willemdebruijn.kernel@gmail.com, martin.lau@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, io-uring@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, krisman@suse.de Subject: [PATCH v3 8/9] io_uring/cmd: BPF hook for getsockopt cmd Date: Thu, 17 Aug 2023 07:55:53 -0700 Message-Id: <20230817145554.892543-9-leitao@debian.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230817145554.892543-1-leitao@debian.org> References: <20230817145554.892543-1-leitao@debian.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774590909003980043 X-GMAIL-MSGID: 1774590909003980043 |
Series |
[v3,1/9] bpf: Leverage sockptr_t in BPF getsockopt hook
|
|
Commit Message
Breno Leitao
Aug. 17, 2023, 2:55 p.m. UTC
Add BPF hook support for getsockopts io_uring command. So, BPF cgroups
programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed
through io_uring.
This implementation follows a similar approach to what
__sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of
kernel pointer.
Signed-off-by: Breno Leitao <leitao@debian.org>
---
io_uring/uring_cmd.c | 18 +++++++++++++-----
1 file changed, 13 insertions(+), 5 deletions(-)
Comments
... > Not really, sock->ops->getsockopt() does not suport sockptr_t, but > __user addresses, differently from setsockopt() > ... > int (*getsockopt)(struct socket *sock, int level, > int optname, char __user *optval, int __user *optlen); > > In order to be able to call sock->ops->getsockopt(), the callback > function will need to accepted sockptr. It is also worth looking at whether 'optlen' can be passed in as a numeric parameter and then returned on success. That would move the user copy into the syscall wrapper. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Mon, Aug 21, 2023 at 01:03:08PM -0400, Gabriel Krisman Bertazi wrote: > Breno Leitao <leitao@debian.org> writes: > > > On Thu, Aug 17, 2023 at 03:08:47PM -0400, Gabriel Krisman Bertazi wrote: > >> Breno Leitao <leitao@debian.org> writes: > >> > >> > Add BPF hook support for getsockopts io_uring command. So, BPF cgroups > >> > programs can run when SOCKET_URING_OP_GETSOCKOPT command is executed > >> > through io_uring. > >> > > >> > This implementation follows a similar approach to what > >> > __sys_getsockopt() does, but, using USER_SOCKPTR() for optval instead of > >> > kernel pointer. > >> > > >> > Signed-off-by: Breno Leitao <leitao@debian.org> > >> > --- > >> > io_uring/uring_cmd.c | 18 +++++++++++++----- > >> > 1 file changed, 13 insertions(+), 5 deletions(-) > >> > > >> > diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c > >> > index a567dd32df00..9e08a14760c3 100644 > >> > --- a/io_uring/uring_cmd.c > >> > +++ b/io_uring/uring_cmd.c > >> > @@ -5,6 +5,8 @@ > >> > #include <linux/io_uring.h> > >> > #include <linux/security.h> > >> > #include <linux/nospec.h> > >> > +#include <linux/compat.h> > >> > +#include <linux/bpf-cgroup.h> > >> > > >> > #include <uapi/linux/io_uring.h> > >> > #include <uapi/asm-generic/ioctls.h> > >> > @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, > >> > if (err) > >> > return err; > >> > > >> > - if (level == SOL_SOCKET) { > >> > + err = -EOPNOTSUPP; > >> > + if (level == SOL_SOCKET) > >> > err = sk_getsockopt(sock->sk, level, optname, > >> > USER_SOCKPTR(optval), > >> > KERNEL_SOCKPTR(&optlen)); > >> > - if (err) > >> > - return err; > >> > > >> > + if (!(issue_flags & IO_URING_F_COMPAT)) > >> > + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, > >> > + optname, > >> > + USER_SOCKPTR(optval), > >> > + KERNEL_SOCKPTR(&optlen), > >> > + optlen, err); > >> > + > >> > + if (!err) > >> > return optlen; > >> > - } > >> > >> Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to > >> running the hook? > >> Before this patch, it would bail out with EOPNOTSUPP, > >> but now the bpf hook gets called even for level!=SOL_SOCKET, which > >> doesn't fit __sys_getsockopt. Am I misreading the code? > > > > Not really, sock->ops->getsockopt() does not suport sockptr_t, but > > __user addresses, differently from setsockopt() > > > > int (*setsockopt)(struct socket *sock, int level, > > int optname, sockptr_t optval, > > unsigned int optlen); > > int (*getsockopt)(struct socket *sock, int level, > > int optname, char __user *optval, int __user *optlen); > > > > In order to be able to call sock->ops->getsockopt(), the callback > > function will need to accepted sockptr. > > So, it seems you won't support !SOL_SOCKETs here. Then, I think you > shouldn't call the hook for those sockets. My main concern is that we > remain compatible to __sys_getsockopt when invoking the hook. > > I think you should just have the following as the very first thing in > the function (but after the security_ check). > > if (level != SOL_SOCKET) > return -EOPNOTSUPP; Gotcha. I will update. Thanks!
On Mon, Aug 21, 2023 at 01:25:25PM -0700, Martin KaFai Lau wrote: > On 8/17/23 12:08 PM, Gabriel Krisman Bertazi wrote: > > Shouldn't you call sock->ops->getsockopt for level!=SOL_SOCKET prior to > > running the hook? Before this patch, it would bail out with EOPNOTSUPP, > > but now the bpf hook gets called even for level!=SOL_SOCKET, which > > doesn't fit __sys_getsockopt. Am I misreading the code? > I agree it should not call into bpf if the io_uring cannot support non > SOL_SOCKET optnames. Otherwise, the bpf prog will get different optval and > optlen when running in _sys_getsockopt vs io_uring getsockopt (e.g. in > regular _sys_getsockopt(SOL_TCP), bpf expects the optval returned from > tcp_getsockopt). > > I think __sys_getsockopt can also be refactored similar to __sys_setsockopt > in patch 3. Yes, for non SOL_SOCKET it only supports __user *optval and > __user *optlen but may be a WARN_ON_ONCE/BUG_ON(sockpt_is_kernel(optval)) > can be added before calling ops->getsockopt()? Then this details can be > hidden away from the io_uring. Right, I've spent some time thinking about it, and this could be done. This is a draft I have. Is it what you had in mind? -- diff --git a/include/linux/bpf-cgroup.h b/include/linux/bpf-cgroup.h index 5e3419eb267a..e39743f4ce5e 100644 --- a/include/linux/bpf-cgroup.h +++ b/include/linux/bpf-cgroup.h @@ -378,7 +378,7 @@ static inline bool cgroup_bpf_sock_enabled(struct sock *sk, ({ \ int __ret = 0; \ if (cgroup_bpf_enabled(CGROUP_GETSOCKOPT)) \ - get_user(__ret, optlen); \ + copy_from_sockptr(&__ret, optlen, sizeof(int)); \ __ret; \ }) diff --git a/include/net/sock.h b/include/net/sock.h index 2a0324275347..24ea1719fd02 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1855,6 +1855,8 @@ int sock_setsockopt(struct socket *sock, int level, int op, sockptr_t optval, unsigned int optlen); int do_sock_setsockopt(struct socket *sock, bool compat, int level, int optname, sockptr_t optval, int optlen); +int do_sock_getsockopt(struct socket *sock, bool compat, int level, + int optname, sockptr_t optval, sockptr_t optlen); int sk_getsockopt(struct sock *sk, int level, int optname, sockptr_t optval, sockptr_t optlen); diff --git a/net/core/sock.c b/net/core/sock.c index 9370fd50aa2c..2a5f30f14f5c 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1997,14 +1997,6 @@ int sk_getsockopt(struct sock *sk, int level, int optname, return 0; } -int sock_getsockopt(struct socket *sock, int level, int optname, - char __user *optval, int __user *optlen) -{ - return sk_getsockopt(sock->sk, level, optname, - USER_SOCKPTR(optval), - USER_SOCKPTR(optlen)); -} - /* * Initialize an sk_lock. * diff --git a/net/socket.c b/net/socket.c index b5e4398a6b4d..f0d6b6b1f75e 100644 --- a/net/socket.c +++ b/net/socket.c @@ -2290,6 +2290,40 @@ SYSCALL_DEFINE5(setsockopt, int, fd, int, level, int, optname, INDIRECT_CALLABLE_DECLARE(bool tcp_bpf_bypass_getsockopt(int level, int optname)); +int do_sock_getsockopt(struct socket *sock, bool compat, int level, + int optname, sockptr_t optval, sockptr_t optlen) +{ + int max_optlen __maybe_unused; + int err; + + err = security_socket_getsockopt(sock, level, optname); + if (err) + return err; + + if (level == SOL_SOCKET) { + err = sk_getsockopt(sock->sk, level, optname, optval, optlen); + } else if (unlikely(!sock->ops->getsockopt)) { + err = -EOPNOTSUPP; + } else { + if (WARN_ONCE(optval.is_kernel || optlen.is_kernel, + "Invalid argument type")) + return -EOPNOTSUPP; + + err = sock->ops->getsockopt(sock, level, optname, optval.user, + optlen.user); + } + + if (!compat) { + max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen); + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname, + optval, optlen, max_optlen, + err); + } + + return err; +} +EXPORT_SYMBOL(do_sock_getsockopt); + /* * Get a socket option. Because we don't know the option lengths we have * to pass a user mode parameter for the protocols to sort out. @@ -2297,35 +2331,17 @@ INDIRECT_CALLABLE_DECLARE(bool tcp_bpf_bypass_getsockopt(int level, int __sys_getsockopt(int fd, int level, int optname, char __user *optval, int __user *optlen) { - int max_optlen __maybe_unused; int err, fput_needed; + bool compat = in_compat_syscall(); struct socket *sock; sock = sockfd_lookup_light(fd, &err, &fput_needed); if (!sock) return err; - err = security_socket_getsockopt(sock, level, optname); - if (err) - goto out_put; - - if (!in_compat_syscall()) - max_optlen = BPF_CGROUP_GETSOCKOPT_MAX_OPTLEN(optlen); - - if (level == SOL_SOCKET) - err = sock_getsockopt(sock, level, optname, optval, optlen); - else if (unlikely(!sock->ops->getsockopt)) - err = -EOPNOTSUPP; - else - err = sock->ops->getsockopt(sock, level, optname, optval, - optlen); + err = do_sock_getsockopt(sock, compat, level, optname, + USER_SOCKPTR(optval), USER_SOCKPTR(optlen)); - if (!in_compat_syscall()) - err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, optname, - USER_SOCKPTR(optval), - USER_SOCKPTR(optlen), - max_optlen, err); -out_put: fput_light(sock->file, fput_needed); return err; }
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index a567dd32df00..9e08a14760c3 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -5,6 +5,8 @@ #include <linux/io_uring.h> #include <linux/security.h> #include <linux/nospec.h> +#include <linux/compat.h> +#include <linux/bpf-cgroup.h> #include <uapi/linux/io_uring.h> #include <uapi/asm-generic/ioctls.h> @@ -184,17 +186,23 @@ static inline int io_uring_cmd_getsockopt(struct socket *sock, if (err) return err; - if (level == SOL_SOCKET) { + err = -EOPNOTSUPP; + if (level == SOL_SOCKET) err = sk_getsockopt(sock->sk, level, optname, USER_SOCKPTR(optval), KERNEL_SOCKPTR(&optlen)); - if (err) - return err; + if (!(issue_flags & IO_URING_F_COMPAT)) + err = BPF_CGROUP_RUN_PROG_GETSOCKOPT(sock->sk, level, + optname, + USER_SOCKPTR(optval), + KERNEL_SOCKPTR(&optlen), + optlen, err); + + if (!err) return optlen; - } - return -EOPNOTSUPP; + return err; } static inline int io_uring_cmd_setsockopt(struct socket *sock,