Message ID | 20230510131527.1244929-1-aleksandr.mikhalitsyn@canonical.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3616004vqo; Wed, 10 May 2023 06:17:50 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6hPAuhY9JFKPlUeVLRsnGkNF7UNWY/lbM1HbOWrMJXWUEEGI8PndzmuUDrXkePgitltEhB X-Received: by 2002:a05:6a00:2194:b0:643:b653:3aa with SMTP id h20-20020a056a00219400b00643b65303aamr20779829pfi.32.1683724669618; Wed, 10 May 2023 06:17:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683724669; cv=none; d=google.com; s=arc-20160816; b=w8N4Axois7AHuM6EwtfniemmL+MbaZvz17+WSRDTY7mHFS+T1n/HWAPbK5vlMzTdyb hpAvnXWuvPDJHTyImthwucxMxq+xZZ9k6xKs4oII1oM2auhT/Cs54d+CjgvKrZWHs1vi 1yv+VP1VeJeUtOjFYTBLyCx0s6HpW+P1wfD7CL6Swbc6krtEIlDXfgD6Lrh97sfVsEmB cx782AP6smPbM5KEaHSEH0nJMF4rAq5In3zv1pBIOlzgnLrdyin7xnUqlP1h6A74AOit GoZVHboVtwmOxN7R3WVnvIXnVuv1oeIhSF9/5AbcSdHzxkF6CXg1iGsBWor2/13zN3Xo jHLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=a4+m+7dc0UGwNdAzW/ds7TFtd8vcV7ga23RII0fZY0k=; b=H8KlKxxNBIex2w5ViqNRoZZKukX+4HvfJw2l8U7ezVixlQgIrrBkSGJiJDH56ot/6j /qgOoNaSHqDn6LNzQAp0wmu99nW4ruXBlwtIcQsOHdnjhwtu3WuR4Ne1zRMyVjmkU85u 0GLohRWw/VgcPGpyzQ+5Iu6tEpbajVlQQ79nldwLO/iUzYNo18S5Ie7MgB7aKpvB+4pb eTxo2/DaroRhAnNJlPaPUIsOoyfM6V0JIac5KEJ+WW/iB8y+/3fPyf+GlVJDrpHIV06w VZy6q37l1v6dPE66DWbeEozqakw69xGYSPXt6sxZUBODpIukhZnE0mU7ZZe8Vu3JZSHi iQVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b=QnDwxcwb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q24-20020aa79618000000b00628217e3ea6si4821086pfg.316.2023.05.10.06.17.36; Wed, 10 May 2023 06:17:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@canonical.com header.s=20210705 header.b=QnDwxcwb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236672AbjEJNPv (ORCPT <rfc822;jeantsuru.cumc.mandola@gmail.com> + 99 others); Wed, 10 May 2023 09:15:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229740AbjEJNPt (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 10 May 2023 09:15:49 -0400 Received: from smtp-relay-internal-1.canonical.com (smtp-relay-internal-1.canonical.com [185.125.188.123]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DC09612A for <linux-kernel@vger.kernel.org>; Wed, 10 May 2023 06:15:46 -0700 (PDT) Received: from mail-ed1-f69.google.com (mail-ed1-f69.google.com [209.85.208.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 1CE2C3F4D6 for <linux-kernel@vger.kernel.org>; Wed, 10 May 2023 13:15:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1683724545; bh=a4+m+7dc0UGwNdAzW/ds7TFtd8vcV7ga23RII0fZY0k=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=QnDwxcwbqo0qcvmbuAkflNeP4iBu/+uu14fg1m+OR/wxY7VXnStVXHlHnR5f+u7gS 19TeDZuXTdBxaZWrQvvUcN8KNaj6cJ76l/aRFlncLW5sqI8Nl7qzd61XLhqHzMIfFX DPvnXQ21AJdTMwgyJxD5PNOtqa9VIo90yVsal+Gi8YboeSxJOo8SgFo5ZU+faE6jHW r8B1+LGY3HX5eBvDlbAqXVRX3hiQlgb0r+nWnTDbjo2jUfOHFzQA24SzMn+1dbSa3c AnFBtdrrs9OaSMH8A75J0PnzaRfLVv2eo0rHk1jSFg7p4/P5cpR+HLr7gEyVdNDoLJ fEQm44H3iMv9w== Received: by mail-ed1-f69.google.com with SMTP id 4fb4d7f45d1cf-50a16ab50e6so6777348a12.0 for <linux-kernel@vger.kernel.org>; Wed, 10 May 2023 06:15:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683724544; x=1686316544; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=a4+m+7dc0UGwNdAzW/ds7TFtd8vcV7ga23RII0fZY0k=; b=RLewV8CyO/AxCtZSnWFQhjS630QJyUttB1ZpfOBxUtOdfcHlC50jFZXzR1h6QUMyhh bhX2G6Q/IZJYc0WkP8GRpJPzzloKnddo8ju2GcL9SXiAQQnicdTj+AzJTK8VfMeTVJUf JDbkwpCZB6tnKK/xUNey2QJtvQS3d6bWeDq+bWN3RaZtvPflFXfBXW+nHDh3rU8sVVnl z7hSgb228ueneJF/jXOsqkUk63Jr1ydrPin09cBnrUjT4OBJadH78NjMZLV2XigCkHeA 503iuw1JUcFUy6oJz6c8te2KNXsJ6PIOJo1zZEaItqHJQxokfJRUPXlwYGHzIfOJGb6z ITdA== X-Gm-Message-State: AC+VfDyu27OURfpnk4whVK1MjSPn48KeLlr67rQwYcRW9hInh5KSDSiY 2n6xM9UEVYQvY8GE6m4lc4EZxPhbCL1r+6HpmoB/DJgZbkNSxPqLw/XH4sUtiPS5i08uK/RRk97 UecSoqTEiIIo3utHkbr465yr04zTZRDxlCf2ViqSg4lTkBmWO0Q== X-Received: by 2002:a05:6402:10cc:b0:502:2494:b8fc with SMTP id p12-20020a05640210cc00b005022494b8fcmr12892380edu.7.1683724544609; Wed, 10 May 2023 06:15:44 -0700 (PDT) X-Received: by 2002:a05:6402:10cc:b0:502:2494:b8fc with SMTP id p12-20020a05640210cc00b005022494b8fcmr12892359edu.7.1683724544287; Wed, 10 May 2023 06:15:44 -0700 (PDT) Received: from amikhalitsyn.. (ip5f5bf3d5.dynamic.kabel-deutschland.de. [95.91.243.213]) by smtp.gmail.com with ESMTPSA id e2-20020a50fb82000000b00509e3053b66sm1824750edq.90.2023.05.10.06.15.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 May 2023 06:15:43 -0700 (PDT) From: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> To: nhorman@tuxdriver.com Cc: davem@davemloft.net, Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>, Daniel Borkmann <daniel@iogearbox.net>, Christian Brauner <brauner@kernel.org>, Stanislav Fomichev <sdf@google.com>, Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>, Xin Long <lucien.xin@gmail.com>, linux-sctp@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org Subject: [PATCH net-next] sctp: add bpf_bypass_getsockopt proto callback Date: Wed, 10 May 2023 15:15:27 +0200 Message-Id: <20230510131527.1244929-1-aleksandr.mikhalitsyn@canonical.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765513279488311025?= X-GMAIL-MSGID: =?utf-8?q?1765513279488311025?= |
Series |
[net-next] sctp: add bpf_bypass_getsockopt proto callback
|
|
Commit Message
Aleksandr Mikhalitsyn
May 10, 2023, 1:15 p.m. UTC
Add bpf_bypass_getsockopt proto callback and filter out
SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options
from running eBPF hook on them.
These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT
hook returns an error after success of the original handler
sctp_getsockopt(...), userspace will receive an error from getsockopt
syscall and will be not aware that fd was successfully installed into fdtable.
This patch was born as a result of discussion around a new SCM_PIDFD interface:
https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/
Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks")
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Stanislav Fomichev <sdf@google.com>
Cc: Neil Horman <nhorman@tuxdriver.com>
Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Cc: linux-sctp@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: netdev@vger.kernel.org
Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com>
---
net/sctp/socket.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)
Comments
On Wed, May 10, 2023 at 6:15 AM Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> wrote: > > Add bpf_bypass_getsockopt proto callback and filter out > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > from running eBPF hook on them. > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > hook returns an error after success of the original handler > sctp_getsockopt(...), userspace will receive an error from getsockopt > syscall and will be not aware that fd was successfully installed into fdtable. > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") > Cc: Daniel Borkmann <daniel@iogearbox.net> > Cc: Christian Brauner <brauner@kernel.org> > Cc: Stanislav Fomichev <sdf@google.com> > Cc: Neil Horman <nhorman@tuxdriver.com> > Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > Cc: Xin Long <lucien.xin@gmail.com> > Cc: linux-sctp@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: netdev@vger.kernel.org > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> Acked-by: Stanislav Fomichev <sdf@google.com> with a small nit below > --- > net/sctp/socket.c | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > index cda8c2874691..a9a0ababea90 100644 > --- a/net/sctp/socket.c > +++ b/net/sctp/socket.c > @@ -8281,6 +8281,29 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname, > return retval; > } > [...] > +bool sctp_bpf_bypass_getsockopt(int level, int optname) static bool ... ? You're not making it indirect-callable, so seems fine to keep private to this compilation unit? > +{ > + /* > + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > + * hook returns an error after success of the original handler > + * sctp_getsockopt(...), userspace will receive an error from getsockopt > + * syscall and will be not aware that fd was successfully installed into fdtable. > + * > + * Let's prevent bpf cgroup hook from running on them. > + */ > + if (level == SOL_SCTP) { > + switch (optname) { > + case SCTP_SOCKOPT_PEELOFF: > + case SCTP_SOCKOPT_PEELOFF_FLAGS: > + return true; > + default: > + return false; > + } > + } > + > + return false; > +} > + > static int sctp_hash(struct sock *sk) > { > /* STUB */ > @@ -9650,6 +9673,7 @@ struct proto sctp_prot = { > .shutdown = sctp_shutdown, > .setsockopt = sctp_setsockopt, > .getsockopt = sctp_getsockopt, > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > .sendmsg = sctp_sendmsg, > .recvmsg = sctp_recvmsg, > .bind = sctp_bind, > @@ -9705,6 +9729,7 @@ struct proto sctpv6_prot = { > .shutdown = sctp_shutdown, > .setsockopt = sctp_setsockopt, > .getsockopt = sctp_getsockopt, > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > .sendmsg = sctp_sendmsg, > .recvmsg = sctp_recvmsg, > .bind = sctp_bind, > -- > 2.34.1 >
On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > Add bpf_bypass_getsockopt proto callback and filter out > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > from running eBPF hook on them. > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > hook returns an error after success of the original handler > sctp_getsockopt(...), userspace will receive an error from getsockopt > syscall and will be not aware that fd was successfully installed into fdtable. > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ I read some of the emails in there but I don't get why the fd leak is special here. I mean, I get that it leaks, but masking the error return like this can lead to several other problems in the application as well. For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it failed, and the hook returns success, the user app will at least log a wrong "connection successful". If the hook can't be responsible for cleaning up before returning a different value, then maybe we want to extend the list of sockopts in here. AFAICT these would be the 3 most critical sockopts.
On Wed, May 10, 2023 at 4:32 PM Stanislav Fomichev <sdf@google.com> wrote: > > On Wed, May 10, 2023 at 6:15 AM Alexander Mikhalitsyn > <aleksandr.mikhalitsyn@canonical.com> wrote: > > > > Add bpf_bypass_getsockopt proto callback and filter out > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > from running eBPF hook on them. > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > hook returns an error after success of the original handler > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > > > Fixes: 0d01da6afc54 ("bpf: implement getsockopt and setsockopt hooks") > > Cc: Daniel Borkmann <daniel@iogearbox.net> > > Cc: Christian Brauner <brauner@kernel.org> > > Cc: Stanislav Fomichev <sdf@google.com> > > Cc: Neil Horman <nhorman@tuxdriver.com> > > Cc: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> > > Cc: Xin Long <lucien.xin@gmail.com> > > Cc: linux-sctp@vger.kernel.org > > Cc: linux-kernel@vger.kernel.org > > Cc: netdev@vger.kernel.org > > Signed-off-by: Alexander Mikhalitsyn <aleksandr.mikhalitsyn@canonical.com> > > Acked-by: Stanislav Fomichev <sdf@google.com> > > with a small nit below Hi Stanislav! Thanks for your review. I've also added a Suggested-by tag with your name in -v2, because you've given me this idea to use bpf_bypass_getsockopt. Hope that you are comfortable with it. > > > --- > > net/sctp/socket.c | 25 +++++++++++++++++++++++++ > > 1 file changed, 25 insertions(+) > > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > > index cda8c2874691..a9a0ababea90 100644 > > --- a/net/sctp/socket.c > > +++ b/net/sctp/socket.c > > @@ -8281,6 +8281,29 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname, > > return retval; > > } > > > > [...] > > > +bool sctp_bpf_bypass_getsockopt(int level, int optname) > > static bool ... ? You're not making it indirect-callable, so seems > fine to keep private to this compilation unit? Sure, my bad. Fixed in v2. Kind regards, Alex > > > +{ > > + /* > > + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > + * hook returns an error after success of the original handler > > + * sctp_getsockopt(...), userspace will receive an error from getsockopt > > + * syscall and will be not aware that fd was successfully installed into fdtable. > > + * > > + * Let's prevent bpf cgroup hook from running on them. > > + */ > > + if (level == SOL_SCTP) { > > + switch (optname) { > > + case SCTP_SOCKOPT_PEELOFF: > > + case SCTP_SOCKOPT_PEELOFF_FLAGS: > > + return true; > > + default: > > + return false; > > + } > > + } > > + > > + return false; > > +} > > + > > static int sctp_hash(struct sock *sk) > > { > > /* STUB */ > > @@ -9650,6 +9673,7 @@ struct proto sctp_prot = { > > .shutdown = sctp_shutdown, > > .setsockopt = sctp_setsockopt, > > .getsockopt = sctp_getsockopt, > > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > > .sendmsg = sctp_sendmsg, > > .recvmsg = sctp_recvmsg, > > .bind = sctp_bind, > > @@ -9705,6 +9729,7 @@ struct proto sctpv6_prot = { > > .shutdown = sctp_shutdown, > > .setsockopt = sctp_setsockopt, > > .getsockopt = sctp_getsockopt, > > + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, > > .sendmsg = sctp_sendmsg, > > .recvmsg = sctp_recvmsg, > > .bind = sctp_bind, > > -- > > 2.34.1 > >
On Wed, May 10, 2023 at 4:39 PM Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > > On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > > Add bpf_bypass_getsockopt proto callback and filter out > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > from running eBPF hook on them. > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > hook returns an error after success of the original handler > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > I read some of the emails in there but I don't get why the fd leak is > special here. I mean, I get that it leaks, but masking the error > return like this can lead to several other problems in the application > as well. > > For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it > failed, and the hook returns success, the user app will at least log a > wrong "connection successful". > > If the hook can't be responsible for cleaning up before returning a > different value, then maybe we want to extend the list of sockopts in > here. AFAICT these would be the 3 most critical sockopts. > Dear Marcelo, Thanks for pointing this out. Initially this problem was discovered by Christian Brauner and for SO_PEERPIDFD (a new SOL_SOCKET option that we want to add), after this I decided to check if we do fd_install in any other socket options in the kernel and found that we have 2 cases in SCTP. It was an accidental finding. :) So, this patch isn't specific to fd_install things and probably we should filter out bpf hook from being called for other socket options as well. So, I need to filter out SCTP_SOCKOPT_CONNECTX3 and SCTP_SOCKOPT_PEELOFF* for SCTP, right? Kind regards, Alex
Hi Alexander, kernel test robot noticed the following build warnings: [auto build test WARNING on net-next/main] url: https://github.com/intel-lab-lkp/linux/commits/Alexander-Mikhalitsyn/sctp-add-bpf_bypass_getsockopt-proto-callback/20230510-211646 base: net-next/main patch link: https://lore.kernel.org/r/20230510131527.1244929-1-aleksandr.mikhalitsyn%40canonical.com patch subject: [PATCH net-next] sctp: add bpf_bypass_getsockopt proto callback config: m68k-defconfig (https://download.01.org/0day-ci/archive/20230510/202305102234.u4T0ut0T-lkp@intel.com/config) compiler: m68k-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/8ad9818b4b74026fe549b2aa34ea800ab6c8e66d git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Alexander-Mikhalitsyn/sctp-add-bpf_bypass_getsockopt-proto-callback/20230510-211646 git checkout 8ad9818b4b74026fe549b2aa34ea800ab6c8e66d # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=m68k SHELL=/bin/bash net/sctp/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> | Link: https://lore.kernel.org/oe-kbuild-all/202305102234.u4T0ut0T-lkp@intel.com/ All warnings (new ones prefixed by >>): >> net/sctp/socket.c:8284:6: warning: no previous prototype for 'sctp_bpf_bypass_getsockopt' [-Wmissing-prototypes] 8284 | bool sctp_bpf_bypass_getsockopt(int level, int optname) | ^~~~~~~~~~~~~~~~~~~~~~~~~~ vim +/sctp_bpf_bypass_getsockopt +8284 net/sctp/socket.c 8283 > 8284 bool sctp_bpf_bypass_getsockopt(int level, int optname) 8285 { 8286 /* 8287 * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT 8288 * hook returns an error after success of the original handler 8289 * sctp_getsockopt(...), userspace will receive an error from getsockopt 8290 * syscall and will be not aware that fd was successfully installed into fdtable. 8291 * 8292 * Let's prevent bpf cgroup hook from running on them. 8293 */ 8294 if (level == SOL_SCTP) { 8295 switch (optname) { 8296 case SCTP_SOCKOPT_PEELOFF: 8297 case SCTP_SOCKOPT_PEELOFF_FLAGS: 8298 return true; 8299 default: 8300 return false; 8301 } 8302 } 8303 8304 return false; 8305 } 8306
On Wed, May 10, 2023 at 04:55:37PM +0200, Aleksandr Mikhalitsyn wrote: > On Wed, May 10, 2023 at 4:39 PM Marcelo Ricardo Leitner > <marcelo.leitner@gmail.com> wrote: > > > > On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > > > Add bpf_bypass_getsockopt proto callback and filter out > > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > > from running eBPF hook on them. > > > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > > hook returns an error after success of the original handler > > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > > > I read some of the emails in there but I don't get why the fd leak is > > special here. I mean, I get that it leaks, but masking the error > > return like this can lead to several other problems in the application > > as well. > > > > For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it > > failed, and the hook returns success, the user app will at least log a > > wrong "connection successful". > > > > If the hook can't be responsible for cleaning up before returning a > > different value, then maybe we want to extend the list of sockopts in > > here. AFAICT these would be the 3 most critical sockopts. > > > > Dear Marcelo, Hello! > > Thanks for pointing this out. Initially this problem was discovered by > Christian Brauner and for SO_PEERPIDFD (a new SOL_SOCKET option that > we want to add), > after this I decided to check if we do fd_install in any other socket > options in the kernel and found that we have 2 cases in SCTP. It was > an accidental finding. :) > > So, this patch isn't specific to fd_install things and probably we > should filter out bpf hook from being called for other socket options > as well. Understood. > > So, I need to filter out SCTP_SOCKOPT_CONNECTX3 and > SCTP_SOCKOPT_PEELOFF* for SCTP, right? Gotta say, it seems weird that it will filter out some of the most important sockopts. But I'm not acquainted to bpf hooks so I won't question (much? :) ) that. Considering that filtering is needed, then yes, on getsock those are ones I'm seeing that needs filtering. Otherwise they will either trigger leakage or will confuse the application. Should we care about setsock as well? We have SCTP_SOCKOPT_CONNECTX and SCTP_SOCKOPT_CONNECTX_OLD in there, and well, I guess any of those would misbehave if they failed and yet the hook returns success. Thanks, Marcelo
On Wed, May 10, 2023 at 8:18 AM Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> wrote: > > On Wed, May 10, 2023 at 04:55:37PM +0200, Aleksandr Mikhalitsyn wrote: > > On Wed, May 10, 2023 at 4:39 PM Marcelo Ricardo Leitner > > <marcelo.leitner@gmail.com> wrote: > > > > > > On Wed, May 10, 2023 at 03:15:27PM +0200, Alexander Mikhalitsyn wrote: > > > > Add bpf_bypass_getsockopt proto callback and filter out > > > > SCTP_SOCKOPT_PEELOFF and SCTP_SOCKOPT_PEELOFF_FLAGS socket options > > > > from running eBPF hook on them. > > > > > > > > These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT > > > > hook returns an error after success of the original handler > > > > sctp_getsockopt(...), userspace will receive an error from getsockopt > > > > syscall and will be not aware that fd was successfully installed into fdtable. > > > > > > > > This patch was born as a result of discussion around a new SCM_PIDFD interface: > > > > https://lore.kernel.org/all/20230413133355.350571-3-aleksandr.mikhalitsyn@canonical.com/ > > > > > > I read some of the emails in there but I don't get why the fd leak is > > > special here. I mean, I get that it leaks, but masking the error > > > return like this can lead to several other problems in the application > > > as well. > > > > > > For example, SCTP_SOCKOPT_CONNECTX3 will trigger a connect(). If it > > > failed, and the hook returns success, the user app will at least log a > > > wrong "connection successful". > > > > > > If the hook can't be responsible for cleaning up before returning a > > > different value, then maybe we want to extend the list of sockopts in > > > here. AFAICT these would be the 3 most critical sockopts. > > > > > > > Dear Marcelo, > > Hello! > > > > > Thanks for pointing this out. Initially this problem was discovered by > > Christian Brauner and for SO_PEERPIDFD (a new SOL_SOCKET option that > > we want to add), > > after this I decided to check if we do fd_install in any other socket > > options in the kernel and found that we have 2 cases in SCTP. It was > > an accidental finding. :) > > > > So, this patch isn't specific to fd_install things and probably we > > should filter out bpf hook from being called for other socket options > > as well. > > Understood. > > > > > So, I need to filter out SCTP_SOCKOPT_CONNECTX3 and > > SCTP_SOCKOPT_PEELOFF* for SCTP, right? > > Gotta say, it seems weird that it will filter out some of the most > important sockopts. But I'm not acquainted to bpf hooks so I won't > question (much? :) ) that. Thanks for raising this. Alexander, maybe you can respin your v2 to include these as well? > Considering that filtering is needed, then yes, on getsock those are > ones I'm seeing that needs filtering. Otherwise they will either > trigger leakage or will confuse the application. [..] > Should we care about setsock as well? We have SCTP_SOCKOPT_CONNECTX > and SCTP_SOCKOPT_CONNECTX_OLD in there, and well, I guess any of those > would misbehave if they failed and yet the hook returns success. For setsockopt, the bpf program runs before the kernel, so setsockopt shouldn't have those issues we're observing with getsockopt (which runs after the kernel and has an option to ignore kernel value). > Thanks, > Marcelo
diff --git a/net/sctp/socket.c b/net/sctp/socket.c index cda8c2874691..a9a0ababea90 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -8281,6 +8281,29 @@ static int sctp_getsockopt(struct sock *sk, int level, int optname, return retval; } +bool sctp_bpf_bypass_getsockopt(int level, int optname) +{ + /* + * These options do fd_install(), and if BPF_CGROUP_RUN_PROG_GETSOCKOPT + * hook returns an error after success of the original handler + * sctp_getsockopt(...), userspace will receive an error from getsockopt + * syscall and will be not aware that fd was successfully installed into fdtable. + * + * Let's prevent bpf cgroup hook from running on them. + */ + if (level == SOL_SCTP) { + switch (optname) { + case SCTP_SOCKOPT_PEELOFF: + case SCTP_SOCKOPT_PEELOFF_FLAGS: + return true; + default: + return false; + } + } + + return false; +} + static int sctp_hash(struct sock *sk) { /* STUB */ @@ -9650,6 +9673,7 @@ struct proto sctp_prot = { .shutdown = sctp_shutdown, .setsockopt = sctp_setsockopt, .getsockopt = sctp_getsockopt, + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, .sendmsg = sctp_sendmsg, .recvmsg = sctp_recvmsg, .bind = sctp_bind, @@ -9705,6 +9729,7 @@ struct proto sctpv6_prot = { .shutdown = sctp_shutdown, .setsockopt = sctp_setsockopt, .getsockopt = sctp_getsockopt, + .bpf_bypass_getsockopt = sctp_bpf_bypass_getsockopt, .sendmsg = sctp_sendmsg, .recvmsg = sctp_recvmsg, .bind = sctp_bind,