From patchwork Wed Nov 1 18:01:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160628 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605109vqx; Wed, 1 Nov 2023 11:03:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG2T6KyVleLpVFXpo510tCUBb5iUKvPMm2ZpNP2YAfj13h04Y+tLPKLWFg7MWtjGNv7vMMQ X-Received: by 2002:a17:902:d484:b0:1cc:361b:7b28 with SMTP id c4-20020a170902d48400b001cc361b7b28mr11357066plg.64.1698861809358; Wed, 01 Nov 2023 11:03:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861809; cv=none; d=google.com; s=arc-20160816; b=UGlmdTTH6gcd70G+WT5FORKSrApKN66AiTl4Oz7LDshyld8wZEMCYISAqm32idwe3w /Tjf5DS/0VrC0XEnI7tD2gNn4nm+IC2500iI926PzLOJdHMDTHaTCSi5N3WUouVx0iFA iIflyPZXuh09eR8joPAEfQH8snezgJR/nJxORPB9+n8NbFO1Wckb7LdmzlVNhRpdYjuy JXy4k92SPvL+GUhd1vsHaDjcvl6+n/5+Yo8njt8Uj9sm3JauGbtQPovYCxlUkCTVYnKh xaLbem6f326QoarMqBShVQVfcsPYTD+cl4R+4v7tomiUrt8Z0pKKsFuWEb7INj3CUeTJ oDAQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=yccWDwXNSEplPRKvHpKQnUcNVcHf6PL+CKwkhXIxnDA=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=jBSqvkrTq4DS4mvapgOgUcxA113NWhelEqYIgyZlDAWXz8FacAAuTpMwkFgE1fyTbm hCAQyVARu9gdGhvvNYoMOxoTzcEmwqCyzD20KJeCmRU4xtWqFqPVcmSyZZSSkdI7E4hV h6YZTZZuz+N/FkE1sH35/pCRpSs+CIZFpSOQcQi8bnuZe+3rSJSWageGm2muDt2LJBOI ZUsermJDF+ert7RtoCQHEs++Qa635qNTE36SR6MpGroQHbKRMCA1KGg4gddT74u+sv2R sojbnbt4XMdBAz9NrJg3GoJwxzM2nmgdpV/5REC/WxpeBuRqebf2j/ucOR07N9gjMTo+ CEUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3I2kxRde; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id p12-20020a170902e74c00b001cc5029e3c1si3806239plf.370.2023.11.01.11.03.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:03:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3I2kxRde; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id B501780C7715; Wed, 1 Nov 2023 11:03:05 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344715AbjKASC1 (ORCPT + 34 others); Wed, 1 Nov 2023 14:02:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34248 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344629AbjKASC0 (ORCPT ); Wed, 1 Nov 2023 14:02:26 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 25A29109 for ; Wed, 1 Nov 2023 11:02:19 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-da3dd6a72a7so37047276.0 for ; Wed, 01 Nov 2023 11:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861738; x=1699466538; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yccWDwXNSEplPRKvHpKQnUcNVcHf6PL+CKwkhXIxnDA=; b=3I2kxRdeBmgOaPOh6ks67zMG0Efb6AsWHbT2L+Hj5nP8Fwcoro0S7pMagc8TwwIUv3 KXKymBIjCi+U50P244JW3hE3n6B62Z843BWiD0JYrhjj862XfEGTyqCIwFr09jlWU8pA QjwJj0kMNPx8SFWg3K5iZmDcqiGSNMSfqMqFbMstl+JIbY8IUL9k9+dwXEITuPY1gpz4 HG9HU2M6rbH/d3d48KYKMTis3R8eVbIb8nipPzcI9PQAnIf6+bxsifEx9ArtYhFmaGd3 1i/Ao0UPwZTWct8+fICPFsva/yNwLttU6yYJwYSRUhMnz7tL+jJ0mITCPOuRlbotINoj hyig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861738; x=1699466538; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yccWDwXNSEplPRKvHpKQnUcNVcHf6PL+CKwkhXIxnDA=; b=McZ1p7rFXRgYbxWr40AGqSYaBivqXE+zoZiyjgIIKFOJC078eCcmi1swsfgFTIvK3M 11Uzg92xkWSIIrWUsM5d6Afk/PhFe7HKhnTSaoitnbmtcK2c8QJqaDS5tQat1SmEi0rq HY9tknAAADiPlauNjYlVpUrFH6AOEriGNFzsGK2qKrvsytTmJJGzbNvgkN8v6ZoM+/JK bOM36NRXfwQiKW9N7nmhEVUsTj1SxPAZzIYI0Q8ANQ2dvH6qXwHctMhEXlkyP6zxe97Z GqPquDH1A+mSiwIYyZ4wLn+8m4z6q0lupK1qCOG9LHA6PvOT/htWIAU2PPCcTVebBZVS XAug== X-Gm-Message-State: AOJu0Yxyy6sJD6ZKFA8rWFZIiGbUME0LK/sBheluX2MaBs8EhDkYQ9ly zcEAd3RXSEGuul2+HsJeDFzTyv8BItrazmk= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6902:1083:b0:da0:567d:f819 with SMTP id v3-20020a056902108300b00da0567df819mr389299ybu.10.1698861738397; Wed, 01 Nov 2023 11:02:18 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:31 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-1-08ba9197f637@google.com> Subject: [PATCH RFC 01/20] rust_binder: define a Rust binder driver From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:05 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385720860643331 X-GMAIL-MSGID: 1781385720860643331 From: Wedson Almeida Filho Define the Rust binder driver, and set up the helpers for making C types accessible from Rust. Signed-off-by: Wedson Almeida Filho Co-developed-by: Alice Ryhl Signed-off-by: Alice Ryhl --- drivers/android/Kconfig | 11 +++++++++++ drivers/android/Makefile | 1 + drivers/android/rust_binder.rs | 21 +++++++++++++++++++++ include/uapi/linux/android/binder.h | 30 ++++++++++++++++-------------- rust/bindings/bindings_helper.h | 1 + 5 files changed, 50 insertions(+), 14 deletions(-) diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig index 07aa8ae0a058..fcfd25c9a016 100644 --- a/drivers/android/Kconfig +++ b/drivers/android/Kconfig @@ -13,6 +13,17 @@ config ANDROID_BINDER_IPC Android process, using Binder to identify, invoke and pass arguments between said processes. +config ANDROID_BINDER_IPC_RUST + bool "Android Binder IPC Driver in Rust" + depends on MMU && RUST + help + Binder is used in Android for both communication between processes, + and remote method invocation. + + This means one Android process can call a method/routine in another + Android process, using Binder to identify, invoke and pass arguments + between said processes. + config ANDROID_BINDERFS bool "Android Binderfs filesystem" depends on ANDROID_BINDER_IPC diff --git a/drivers/android/Makefile b/drivers/android/Makefile index c9d3d0c99c25..6348f75832ca 100644 --- a/drivers/android/Makefile +++ b/drivers/android/Makefile @@ -4,3 +4,4 @@ ccflags-y += -I$(src) # needed for trace events obj-$(CONFIG_ANDROID_BINDERFS) += binderfs.o obj-$(CONFIG_ANDROID_BINDER_IPC) += binder.o binder_alloc.o obj-$(CONFIG_ANDROID_BINDER_IPC_SELFTEST) += binder_alloc_selftest.o +obj-$(CONFIG_ANDROID_BINDER_IPC_RUST) += rust_binder.o diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs new file mode 100644 index 000000000000..4b3d6676a9cf --- /dev/null +++ b/drivers/android/rust_binder.rs @@ -0,0 +1,21 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Binder -- the Android IPC mechanism. + +use kernel::prelude::*; + +module! { + type: BinderModule, + name: "rust_binder", + author: "Wedson Almeida Filho, Alice Ryhl", + description: "Android Binder", + license: "GPL", +} + +struct BinderModule {} + +impl kernel::Module for BinderModule { + fn init(_module: &'static kernel::ThisModule) -> Result { + Ok(Self {}) + } +} diff --git a/include/uapi/linux/android/binder.h b/include/uapi/linux/android/binder.h index 5f636b5afcd7..d44a8118b2ed 100644 --- a/include/uapi/linux/android/binder.h +++ b/include/uapi/linux/android/binder.h @@ -251,20 +251,22 @@ struct binder_extended_error { __s32 param; }; -#define BINDER_WRITE_READ _IOWR('b', 1, struct binder_write_read) -#define BINDER_SET_IDLE_TIMEOUT _IOW('b', 3, __s64) -#define BINDER_SET_MAX_THREADS _IOW('b', 5, __u32) -#define BINDER_SET_IDLE_PRIORITY _IOW('b', 6, __s32) -#define BINDER_SET_CONTEXT_MGR _IOW('b', 7, __s32) -#define BINDER_THREAD_EXIT _IOW('b', 8, __s32) -#define BINDER_VERSION _IOWR('b', 9, struct binder_version) -#define BINDER_GET_NODE_DEBUG_INFO _IOWR('b', 11, struct binder_node_debug_info) -#define BINDER_GET_NODE_INFO_FOR_REF _IOWR('b', 12, struct binder_node_info_for_ref) -#define BINDER_SET_CONTEXT_MGR_EXT _IOW('b', 13, struct flat_binder_object) -#define BINDER_FREEZE _IOW('b', 14, struct binder_freeze_info) -#define BINDER_GET_FROZEN_INFO _IOWR('b', 15, struct binder_frozen_status_info) -#define BINDER_ENABLE_ONEWAY_SPAM_DETECTION _IOW('b', 16, __u32) -#define BINDER_GET_EXTENDED_ERROR _IOWR('b', 17, struct binder_extended_error) +enum { + BINDER_WRITE_READ = _IOWR('b', 1, struct binder_write_read), + BINDER_SET_IDLE_TIMEOUT = _IOW('b', 3, __s64), + BINDER_SET_MAX_THREADS = _IOW('b', 5, __u32), + BINDER_SET_IDLE_PRIORITY = _IOW('b', 6, __s32), + BINDER_SET_CONTEXT_MGR = _IOW('b', 7, __s32), + BINDER_THREAD_EXIT = _IOW('b', 8, __s32), + BINDER_VERSION = _IOWR('b', 9, struct binder_version), + BINDER_GET_NODE_DEBUG_INFO = _IOWR('b', 11, struct binder_node_debug_info), + BINDER_GET_NODE_INFO_FOR_REF = _IOWR('b', 12, struct binder_node_info_for_ref), + BINDER_SET_CONTEXT_MGR_EXT = _IOW('b', 13, struct flat_binder_object), + BINDER_FREEZE = _IOW('b', 14, struct binder_freeze_info), + BINDER_GET_FROZEN_INFO = _IOWR('b', 15, struct binder_frozen_status_info), + BINDER_ENABLE_ONEWAY_SPAM_DETECTION = _IOW('b', 16, __u32), + BINDER_GET_EXTENDED_ERROR = _IOWR('b', 17, struct binder_extended_error), +}; /* * NOTE: Two special error codes you should check for when calling diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 14f84aeef62d..00a66666f00a 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -21,6 +21,7 @@ #include #include #include +#include /* `bindgen` gets confused at certain things. */ const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN; From patchwork Wed Nov 1 18:01:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160632 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605503vqx; Wed, 1 Nov 2023 11:04:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IERqi94shg7EyBtvbQ36ZoTpaIEsw9nzWsbsz1FdxotwFe1wZJPk8OhoITGCqLGxCZewgt0 X-Received: by 2002:a05:6a20:3d04:b0:129:3bb4:77f1 with SMTP id y4-20020a056a203d0400b001293bb477f1mr17375354pzi.0.1698861840646; Wed, 01 Nov 2023 11:04:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861840; cv=none; d=google.com; s=arc-20160816; b=aPB9UPizOO/8m91NjlMcYzaV/XiU63WY8JFvF+o2ec1lvlyd8jktQmBt0yyPIbFQ6l BlOSDSQJq/EBH/L+BOipFPcqYqqJGgB0iSWEoUbCWBGlhWFsL8HMF0FC/Bqw/TRM8G55 ThhcdKwT45jxtdfhwvKexb5+IlG2w/tZdr/HAT4Mh2dXdrVxWun2pbusCN26EP3pcsnW 4bNL7zfAdf6LQmRtQFSxNpukFvvLgfRpi1PirIHDMEkhaePwpewmsGK000/9I8nLkkKb hrL/CS4O1Nox0k3Wan8rJx5OcDOL5mIoEHUdFV8lt4kvVira6qLjjeeJr7mRplGYvkej bdyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=QWHuDeEW+muTS4+d83vSpQZhUT21LbllHHwugGQQp2M=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=fKJ4x35gEPslFC6W3W3hTYE7F4HoO4j21tqHCyvMI1CFMW+1RF+LVdy3f3Y6P2tXV5 NImpINiJlVke5riY0NgWI27FgsASkG++IUBgXZAbpdKBYZTFci/dW7d9f4T8e8cgsb3u LQkorUo1C54f40nbhxM1E3zmU65wbeYNyCG3XpEAHlpSsgXvCoSYy8Icp7/ZHegHLLEv n58nqB31+s2GuYV5wNiVSZWeAQl9QO1673qpj6Z0nzVzxOyhTz/e68xznolWT7+1Q6Wi kVOaWFaPRCqBxoQTKv7NRAWYohsniG6f++B+Jz0p0ted4I82BU++GDWf9JVGDO/9+JAP Sing== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=nUASRUhm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id e13-20020a656bcd000000b005b909e93e2dsi377518pgw.522.2023.11.01.11.03.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=nUASRUhm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id CDF8B80278AD; Wed, 1 Nov 2023 11:03:25 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344803AbjKASCj (ORCPT + 34 others); Wed, 1 Nov 2023 14:02:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58880 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344759AbjKASC3 (ORCPT ); Wed, 1 Nov 2023 14:02:29 -0400 Received: from mail-ej1-x64a.google.com (mail-ej1-x64a.google.com [IPv6:2a00:1450:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60A55115 for ; Wed, 1 Nov 2023 11:02:22 -0700 (PDT) Received: by mail-ej1-x64a.google.com with SMTP id a640c23a62f3a-9d2946be350so4293666b.3 for ; Wed, 01 Nov 2023 11:02:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861741; x=1699466541; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QWHuDeEW+muTS4+d83vSpQZhUT21LbllHHwugGQQp2M=; b=nUASRUhmngM8h3y6mw/RT7KL+zBn6KCZhfcnq/2CUNPNMATUd6U/jM9mqprvBmjPzO 5ts5l/20GVyjG6zn6/V6koYlKSt+OGt5EvfmErzXGkkcU/+u4XaFUmHclVilcum2G76W 1ynkwy06rBtZ0UXDQxA2IHOS5hOJiPes9tsc2zd52XAeVSz+FbKk7LlVWar/Y/T5eUsu QyrvNLIg9WSG9jPPvCSzDZcu1o5XsR9AUxd7GEj+pRPRpsIuIWUzv/1hjb72wxOuxIsP tx1xyZ2yNobd3gd99E9H6x4JWbgsvf2Xqp0qHUfvhAxeWHCsWIxuHcYG0s+d2cO7AqIp m7kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861741; x=1699466541; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QWHuDeEW+muTS4+d83vSpQZhUT21LbllHHwugGQQp2M=; b=BcLO+0UWeuKniwGuXjXL/PXsGBTmeUpz/AhjISZxXYF+vgnLFJiyay1euCSPfwnOzA zGhVaqwD3ApRAwawS5XpDhC6bYV/tXP2YiZbJo32J2nK4I4kI8rXD16imzN1//WpgyeC F8bN/0Dq3+pjVUOMyi1J1JP7kgd3hprRVVJvbdyIepv0KhsSLqZQ8xoJJjwWOcgSUdz5 sZ4O+kBp9QZGr8iXvnICSwcHcheJesNmESeVUisoymdXCzscZMheGlCn/mDMgeDdKohH ge8bxgLivjg75baRfS94SCrWnr0wE8JhVT7uoaMhcLV3BmLCMM9Zb3Rena3VpX+R7d07 bv5w== X-Gm-Message-State: AOJu0YyMfeBY3wfkp1ymsjlzrmfRHSfJwvOqP2gNApIMM2PfysRhLINQ fDXLV4yZFKZ6J27+tlUbXXQUPcMUJHt9cfY= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a17:906:e216:b0:9ad:ad95:4b90 with SMTP id gf22-20020a170906e21600b009adad954b90mr24091ejb.9.1698861740722; Wed, 01 Nov 2023 11:02:20 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:32 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-2-08ba9197f637@google.com> Subject: [PATCH RFC 02/20] rust_binder: add binderfs support to Rust binder From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:25 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385753571398919 X-GMAIL-MSGID: 1781385753571398919 Add support for accessing the Rust binder driver via binderfs. The actual binderfs implementation is done entirely in C, and the `rust_binderfs.c` file is a modified version of `binderfs.c` that is adjusted to call into the Rust binder driver rather than the C driver. We have left the binderfs filesystem component in C. Rewriting it in Rust would be a large amount of work and requires a lot of bindings to the file system interfaces. Binderfs has not historically had the same challenges with security and complexity, so rewriting Binderfs seems to have lower value than the rest of Binder. We also add code on the Rust side for binderfs to call into. Most of this is left as stub implementation, with the exception of closing the file descriptor and the BINDER_VERSION ioctl. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Signed-off-by: Alice Ryhl --- drivers/android/Kconfig | 24 ++ drivers/android/Makefile | 1 + drivers/android/context.rs | 144 +++++++ drivers/android/defs.rs | 39 ++ drivers/android/process.rs | 251 ++++++++++++ drivers/android/rust_binder.rs | 196 ++++++++- drivers/android/rust_binderfs.c | 866 ++++++++++++++++++++++++++++++++++++++++ include/linux/rust_binder.h | 16 + include/uapi/linux/magic.h | 1 + rust/bindings/bindings_helper.h | 2 + rust/kernel/lib.rs | 7 + scripts/Makefile.build | 2 +- 12 files changed, 1547 insertions(+), 2 deletions(-) diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig index fcfd25c9a016..82ed6ddabe1a 100644 --- a/drivers/android/Kconfig +++ b/drivers/android/Kconfig @@ -36,6 +36,18 @@ config ANDROID_BINDERFS It can be used to dynamically allocate new binder IPC devices via ioctls. +config ANDROID_BINDERFS_RUST + bool "Android Binderfs filesystem in Rust" + depends on ANDROID_BINDER_IPC_RUST + default n + help + Binderfs is a pseudo-filesystem for the Android Binder IPC driver + which can be mounted per-ipc namespace allowing to run multiple + instances of Android. + Each binderfs mount initially only contains a binder-control device. + It can be used to dynamically allocate new binder IPC devices via + ioctls. + config ANDROID_BINDER_DEVICES string "Android Binder devices" depends on ANDROID_BINDER_IPC @@ -48,6 +60,18 @@ config ANDROID_BINDER_DEVICES created. Each binder device has its own context manager, and is therefore logically separated from the other devices. +config ANDROID_BINDER_DEVICES_RUST + string "Android Binder devices in Rust" + depends on ANDROID_BINDER_IPC_RUST + default "binder,hwbinder,vndbinder" + help + Default value for the binder.devices parameter. + + The binder.devices parameter is a comma-separated list of strings + that specifies the names of the binder device nodes that will be + created. Each binder device has its own context manager, and is + therefore logically separated from the other devices. + config ANDROID_BINDER_IPC_SELFTEST bool "Android Binder IPC Driver Selftest" depends on ANDROID_BINDER_IPC diff --git a/drivers/android/Makefile b/drivers/android/Makefile index 6348f75832ca..5c819011aa77 100644 --- a/drivers/android/Makefile +++ b/drivers/android/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_ANDROID_BINDERFS) += binderfs.o obj-$(CONFIG_ANDROID_BINDER_IPC) += binder.o binder_alloc.o obj-$(CONFIG_ANDROID_BINDER_IPC_SELFTEST) += binder_alloc_selftest.o obj-$(CONFIG_ANDROID_BINDER_IPC_RUST) += rust_binder.o +obj-$(CONFIG_ANDROID_BINDERFS_RUST) += rust_binderfs.o diff --git a/drivers/android/context.rs b/drivers/android/context.rs new file mode 100644 index 000000000000..630cb575d3ac --- /dev/null +++ b/drivers/android/context.rs @@ -0,0 +1,144 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::{ + list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks}, + prelude::*, + str::{CStr, CString}, + sync::{Arc, Mutex}, +}; + +use crate::process::Process; + +// This module defines the global variable containing the list of contexts. Since the +// `kernel::sync` bindings currently don't support mutexes in globals, we use a temporary +// workaround. +// +// TODO: Once `kernel::sync` has support for mutexes in globals, remove this module. +mod context_global { + use super::ContextList; + use core::cell::UnsafeCell; + use core::mem::MaybeUninit; + use kernel::init::PinInit; + use kernel::list::List; + use kernel::sync::lock::mutex::{Mutex, MutexBackend}; + use kernel::sync::lock::Guard; + + /// A temporary wrapper used to define a mutex in a global. + pub(crate) struct Contexts { + inner: UnsafeCell>>, + } + + impl Contexts { + /// Called when the module is initialized. + pub(crate) fn init(&self) { + // SAFETY: This is only called during initialization of the binder module, so we know + // that the global is currently uninitialized and that nobody else is using it yet. + unsafe { + let ptr = self.inner.get() as *mut Mutex; + let init = kernel::new_mutex!(ContextList { list: List::new() }, "ContextList"); + match init.__pinned_init(ptr) { + Ok(()) => {} + Err(e) => match e {}, + } + } + } + + pub(crate) fn lock(&self) -> Guard<'_, ContextList, MutexBackend> { + // SAFETY: The `init` method is called during initialization of the binder module, so the + // mutex is always initialized when this method is called. + unsafe { + let ptr = self.inner.get() as *const Mutex; + (*ptr).lock() + } + } + } + + unsafe impl Send for Contexts {} + unsafe impl Sync for Contexts {} + + pub(crate) static CONTEXTS: Contexts = Contexts { + inner: UnsafeCell::new(MaybeUninit::uninit()), + }; +} + +pub(crate) use self::context_global::CONTEXTS; + +pub(crate) struct ContextList { + list: List, +} + +/// This struct keeps track of the processes using this context, and which process is the context +/// manager. +struct Manager { + all_procs: List, +} + +/// There is one context per binder file (/dev/binder, /dev/hwbinder, etc) +#[pin_data] +pub(crate) struct Context { + #[pin] + manager: Mutex, + pub(crate) name: CString, + #[pin] + links: ListLinks, +} + +kernel::list::impl_has_list_links! { + impl HasListLinks<0> for Context { self.links } +} +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for Context { untracked; } +} +kernel::list::impl_list_item! { + impl ListItem<0> for Context { + using ListLinks; + } +} + +impl Context { + pub(crate) fn new(name: &CStr) -> Result> { + let name = CString::try_from(name)?; + let list_ctx = ListArc::pin_init(pin_init!(Context { + name, + links <- ListLinks::new(), + manager <- kernel::new_mutex!(Manager { + all_procs: List::new(), + }, "Context::manager"), + }))?; + + let ctx = list_ctx.clone_arc(); + CONTEXTS.lock().list.push_back(list_ctx); + + Ok(ctx) + } + + /// Called when the file for this context is unlinked. + /// + /// No-op if called twice. + pub(crate) fn deregister(&self) { + // SAFETY: We never add the context to any other linked list than this one, so it is either + // in this list, or not in any list. + unsafe { + CONTEXTS.lock().list.remove(self); + } + } + + pub(crate) fn register_process(self: &Arc, proc: ListArc) { + if !Arc::ptr_eq(self, &proc.ctx) { + pr_err!("Context::register_process called on the wrong context."); + return; + } + self.manager.lock().all_procs.push_back(proc); + } + + pub(crate) fn deregister_process(self: &Arc, proc: &Process) { + if !Arc::ptr_eq(self, &proc.ctx) { + pr_err!("Context::deregister_process called on the wrong context."); + return; + } + // SAFETY: We just checked that this is the right list. + unsafe { + self.manager.lock().all_procs.remove(proc); + } + } +} diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs new file mode 100644 index 000000000000..8fdcb856ccad --- /dev/null +++ b/drivers/android/defs.rs @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 + +use core::ops::{Deref, DerefMut}; +use kernel::{ + bindings, + io_buffer::{ReadableFromBytes, WritableToBytes}, +}; + +macro_rules! decl_wrapper { + ($newname:ident, $wrapped:ty) => { + #[derive(Copy, Clone, Default)] + #[repr(transparent)] + pub(crate) struct $newname($wrapped); + // SAFETY: This macro is only used with types where this is ok. + unsafe impl ReadableFromBytes for $newname {} + unsafe impl WritableToBytes for $newname {} + impl Deref for $newname { + type Target = $wrapped; + fn deref(&self) -> &Self::Target { + &self.0 + } + } + impl DerefMut for $newname { + fn deref_mut(&mut self) -> &mut Self::Target { + &mut self.0 + } + } + }; +} + +decl_wrapper!(BinderVersion, bindings::binder_version); + +impl BinderVersion { + pub(crate) fn current() -> Self { + Self(bindings::binder_version { + protocol_version: bindings::BINDER_CURRENT_PROTOCOL_VERSION as _, + }) + } +} diff --git a/drivers/android/process.rs b/drivers/android/process.rs new file mode 100644 index 000000000000..2f16e4cedbf1 --- /dev/null +++ b/drivers/android/process.rs @@ -0,0 +1,251 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module defines the `Process` type, which represents a process using a particular binder +//! context. +//! +//! The `Process` object keeps track of all of the resources that this process owns in the binder +//! context. +//! +//! There is one `Process` object for each binder fd that a process has opened, so processes using +//! several binder contexts have several `Process` objects. This ensures that the contexts are +//! fully separated. + +use kernel::{ + bindings, + cred::Credential, + file::{File, PollTable}, + io_buffer::IoBufferWriter, + list::{HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks}, + mm, + prelude::*, + sync::{Arc, ArcBorrow, SpinLock}, + task::Task, + types::ARef, + user_ptr::{UserSlicePtr, UserSlicePtrReader}, + workqueue::{self, Work}, +}; + +use crate::{context::Context, defs::*}; + +const PROC_DEFER_FLUSH: u8 = 1; +const PROC_DEFER_RELEASE: u8 = 2; + +/// The fields of `Process` protected by the spinlock. +pub(crate) struct ProcessInner { + is_dead: bool, + + /// Bitmap of deferred work to do. + defer_work: u8, +} + +impl ProcessInner { + fn new() -> Self { + Self { + is_dead: false, + defer_work: 0, + } + } +} + +/// A process using binder. +/// +/// Strictly speaking, there can be multiple of these per process. There is one for each binder fd +/// that a process has opened, so processes using several binder contexts have several `Process` +/// objects. This ensures that the contexts are fully separated. +#[pin_data] +pub(crate) struct Process { + pub(crate) ctx: Arc, + + // The task leader (process). + pub(crate) task: ARef, + + // Credential associated with file when `Process` is created. + pub(crate) cred: ARef, + + #[pin] + pub(crate) inner: SpinLock, + + // Work node for deferred work item. + #[pin] + defer_work: Work, + + // Links for process list in Context. + #[pin] + links: ListLinks, +} + +kernel::impl_has_work! { + impl HasWork for Process { self.defer_work } +} + +kernel::list::impl_has_list_links! { + impl HasListLinks<0> for Process { self.links } +} +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for Process { untracked; } +} +kernel::list::impl_list_item! { + impl ListItem<0> for Process { + using ListLinks; + } +} + +impl workqueue::WorkItem for Process { + type Pointer = Arc; + + fn run(me: Arc) { + let defer; + { + let mut inner = me.inner.lock(); + defer = inner.defer_work; + inner.defer_work = 0; + } + + if defer & PROC_DEFER_FLUSH != 0 { + me.deferred_flush(); + } + if defer & PROC_DEFER_RELEASE != 0 { + me.deferred_release(); + } + } +} + +impl Process { + fn new(ctx: Arc, cred: ARef) -> Result> { + let list_process = ListArc::pin_init(pin_init!(Process { + ctx, + cred, + inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"), + task: kernel::current!().group_leader().into(), + defer_work <- kernel::new_work!("Process::defer_work"), + links <- ListLinks::new(), + }))?; + + let process = list_process.clone_arc(); + process.ctx.register_process(list_process); + + Ok(process) + } + + fn version(&self, data: UserSlicePtr) -> Result { + data.writer().write(&BinderVersion::current()) + } + + fn deferred_flush(&self) { + // NOOP for now. + } + + fn deferred_release(self: Arc) { + self.inner.lock().is_dead = true; + + self.ctx.deregister_process(&self); + } + + pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result { + let should_schedule; + { + let mut inner = this.inner.lock(); + should_schedule = inner.defer_work == 0; + inner.defer_work |= PROC_DEFER_FLUSH; + } + + if should_schedule { + // Ignore failures to schedule to the workqueue. Those just mean that we're already + // scheduled for execution. + let _ = workqueue::system().enqueue(Arc::from(this)); + } + Ok(()) + } +} + +/// The ioctl handler. +impl Process { + fn write( + _this: ArcBorrow<'_, Process>, + _file: &File, + _cmd: u32, + _reader: &mut UserSlicePtrReader, + ) -> Result { + Err(EINVAL) + } + + fn read_write( + this: ArcBorrow<'_, Process>, + _file: &File, + cmd: u32, + data: UserSlicePtr, + ) -> Result { + match cmd { + bindings::BINDER_VERSION => this.version(data)?, + _ => return Err(EINVAL), + } + Ok(0) + } +} + +/// The file operations supported by `Process`. +impl Process { + pub(crate) fn open(ctx: ArcBorrow<'_, Context>, file: &File) -> Result> { + Self::new(ctx.into(), ARef::from(file.cred())) + } + + pub(crate) fn release(this: Arc, _file: &File) { + let should_schedule; + { + let mut inner = this.inner.lock(); + should_schedule = inner.defer_work == 0; + inner.defer_work |= PROC_DEFER_RELEASE; + } + + if should_schedule { + // Ignore failures to schedule to the workqueue. Those just mean that we're already + // scheduled for execution. + let _ = workqueue::system().enqueue(this); + } + } + + pub(crate) fn ioctl( + this: ArcBorrow<'_, Process>, + file: &File, + cmd: u32, + arg: *mut core::ffi::c_void, + ) -> Result { + use kernel::ioctl::{_IOC_DIR, _IOC_SIZE}; + use kernel::uapi::{_IOC_READ, _IOC_WRITE}; + + let user_slice = UserSlicePtr::new(arg, _IOC_SIZE(cmd)); + + const _IOC_READ_WRITE: u32 = _IOC_READ | _IOC_WRITE; + + match _IOC_DIR(cmd) { + _IOC_WRITE => Self::write(this, file, cmd, &mut user_slice.reader()), + _IOC_READ_WRITE => Self::read_write(this, file, cmd, user_slice), + _ => Err(EINVAL), + } + } + + pub(crate) fn compat_ioctl( + this: ArcBorrow<'_, Process>, + file: &File, + cmd: u32, + arg: *mut core::ffi::c_void, + ) -> Result { + Self::ioctl(this, file, cmd, arg) + } + + pub(crate) fn mmap( + _this: ArcBorrow<'_, Process>, + _file: &File, + _vma: &mut mm::virt::Area, + ) -> Result { + Err(EINVAL) + } + + pub(crate) fn poll( + _this: ArcBorrow<'_, Process>, + _file: &File, + _table: &mut PollTable, + ) -> Result { + Err(EINVAL) + } +} diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 4b3d6676a9cf..6de2f40846fb 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -2,7 +2,19 @@ //! Binder -- the Android IPC mechanism. -use kernel::prelude::*; +use kernel::{ + bindings::{self, seq_file}, + file::{File, PollTable}, + prelude::*, + sync::Arc, + types::ForeignOwnable, +}; + +use crate::{context::Context, process::Process}; + +mod context; +mod defs; +mod process; module! { type: BinderModule, @@ -16,6 +28,188 @@ struct BinderModule {} impl kernel::Module for BinderModule { fn init(_module: &'static kernel::ThisModule) -> Result { + crate::context::CONTEXTS.init(); + + // SAFETY: The module is being loaded, so we can initialize binderfs. + #[cfg(CONFIG_ANDROID_BINDERFS_RUST)] + unsafe { + kernel::error::to_result(bindings::init_rust_binderfs())?; + } + Ok(Self {}) } } + +/// Makes the inner type Sync. +#[repr(transparent)] +pub struct AssertSync(T); +// SAFETY: Used only to insert `file_operations` into a global, which is safe. +unsafe impl Sync for AssertSync {} + +/// File operations that rust_binderfs.c can use. +#[no_mangle] +#[used] +pub static rust_binder_fops: AssertSync = { + // SAFETY: All zeroes is safe for the `file_operations` type. + let zeroed_ops = unsafe { core::mem::MaybeUninit::zeroed().assume_init() }; + + let ops = kernel::bindings::file_operations { + owner: THIS_MODULE.as_ptr(), + poll: Some(rust_binder_poll), + unlocked_ioctl: Some(rust_binder_unlocked_ioctl), + compat_ioctl: Some(rust_binder_compat_ioctl), + mmap: Some(rust_binder_mmap), + open: Some(rust_binder_open), + release: Some(rust_binder_release), + mmap_supported_flags: 0, + flush: Some(rust_binder_flush), + ..zeroed_ops + }; + AssertSync(ops) +}; + +#[no_mangle] +unsafe extern "C" fn rust_binder_new_device( + name: *const core::ffi::c_char, +) -> *mut core::ffi::c_void { + // SAFETY: The caller will always provide a valid c string here. + let name = unsafe { kernel::str::CStr::from_char_ptr(name) }; + match Context::new(name) { + Ok(ctx) => Arc::into_foreign(ctx).cast_mut(), + Err(_err) => core::ptr::null_mut(), + } +} + +#[no_mangle] +unsafe extern "C" fn rust_binder_remove_device(device: *mut core::ffi::c_void) { + if !device.is_null() { + // SAFETY: The caller ensures that the `device` pointer came from a previous call to + // `rust_binder_new_device`. + let ctx = unsafe { Arc::::from_foreign(device) }; + ctx.deregister(); + drop(ctx); + } +} + +unsafe extern "C" fn rust_binder_open( + inode: *mut bindings::inode, + file_ptr: *mut bindings::file, +) -> core::ffi::c_int { + // SAFETY: The `rust_binderfs.c` file ensures that `i_private` is set to the return value of a + // successful call to `rust_binder_new_device`. + let ctx = unsafe { Arc::::borrow((*inode).i_private) }; + + // SAFETY: The caller provides a valid file pointer to a new `struct file`. + let file = unsafe { File::from_ptr(file_ptr) }; + let process = match Process::open(ctx, file) { + Ok(process) => process, + Err(err) => return err.to_errno(), + }; + // SAFETY: This file is associated with Rust binder, so we own the `private_data` field. + unsafe { + (*file_ptr).private_data = process.into_foreign().cast_mut(); + } + 0 +} + +unsafe extern "C" fn rust_binder_release( + _inode: *mut bindings::inode, + file: *mut bindings::file, +) -> core::ffi::c_int { + // SAFETY: We previously set `private_data` in `rust_binder_open`. + let process = unsafe { Arc::::from_foreign((*file).private_data) }; + // SAFETY: The caller ensures that the file is valid. + let file = unsafe { File::from_ptr(file) }; + Process::release(process, file); + 0 +} + +unsafe extern "C" fn rust_binder_compat_ioctl( + file: *mut bindings::file, + cmd: core::ffi::c_uint, + arg: core::ffi::c_ulong, +) -> core::ffi::c_long { + // SAFETY: We previously set `private_data` in `rust_binder_open`. + let f = unsafe { Arc::::borrow((*file).private_data) }; + // SAFETY: The caller ensures that the file is valid. + match Process::compat_ioctl(f, unsafe { File::from_ptr(file) }, cmd as _, arg as _) { + Ok(ret) => ret.into(), + Err(err) => err.to_errno().into(), + } +} + +unsafe extern "C" fn rust_binder_unlocked_ioctl( + file: *mut bindings::file, + cmd: core::ffi::c_uint, + arg: core::ffi::c_ulong, +) -> core::ffi::c_long { + // SAFETY: We previously set `private_data` in `rust_binder_open`. + let f = unsafe { Arc::::borrow((*file).private_data) }; + // SAFETY: The caller ensures that the file is valid. + match Process::ioctl(f, unsafe { File::from_ptr(file) }, cmd as _, arg as _) { + Ok(ret) => ret.into(), + Err(err) => err.to_errno().into(), + } +} + +unsafe extern "C" fn rust_binder_mmap( + file: *mut bindings::file, + vma: *mut bindings::vm_area_struct, +) -> core::ffi::c_int { + // SAFETY: We previously set `private_data` in `rust_binder_open`. + let f = unsafe { Arc::::borrow((*file).private_data) }; + // SAFETY: The caller ensures that the vma is valid. + let area = unsafe { kernel::mm::virt::Area::from_ptr_mut(vma) }; + // SAFETY: The caller ensures that the file is valid. + match Process::mmap(f, unsafe { File::from_ptr(file) }, area) { + Ok(()) => 0, + Err(err) => err.to_errno(), + } +} + +unsafe extern "C" fn rust_binder_poll( + file: *mut bindings::file, + wait: *mut bindings::poll_table_struct, +) -> bindings::__poll_t { + // SAFETY: We previously set `private_data` in `rust_binder_open`. + let f = unsafe { Arc::::borrow((*file).private_data) }; + // SAFETY: The caller ensures that the file is valid. + let fileref = unsafe { File::from_ptr(file) }; + // SAFETY: The caller ensures that the `PollTable` is valid. + match Process::poll(f, fileref, unsafe { PollTable::from_ptr(wait) }) { + Ok(v) => v, + Err(_) => bindings::POLLERR, + } +} + +unsafe extern "C" fn rust_binder_flush( + file: *mut bindings::file, + _id: bindings::fl_owner_t, +) -> core::ffi::c_int { + // SAFETY: We previously set `private_data` in `rust_binder_open`. + let f = unsafe { Arc::::borrow((*file).private_data) }; + match Process::flush(f) { + Ok(()) => 0, + Err(err) => err.to_errno(), + } +} + +#[no_mangle] +unsafe extern "C" fn rust_binder_stats_show(_: *mut seq_file) -> core::ffi::c_int { + 0 +} + +#[no_mangle] +unsafe extern "C" fn rust_binder_state_show(_: *mut seq_file) -> core::ffi::c_int { + 0 +} + +#[no_mangle] +unsafe extern "C" fn rust_binder_transactions_show(_: *mut seq_file) -> core::ffi::c_int { + 0 +} + +#[no_mangle] +unsafe extern "C" fn rust_binder_transaction_log_show(_: *mut seq_file) -> core::ffi::c_int { + 0 +} diff --git a/drivers/android/rust_binderfs.c b/drivers/android/rust_binderfs.c new file mode 100644 index 000000000000..2c011e26752c --- /dev/null +++ b/drivers/android/rust_binderfs.c @@ -0,0 +1,866 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "binder_internal.h" + +#define FIRST_INODE 1 +#define SECOND_INODE 2 +#define INODE_OFFSET 3 +#define BINDERFS_MAX_MINOR (1U << MINORBITS) +/* Ensure that the initial ipc namespace always has devices available. */ +#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 4) + +/* === DEFINED IN RUST === */ +extern int rust_binder_stats_show(struct seq_file *m, void *unused); +DEFINE_SHOW_ATTRIBUTE(rust_binder_stats); + +extern int rust_binder_state_show(struct seq_file *m, void *unused); +DEFINE_SHOW_ATTRIBUTE(rust_binder_state); + +extern int rust_binder_transactions_show(struct seq_file *m, void *unused); +DEFINE_SHOW_ATTRIBUTE(rust_binder_transactions); + +extern int rust_binder_transaction_log_show(struct seq_file *m, void *unused); +DEFINE_SHOW_ATTRIBUTE(rust_binder_transaction_log); + +extern const struct file_operations rust_binder_fops; +extern rust_binder_device rust_binder_new_device(char *name); +extern void rust_binder_remove_device(rust_binder_device device); +/* === END DEFINED IN RUST === */ + +char *rust_binder_devices_param = CONFIG_ANDROID_BINDER_DEVICES_RUST; +module_param_named(rust_devices, rust_binder_devices_param, charp, 0444); + +static dev_t binderfs_dev; +static DEFINE_MUTEX(binderfs_minors_mutex); +static DEFINE_IDA(binderfs_minors); + +enum binderfs_param { + Opt_max, + Opt_stats_mode, +}; + +enum binderfs_stats_mode { + binderfs_stats_mode_unset, + binderfs_stats_mode_global, +}; + +struct binder_features { + bool oneway_spam_detection; + bool extended_error; +}; + +static const struct constant_table binderfs_param_stats[] = { + { "global", binderfs_stats_mode_global }, + {} +}; + +static const struct fs_parameter_spec binderfs_fs_parameters[] = { + fsparam_u32("max", Opt_max), + fsparam_enum("stats", Opt_stats_mode, binderfs_param_stats), + {} +}; + +static struct binder_features binder_features = { + .oneway_spam_detection = true, + .extended_error = true, +}; + +static inline struct binderfs_info *BINDERFS_SB(const struct super_block *sb) +{ + return sb->s_fs_info; +} + +bool is_rust_binderfs_device(const struct inode *inode) +{ + if (inode->i_sb->s_magic == RUST_BINDERFS_SUPER_MAGIC) + return true; + + return false; +} + +/** + * binderfs_binder_device_create - allocate inode from super block of a + * binderfs mount + * @ref_inode: inode from wich the super block will be taken + * @userp: buffer to copy information about new device for userspace to + * @req: struct binderfs_device as copied from userspace + * + * This function allocates a new binder_device and reserves a new minor + * number for it. + * Minor numbers are limited and tracked globally in binderfs_minors. The + * function will stash a struct binder_device for the specific binder + * device in i_private of the inode. + * It will go on to allocate a new inode from the super block of the + * filesystem mount, stash a struct binder_device in its i_private field + * and attach a dentry to that inode. + * + * Return: 0 on success, negative errno on failure + */ +static int binderfs_binder_device_create(struct inode *ref_inode, + struct binderfs_device __user *userp, + struct binderfs_device *req) +{ + int minor, ret; + struct dentry *dentry, *root; + rust_binder_device device = NULL; + char *name = NULL; + size_t name_len; + struct inode *inode = NULL; + struct super_block *sb = ref_inode->i_sb; + struct binderfs_info *info = sb->s_fs_info; +#if defined(CONFIG_IPC_NS) + bool use_reserve = (info->ipc_ns == &init_ipc_ns); +#else + bool use_reserve = true; +#endif + + /* Reserve new minor number for the new device. */ + mutex_lock(&binderfs_minors_mutex); + if (++info->device_count <= info->mount_opts.max) + minor = ida_alloc_max(&binderfs_minors, + use_reserve ? BINDERFS_MAX_MINOR : + BINDERFS_MAX_MINOR_CAPPED, + GFP_KERNEL); + else + minor = -ENOSPC; + if (minor < 0) { + --info->device_count; + mutex_unlock(&binderfs_minors_mutex); + return minor; + } + mutex_unlock(&binderfs_minors_mutex); + + ret = -ENOMEM; + req->name[BINDERFS_MAX_NAME] = '\0'; /* NUL-terminate */ + name_len = strlen(req->name); + /* Make sure to include terminating NUL byte */ + name = kmemdup(req->name, name_len + 1, GFP_KERNEL); + if (!name) + goto err; + + device = rust_binder_new_device(name); + if (!device) + goto err; + + inode = new_inode(sb); + if (!inode) + goto err; + + inode->i_ino = minor + INODE_OFFSET; + simple_inode_init_ts(inode); + init_special_inode(inode, S_IFCHR | 0600, + MKDEV(MAJOR(binderfs_dev), minor)); + inode->i_fop = &rust_binder_fops; + inode->i_uid = info->root_uid; + inode->i_gid = info->root_gid; + + req->major = MAJOR(binderfs_dev); + req->minor = minor; + + if (userp && copy_to_user(userp, req, sizeof(*req))) { + ret = -EFAULT; + goto err; + } + + root = sb->s_root; + inode_lock(d_inode(root)); + + /* look it up */ + dentry = lookup_one_len(name, root, name_len); + if (IS_ERR(dentry)) { + inode_unlock(d_inode(root)); + ret = PTR_ERR(dentry); + goto err; + } + + if (d_really_is_positive(dentry)) { + /* already exists */ + dput(dentry); + inode_unlock(d_inode(root)); + ret = -EEXIST; + goto err; + } + + inode->i_private = device; + d_instantiate(dentry, inode); + fsnotify_create(root->d_inode, dentry); + inode_unlock(d_inode(root)); + + return 0; + +err: + kfree(name); + rust_binder_remove_device(device); + mutex_lock(&binderfs_minors_mutex); + --info->device_count; + ida_free(&binderfs_minors, minor); + mutex_unlock(&binderfs_minors_mutex); + iput(inode); + + return ret; +} + +/** + * binder_ctl_ioctl - handle binder device node allocation requests + * + * The request handler for the binder-control device. All requests operate on + * the binderfs mount the binder-control device resides in: + * - BINDER_CTL_ADD + * Allocate a new binder device. + * + * Return: %0 on success, negative errno on failure. + */ +static long binder_ctl_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + int ret = -EINVAL; + struct inode *inode = file_inode(file); + struct binderfs_device __user *device = (struct binderfs_device __user *)arg; + struct binderfs_device device_req; + + switch (cmd) { + case BINDER_CTL_ADD: + ret = copy_from_user(&device_req, device, sizeof(device_req)); + if (ret) { + ret = -EFAULT; + break; + } + + ret = binderfs_binder_device_create(inode, device, &device_req); + break; + default: + break; + } + + return ret; +} + +static void binderfs_evict_inode(struct inode *inode) +{ + rust_binder_device device = inode->i_private; + struct binderfs_info *info = BINDERFS_SB(inode->i_sb); + int minor = inode->i_ino - INODE_OFFSET; + + clear_inode(inode); + + if (!S_ISCHR(inode->i_mode) || !device) + return; + + mutex_lock(&binderfs_minors_mutex); + --info->device_count; + ida_free(&binderfs_minors, minor); + mutex_unlock(&binderfs_minors_mutex); + + rust_binder_remove_device(device); +} + +static int binderfs_fs_context_parse_param(struct fs_context *fc, + struct fs_parameter *param) +{ + int opt; + struct binderfs_mount_opts *ctx = fc->fs_private; + struct fs_parse_result result; + + opt = fs_parse(fc, binderfs_fs_parameters, param, &result); + if (opt < 0) + return opt; + + switch (opt) { + case Opt_max: + if (result.uint_32 > BINDERFS_MAX_MINOR) + return invalfc(fc, "Bad value for '%s'", param->key); + + ctx->max = result.uint_32; + break; + case Opt_stats_mode: + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + ctx->stats_mode = result.uint_32; + break; + default: + return invalfc(fc, "Unsupported parameter '%s'", param->key); + } + + return 0; +} + +static int binderfs_fs_context_reconfigure(struct fs_context *fc) +{ + struct binderfs_mount_opts *ctx = fc->fs_private; + struct binderfs_info *info = BINDERFS_SB(fc->root->d_sb); + + if (info->mount_opts.stats_mode != ctx->stats_mode) + return invalfc(fc, "Binderfs stats mode cannot be changed during a remount"); + + info->mount_opts.stats_mode = ctx->stats_mode; + info->mount_opts.max = ctx->max; + return 0; +} + +static int binderfs_show_options(struct seq_file *seq, struct dentry *root) +{ + struct binderfs_info *info = BINDERFS_SB(root->d_sb); + + if (info->mount_opts.max <= BINDERFS_MAX_MINOR) + seq_printf(seq, ",max=%d", info->mount_opts.max); + + switch (info->mount_opts.stats_mode) { + case binderfs_stats_mode_unset: + break; + case binderfs_stats_mode_global: + seq_printf(seq, ",stats=global"); + break; + } + + return 0; +} + +static const struct super_operations binderfs_super_ops = { + .evict_inode = binderfs_evict_inode, + .show_options = binderfs_show_options, + .statfs = simple_statfs, +}; + +static inline bool is_binderfs_control_device(const struct dentry *dentry) +{ + struct binderfs_info *info = dentry->d_sb->s_fs_info; + + return info->control_dentry == dentry; +} + +static int binderfs_rename(struct mnt_idmap *idmap, + struct inode *old_dir, struct dentry *old_dentry, + struct inode *new_dir, struct dentry *new_dentry, + unsigned int flags) +{ + if (is_binderfs_control_device(old_dentry) || + is_binderfs_control_device(new_dentry)) + return -EPERM; + + return simple_rename(idmap, old_dir, old_dentry, new_dir, + new_dentry, flags); +} + +static int binderfs_unlink(struct inode *dir, struct dentry *dentry) +{ + if (is_binderfs_control_device(dentry)) + return -EPERM; + + return simple_unlink(dir, dentry); +} + +static const struct file_operations binder_ctl_fops = { + .owner = THIS_MODULE, + .open = nonseekable_open, + .unlocked_ioctl = binder_ctl_ioctl, + .compat_ioctl = binder_ctl_ioctl, + .llseek = noop_llseek, +}; + +/** + * binderfs_binder_ctl_create - create a new binder-control device + * @sb: super block of the binderfs mount + * + * This function creates a new binder-control device node in the binderfs mount + * referred to by @sb. + * + * Return: 0 on success, negative errno on failure + */ +static int binderfs_binder_ctl_create(struct super_block *sb) +{ + int minor, ret; + struct dentry *dentry; + struct binder_device *device; + struct inode *inode = NULL; + struct dentry *root = sb->s_root; + struct binderfs_info *info = sb->s_fs_info; +#if defined(CONFIG_IPC_NS) + bool use_reserve = (info->ipc_ns == &init_ipc_ns); +#else + bool use_reserve = true; +#endif + + device = kzalloc(sizeof(*device), GFP_KERNEL); + if (!device) + return -ENOMEM; + + /* If we have already created a binder-control node, return. */ + if (info->control_dentry) { + ret = 0; + goto out; + } + + ret = -ENOMEM; + inode = new_inode(sb); + if (!inode) + goto out; + + /* Reserve a new minor number for the new device. */ + mutex_lock(&binderfs_minors_mutex); + minor = ida_alloc_max(&binderfs_minors, + use_reserve ? BINDERFS_MAX_MINOR : + BINDERFS_MAX_MINOR_CAPPED, + GFP_KERNEL); + mutex_unlock(&binderfs_minors_mutex); + if (minor < 0) { + ret = minor; + goto out; + } + + inode->i_ino = SECOND_INODE; + simple_inode_init_ts(inode); + init_special_inode(inode, S_IFCHR | 0600, + MKDEV(MAJOR(binderfs_dev), minor)); + inode->i_fop = &binder_ctl_fops; + inode->i_uid = info->root_uid; + inode->i_gid = info->root_gid; + + refcount_set(&device->ref, 1); + device->binderfs_inode = inode; + device->miscdev.minor = minor; + + dentry = d_alloc_name(root, "binder-control"); + if (!dentry) + goto out; + + inode->i_private = device; + info->control_dentry = dentry; + d_add(dentry, inode); + + return 0; + +out: + kfree(device); + iput(inode); + + return ret; +} + +static const struct inode_operations binderfs_dir_inode_operations = { + .lookup = simple_lookup, + .rename = binderfs_rename, + .unlink = binderfs_unlink, +}; + +static struct inode *binderfs_make_inode(struct super_block *sb, int mode) +{ + struct inode *ret; + + ret = new_inode(sb); + if (ret) { + ret->i_ino = iunique(sb, BINDERFS_MAX_MINOR + INODE_OFFSET); + ret->i_mode = mode; + simple_inode_init_ts(ret); + } + return ret; +} + +static struct dentry *binderfs_create_dentry(struct dentry *parent, + const char *name) +{ + struct dentry *dentry; + + dentry = lookup_one_len(name, parent, strlen(name)); + if (IS_ERR(dentry)) + return dentry; + + /* Return error if the file/dir already exists. */ + if (d_really_is_positive(dentry)) { + dput(dentry); + return ERR_PTR(-EEXIST); + } + + return dentry; +} + +void rust_binderfs_remove_file(struct dentry *dentry) +{ + struct inode *parent_inode; + + parent_inode = d_inode(dentry->d_parent); + inode_lock(parent_inode); + if (simple_positive(dentry)) { + dget(dentry); + simple_unlink(parent_inode, dentry); + d_delete(dentry); + dput(dentry); + } + inode_unlock(parent_inode); +} + +struct dentry *rust_binderfs_create_file(struct dentry *parent, const char *name, + const struct file_operations *fops, + void *data) +{ + struct dentry *dentry; + struct inode *new_inode, *parent_inode; + struct super_block *sb; + + parent_inode = d_inode(parent); + inode_lock(parent_inode); + + dentry = binderfs_create_dentry(parent, name); + if (IS_ERR(dentry)) + goto out; + + sb = parent_inode->i_sb; + new_inode = binderfs_make_inode(sb, S_IFREG | 0444); + if (!new_inode) { + dput(dentry); + dentry = ERR_PTR(-ENOMEM); + goto out; + } + + new_inode->i_fop = fops; + new_inode->i_private = data; + d_instantiate(dentry, new_inode); + fsnotify_create(parent_inode, dentry); + +out: + inode_unlock(parent_inode); + return dentry; +} + +static struct dentry *binderfs_create_dir(struct dentry *parent, + const char *name) +{ + struct dentry *dentry; + struct inode *new_inode, *parent_inode; + struct super_block *sb; + + parent_inode = d_inode(parent); + inode_lock(parent_inode); + + dentry = binderfs_create_dentry(parent, name); + if (IS_ERR(dentry)) + goto out; + + sb = parent_inode->i_sb; + new_inode = binderfs_make_inode(sb, S_IFDIR | 0755); + if (!new_inode) { + dput(dentry); + dentry = ERR_PTR(-ENOMEM); + goto out; + } + + new_inode->i_fop = &simple_dir_operations; + new_inode->i_op = &simple_dir_inode_operations; + + set_nlink(new_inode, 2); + d_instantiate(dentry, new_inode); + inc_nlink(parent_inode); + fsnotify_mkdir(parent_inode, dentry); + +out: + inode_unlock(parent_inode); + return dentry; +} + +static int binder_features_show(struct seq_file *m, void *unused) +{ + bool *feature = m->private; + + seq_printf(m, "%d\n", *feature); + + return 0; +} +DEFINE_SHOW_ATTRIBUTE(binder_features); + +static int init_binder_features(struct super_block *sb) +{ + struct dentry *dentry, *dir; + + dir = binderfs_create_dir(sb->s_root, "features"); + if (IS_ERR(dir)) + return PTR_ERR(dir); + + dentry = rust_binderfs_create_file(dir, "oneway_spam_detection", + &binder_features_fops, + &binder_features.oneway_spam_detection); + if (IS_ERR(dentry)) + return PTR_ERR(dentry); + + dentry = rust_binderfs_create_file(dir, "extended_error", + &binder_features_fops, + &binder_features.extended_error); + if (IS_ERR(dentry)) + return PTR_ERR(dentry); + + return 0; +} + +static int init_binder_logs(struct super_block *sb) +{ + struct dentry *binder_logs_root_dir, *dentry, *proc_log_dir; + struct binderfs_info *info; + int ret = 0; + + binder_logs_root_dir = binderfs_create_dir(sb->s_root, + "binder_logs"); + if (IS_ERR(binder_logs_root_dir)) { + ret = PTR_ERR(binder_logs_root_dir); + goto out; + } + + dentry = rust_binderfs_create_file(binder_logs_root_dir, "stats", + &rust_binder_stats_fops, NULL); + if (IS_ERR(dentry)) { + ret = PTR_ERR(dentry); + goto out; + } + + dentry = rust_binderfs_create_file(binder_logs_root_dir, "state", + &rust_binder_state_fops, NULL); + if (IS_ERR(dentry)) { + ret = PTR_ERR(dentry); + goto out; + } + + dentry = rust_binderfs_create_file(binder_logs_root_dir, "transactions", + &rust_binder_transactions_fops, NULL); + if (IS_ERR(dentry)) { + ret = PTR_ERR(dentry); + goto out; + } + + dentry = rust_binderfs_create_file(binder_logs_root_dir, + "transaction_log", + &rust_binder_transaction_log_fops, + NULL); + if (IS_ERR(dentry)) { + ret = PTR_ERR(dentry); + goto out; + } + + dentry = rust_binderfs_create_file(binder_logs_root_dir, + "failed_transaction_log", + &rust_binder_transaction_log_fops, + NULL); + if (IS_ERR(dentry)) { + ret = PTR_ERR(dentry); + goto out; + } + + proc_log_dir = binderfs_create_dir(binder_logs_root_dir, "proc"); + if (IS_ERR(proc_log_dir)) { + ret = PTR_ERR(proc_log_dir); + goto out; + } + info = sb->s_fs_info; + info->proc_log_dir = proc_log_dir; + +out: + return ret; +} + +static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc) +{ + int ret; + struct binderfs_info *info; + struct binderfs_mount_opts *ctx = fc->fs_private; + struct inode *inode = NULL; + struct binderfs_device device_info = {}; + const char *name; + size_t len; + + sb->s_blocksize = PAGE_SIZE; + sb->s_blocksize_bits = PAGE_SHIFT; + + /* + * The binderfs filesystem can be mounted by userns root in a + * non-initial userns. By default such mounts have the SB_I_NODEV flag + * set in s_iflags to prevent security issues where userns root can + * just create random device nodes via mknod() since it owns the + * filesystem mount. But binderfs does not allow to create any files + * including devices nodes. The only way to create binder devices nodes + * is through the binder-control device which userns root is explicitly + * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both + * necessary and safe. + */ + sb->s_iflags &= ~SB_I_NODEV; + sb->s_iflags |= SB_I_NOEXEC; + sb->s_magic = RUST_BINDERFS_SUPER_MAGIC; + sb->s_op = &binderfs_super_ops; + sb->s_time_gran = 1; + + sb->s_fs_info = kzalloc(sizeof(struct binderfs_info), GFP_KERNEL); + if (!sb->s_fs_info) + return -ENOMEM; + info = sb->s_fs_info; + + info->ipc_ns = get_ipc_ns(current->nsproxy->ipc_ns); + + info->root_gid = make_kgid(sb->s_user_ns, 0); + if (!gid_valid(info->root_gid)) + info->root_gid = GLOBAL_ROOT_GID; + info->root_uid = make_kuid(sb->s_user_ns, 0); + if (!uid_valid(info->root_uid)) + info->root_uid = GLOBAL_ROOT_UID; + info->mount_opts.max = ctx->max; + info->mount_opts.stats_mode = ctx->stats_mode; + + inode = new_inode(sb); + if (!inode) + return -ENOMEM; + + inode->i_ino = FIRST_INODE; + inode->i_fop = &simple_dir_operations; + inode->i_mode = S_IFDIR | 0755; + simple_inode_init_ts(inode); + inode->i_op = &binderfs_dir_inode_operations; + set_nlink(inode, 2); + + sb->s_root = d_make_root(inode); + if (!sb->s_root) + return -ENOMEM; + + ret = binderfs_binder_ctl_create(sb); + if (ret) + return ret; + + name = rust_binder_devices_param; + for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) { + strscpy(device_info.name, name, len + 1); + ret = binderfs_binder_device_create(inode, NULL, &device_info); + if (ret) + return ret; + name += len; + if (*name == ',') + name++; + } + + ret = init_binder_features(sb); + if (ret) + return ret; + + if (info->mount_opts.stats_mode == binderfs_stats_mode_global) + return init_binder_logs(sb); + + return 0; +} + +static int binderfs_fs_context_get_tree(struct fs_context *fc) +{ + return get_tree_nodev(fc, binderfs_fill_super); +} + +static void binderfs_fs_context_free(struct fs_context *fc) +{ + struct binderfs_mount_opts *ctx = fc->fs_private; + + kfree(ctx); +} + +static const struct fs_context_operations binderfs_fs_context_ops = { + .free = binderfs_fs_context_free, + .get_tree = binderfs_fs_context_get_tree, + .parse_param = binderfs_fs_context_parse_param, + .reconfigure = binderfs_fs_context_reconfigure, +}; + +static int binderfs_init_fs_context(struct fs_context *fc) +{ + struct binderfs_mount_opts *ctx; + + ctx = kzalloc(sizeof(struct binderfs_mount_opts), GFP_KERNEL); + if (!ctx) + return -ENOMEM; + + ctx->max = BINDERFS_MAX_MINOR; + ctx->stats_mode = binderfs_stats_mode_unset; + + fc->fs_private = ctx; + fc->ops = &binderfs_fs_context_ops; + + return 0; +} + +static void binderfs_kill_super(struct super_block *sb) +{ + struct binderfs_info *info = sb->s_fs_info; + + /* + * During inode eviction struct binderfs_info is needed. + * So first wipe the super_block then free struct binderfs_info. + */ + kill_litter_super(sb); + + if (info && info->ipc_ns) + put_ipc_ns(info->ipc_ns); + + kfree(info); +} + +static struct file_system_type binder_fs_type = { + .name = "binder", + .init_fs_context = binderfs_init_fs_context, + .parameters = binderfs_fs_parameters, + .kill_sb = binderfs_kill_super, + .fs_flags = FS_USERNS_MOUNT, +}; + +int init_rust_binderfs(void) +{ + int ret; + const char *name; + size_t len; + + /* Verify that the default binderfs device names are valid. */ + name = rust_binder_devices_param; + for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) { + if (len > BINDERFS_MAX_NAME) + return -E2BIG; + name += len; + if (*name == ',') + name++; + } + + /* Allocate new major number for binderfs. */ + ret = alloc_chrdev_region(&binderfs_dev, 0, BINDERFS_MAX_MINOR, + "rust_binder"); + if (ret) + return ret; + + ret = register_filesystem(&binder_fs_type); + if (ret) { + unregister_chrdev_region(binderfs_dev, BINDERFS_MAX_MINOR); + return ret; + } + + return ret; +} diff --git a/include/linux/rust_binder.h b/include/linux/rust_binder.h new file mode 100644 index 000000000000..1e44a0a5f6a1 --- /dev/null +++ b/include/linux/rust_binder.h @@ -0,0 +1,16 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_RUST_BINDER_H +#define _LINUX_RUST_BINDER_H + +#include + +/* + * This typedef is used for Rust binder driver instances. The driver object is + * completely opaque from C and can only be accessed via calls into Rust, so we + * use a typedef. + */ +typedef void *rust_binder_device; + +int init_rust_binderfs(void); + +#endif diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index 6325d1d0e90f..e5a20c1498af 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -82,6 +82,7 @@ #define BINFMTFS_MAGIC 0x42494e4d #define DEVPTS_SUPER_MAGIC 0x1cd1 #define BINDERFS_SUPER_MAGIC 0x6c6f6f70 +#define RUST_BINDERFS_SUPER_MAGIC 0x6c6f6f71 #define FUTEXFS_SUPER_MAGIC 0xBAD1DEA #define PIPEFS_MAGIC 0x50495045 #define PROC_SUPER_MAGIC 0x9fa0 diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index 00a66666f00a..ffeea312f2fd 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -17,11 +17,13 @@ #include #include #include +#include #include #include #include #include #include +#include /* `bindgen` gets confused at certain things. */ const size_t BINDINGS_ARCH_SLAB_MINALIGN = ARCH_SLAB_MINALIGN; diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index 435d4c2ac5fc..f4d58da9202e 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -99,6 +99,13 @@ impl ThisModule { pub const unsafe fn from_ptr(ptr: *mut bindings::module) -> ThisModule { ThisModule(ptr) } + + /// Access the raw pointer for this module. + /// + /// It is up to the user to use it correctly. + pub const fn as_ptr(&self) -> *mut bindings::module { + self.0 + } } #[cfg(not(any(testlib, test)))] diff --git a/scripts/Makefile.build b/scripts/Makefile.build index da37bfa97211..f78d2e75a795 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -262,7 +262,7 @@ $(obj)/%.lst: $(src)/%.c FORCE # Compile Rust sources (.rs) # --------------------------------------------------------------------------- -rust_allowed_features := new_uninit,offset_of +rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of # `--out-dir` is required to avoid temporaries being created by `rustc` in the # current working directory, which may be not accessible in the out-of-tree From patchwork Wed Nov 1 18:01:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160625 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp604623vqx; Wed, 1 Nov 2023 11:02:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFFkuquaM5VwYvdZILjGcpf7ewj5SRqOM+Vy+YJaf7uOfP+Uo9iHxzf0d/L884EOLAw6DVE X-Received: by 2002:a17:90b:3107:b0:27d:348:94a8 with SMTP id gc7-20020a17090b310700b0027d034894a8mr15218361pjb.6.1698861771337; Wed, 01 Nov 2023 11:02:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861771; cv=none; d=google.com; s=arc-20160816; b=EAh4zRl+VGmZhuk4+7JG6XVBgR24J5aYM0bHN2wokOANLRq3TANPvN0b1m5+fr/BEz XbWr3R1ZodQy5dDo8MFs44jpE/rMxtQAIpLkW6RSj0gDbaAuJn+Nqj++tNfu9nz1EVmg e9Dp3foC5U7qYHi6BJsV/Edvf2dadLiTVJneVuVpP36MQ/RrkumXYotL0AaJUoJJB1xP eUdM3AC66q1qklSYqrJV9eB6A2Z4TFQFFGJx/ImArzyMRAUU7BzVDZrQC6hZxFDKUzKH 29/qPJBrcBoc+ffGPSC8jHcjiTWunf2URy/22a70sxMt4pY3Dmg034oU7oWoBXjJ9Zul hjbA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=i/6XZfo1xt29cN1naYzGKiL8MPrRh+1U9L/srRftA7U=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=XjqsT6BpVQuCkhMkjHNbr80Vx8IiITQysXCtzzIutDVaxOfcdkQxMSUrAY8ER7fenO OMJxeqyk++G1dxgb3IcrGkE8Kq+NPU9EYpacuPI3jpPTfogb9f5OALlDxhrHR+S2nyYa Gmbn3m9Cl1XClWW8oZYnfB5zaar7G8vVAs2m7IlNmGtT9HyoAVVNk9z8L29huKiV1aSY 06fJnMn5VqLOVNMFXKMc5vioCbB3itnE4EbF9oK86Q05ds5F3JTOVrUIrukcLm8bxSi6 zNdQ64mRNAX0SnD8Fo5jPWl0vVWljiwHd2E611gYphOAHDF43/lafXWQl8YVMO86ViQ9 AUPw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3YrywKWG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id mz14-20020a17090b378e00b00271c377a534si1218023pjb.104.2023.11.01.11.02.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:02:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3YrywKWG; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 5A25A80ACB57; Wed, 1 Nov 2023 11:02:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344781AbjKASCg (ORCPT + 34 others); Wed, 1 Nov 2023 14:02:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58890 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344758AbjKASC3 (ORCPT ); Wed, 1 Nov 2023 14:02:29 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BC79C1 for ; Wed, 1 Nov 2023 11:02:24 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d9cb4de3bf0so47706276.0 for ; Wed, 01 Nov 2023 11:02:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861743; x=1699466543; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=i/6XZfo1xt29cN1naYzGKiL8MPrRh+1U9L/srRftA7U=; b=3YrywKWGyYmXrknlbPpezb3RIrA5WjlJNYjBnWXF2JFRaQ0nYrxVVLfV8ns3xB+4Q/ W/sbaYfN61KLCSbTXcMUK1PiszvLX1WaaMc1iZfML8hfJ5sQMVApI0JYb7CD1N1wtO3S FjA35e1fCkRxhrchRnFEayCR2n96Fekp6GEiL4znrizXjlP7hLrnauro/WJFYC3B8uj5 qlbypJu2fiUlP4g4+/NB6odigQ5DYdzMF89v747ojKvwLgc0BP0v5J6GQrHca8WzXPRN 4fhIRyguYFRJOXzoZE8aoaKAnl0qMAlzYjk5sJwKLtJ++8qok7fXBuhjSGZiWNnbnG4G shzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861743; x=1699466543; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=i/6XZfo1xt29cN1naYzGKiL8MPrRh+1U9L/srRftA7U=; b=dmu5+HLWD33CWmEzjZ29HdJLGXMcbUbIiamDavpVPCMEFY99ClnTzsQWPewcX2hTIe UJMa/RVrIPxs+ixhXFOMlpmeqrIGbCvlz1hYKF38Cs58OZANqnW9PEl318I9kiQJjQ97 i0Cw7PcTN8qIUX0Q6tb7d9w2v+g5tNmkfTLCVUfAeZD07Ca86c5HZVIRdap5pTVBfDUq 9rk7tuvIijwQ3gOfVxckmw9bnErU8R4mHPifZ0Quki+8ZxKQMgz55wee6B9QqMus4pTD Q+DjrTKSXaCD6dEO0SANRc+7oyc+j4Kcu2nbFkTq0gy38u4MxeS419yu/IWN407/l19H yayQ== X-Gm-Message-State: AOJu0Yw9DpX/CGIPeawViG5KbEVORDV5nPlGtFZAaJ18uWOwj0wDBfjo oytz/h9yujw7klvVGK8Zcm1PitEW2Aa9qKI= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a25:9392:0:b0:da0:cbe9:6bac with SMTP id a18-20020a259392000000b00da0cbe96bacmr335247ybm.11.1698861743329; Wed, 01 Nov 2023 11:02:23 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:33 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-3-08ba9197f637@google.com> Subject: [PATCH RFC 03/20] rust_binder: add threading support From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:02:46 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385680708231223 X-GMAIL-MSGID: 1781385680708231223 The binder driver needs to keep track of the threads that a process uses with the driver for several reasons: 1. When replying to a transaction, it is assumed that you are replying to the "currently active transaction" on the thread you made the syscall from. The syscall does not provide any way to specify which transaction you are replying to. 2. When a thread is sleeping while waiting for incoming transactions, the driver needs to keep track of where it can deliver a transaction to. 3. The BINDER_GET_EXTENDED_ERROR ioctl gives you the last error triggered by a syscall on the same thread, so it needs to keep track of this value for each thread. 4. For binder servers, the driver keeps track of whether a process has enough threads in its transaction thread pool. Note that not all of the above items are implemented yet. Some of them will appear in later patches. In this patch, we add the structures to keep track of the threads and implement item 3 and 4 in the above list. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Co-developed-by: Matt Gilbride Signed-off-by: Matt Gilbride Signed-off-by: Alice Ryhl --- drivers/android/defs.rs | 36 ++++++- drivers/android/error.rs | 52 +++++++++++ drivers/android/process.rs | 108 +++++++++++++++++++-- drivers/android/rust_binder.rs | 2 + drivers/android/thread.rs | 206 +++++++++++++++++++++++++++++++++++++++++ scripts/Makefile.build | 2 +- 6 files changed, 396 insertions(+), 10 deletions(-) diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 8fdcb856ccad..86173add2616 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -2,24 +2,50 @@ use core::ops::{Deref, DerefMut}; use kernel::{ - bindings, + bindings::{self, *}, io_buffer::{ReadableFromBytes, WritableToBytes}, }; +macro_rules! pub_no_prefix { + ($prefix:ident, $($newname:ident),+) => { + $(pub(crate) const $newname: u32 = kernel::macros::concat_idents!($prefix, $newname);)+ + }; +} + +pub_no_prefix!( + binder_driver_return_protocol_, + BR_DEAD_REPLY, + BR_FAILED_REPLY, + BR_NOOP, + BR_SPAWN_LOOPER, + BR_TRANSACTION_COMPLETE, + BR_OK +); + +pub_no_prefix!( + binder_driver_command_protocol_, + BC_ENTER_LOOPER, + BC_EXIT_LOOPER, + BC_REGISTER_LOOPER +); + macro_rules! decl_wrapper { ($newname:ident, $wrapped:ty) => { #[derive(Copy, Clone, Default)] #[repr(transparent)] pub(crate) struct $newname($wrapped); + // SAFETY: This macro is only used with types where this is ok. unsafe impl ReadableFromBytes for $newname {} unsafe impl WritableToBytes for $newname {} + impl Deref for $newname { type Target = $wrapped; fn deref(&self) -> &Self::Target { &self.0 } } + impl DerefMut for $newname { fn deref_mut(&mut self) -> &mut Self::Target { &mut self.0 @@ -28,7 +54,9 @@ fn deref_mut(&mut self) -> &mut Self::Target { }; } +decl_wrapper!(BinderWriteRead, bindings::binder_write_read); decl_wrapper!(BinderVersion, bindings::binder_version); +decl_wrapper!(ExtendedError, bindings::binder_extended_error); impl BinderVersion { pub(crate) fn current() -> Self { @@ -37,3 +65,9 @@ pub(crate) fn current() -> Self { }) } } + +impl ExtendedError { + pub(crate) fn new(id: u32, command: u32, param: i32) -> Self { + Self(bindings::binder_extended_error { id, command, param }) + } +} diff --git a/drivers/android/error.rs b/drivers/android/error.rs new file mode 100644 index 000000000000..41fc4347ab55 --- /dev/null +++ b/drivers/android/error.rs @@ -0,0 +1,52 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::prelude::*; + +use crate::defs::*; + +/// An error that will be returned to userspace via the `BINDER_WRITE_READ` ioctl rather than via +/// errno. +pub(crate) struct BinderError { + pub(crate) reply: u32, + source: Option, +} + +/// Convert an errno into a `BinderError` and store the errno used to construct it. The errno +/// should be stored as the thread's extended error when given to userspace. +impl From for BinderError { + fn from(source: Error) -> Self { + Self { + reply: BR_FAILED_REPLY, + source: Some(source), + } + } +} + +impl From for BinderError { + fn from(_: core::alloc::AllocError) -> Self { + Self { + reply: BR_FAILED_REPLY, + source: Some(ENOMEM), + } + } +} + +impl core::fmt::Debug for BinderError { + fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { + match self.reply { + BR_FAILED_REPLY => match self.source.as_ref() { + Some(source) => f + .debug_struct("BR_FAILED_REPLY") + .field("source", source) + .finish(), + None => f.pad("BR_FAILED_REPLY"), + }, + BR_DEAD_REPLY => f.pad("BR_DEAD_REPLY"), + BR_TRANSACTION_COMPLETE => f.pad("BR_TRANSACTION_COMPLETE"), + _ => f + .debug_struct("BinderError") + .field("reply", &self.reply) + .finish(), + } + } +} diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 2f16e4cedbf1..47d074dd8465 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -13,11 +13,12 @@ use kernel::{ bindings, cred::Credential, - file::{File, PollTable}, - io_buffer::IoBufferWriter, + file::{self, File, PollTable}, + io_buffer::{IoBufferReader, IoBufferWriter}, list::{HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks}, mm, prelude::*, + rbtree::RBTree, sync::{Arc, ArcBorrow, SpinLock}, task::Task, types::ARef, @@ -25,7 +26,9 @@ workqueue::{self, Work}, }; -use crate::{context::Context, defs::*}; +use crate::{context::Context, defs::*, thread::Thread}; + +use core::mem::take; const PROC_DEFER_FLUSH: u8 = 1; const PROC_DEFER_RELEASE: u8 = 2; @@ -33,6 +36,14 @@ /// The fields of `Process` protected by the spinlock. pub(crate) struct ProcessInner { is_dead: bool, + threads: RBTree>, + + /// The number of requested threads that haven't registered yet. + requested_thread_count: u32, + /// The maximum number of threads used by the process thread pool. + max_threads: u32, + /// The number of threads the started and registered with the thread pool. + started_thread_count: u32, /// Bitmap of deferred work to do. defer_work: u8, @@ -42,9 +53,23 @@ impl ProcessInner { fn new() -> Self { Self { is_dead: false, + threads: RBTree::new(), + requested_thread_count: 0, + max_threads: 0, + started_thread_count: 0, defer_work: 0, } } + + fn register_thread(&mut self) -> bool { + if self.requested_thread_count == 0 { + return false; + } + + self.requested_thread_count -= 1; + self.started_thread_count += 1; + true + } } /// A process using binder. @@ -127,10 +152,56 @@ fn new(ctx: Arc, cred: ARef) -> Result> { Ok(process) } + fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result> { + { + let inner = self.inner.lock(); + if let Some(thread) = inner.threads.get(&id) { + return Ok(thread.clone()); + } + } + + // Allocate a new `Thread` without holding any locks. + let ta = Thread::new(id, self.into())?; + let node = RBTree::try_allocate_node(id, ta.clone())?; + + let mut inner = self.inner.lock(); + + // Recheck. It's possible the thread was created while we were not holding the lock. + if let Some(thread) = inner.threads.get(&id) { + return Ok(thread.clone()); + } + + inner.threads.insert(node); + Ok(ta) + } + fn version(&self, data: UserSlicePtr) -> Result { data.writer().write(&BinderVersion::current()) } + pub(crate) fn register_thread(&self) -> bool { + self.inner.lock().register_thread() + } + + fn remove_thread(&self, thread: Arc) { + self.inner.lock().threads.remove(&thread.id); + thread.release(); + } + + fn set_max_threads(&self, max: u32) { + self.inner.lock().max_threads = max; + } + + pub(crate) fn needs_thread(&self) -> bool { + let mut inner = self.inner.lock(); + let ret = + inner.requested_thread_count == 0 && inner.started_thread_count < inner.max_threads; + if ret { + inner.requested_thread_count += 1 + } + ret + } + fn deferred_flush(&self) { // NOOP for now. } @@ -139,6 +210,17 @@ fn deferred_release(self: Arc) { self.inner.lock().is_dead = true; self.ctx.deregister_process(&self); + + // Move the threads out of `inner` so that we can iterate over them without holding the + // lock. + let mut inner = self.inner.lock(); + let threads = take(&mut inner.threads); + drop(inner); + + // Release all threads. + for thread in threads.values() { + thread.release(); + } } pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result { @@ -161,22 +243,32 @@ pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result { /// The ioctl handler. impl Process { fn write( - _this: ArcBorrow<'_, Process>, + this: ArcBorrow<'_, Process>, _file: &File, - _cmd: u32, - _reader: &mut UserSlicePtrReader, + cmd: u32, + reader: &mut UserSlicePtrReader, ) -> Result { - Err(EINVAL) + let thread = this.get_thread(kernel::current!().pid())?; + match cmd { + bindings::BINDER_SET_MAX_THREADS => this.set_max_threads(reader.read()?), + bindings::BINDER_THREAD_EXIT => this.remove_thread(thread), + _ => return Err(EINVAL), + } + Ok(0) } fn read_write( this: ArcBorrow<'_, Process>, - _file: &File, + file: &File, cmd: u32, data: UserSlicePtr, ) -> Result { + let thread = this.get_thread(kernel::current!().pid())?; + let blocking = (file.flags() & file::flags::O_NONBLOCK) == 0; match cmd { + bindings::BINDER_WRITE_READ => thread.write_read(data, blocking)?, bindings::BINDER_VERSION => this.version(data)?, + bindings::BINDER_GET_EXTENDED_ERROR => thread.get_extended_error(data)?, _ => return Err(EINVAL), } Ok(0) diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 6de2f40846fb..64fd24ea8be1 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -14,7 +14,9 @@ mod context; mod defs; +mod error; mod process; +mod thread; module! { type: BinderModule, diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs new file mode 100644 index 000000000000..593c8e4f184e --- /dev/null +++ b/drivers/android/thread.rs @@ -0,0 +1,206 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module defines the `Thread` type, which represents a userspace thread that is using +//! binder. +//! +//! The `Process` object stores all of the threads in an rb tree. + +use kernel::{ + bindings, + io_buffer::{IoBufferReader, IoBufferWriter}, + prelude::*, + sync::{Arc, SpinLock}, + user_ptr::UserSlicePtr, +}; + +use crate::{defs::*, process::Process}; + +use core::mem::size_of; + +/// The fields of `Thread` protected by the spinlock. +struct InnerThread { + /// Determines the looper state of the thread. It is a bit-wise combination of the constants + /// prefixed with `LOOPER_`. + looper_flags: u32, + + /// Determines if thread is dead. + is_dead: bool, + + /// Extended error information for this thread. + extended_error: ExtendedError, +} + +const LOOPER_REGISTERED: u32 = 0x01; +const LOOPER_ENTERED: u32 = 0x02; +const LOOPER_EXITED: u32 = 0x04; +const LOOPER_INVALID: u32 = 0x08; + +impl InnerThread { + fn new() -> Self { + use core::sync::atomic::{AtomicU32, Ordering}; + + fn next_err_id() -> u32 { + static EE_ID: AtomicU32 = AtomicU32::new(0); + EE_ID.fetch_add(1, Ordering::Relaxed) + } + + Self { + looper_flags: 0, + is_dead: false, + extended_error: ExtendedError::new(next_err_id(), BR_OK, 0), + } + } + + fn looper_enter(&mut self) { + self.looper_flags |= LOOPER_ENTERED; + if self.looper_flags & LOOPER_REGISTERED != 0 { + self.looper_flags |= LOOPER_INVALID; + } + } + + fn looper_register(&mut self, valid: bool) { + self.looper_flags |= LOOPER_REGISTERED; + if !valid || self.looper_flags & LOOPER_ENTERED != 0 { + self.looper_flags |= LOOPER_INVALID; + } + } + + fn looper_exit(&mut self) { + self.looper_flags |= LOOPER_EXITED; + } + + /// Determines whether the thread is part of a pool, i.e., if it is a looper. + fn is_looper(&self) -> bool { + self.looper_flags & (LOOPER_ENTERED | LOOPER_REGISTERED) != 0 + } +} + +/// This represents a thread that's used with binder. +#[pin_data] +pub(crate) struct Thread { + pub(crate) id: i32, + pub(crate) process: Arc, + #[pin] + inner: SpinLock, +} + +impl Thread { + pub(crate) fn new(id: i32, process: Arc) -> Result> { + Arc::pin_init(pin_init!(Thread { + id, + process, + inner <- kernel::new_spinlock!(InnerThread::new(), "Thread::inner"), + })) + } + + pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result { + let mut writer = data.writer(); + let ee = self.inner.lock().extended_error; + writer.write(&ee)?; + Ok(()) + } + + fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { + let write_start = req.write_buffer.wrapping_add(req.write_consumed); + let write_len = req.write_size - req.write_consumed; + let mut reader = UserSlicePtr::new(write_start as _, write_len as _).reader(); + + while reader.len() >= size_of::() { + let before = reader.len(); + let cmd = reader.read::()?; + match cmd { + BC_REGISTER_LOOPER => { + let valid = self.process.register_thread(); + self.inner.lock().looper_register(valid); + } + BC_ENTER_LOOPER => self.inner.lock().looper_enter(), + BC_EXIT_LOOPER => self.inner.lock().looper_exit(), + + // Fail if given an unknown error code. + // BC_ATTEMPT_ACQUIRE and BC_ACQUIRE_RESULT are no longer supported. + _ => return Err(EINVAL), + } + // Update the number of write bytes consumed. + req.write_consumed += (before - reader.len()) as u64; + } + + Ok(()) + } + + fn read(self: &Arc, req: &mut BinderWriteRead, _wait: bool) -> Result { + let read_start = req.read_buffer.wrapping_add(req.read_consumed); + let read_len = req.read_size - req.read_consumed; + let mut writer = UserSlicePtr::new(read_start as _, read_len as _).writer(); + let in_pool = self.inner.lock().is_looper(); + + // Reserve some room at the beginning of the read buffer so that we can send a + // BR_SPAWN_LOOPER if we need to. + let mut has_noop_placeholder = false; + if req.read_consumed == 0 { + if let Err(err) = writer.write(&BR_NOOP) { + pr_warn!("Failure when writing BR_NOOP at beginning of buffer."); + return Err(err); + } + has_noop_placeholder = true; + } + + // Loop doing work while there is room in the buffer. + #[allow(clippy::never_loop)] + while writer.len() >= size_of::() + 4 { + // There is enough space in the output buffer to process another work item. + // + // However, we have not yet added work items to the driver, so we immediately break + // from the loop. + break; + } + + req.read_consumed += read_len - writer.len() as u64; + + // Write BR_SPAWN_LOOPER if the process needs more threads for its pool. + if has_noop_placeholder && in_pool && self.process.needs_thread() { + let mut writer = UserSlicePtr::new(req.read_buffer as _, req.read_size as _).writer(); + writer.write(&BR_SPAWN_LOOPER)?; + } + Ok(()) + } + + pub(crate) fn write_read(self: &Arc, data: UserSlicePtr, wait: bool) -> Result { + let (mut reader, mut writer) = data.reader_writer(); + let mut req = reader.read::()?; + + // Go through the write buffer. + if req.write_size > 0 { + if let Err(err) = self.write(&mut req) { + pr_warn!( + "Write failure {:?} in pid:{}", + err, + self.process.task.pid_in_current_ns() + ); + req.read_consumed = 0; + writer.write(&req)?; + return Err(err); + } + } + + // Go through the work queue. + let mut ret = Ok(()); + if req.read_size > 0 { + ret = self.read(&mut req, wait); + if ret.is_err() && ret != Err(EINTR) { + pr_warn!( + "Read failure {:?} in pid:{}", + ret, + self.process.task.pid_in_current_ns() + ); + } + } + + // Write the request back so that the consumed fields are visible to the caller. + writer.write(&req)?; + ret + } + + pub(crate) fn release(self: &Arc) { + self.inner.lock().is_dead = true; + } +} diff --git a/scripts/Makefile.build b/scripts/Makefile.build index f78d2e75a795..b388f3d75d49 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -262,7 +262,7 @@ $(obj)/%.lst: $(src)/%.c FORCE # Compile Rust sources (.rs) # --------------------------------------------------------------------------- -rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of +rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of,allocator_api # `--out-dir` is required to avoid temporaries being created by `rustc` in the # current working directory, which may be not accessible in the out-of-tree From patchwork Wed Nov 1 18:01:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160626 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp604926vqx; Wed, 1 Nov 2023 11:03:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE5FyXge9C2kHTuplBy1SUwiw+becleFe1wVYWNkrT213Tc+mXS6aJzy2wiiYIMEQhF3hkb X-Received: by 2002:a05:6a00:15ca:b0:6bd:66ce:21d4 with SMTP id o10-20020a056a0015ca00b006bd66ce21d4mr16186622pfu.23.1698861795313; Wed, 01 Nov 2023 11:03:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861795; cv=none; d=google.com; s=arc-20160816; b=n62urCT2zIgRha1TyfLx6sWSJKLv4ory1gkBM4doC621lbkyaJz9XHxS8zweeD0ZNT HRe4KZoapClsRNUiJODe1sZGFjbt09LadFVuBypt/UUiOTugoyQPDGr++WxlLbwA+4v9 GjBMr9QB8oH/hpSeYN6JoxFvqDRp6omfWaE0sBU1hJzacLhEOsYc2/D+AbOrV7acSyKU fwTp9SoTNuOqmgWU5+Zik09IIXTM2A2wP6GMMCcnsaVNzlnwlLj0V8yFw8Tyvv157e7e c+U9El+kWX9W0orl92IL5FHoocpWYHbnJHXUa7kSGOdE7nOSCCyLHObQEbHACLherRp5 oPpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=jcR4e9c2oSdeBUrhisVRNJ8oBwDdg+8SErGLxmSxrCU=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=JZwjGwqjiMKkaw2c+qeqY8pO9KmaAAIjCIdReh4Yqydg9fWCPItZoEYRayUvXpP6GO b57EmetqQRIdyTqf71C0/Z9WkHqm17rN1kEEHhFdpubnqd/hE0YWR5N0MtLkRtZOgnFO jg0/7P4Ml3F+TV8NR+VNjOXQG486tQRi/9yPuT6CcdQ3fNJNBhlKzSAf6CtgIcXucAm+ MtVVh4QtYVVO9uYOH3OO1QgoA5lnhI2l1rd6W7YAQWkh4g9wONuZI8uWp6vyLxe/7MX8 xO1l/HitTiWaN4Aelp0xt5i7cRWXeejw+SKHK2m5iyjuOtPqD56N78WZ8AuaaI1uttpH TbJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=A5HgAVwo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id fa25-20020a056a002d1900b006be1fc3ee86si2105489pfb.234.2023.11.01.11.02.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:03:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=A5HgAVwo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 99D378184E00; Wed, 1 Nov 2023 11:02:54 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344819AbjKASCn (ORCPT + 34 others); Wed, 1 Nov 2023 14:02:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58904 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344760AbjKASCc (ORCPT ); Wed, 1 Nov 2023 14:02:32 -0400 Received: from mail-ej1-x649.google.com (mail-ej1-x649.google.com [IPv6:2a00:1450:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77BFC109 for ; Wed, 1 Nov 2023 11:02:27 -0700 (PDT) Received: by mail-ej1-x649.google.com with SMTP id a640c23a62f3a-9ae0601d689so5311166b.0 for ; Wed, 01 Nov 2023 11:02:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861746; x=1699466546; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jcR4e9c2oSdeBUrhisVRNJ8oBwDdg+8SErGLxmSxrCU=; b=A5HgAVwolWkxDzqUBXti0Oql3sQEwyRgWjdhuCEzpT7tCW/NiikLFj6xtQF0F8Wgn1 qGr1w4ciVZnVpEzgz9L8p7s1Fzj8R1NgRQcmw6GBC9MgyLIaTeFPG8tTVer6SkGWuNbE fNo3ANpSF263PIMBqdZlF9PBsWRAAhFxpkzLARfKpaq+SSYPBwvAEzBCzcwgJFA/mcIj 7nenqWTpZ1wwQ//iIvDwQUhbVffJ7RJpTKFXXvGCKv/H6s9eaw9PfKL7px2wCnehIy4H 45b9j0d9bUinr628zG6OML6YnvTGHJJZD4C0ajN+Xhe3vWYr7kRaWOUCU2rQXzrRB5eY +Omw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861746; x=1699466546; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jcR4e9c2oSdeBUrhisVRNJ8oBwDdg+8SErGLxmSxrCU=; b=K9ThkUSpIE4Cs+3hoNynUZTtq6EKpCzjBSeQZK2VS9GKWQH508EaBtXxOOJiHOva0O +64jcMtDilYDiJTRYgPyrCUlkEEXUxMGtwRXTMw7m5l28SuErgX/Cla+Dj9Pq/xUkG/5 WxxjE1/M7KkbeQdiXWgCbPX2BD4Gv8FK5N8sMDvLrpy4BzmcwwyBz+NPwjw6mP4mZ4zM 3UfXxMRe2uetHVcDm6QAS25qyNOGaTS9mOGtXiI6ST9xe8vP9ET360MrCKH0j7ovxVJu i8/9W18nSu++j83Gs2oDX1oU8ycRoIlQejuUmkovJnQEvIly4WlS2gvDhl/SKILQ+IbX ctSw== X-Gm-Message-State: AOJu0Yyek2aSltlOgRKYQKpwJpU/0+w2eKhI1VZ3dvLOe67quq8o5pxz L+EHoB5bICUdTRRqeCzPnwo94ew10nhyTOI= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a17:906:d93:b0:9be:46f7:7e28 with SMTP id m19-20020a1709060d9300b009be46f77e28mr25262eji.13.1698861745966; Wed, 01 Nov 2023 11:02:25 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:34 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-4-08ba9197f637@google.com> Subject: [PATCH RFC 04/20] rust_binder: add work lists From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:02:54 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385705549181020 X-GMAIL-MSGID: 1781385705549181020 The binder driver uses linked lists of work items to store events that need to be delivered to userspace. There are work lists on both the process and threads. Work items are expected to implement the `DeliverToRead` trait, whose name signifies that this type is something that can be delivered to userspace via the read part of the `BINDER_WRITE_READ` ioctl. The trait defines what happens when a work item is executed, when it is cancelled, how the thread should be notified (`wake_up_interruptible_sync` or `wake_up_interruptible`?), and how it can be enqueued to a linked list. For each type that implements the trait, Rust will generate a vtable for the type. Pointers to the `dyn DeliverToRead` type will be fat pointers where the metadata of the pointer is a pointer to the vtable. We introduce the concept of a "ready thread". This is a thread that is currently waiting for work items inside the `get_work` method. The process will keep track of them and deliver new work items to one of the ready threads directly. When there are no ready threads, work items are stored in the process work list. The work lists added in this patch are not used yet, so the `push_work` methods are marked with `#[allow(dead_code)]` to silence the warnings about unused methods. A user is added in the next patch of this patch set. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Signed-off-by: Alice Ryhl --- drivers/android/error.rs | 9 ++ drivers/android/process.rs | 126 ++++++++++++++++-- drivers/android/rust_binder.rs | 87 ++++++++++++- drivers/android/thread.rs | 284 +++++++++++++++++++++++++++++++++++++++-- scripts/Makefile.build | 2 +- 5 files changed, 488 insertions(+), 20 deletions(-) diff --git a/drivers/android/error.rs b/drivers/android/error.rs index 41fc4347ab55..a31b696efafc 100644 --- a/drivers/android/error.rs +++ b/drivers/android/error.rs @@ -11,6 +11,15 @@ pub(crate) struct BinderError { source: Option, } +impl BinderError { + pub(crate) fn new_dead() -> Self { + Self { + reply: BR_DEAD_REPLY, + source: None, + } + } +} + /// Convert an errno into a `BinderError` and store the errno used to construct it. The errno /// should be stored as the thread's extended error when given to userspace. impl From for BinderError { diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 47d074dd8465..22662c7d388a 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -15,18 +15,24 @@ cred::Credential, file::{self, File, PollTable}, io_buffer::{IoBufferReader, IoBufferWriter}, - list::{HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks}, + list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks}, mm, prelude::*, rbtree::RBTree, - sync::{Arc, ArcBorrow, SpinLock}, + sync::{lock::Guard, Arc, ArcBorrow, SpinLock}, task::Task, - types::ARef, + types::{ARef, Either}, user_ptr::{UserSlicePtr, UserSlicePtrReader}, workqueue::{self, Work}, }; -use crate::{context::Context, defs::*, thread::Thread}; +use crate::{ + context::Context, + defs::*, + error::BinderError, + thread::{PushWorkRes, Thread}, + DLArc, DTRWrap, DeliverToRead, +}; use core::mem::take; @@ -35,8 +41,10 @@ /// The fields of `Process` protected by the spinlock. pub(crate) struct ProcessInner { - is_dead: bool, + pub(crate) is_dead: bool, threads: RBTree>, + ready_threads: List, + work: List>, /// The number of requested threads that haven't registered yet. requested_thread_count: u32, @@ -54,6 +62,8 @@ fn new() -> Self { Self { is_dead: false, threads: RBTree::new(), + ready_threads: List::new(), + work: List::new(), requested_thread_count: 0, max_threads: 0, started_thread_count: 0, @@ -61,6 +71,37 @@ fn new() -> Self { } } + /// Schedule the work item for execution on this process. + /// + /// If any threads are ready for work, then the work item is given directly to that thread and + /// it is woken up. Otherwise, it is pushed to the process work list. + /// + /// This call can fail only if the process is dead. In this case, the work item is returned to + /// the caller so that the caller can drop it after releasing the inner process lock. This is + /// necessary since the destructor of `Transaction` will take locks that can't necessarily be + /// taken while holding the inner process lock. + #[allow(dead_code)] + pub(crate) fn push_work( + &mut self, + work: DLArc, + ) -> Result<(), (BinderError, DLArc)> { + // Try to find a ready thread to which to push the work. + if let Some(thread) = self.ready_threads.pop_front() { + // Push to thread while holding state lock. This prevents the thread from giving up + // (for example, because of a signal) when we're about to deliver work. + match thread.push_work(work) { + PushWorkRes::Ok => Ok(()), + PushWorkRes::FailedDead(work) => Err((BinderError::new_dead(), work)), + } + } else if self.is_dead { + Err((BinderError::new_dead(), work)) + } else { + // There are no ready threads. Push work to process queue. + self.work.push_back(work); + Ok(()) + } + } + fn register_thread(&mut self) -> bool { if self.requested_thread_count == 0 { return false; @@ -152,6 +193,31 @@ fn new(ctx: Arc, cred: ARef) -> Result> { Ok(process) } + /// Attempts to fetch a work item from the process queue. + pub(crate) fn get_work(&self) -> Option> { + self.inner.lock().work.pop_front() + } + + /// Attempts to fetch a work item from the process queue. If none is available, it registers the + /// given thread as ready to receive work directly. + /// + /// This must only be called when the thread is not participating in a transaction chain; when + /// it is, work will always be delivered directly to the thread (and not through the process + /// queue). + pub(crate) fn get_work_or_register<'a>( + &'a self, + thread: &'a Arc, + ) -> Either, Registration<'a>> { + let mut inner = self.inner.lock(); + // Try to get work from the process queue. + if let Some(work) = inner.work.pop_front() { + return Either::Left(work); + } + + // Register the thread as ready. + Either::Right(Registration::new(self, thread, &mut inner)) + } + fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result> { { let inner = self.inner.lock(); @@ -194,8 +260,9 @@ fn set_max_threads(&self, max: u32) { pub(crate) fn needs_thread(&self) -> bool { let mut inner = self.inner.lock(); - let ret = - inner.requested_thread_count == 0 && inner.started_thread_count < inner.max_threads; + let ret = inner.requested_thread_count == 0 + && inner.ready_threads.is_empty() + && inner.started_thread_count < inner.max_threads; if ret { inner.requested_thread_count += 1 } @@ -203,7 +270,10 @@ pub(crate) fn needs_thread(&self) -> bool { } fn deferred_flush(&self) { - // NOOP for now. + let inner = self.inner.lock(); + for thread in inner.threads.values() { + thread.exit_looper(); + } } fn deferred_release(self: Arc) { @@ -211,6 +281,11 @@ fn deferred_release(self: Arc) { self.ctx.deregister_process(&self); + // Cancel all pending work items. + while let Some(work) = self.get_work() { + work.into_arc().cancel(); + } + // Move the threads out of `inner` so that we can iterate over them without holding the // lock. let mut inner = self.inner.lock(); @@ -341,3 +416,38 @@ pub(crate) fn poll( Err(EINVAL) } } + +/// Represents that a thread has registered with the `ready_threads` list of its process. +/// +/// The destructor of this type will unregister the thread from the list of ready threads. +pub(crate) struct Registration<'a> { + process: &'a Process, + thread: &'a Arc, +} + +impl<'a> Registration<'a> { + fn new( + process: &'a Process, + thread: &'a Arc, + guard: &mut Guard<'_, ProcessInner, kernel::sync::lock::spinlock::SpinLockBackend>, + ) -> Self { + assert!(core::ptr::eq(process, &*thread.process)); + // INVARIANT: We are pushing this thread to the right `ready_threads` list. + if let Ok(list_arc) = ListArc::try_from_arc(thread.clone()) { + guard.ready_threads.push_front(list_arc); + } else { + pr_warn!("Same thread registered with `ready_threads` twice."); + } + Self { process, thread } + } +} + +impl Drop for Registration<'_> { + fn drop(&mut self) { + let mut inner = self.process.inner.lock(); + // SAFETY: The thread has the invariant that we never push it to any other linked list than + // the `ready_threads` list of its parent process. Therefore, the thread is either in that + // list, or in no list. + unsafe { inner.ready_threads.remove(self.thread) }; + } +} diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 64fd24ea8be1..55d475737cef 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -5,12 +5,16 @@ use kernel::{ bindings::{self, seq_file}, file::{File, PollTable}, + list::{ + HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc, + }, prelude::*, sync::Arc, types::ForeignOwnable, + user_ptr::UserSlicePtrWriter, }; -use crate::{context::Context, process::Process}; +use crate::{context::Context, process::Process, thread::Thread}; mod context; mod defs; @@ -26,6 +30,87 @@ license: "GPL", } +/// Specifies how a type should be delivered to the read part of a BINDER_WRITE_READ ioctl. +/// +/// When a value is pushed to the todo list for a process or thread, it is stored as a trait object +/// with the type `Arc`. Trait objects are a Rust feature that lets you +/// implement dynamic dispatch over many different types. This lets us store many different types +/// in the todo list. +trait DeliverToRead: ListArcSafe + Send + Sync { + /// Performs work. Returns true if remaining work items in the queue should be processed + /// immediately, or false if it should return to caller before processing additional work + /// items. + fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) -> Result; + + /// Cancels the given work item. This is called instead of [`DeliverToRead::do_work`] when work + /// won't be delivered. + fn cancel(self: DArc) {} + + /// Should we use `wake_up_interruptible_sync` or `wake_up_interruptible` when scheduling this + /// work item? + /// + /// Generally only set to true for non-oneway transactions. + fn should_sync_wakeup(&self) -> bool; + + /// Get the debug name of this type. + fn debug_name(&self) -> &'static str { + core::any::type_name::() + } +} + +// Wrapper around a `DeliverToRead` with linked list links. +#[pin_data] +struct DTRWrap { + #[pin] + links: ListLinksSelfPtr>, + #[pin] + wrapped: T, +} +kernel::list::impl_has_list_links_self_ptr! { + impl HasSelfPtr> for DTRWrap { self.links } +} +kernel::list::impl_list_arc_safe! { + impl{T: ListArcSafe + ?Sized} ListArcSafe<0> for DTRWrap { + tracked_by wrapped: T; + } +} +kernel::list::impl_list_item! { + impl ListItem<0> for DTRWrap { + using ListLinksSelfPtr; + } +} + +impl core::ops::Deref for DTRWrap { + type Target = T; + fn deref(&self) -> &T { + &self.wrapped + } +} + +impl core::ops::Receiver for DTRWrap {} + +type DArc = kernel::sync::Arc>; +type DLArc = kernel::list::ListArc>; + +impl DTRWrap { + #[allow(dead_code)] + fn arc_try_new(val: T) -> Result, alloc::alloc::AllocError> { + ListArc::pin_init(pin_init!(Self { + links <- ListLinksSelfPtr::new(), + wrapped: val, + })) + .map_err(|_| alloc::alloc::AllocError) + } + + #[allow(dead_code)] + fn arc_pin_init(init: impl PinInit) -> Result, kernel::error::Error> { + ListArc::pin_init(pin_init!(Self { + links <- ListLinksSelfPtr::new(), + wrapped <- init, + })) + } +} + struct BinderModule {} impl kernel::Module for BinderModule { diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 593c8e4f184e..a12c271a4e8f 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -8,24 +8,51 @@ use kernel::{ bindings, io_buffer::{IoBufferReader, IoBufferWriter}, + list::{ + AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc, + }, prelude::*, - sync::{Arc, SpinLock}, + sync::{Arc, CondVar, SpinLock}, + types::Either, user_ptr::UserSlicePtr, }; -use crate::{defs::*, process::Process}; +use crate::{defs::*, process::Process, DLArc, DTRWrap, DeliverToRead}; use core::mem::size_of; +pub(crate) enum PushWorkRes { + Ok, + FailedDead(DLArc), +} + +impl PushWorkRes { + fn is_ok(&self) -> bool { + match self { + PushWorkRes::Ok => true, + PushWorkRes::FailedDead(_) => false, + } + } +} + /// The fields of `Thread` protected by the spinlock. struct InnerThread { /// Determines the looper state of the thread. It is a bit-wise combination of the constants /// prefixed with `LOOPER_`. looper_flags: u32, + /// Determines whether the looper should return. + looper_need_return: bool, + /// Determines if thread is dead. is_dead: bool, + /// Determines whether the work list below should be processed. When set to false, `work_list` + /// is treated as if it were empty. + process_work_list: bool, + /// List of work items to deliver to userspace. + work_list: List>, + /// Extended error information for this thread. extended_error: ExtendedError, } @@ -34,6 +61,8 @@ struct InnerThread { const LOOPER_ENTERED: u32 = 0x02; const LOOPER_EXITED: u32 = 0x04; const LOOPER_INVALID: u32 = 0x08; +const LOOPER_WAITING: u32 = 0x10; +const LOOPER_WAITING_PROC: u32 = 0x20; impl InnerThread { fn new() -> Self { @@ -46,11 +75,42 @@ fn next_err_id() -> u32 { Self { looper_flags: 0, + looper_need_return: false, is_dead: false, + process_work_list: false, + work_list: List::new(), extended_error: ExtendedError::new(next_err_id(), BR_OK, 0), } } + fn pop_work(&mut self) -> Option> { + if !self.process_work_list { + return None; + } + + let ret = self.work_list.pop_front(); + self.process_work_list = !self.work_list.is_empty(); + ret + } + + #[allow(dead_code)] + fn push_work(&mut self, work: DLArc) -> PushWorkRes { + if self.is_dead { + PushWorkRes::FailedDead(work) + } else { + self.work_list.push_back(work); + self.process_work_list = true; + PushWorkRes::Ok + } + } + + /// Used to push work items that do not need to be processed immediately and can wait until the + /// thread gets another work item. + #[allow(dead_code)] + fn push_work_deferred(&mut self, work: DLArc) { + self.work_list.push_back(work); + } + fn looper_enter(&mut self) { self.looper_flags |= LOOPER_ENTERED; if self.looper_flags & LOOPER_REGISTERED != 0 { @@ -73,6 +133,14 @@ fn looper_exit(&mut self) { fn is_looper(&self) -> bool { self.looper_flags & (LOOPER_ENTERED | LOOPER_REGISTERED) != 0 } + + /// Determines whether the thread should attempt to fetch work items from the process queue. + /// This is case when the thread is not part of a transaction stack and it is registered as a + /// looper. Also, if there is local work, we want to return to userspace before we deliver any + /// remote work. + fn should_use_process_work_queue(&self) -> bool { + !self.process_work_list && self.is_looper() + } } /// This represents a thread that's used with binder. @@ -82,6 +150,29 @@ pub(crate) struct Thread { pub(crate) process: Arc, #[pin] inner: SpinLock, + #[pin] + work_condvar: CondVar, + /// Used to insert this thread into the process' `ready_threads` list. + /// + /// INVARIANT: May never be used for any other list than the `self.process.ready_threads`. + #[pin] + links: ListLinks, + #[pin] + links_track: AtomicListArcTracker, +} + +kernel::list::impl_has_list_links! { + impl HasListLinks<0> for Thread { self.links } +} +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for Thread { + tracked_by links_track: AtomicListArcTracker; + } +} +kernel::list::impl_list_item! { + impl ListItem<0> for Thread { + using ListLinks; + } } impl Thread { @@ -90,6 +181,9 @@ pub(crate) fn new(id: i32, process: Arc) -> Result> { id, process, inner <- kernel::new_spinlock!(InnerThread::new(), "Thread::inner"), + work_condvar <- kernel::new_condvar!("Thread::work_condvar"), + links <- ListLinks::new(), + links_track <- AtomicListArcTracker::new(), })) } @@ -100,6 +194,123 @@ pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result { Ok(()) } + /// Attempts to fetch a work item from the thread-local queue. The behaviour if the queue is + /// empty depends on `wait`: if it is true, the function waits for some work to be queued (or a + /// signal); otherwise it returns indicating that none is available. + fn get_work_local(self: &Arc, wait: bool) -> Result>> { + { + let mut inner = self.inner.lock(); + if inner.looper_need_return { + return Ok(inner.pop_work()); + } + } + + // Try once if the caller does not want to wait. + if !wait { + return self.inner.lock().pop_work().ok_or(EAGAIN).map(Some); + } + + // Loop waiting only on the local queue (i.e., not registering with the process queue). + let mut inner = self.inner.lock(); + loop { + if let Some(work) = inner.pop_work() { + return Ok(Some(work)); + } + + inner.looper_flags |= LOOPER_WAITING; + let signal_pending = self.work_condvar.wait(&mut inner); + inner.looper_flags &= !LOOPER_WAITING; + + if signal_pending { + return Err(EINTR); + } + if inner.looper_need_return { + return Ok(None); + } + } + } + + /// Attempts to fetch a work item from the thread-local queue, falling back to the process-wide + /// queue if none is available locally. + /// + /// This must only be called when the thread is not participating in a transaction chain. If it + /// is, the local version (`get_work_local`) should be used instead. + fn get_work(self: &Arc, wait: bool) -> Result>> { + // Try to get work from the thread's work queue, using only a local lock. + { + let mut inner = self.inner.lock(); + if let Some(work) = inner.pop_work() { + return Ok(Some(work)); + } + if inner.looper_need_return { + drop(inner); + return Ok(self.process.get_work()); + } + } + + // If the caller doesn't want to wait, try to grab work from the process queue. + // + // We know nothing will have been queued directly to the thread queue because it is not in + // a transaction and it is not in the process' ready list. + if !wait { + return self.process.get_work().ok_or(EAGAIN).map(Some); + } + + // Get work from the process queue. If none is available, atomically register as ready. + let reg = match self.process.get_work_or_register(self) { + Either::Left(work) => return Ok(Some(work)), + Either::Right(reg) => reg, + }; + + let mut inner = self.inner.lock(); + loop { + if let Some(work) = inner.pop_work() { + return Ok(Some(work)); + } + + inner.looper_flags |= LOOPER_WAITING | LOOPER_WAITING_PROC; + let signal_pending = self.work_condvar.wait(&mut inner); + inner.looper_flags &= !(LOOPER_WAITING | LOOPER_WAITING_PROC); + + if signal_pending || inner.looper_need_return { + // We need to return now. We need to pull the thread off the list of ready threads + // (by dropping `reg`), then check the state again after it's off the list to + // ensure that something was not queued in the meantime. If something has been + // queued, we just return it (instead of the error). + drop(inner); + drop(reg); + + let res = match self.inner.lock().pop_work() { + Some(work) => Ok(Some(work)), + None if signal_pending => Err(EINTR), + None => Ok(None), + }; + return res; + } + } + } + + /// Push the provided work item to be delivered to user space via this thread. + /// + /// Returns whether the item was successfully pushed. This can only fail if the work item is + /// already in a work list. + #[allow(dead_code)] + pub(crate) fn push_work(&self, work: DLArc) -> PushWorkRes { + let sync = work.should_sync_wakeup(); + + let res = self.inner.lock().push_work(work); + + if res.is_ok() { + if sync { + self.work_condvar.notify_sync(); + } else { + self.work_condvar.notify_one(); + } + } + + res + } + fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { let write_start = req.write_buffer.wrapping_add(req.write_consumed); let write_len = req.write_size - req.write_consumed; @@ -127,11 +338,19 @@ fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { Ok(()) } - fn read(self: &Arc, req: &mut BinderWriteRead, _wait: bool) -> Result { + fn read(self: &Arc, req: &mut BinderWriteRead, wait: bool) -> Result { let read_start = req.read_buffer.wrapping_add(req.read_consumed); let read_len = req.read_size - req.read_consumed; let mut writer = UserSlicePtr::new(read_start as _, read_len as _).writer(); - let in_pool = self.inner.lock().is_looper(); + let (in_pool, use_proc_queue) = { + let inner = self.inner.lock(); + (inner.is_looper(), inner.should_use_process_work_queue()) + }; + let getter = if use_proc_queue { + Self::get_work + } else { + Self::get_work_local + }; // Reserve some room at the beginning of the read buffer so that we can send a // BR_SPAWN_LOOPER if we need to. @@ -145,13 +364,35 @@ fn read(self: &Arc, req: &mut BinderWriteRead, _wait: bool) -> Result { } // Loop doing work while there is room in the buffer. - #[allow(clippy::never_loop)] + let initial_len = writer.len(); while writer.len() >= size_of::() + 4 { - // There is enough space in the output buffer to process another work item. - // - // However, we have not yet added work items to the driver, so we immediately break - // from the loop. - break; + match getter(self, wait && initial_len == writer.len()) { + Ok(Some(work)) => { + let work_ty = work.debug_name(); + match work.into_arc().do_work(self, &mut writer) { + Ok(true) => {} + Ok(false) => break, + Err(err) => { + pr_warn!("Failure inside do_work of type {}.", work_ty); + return Err(err); + } + } + } + Ok(None) => { + break; + } + Err(err) => { + // Propagate the error if we haven't written anything else. + if err != EINTR && err != EAGAIN { + pr_warn!("Failure in work getter: {:?}", err); + } + if initial_len == writer.len() { + return Err(err); + } else { + break; + } + } + } } req.read_consumed += read_len - writer.len() as u64; @@ -178,6 +419,7 @@ pub(crate) fn write_read(self: &Arc, data: UserSlicePtr, wait: bool) -> Re ); req.read_consumed = 0; writer.write(&req)?; + self.inner.lock().looper_need_return = false; return Err(err); } } @@ -197,10 +439,32 @@ pub(crate) fn write_read(self: &Arc, data: UserSlicePtr, wait: bool) -> Re // Write the request back so that the consumed fields are visible to the caller. writer.write(&req)?; + + self.inner.lock().looper_need_return = false; + ret } + /// Make the call to `get_work` or `get_work_local` return immediately, if any. + pub(crate) fn exit_looper(&self) { + let mut inner = self.inner.lock(); + let should_notify = inner.looper_flags & LOOPER_WAITING != 0; + if should_notify { + inner.looper_need_return = true; + } + drop(inner); + + if should_notify { + self.work_condvar.notify_one(); + } + } + pub(crate) fn release(self: &Arc) { self.inner.lock().is_dead = true; + + // Cancel all pending work items. + while let Ok(Some(work)) = self.get_work_local(false) { + work.into_arc().cancel(); + } } } diff --git a/scripts/Makefile.build b/scripts/Makefile.build index b388f3d75d49..29108cd3377c 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -262,7 +262,7 @@ $(obj)/%.lst: $(src)/%.c FORCE # Compile Rust sources (.rs) # --------------------------------------------------------------------------- -rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of,allocator_api +rust_allowed_features := const_maybe_uninit_zeroed,new_uninit,offset_of,allocator_api,receiver_trait # `--out-dir` is required to avoid temporaries being created by `rustc` in the # current working directory, which may be not accessible in the out-of-tree From patchwork Wed Nov 1 18:01:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160627 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp604982vqx; Wed, 1 Nov 2023 11:03:20 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEi5d6ivTXlR6XM8AUAyEOM45rNL8oBrxJtI4rMdXbzzPbuJBZkQbS6Spzlq6RVnJPkOLUq X-Received: by 2002:a05:6a00:391a:b0:68f:f38d:f76c with SMTP id fh26-20020a056a00391a00b0068ff38df76cmr15276696pfb.6.1698861799740; Wed, 01 Nov 2023 11:03:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861799; cv=none; d=google.com; s=arc-20160816; b=mwuqQ+sWbrY3OGSPmQkxQX8Uuwj2dgutszwOJMNB7d3B6Ria0ncTqptdBmv7wXmD8w qPDJdwqHY4qd7DVExL3IrKzpsCx6hP9z5iJ1U8lIqRYvuSPtRg+7HhaLpfLZXSuPi3Y5 0zANxFz9Xg1DEkqARmypik8fi5q1pRmDbaTkQ63D+rqaYNScrDMznky8kGC/SGAhJYUV 6szwS33bCWqj9NVHfqj2f9jkGXgoW9Cf8LbC/inLsuUazBKj8hGWQKgkz80oMhdDRpN4 noUAioLzFqrdFLrrq4eJ4x8fpI1LYDpQiNDt6R7GEmDaG5m/RO8usdckNh8A+OlcUgiF 09dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Hlkj7rWrD6B3SboM8qOndsSoFecqv93kKlDZWfC7+Co=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=EpNtaGbiENCqHZXUApqimxW8PzKc+oBrZJZPQa968NudVvaPVyryvKQDDFpXiT3QTN hjTsIGaaubsj2tdeKMtLnLU6aZZHf13a+4h5ReODSPe4mBX6HJT6T7PpWhotSZuEVzM8 Lj2IQBauCenkooECGYpOkOUWuPwuPXjT0Dg40oSbHLZFDyMx8mJe7Qmxb/o6dfKjetrX khPNK5qiU7scXONE0vPj9NdGB+xzfY3qgsy5cGKD0wHkis6JAixShYTIIh0LwN3uF2Yf riKU4aESb7mmQxI4n7okdmZrYLLW1ix61D9FmfU3Rs9FmXnNPbv31YBmzyzvtDnuWfIu jADA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=EKATHzTD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id ca5-20020a056a02068500b00565e7a3342dsi437195pgb.256.2023.11.01.11.03.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:03:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=EKATHzTD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 345D0818ABC1; Wed, 1 Nov 2023 11:02:57 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344844AbjKASCq (ORCPT + 34 others); Wed, 1 Nov 2023 14:02:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344758AbjKASCh (ORCPT ); Wed, 1 Nov 2023 14:02:37 -0400 Received: from mail-ej1-x64a.google.com (mail-ej1-x64a.google.com [IPv6:2a00:1450:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D663511B for ; Wed, 1 Nov 2023 11:02:29 -0700 (PDT) Received: by mail-ej1-x64a.google.com with SMTP id a640c23a62f3a-9c7f0a33afbso3999566b.3 for ; Wed, 01 Nov 2023 11:02:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861748; x=1699466548; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Hlkj7rWrD6B3SboM8qOndsSoFecqv93kKlDZWfC7+Co=; b=EKATHzTDHijaKPL+lc+y0nsq/Veiv+tpti1DA96nLQd0ot0dXeCiwBXqA/OO/wuvvr /vWVXuQYS7t0Wf2evb5+jsRzHtikV+spWQ4tAmvsZpTZfal8N/LzNumXt5WL3EVd436I CZJk+ChiPOpLjfC84MmelVtfaGYq3ogSyG3NIzUVU7NvqdfsObcfM6XC46Erurby8hGa iIF6PKAsUH1t8P2Jj6qUpja2NZYNhFD7FmSjwjRkh9mqgzKiLvsWiK5Bl5W9huXtRAw/ xUtGrgJMvF90YY2iKqQlzxg/nLmHHITmS12bnnVGeWgdenCwOLo8wDf9NJ0TWZ6ZPhAm JFVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861748; x=1699466548; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Hlkj7rWrD6B3SboM8qOndsSoFecqv93kKlDZWfC7+Co=; b=N8IYOL12E1hf64jlJsQ5GnwpFFnZ5MjIFzDy93B5kkd1wPhQrlMnEQmeswf5gudVsX ctz0TsT6JObx0+ulcbxqwdA9ZF4weyAZqrDZAw1dwcOVUEuUeOcCd1vKygmxC8S7l27A CxAn2YogCRP0O9+EVr23A58AhmHC7hyLn0mLYeu0ttdgiHxmOr8cNuXhbW3vwyqaXePk 9suARGc/qUV6RBel70Sjl1zudzdpZvb/JQE8rJRvtvdmf8cQYufNPknzAKw3tfn3OBHj T++VJMX51jj33BbD+Be9RTvAuoidAX9cOxu56m6M36yh+T2VK6JhGT/O0+RkD7Ira07a t8Xw== X-Gm-Message-State: AOJu0YwxKM5FV5P868KViXSNsy9ITd7+Zma1NJ0v3wl8yGjAbmCgQ/Pc qfDmqcE6IXwgbb/tPBSn0NqRXwBMIbo5ibo= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a17:906:b2cc:b0:9bd:ca2b:40fb with SMTP id cf12-20020a170906b2cc00b009bdca2b40fbmr29009ejb.1.1698861748309; Wed, 01 Nov 2023 11:02:28 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:35 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-5-08ba9197f637@google.com> Subject: [PATCH RFC 05/20] rust_binder: add nodes and context managers From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:02:57 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385710599358746 X-GMAIL-MSGID: 1781385710599358746 An important concept for the binder driver is a "node", which is a type of object defined by the binder driver that serves as a "binder server". Whenever you send a transaction, the recipient will be a node. Binder nodes can exist in many processes. The driver keeps track of this using two fields in `Process`. * The `nodes` rbtree. This structure stores all nodes that this process is the primary owner of. The `process` field of the `Node` struct will point at the process that has it in its `nodes` rbtree. * The `node_refs` collection. This keeps track of the nodes from other processes that this process holds a reference to. A process can only send transactions to nodes in this collection. From userspace, we also make a distinction between local nodes owned by the process itself, and proxy nodes that are owned by a different process. Generally, a process will refer to local nodes using the address of the corresponding userspace object, and it will refer to proxy nodes using a 32-bit id that the kernel assigns to the node. The 32-bit ids are local to each process and assigned consecutively (the same node can have a different 32-bit id in each external process that has a reference to it). Additionally, a node can also be stored in the context as the "context manager". There will only be one context manager for each context (that is, for each file in `/dev/binderfs`). The context manager implicitly has the id 0 in every other process, which means that all processes are able to access it by default. In a later patch, we will add the ability to send nodes from one process to another as part of a transaction. When this happens, the node is added to the `node_refs` collection of the target process, and the process will be able to start using it from then on. Except for the context manager node, sending nodes in this way is the *only* way for a process to obtain a reference to a node defined by another process. Generally, userspace processes are expected to send their nodes to the context manager process so that the context manager can pass it on to clients that want to connect to it. Binder nodes are reference counted through the kernel. This generally happens in the following manner: 1. Process A owns a binder node, which it stores in an allocation in userspace. This allocation is reference counted. 2. The kernel owns a `Node` object that holds a reference count to the userspace object in process A. Changes to this reference count are communicated to process A using the commands BR_ACQUIRE, BR_RELEASE, BR_INCREFS, and BR_DECREFS. 3. Other parts of the kernel own a `NodeRef` object that holds a reference count to the `Node` object. Destroying a `NodeRef` will decrement the refcount of the associated `Node` in the appropriate way. 4. Process B owns a proxy node, which is a userspace object. Using a 32-bit id, this proxy node refers to a `NodeRef` object in the kernel. When the proxy node is destroyed, userspace will use the commands BC_ACQUIRE, BC_RELEASE, BC_INCREFS, and BC_DECREFS to tell the kernel to modify the refcount on the `NodeRef` object. Via the above chain, process B can own a refcount that keeps a node in process A alive. There can also be other things than processes than own a `NodeRef`. For example, the context holds a `NodeRef` to the context manager node. This keeps the node alive, even if there are no other processes with a reference to it. In a later patch, we will see other instances of this - for example, a transaction's allocation will also own a `NodeRef` to any nodes embedded in it so that they don't go away while the process is handling the transaction. There is a potential race condition where the kernel sends BR_ACQUIRE immediately followed by BR_RELEASE. If these are delivered to two different userspace threads, then userspace might see them in reverse order, which could make the refcount drop to zero when it shouldn't. To prevent this from happening, userspace will respond to BR_ACQUIRE commands with a BC_ACQUIRE_DONE after incrementing the refcount. The kernel will postpone BR_RELEASE commands until after userspace has responded with BC_ACQUIRE_DONE, which ensures that this race cannot happen. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Signed-off-by: Alice Ryhl --- drivers/android/context.rs | 44 ++++- drivers/android/defs.rs | 17 +- drivers/android/node.rs | 377 +++++++++++++++++++++++++++++++++++++++++ drivers/android/process.rs | 343 ++++++++++++++++++++++++++++++++++++- drivers/android/rust_binder.rs | 2 +- drivers/android/thread.rs | 13 +- rust/helpers.c | 6 + rust/kernel/security.rs | 8 + 8 files changed, 799 insertions(+), 11 deletions(-) diff --git a/drivers/android/context.rs b/drivers/android/context.rs index 630cb575d3ac..b5de9d98a6b0 100644 --- a/drivers/android/context.rs +++ b/drivers/android/context.rs @@ -3,11 +3,13 @@ use kernel::{ list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks}, prelude::*, + security, str::{CStr, CString}, sync::{Arc, Mutex}, + task::Kuid, }; -use crate::process::Process; +use crate::{error::BinderError, node::NodeRef, process::Process}; // This module defines the global variable containing the list of contexts. Since the // `kernel::sync` bindings currently don't support mutexes in globals, we use a temporary @@ -70,6 +72,8 @@ pub(crate) struct ContextList { /// This struct keeps track of the processes using this context, and which process is the context /// manager. struct Manager { + node: Option, + uid: Option, all_procs: List, } @@ -103,6 +107,8 @@ pub(crate) fn new(name: &CStr) -> Result> { links <- ListLinks::new(), manager <- kernel::new_mutex!(Manager { all_procs: List::new(), + node: None, + uid: None, }, "Context::manager"), }))?; @@ -141,4 +147,40 @@ pub(crate) fn deregister_process(self: &Arc, proc: &Process) { self.manager.lock().all_procs.remove(proc); } } + + pub(crate) fn set_manager_node(&self, node_ref: NodeRef) -> Result { + let mut manager = self.manager.lock(); + if manager.node.is_some() { + pr_warn!("BINDER_SET_CONTEXT_MGR already set"); + return Err(EBUSY); + } + security::binder_set_context_mgr(&node_ref.node.owner.cred)?; + + // If the context manager has been set before, ensure that we use the same euid. + let caller_uid = Kuid::current_euid(); + if let Some(ref uid) = manager.uid { + if *uid != caller_uid { + return Err(EPERM); + } + } + + manager.node = Some(node_ref); + manager.uid = Some(caller_uid); + Ok(()) + } + + pub(crate) fn unset_manager_node(&self) { + let node_ref = self.manager.lock().node.take(); + drop(node_ref); + } + + pub(crate) fn get_manager_node(&self, strong: bool) -> Result { + self.manager + .lock() + .node + .as_ref() + .ok_or_else(BinderError::new_dead)? + .clone(strong) + .map_err(BinderError::from) + } } diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 86173add2616..8a83df975e61 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -19,14 +19,24 @@ macro_rules! pub_no_prefix { BR_NOOP, BR_SPAWN_LOOPER, BR_TRANSACTION_COMPLETE, - BR_OK + BR_OK, + BR_INCREFS, + BR_ACQUIRE, + BR_RELEASE, + BR_DECREFS ); pub_no_prefix!( binder_driver_command_protocol_, BC_ENTER_LOOPER, BC_EXIT_LOOPER, - BC_REGISTER_LOOPER + BC_REGISTER_LOOPER, + BC_INCREFS, + BC_ACQUIRE, + BC_RELEASE, + BC_DECREFS, + BC_INCREFS_DONE, + BC_ACQUIRE_DONE ); macro_rules! decl_wrapper { @@ -54,6 +64,9 @@ fn deref_mut(&mut self) -> &mut Self::Target { }; } +decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info); +decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref); +decl_wrapper!(FlatBinderObject, bindings::flat_binder_object); decl_wrapper!(BinderWriteRead, bindings::binder_write_read); decl_wrapper!(BinderVersion, bindings::binder_version); decl_wrapper!(ExtendedError, bindings::binder_extended_error); diff --git a/drivers/android/node.rs b/drivers/android/node.rs new file mode 100644 index 000000000000..0ca4b72b8710 --- /dev/null +++ b/drivers/android/node.rs @@ -0,0 +1,377 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::{ + io_buffer::IoBufferWriter, + list::{AtomicListArcTracker, ListArcSafe, TryNewListArc}, + prelude::*, + sync::lock::{spinlock::SpinLockBackend, Guard}, + sync::{Arc, LockedBy}, + user_ptr::UserSlicePtrWriter, +}; + +use crate::{ + defs::*, + process::{Process, ProcessInner}, + thread::Thread, + DArc, DeliverToRead, +}; + +struct CountState { + /// The reference count. + count: usize, + /// Whether the process that owns this node thinks that we hold a refcount on it. (Note that + /// even if count is greater than one, we only increment it once in the owning process.) + has_count: bool, +} + +impl CountState { + fn new() -> Self { + Self { + count: 0, + has_count: false, + } + } +} + +struct NodeInner { + strong: CountState, + weak: CountState, + /// The number of active BR_INCREFS or BR_ACQUIRE operations. (should be maximum two) + /// + /// If this is non-zero, then we postpone any BR_RELEASE or BR_DECREFS notifications until the + /// active operations have ended. This avoids the situation an increment and decrement get + /// reordered from userspace's perspective. + active_inc_refs: u8, +} + +#[pin_data] +pub(crate) struct Node { + pub(crate) global_id: u64, + ptr: usize, + cookie: usize, + #[allow(dead_code)] + pub(crate) flags: u32, + pub(crate) owner: Arc, + inner: LockedBy, + #[pin] + links_track: AtomicListArcTracker, +} + +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for Node { + tracked_by links_track: AtomicListArcTracker; + } +} + +impl Node { + pub(crate) fn new( + ptr: usize, + cookie: usize, + flags: u32, + owner: Arc, + ) -> impl PinInit { + use core::sync::atomic::{AtomicU64, Ordering}; + static NEXT_ID: AtomicU64 = AtomicU64::new(1); + + pin_init!(Self { + global_id: NEXT_ID.fetch_add(1, Ordering::Relaxed), + inner: LockedBy::new( + &owner.inner, + NodeInner { + strong: CountState::new(), + weak: CountState::new(), + active_inc_refs: 0, + }, + ), + ptr, + cookie, + flags, + owner, + links_track <- AtomicListArcTracker::new(), + }) + } + + pub(crate) fn get_id(&self) -> (usize, usize) { + (self.ptr, self.cookie) + } + + pub(crate) fn inc_ref_done_locked( + &self, + _strong: bool, + owner_inner: &mut ProcessInner, + ) -> bool { + let inner = self.inner.access_mut(owner_inner); + if inner.active_inc_refs == 0 { + pr_err!("inc_ref_done called when no active inc_refs"); + return false; + } + + inner.active_inc_refs -= 1; + if inner.active_inc_refs == 0 { + // Having active inc_refs can inhibit dropping of ref-counts. Calculate whether we + // would send a refcount decrement, and if so, tell the caller to schedule us. + let strong = inner.strong.count > 0; + let has_strong = inner.strong.has_count; + let weak = strong || inner.weak.count > 0; + let has_weak = inner.weak.has_count; + + let should_drop_weak = !weak && has_weak; + let should_drop_strong = !strong && has_strong; + + // If we want to drop the ref-count again, tell the caller to schedule a work node for + // that. + should_drop_weak || should_drop_strong + } else { + false + } + } + + pub(crate) fn update_refcount_locked( + &self, + inc: bool, + strong: bool, + count: usize, + owner_inner: &mut ProcessInner, + ) -> bool { + let is_dead = owner_inner.is_dead; + let inner = self.inner.access_mut(owner_inner); + + // Get a reference to the state we'll update. + let state = if strong { + &mut inner.strong + } else { + &mut inner.weak + }; + + // Update the count and determine whether we need to push work. + if inc { + state.count += count; + !is_dead && !state.has_count + } else { + if state.count < count { + pr_err!("Failure: refcount underflow!"); + return false; + } + state.count -= count; + !is_dead && state.count == 0 && state.has_count + } + } + + pub(crate) fn update_refcount(self: &DArc, inc: bool, count: usize, strong: bool) { + self.owner + .inner + .lock() + .update_node_refcount(self, inc, strong, count, None); + } + + pub(crate) fn populate_counts( + &self, + out: &mut BinderNodeInfoForRef, + guard: &Guard<'_, ProcessInner, SpinLockBackend>, + ) { + let inner = self.inner.access(guard); + out.strong_count = inner.strong.count as _; + out.weak_count = inner.weak.count as _; + } + + pub(crate) fn populate_debug_info( + &self, + out: &mut BinderNodeDebugInfo, + guard: &Guard<'_, ProcessInner, SpinLockBackend>, + ) { + out.ptr = self.ptr as _; + out.cookie = self.cookie as _; + let inner = self.inner.access(guard); + if inner.strong.has_count { + out.has_strong_ref = 1; + } + if inner.weak.has_count { + out.has_weak_ref = 1; + } + } + + pub(crate) fn force_has_count(&self, guard: &mut Guard<'_, ProcessInner, SpinLockBackend>) { + let inner = self.inner.access_mut(guard); + inner.strong.has_count = true; + inner.weak.has_count = true; + } + + fn write(&self, writer: &mut UserSlicePtrWriter, code: u32) -> Result { + writer.write(&code)?; + writer.write(&self.ptr)?; + writer.write(&self.cookie)?; + Ok(()) + } +} + +impl DeliverToRead for Node { + fn do_work( + self: DArc, + _thread: &Thread, + writer: &mut UserSlicePtrWriter, + ) -> Result { + let mut owner_inner = self.owner.inner.lock(); + let inner = self.inner.access_mut(&mut owner_inner); + let strong = inner.strong.count > 0; + let has_strong = inner.strong.has_count; + let weak = strong || inner.weak.count > 0; + let has_weak = inner.weak.has_count; + + if weak && !has_weak { + inner.weak.has_count = true; + inner.active_inc_refs += 1; + } + + if strong && !has_strong { + inner.strong.has_count = true; + inner.active_inc_refs += 1; + } + + let no_active_inc_refs = inner.active_inc_refs == 0; + let should_drop_weak = no_active_inc_refs && (!weak && has_weak); + let should_drop_strong = no_active_inc_refs && (!strong && has_strong); + if should_drop_weak { + inner.weak.has_count = false; + } + if should_drop_strong { + inner.strong.has_count = false; + } + if no_active_inc_refs && !weak { + // Remove the node if there are no references to it. + owner_inner.remove_node(self.ptr); + } + drop(owner_inner); + + if weak && !has_weak { + self.write(writer, BR_INCREFS)?; + } + if strong && !has_strong { + self.write(writer, BR_ACQUIRE)?; + } + if should_drop_strong { + self.write(writer, BR_RELEASE)?; + } + if should_drop_weak { + self.write(writer, BR_DECREFS)?; + } + + Ok(true) + } + + fn should_sync_wakeup(&self) -> bool { + false + } +} + +/// Represents something that holds one or more ref-counts to a `Node`. +/// +/// Whenever process A holds a refcount to a node owned by a different process B, then process A +/// will store a `NodeRef` that refers to the `Node` in process B. When process A releases the +/// refcount, we destroy the NodeRef, which decrements the ref-count in process A. +/// +/// This type is also used for some other cases. For example, a transaction allocation holds a +/// refcount on the target node, and this is implemented by storing a `NodeRef` in the allocation +/// so that the destructor of the allocation will drop a refcount of the `Node`. +pub(crate) struct NodeRef { + pub(crate) node: DArc, + /// How many times does this NodeRef hold a refcount on the Node? + strong_node_count: usize, + weak_node_count: usize, + /// How many times does userspace hold a refcount on this NodeRef? + strong_count: usize, + weak_count: usize, +} + +impl NodeRef { + pub(crate) fn new(node: DArc, strong_count: usize, weak_count: usize) -> Self { + Self { + node, + strong_node_count: strong_count, + weak_node_count: weak_count, + strong_count, + weak_count, + } + } + + pub(crate) fn absorb(&mut self, mut other: Self) { + assert!( + Arc::ptr_eq(&self.node, &other.node), + "absorb called with differing nodes" + ); + self.strong_node_count += other.strong_node_count; + self.weak_node_count += other.weak_node_count; + self.strong_count += other.strong_count; + self.weak_count += other.weak_count; + other.strong_count = 0; + other.weak_count = 0; + other.strong_node_count = 0; + other.weak_node_count = 0; + } + + pub(crate) fn clone(&self, strong: bool) -> Result { + if strong && self.strong_count == 0 { + return Err(EINVAL); + } + Ok(self + .node + .owner + .inner + .lock() + .new_node_ref(self.node.clone(), strong, None)) + } + + /// Updates (increments or decrements) the number of references held against the node. If the + /// count being updated transitions from 0 to 1 or from 1 to 0, the node is notified by having + /// its `update_refcount` function called. + /// + /// Returns whether `self` should be removed (when both counts are zero). + pub(crate) fn update(&mut self, inc: bool, strong: bool) -> bool { + if strong && self.strong_count == 0 { + return false; + } + let (count, node_count, other_count) = if strong { + ( + &mut self.strong_count, + &mut self.strong_node_count, + self.weak_count, + ) + } else { + ( + &mut self.weak_count, + &mut self.weak_node_count, + self.strong_count, + ) + }; + if inc { + if *count == 0 { + *node_count = 1; + self.node.update_refcount(true, 1, strong); + } + *count += 1; + } else { + *count -= 1; + if *count == 0 { + self.node.update_refcount(false, *node_count, strong); + *node_count = 0; + return other_count == 0; + } + } + false + } +} + +impl Drop for NodeRef { + // This destructor is called conditionally from `Allocation::drop`. That branch is often + // mispredicted. Inlining this method call reduces the cost of those branch mispredictions. + #[inline(always)] + fn drop(&mut self) { + if self.strong_node_count > 0 { + self.node + .update_refcount(false, self.strong_node_count, true); + } + if self.weak_node_count > 0 { + self.node + .update_refcount(false, self.weak_node_count, false); + } + } +} diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 22662c7d388a..2d8aa29776a1 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -19,7 +19,7 @@ mm, prelude::*, rbtree::RBTree, - sync::{lock::Guard, Arc, ArcBorrow, SpinLock}, + sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock}, task::Task, types::{ARef, Either}, user_ptr::{UserSlicePtr, UserSlicePtrReader}, @@ -30,8 +30,9 @@ context::Context, defs::*, error::BinderError, + node::{Node, NodeRef}, thread::{PushWorkRes, Thread}, - DLArc, DTRWrap, DeliverToRead, + DArc, DLArc, DTRWrap, DeliverToRead, }; use core::mem::take; @@ -41,9 +42,11 @@ /// The fields of `Process` protected by the spinlock. pub(crate) struct ProcessInner { + is_manager: bool, pub(crate) is_dead: bool, threads: RBTree>, ready_threads: List, + nodes: RBTree>, work: List>, /// The number of requested threads that haven't registered yet. @@ -60,9 +63,11 @@ pub(crate) struct ProcessInner { impl ProcessInner { fn new() -> Self { Self { + is_manager: false, is_dead: false, threads: RBTree::new(), ready_threads: List::new(), + nodes: RBTree::new(), work: List::new(), requested_thread_count: 0, max_threads: 0, @@ -80,7 +85,6 @@ fn new() -> Self { /// the caller so that the caller can drop it after releasing the inner process lock. This is /// necessary since the destructor of `Transaction` will take locks that can't necessarily be /// taken while holding the inner process lock. - #[allow(dead_code)] pub(crate) fn push_work( &mut self, work: DLArc, @@ -102,6 +106,81 @@ pub(crate) fn push_work( } } + pub(crate) fn remove_node(&mut self, ptr: usize) { + self.nodes.remove(&ptr); + } + + /// Updates the reference count on the given node. + pub(crate) fn update_node_refcount( + &mut self, + node: &DArc, + inc: bool, + strong: bool, + count: usize, + othread: Option<&Thread>, + ) { + let push = node.update_refcount_locked(inc, strong, count, self); + + // If we decided that we need to push work, push either to the process or to a thread if + // one is specified. + if push { + // It's not a problem if creating the ListArc fails, because that just means that + // it is already queued to a worklist. + if let Some(node) = ListArc::try_from_arc_or_drop(node.clone()) { + if let Some(thread) = othread { + thread.push_work_deferred(node); + } else { + let _ = self.push_work(node); + // Nothing to do: `push_work` may fail if the process is dead, but that's ok as in + // that case, it doesn't care about the notification. + } + } + } + } + + pub(crate) fn new_node_ref( + &mut self, + node: DArc, + strong: bool, + thread: Option<&Thread>, + ) -> NodeRef { + self.update_node_refcount(&node, true, strong, 1, thread); + let strong_count = if strong { 1 } else { 0 }; + NodeRef::new(node, strong_count, 1 - strong_count) + } + + /// Returns an existing node with the given pointer and cookie, if one exists. + /// + /// Returns an error if a node with the given pointer but a different cookie exists. + fn get_existing_node(&self, ptr: usize, cookie: usize) -> Result>> { + match self.nodes.get(&ptr) { + None => Ok(None), + Some(node) => { + let (_, node_cookie) = node.get_id(); + if node_cookie == cookie { + Ok(Some(node.clone())) + } else { + Err(EINVAL) + } + } + } + } + + /// Returns a reference to an existing node with the given pointer and cookie. It requires a + /// mutable reference because it needs to increment the ref count on the node, which may + /// require pushing work to the work queue (to notify userspace of 0 to 1 transitions). + fn get_existing_node_ref( + &mut self, + ptr: usize, + cookie: usize, + strong: bool, + thread: Option<&Thread>, + ) -> Result> { + Ok(self + .get_existing_node(ptr, cookie)? + .map(|node| self.new_node_ref(node, strong, thread))) + } + fn register_thread(&mut self) -> bool { if self.requested_thread_count == 0 { return false; @@ -113,6 +192,30 @@ fn register_thread(&mut self) -> bool { } } +struct NodeRefInfo { + node_ref: NodeRef, +} + +impl NodeRefInfo { + fn new(node_ref: NodeRef) -> Self { + Self { node_ref } + } +} + +struct ProcessNodeRefs { + by_handle: RBTree, + by_global_id: RBTree, +} + +impl ProcessNodeRefs { + fn new() -> Self { + Self { + by_handle: RBTree::new(), + by_global_id: RBTree::new(), + } + } +} + /// A process using binder. /// /// Strictly speaking, there can be multiple of these per process. There is one for each binder fd @@ -131,6 +234,11 @@ pub(crate) struct Process { #[pin] pub(crate) inner: SpinLock, + // Node references are in a different lock to avoid recursive acquisition when + // incrementing/decrementing a node in another process. + #[pin] + node_refs: Mutex, + // Work node for deferred work item. #[pin] defer_work: Work, @@ -182,6 +290,7 @@ fn new(ctx: Arc, cred: ARef) -> Result> { ctx, cred, inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"), + node_refs <- kernel::new_mutex!(ProcessNodeRefs::new(), "Process::node_refs"), task: kernel::current!().group_leader().into(), defer_work <- kernel::new_work!("Process::defer_work"), links <- ListLinks::new(), @@ -241,6 +350,167 @@ fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result> { Ok(ta) } + fn set_as_manager( + self: ArcBorrow<'_, Self>, + info: Option, + thread: &Thread, + ) -> Result { + let (ptr, cookie, flags) = if let Some(obj) = info { + ( + // SAFETY: The object type for this ioctl is implicitly `BINDER_TYPE_BINDER`, so it + // is safe to access the `binder` field. + unsafe { obj.__bindgen_anon_1.binder }, + obj.cookie, + obj.flags, + ) + } else { + (0, 0, 0) + }; + let node_ref = self.get_node(ptr as _, cookie as _, flags as _, true, Some(thread))?; + let node = node_ref.node.clone(); + self.ctx.set_manager_node(node_ref)?; + self.inner.lock().is_manager = true; + + // Force the state of the node to prevent the delivery of acquire/increfs. + let mut owner_inner = node.owner.inner.lock(); + node.force_has_count(&mut owner_inner); + Ok(()) + } + + pub(crate) fn get_node( + self: ArcBorrow<'_, Self>, + ptr: usize, + cookie: usize, + flags: u32, + strong: bool, + thread: Option<&Thread>, + ) -> Result { + // Try to find an existing node. + { + let mut inner = self.inner.lock(); + if let Some(node) = inner.get_existing_node_ref(ptr, cookie, strong, thread)? { + return Ok(node); + } + } + + // Allocate the node before reacquiring the lock. + let node = DTRWrap::arc_pin_init(Node::new(ptr, cookie, flags, self.into()))?.into_arc(); + let rbnode = RBTree::try_allocate_node(ptr, node.clone())?; + let mut inner = self.inner.lock(); + if let Some(node) = inner.get_existing_node_ref(ptr, cookie, strong, thread)? { + return Ok(node); + } + + inner.nodes.insert(rbnode); + Ok(inner.new_node_ref(node, strong, thread)) + } + + pub(crate) fn insert_or_update_handle( + &self, + node_ref: NodeRef, + is_mananger: bool, + ) -> Result { + { + let mut refs = self.node_refs.lock(); + + // Do a lookup before inserting. + if let Some(handle_ref) = refs.by_global_id.get(&node_ref.node.global_id) { + let handle = *handle_ref; + let info = refs.by_handle.get_mut(&handle).unwrap(); + info.node_ref.absorb(node_ref); + return Ok(handle); + } + } + + // Reserve memory for tree nodes. + let reserve1 = RBTree::try_reserve_node()?; + let reserve2 = RBTree::try_reserve_node()?; + + let mut refs = self.node_refs.lock(); + + // Do a lookup again as node may have been inserted before the lock was reacquired. + if let Some(handle_ref) = refs.by_global_id.get(&node_ref.node.global_id) { + let handle = *handle_ref; + let info = refs.by_handle.get_mut(&handle).unwrap(); + info.node_ref.absorb(node_ref); + return Ok(handle); + } + + // Find id. + let mut target: u32 = if is_mananger { 0 } else { 1 }; + for handle in refs.by_handle.keys() { + if *handle > target { + break; + } + if *handle == target { + target = target.checked_add(1).ok_or(ENOMEM)?; + } + } + + // Ensure the process is still alive while we insert a new reference. + let inner = self.inner.lock(); + if inner.is_dead { + return Err(ESRCH); + } + refs.by_global_id + .insert(reserve1.into_node(node_ref.node.global_id, target)); + refs.by_handle + .insert(reserve2.into_node(target, NodeRefInfo::new(node_ref))); + Ok(target) + } + + pub(crate) fn get_node_from_handle(&self, handle: u32, strong: bool) -> Result { + self.node_refs + .lock() + .by_handle + .get(&handle) + .ok_or(ENOENT)? + .node_ref + .clone(strong) + } + + pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result { + if inc && handle == 0 { + if let Ok(node_ref) = self.ctx.get_manager_node(strong) { + if core::ptr::eq(self, &*node_ref.node.owner) { + return Err(EINVAL); + } + let _ = self.insert_or_update_handle(node_ref, true); + return Ok(()); + } + } + + // To preserve original binder behaviour, we only fail requests where the manager tries to + // increment references on itself. + let mut refs = self.node_refs.lock(); + if let Some(info) = refs.by_handle.get_mut(&handle) { + if info.node_ref.update(inc, strong) { + // Remove reference from process tables. + let id = info.node_ref.node.global_id; + refs.by_handle.remove(&handle); + refs.by_global_id.remove(&id); + } + } + Ok(()) + } + + pub(crate) fn inc_ref_done(&self, reader: &mut UserSlicePtrReader, strong: bool) -> Result { + let ptr = reader.read::()?; + let cookie = reader.read::()?; + let mut inner = self.inner.lock(); + if let Ok(Some(node)) = inner.get_existing_node(ptr, cookie) { + if node.inc_ref_done_locked(strong, &mut inner) { + // It's not a problem if creating the ListArc fails, because that just means that + // it is already queued to a worklist. + if let Some(node) = ListArc::try_from_arc_or_drop(node) { + // This only fails if the process is dead. + let _ = inner.push_work(node); + } + } + } + Ok(()) + } + fn version(&self, data: UserSlicePtr) -> Result { data.writer().write(&BinderVersion::current()) } @@ -258,6 +528,57 @@ fn set_max_threads(&self, max: u32) { self.inner.lock().max_threads = max; } + fn get_node_debug_info(&self, data: UserSlicePtr) -> Result { + let (mut reader, mut writer) = data.reader_writer(); + + // Read the starting point. + let ptr = reader.read::()?.ptr as usize; + let mut out = BinderNodeDebugInfo::default(); + + { + let inner = self.inner.lock(); + for (node_ptr, node) in &inner.nodes { + if *node_ptr > ptr { + node.populate_debug_info(&mut out, &inner); + break; + } + } + } + + writer.write(&out) + } + + fn get_node_info_from_ref(&self, data: UserSlicePtr) -> Result { + let (mut reader, mut writer) = data.reader_writer(); + let mut out = reader.read::()?; + + if out.strong_count != 0 + || out.weak_count != 0 + || out.reserved1 != 0 + || out.reserved2 != 0 + || out.reserved3 != 0 + { + return Err(EINVAL); + } + + // Only the context manager is allowed to use this ioctl. + if !self.inner.lock().is_manager { + return Err(EPERM); + } + + let node_ref = self + .get_node_from_handle(out.handle, true) + .or(Err(EINVAL))?; + // Get the counts from the node. + { + let owner_inner = node_ref.node.owner.inner.lock(); + node_ref.node.populate_counts(&mut out, &owner_inner); + } + + // Write the result back. + writer.write(&out) + } + pub(crate) fn needs_thread(&self) -> bool { let mut inner = self.inner.lock(); let ret = inner.requested_thread_count == 0 @@ -277,7 +598,15 @@ fn deferred_flush(&self) { } fn deferred_release(self: Arc) { - self.inner.lock().is_dead = true; + let is_manager = { + let mut inner = self.inner.lock(); + inner.is_dead = true; + inner.is_manager + }; + + if is_manager { + self.ctx.unset_manager_node(); + } self.ctx.deregister_process(&self); @@ -327,6 +656,10 @@ fn write( match cmd { bindings::BINDER_SET_MAX_THREADS => this.set_max_threads(reader.read()?), bindings::BINDER_THREAD_EXIT => this.remove_thread(thread), + bindings::BINDER_SET_CONTEXT_MGR => this.set_as_manager(None, &thread)?, + bindings::BINDER_SET_CONTEXT_MGR_EXT => { + this.set_as_manager(Some(reader.read()?), &thread)? + } _ => return Err(EINVAL), } Ok(0) @@ -342,6 +675,8 @@ fn read_write( let blocking = (file.flags() & file::flags::O_NONBLOCK) == 0; match cmd { bindings::BINDER_WRITE_READ => thread.write_read(data, blocking)?, + bindings::BINDER_GET_NODE_DEBUG_INFO => this.get_node_debug_info(data)?, + bindings::BINDER_GET_NODE_INFO_FOR_REF => this.get_node_info_from_ref(data)?, bindings::BINDER_VERSION => this.version(data)?, bindings::BINDER_GET_EXTENDED_ERROR => thread.get_extended_error(data)?, _ => return Err(EINVAL), diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 55d475737cef..2ef37cc2c556 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -19,6 +19,7 @@ mod context; mod defs; mod error; +mod node; mod process; mod thread; @@ -102,7 +103,6 @@ fn arc_try_new(val: T) -> Result, alloc::alloc::AllocError> { .map_err(|_| alloc::alloc::AllocError) } - #[allow(dead_code)] fn arc_pin_init(init: impl PinInit) -> Result, kernel::error::Error> { ListArc::pin_init(pin_init!(Self { links <- ListLinksSelfPtr::new(), diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index a12c271a4e8f..f7d62fc380e5 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -93,7 +93,6 @@ fn pop_work(&mut self) -> Option> { ret } - #[allow(dead_code)] fn push_work(&mut self, work: DLArc) -> PushWorkRes { if self.is_dead { PushWorkRes::FailedDead(work) @@ -106,7 +105,6 @@ fn push_work(&mut self, work: DLArc) -> PushWorkRes { /// Used to push work items that do not need to be processed immediately and can wait until the /// thread gets another work item. - #[allow(dead_code)] fn push_work_deferred(&mut self, work: DLArc) { self.work_list.push_back(work); } @@ -294,7 +292,6 @@ fn get_work(self: &Arc, wait: bool) -> Result) -> PushWorkRes { let sync = work.should_sync_wakeup(); @@ -311,6 +308,10 @@ pub(crate) fn push_work(&self, work: DLArc) -> PushWorkRes { res } + pub(crate) fn push_work_deferred(&self, work: DLArc) { + self.inner.lock().push_work_deferred(work); + } + fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { let write_start = req.write_buffer.wrapping_add(req.write_consumed); let write_len = req.write_size - req.write_consumed; @@ -320,6 +321,12 @@ fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { let before = reader.len(); let cmd = reader.read::()?; match cmd { + BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?, + BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?, + BC_RELEASE => self.process.update_ref(reader.read()?, false, true)?, + BC_DECREFS => self.process.update_ref(reader.read()?, false, false)?, + BC_INCREFS_DONE => self.process.inc_ref_done(&mut reader, false)?, + BC_ACQUIRE_DONE => self.process.inc_ref_done(&mut reader, true)?, BC_REGISTER_LOOPER => { let valid = self.process.register_thread(); self.inner.lock().looper_register(valid); diff --git a/rust/helpers.c b/rust/helpers.c index 2b436a7199e9..adb94ace2334 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -329,6 +329,12 @@ void rust_helper_security_release_secctx(char *secdata, u32 seclen) security_release_secctx(secdata, seclen); } EXPORT_SYMBOL_GPL(rust_helper_security_release_secctx); + +int rust_helper_security_binder_set_context_mgr(const struct cred *mgr) +{ + return security_binder_set_context_mgr(mgr); +} +EXPORT_SYMBOL_GPL(rust_helper_security_binder_set_context_mgr); #endif /* diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs index 69c10ed89a57..f94c3c37560d 100644 --- a/rust/kernel/security.rs +++ b/rust/kernel/security.rs @@ -6,9 +6,17 @@ use crate::{ bindings, + cred::Credential, error::{to_result, Result}, }; +/// Calls the security modules to determine if the given task can become the manager of a binder +/// context. +pub fn binder_set_context_mgr(mgr: &Credential) -> Result { + // SAFETY: `mrg.0` is valid because the shared reference guarantees a nonzero refcount. + to_result(unsafe { bindings::security_binder_set_context_mgr(mgr.0.get()) }) +} + /// A security context string. /// /// The struct has the invariant that it always contains a valid security context. From patchwork Wed Nov 1 18:01:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160629 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605150vqx; Wed, 1 Nov 2023 11:03:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHarlq1hBQuoppK6WwEwaXMD9Sp9p7GVispavI7KotZncE/KN5u2qH0gAg+aNz+yxJZQaTi X-Received: by 2002:a05:6358:903:b0:168:e737:6b25 with SMTP id r3-20020a056358090300b00168e7376b25mr14066748rwi.20.1698861811945; Wed, 01 Nov 2023 11:03:31 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861811; cv=none; d=google.com; s=arc-20160816; b=0ci4iEdyikpDnQX9rg7o4e9pY7CcLq6EjOgDb3h2kNSWYIQK8EOpCq6+XDkSEM56T3 m/eaoZqcr00LNO1aJVVwUrRNfh1kMJLHI5pNMIQmSUFt74tzZ8jPe3qVSXmjWa8Tp97G 90AYK4qJQkpjw3DUZn/bZ+mGqTOoBnFC9K86rpPTTjgzxvK8QCNHEEKjwh3H8CEfHMgr wswG/Y/mjpIHELjv3pYB7BKKyvTvkttlFdv2/ye4nOk2RSnDlS4EN3pKLL/WT+HQb0XO Z4zzFlni4fZMDvlvKHtI2yrKPEUQhRZoce2vi4qFgkKE/eg3/zl7c4gFVZ0dQMfsJBSB GwRg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Cp/lYPgceSFtl5mqvD/Fgl0DfL4AQ8R2jryIb+qcHAY=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=TOYe4HIGA6y0k8rPAS28N/y356AKgahIxEA9lVjjPQ2jqQVP/r2ShwJbjTafy1mB/J PZymvBBgAzyMJ0Z1/BqpzFMQRrCb5rGRTKpQzpQZFIE5b0uiw+LCWuvWUk1wNXo/DkM3 m96ajfB5ywiyT2nxS5/f0zB4xWbJqNjyu7byXDs8b/cvJDU+PnNYYuC3LN3pQ76SXvzh hatj3a4UYjrIxsV3v6s+QNhB8mlk9+OdpbHodx6WgppbFjsGSTvnLRDcpF3RChOZ4QJM LTwO4uW9wuxKuHkXOIAbNsOcpPDQRnh/epnX4U7r17tcxsgJIgWgXnPUfklzjlXHYetv aVVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=vBMedMD2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id cb22-20020a056a02071600b005b8f61fcba6si391052pgb.452.2023.11.01.11.03.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:03:31 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=vBMedMD2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id B39DC818ABC0; Wed, 1 Nov 2023 11:03:17 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344867AbjKASDG (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344838AbjKASDB (ORCPT ); Wed, 1 Nov 2023 14:03:01 -0400 Received: from mail-ed1-x54a.google.com (mail-ed1-x54a.google.com [IPv6:2a00:1450:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B8E7123 for ; Wed, 1 Nov 2023 11:02:32 -0700 (PDT) Received: by mail-ed1-x54a.google.com with SMTP id 4fb4d7f45d1cf-53fa5cd4480so38820a12.0 for ; Wed, 01 Nov 2023 11:02:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861751; x=1699466551; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Cp/lYPgceSFtl5mqvD/Fgl0DfL4AQ8R2jryIb+qcHAY=; b=vBMedMD2MPVD8k4enKCq6Sr+6/1vpW7Hswg8g48Dg8DBOVLjjBhnnfoJRL6UxXZHfu R1bN7r96R44QRljgfYMNJP48ooo2zD67e2Nkx2pppztEAMuL5Ekl23+DfkqPRV5SuvV4 UHUMVGjKW/Lc8IKh3u5IZ0rARYPyyXuWBPdcmL+KD9g1Y27ONF2kiVJrCvsMhN4qHTTu FGH3FlKvXh81xgIjFL0mUXX8i4FZXMASDujKMHV5HFF+E0TaxpMMxodrW1yppnxuk0qg r/yTqZSuQY3nBEz70Vu/qcgfzdkxfoecmbUmvMC/e421AKaURiOZFAKFYpUVEb6/dsU6 VWBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861751; x=1699466551; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Cp/lYPgceSFtl5mqvD/Fgl0DfL4AQ8R2jryIb+qcHAY=; b=NkFUTPqSghTHxuvrrcaoQTm/QztwzsR2PuyyXPw6ejmxDY5jq/FWtSRSzlxhUfrFlu eedJ8oZn6ijA25ccANd6aSZMWHPwYKo3IFMdlnZ48rdbwnEGF5z8K1rO+m42RY+V0u80 WBz+VbfCuWpHfx8FPkVDu9T2nkH34uyJ5FrFKEh9nb2H8iBx7iAe6EsM2ak5bzzHUSqX MBOzKjKq06YHjvPNreaZi/BAW+ElwkY7oXMbNsff1Yag6NmDBBvhHyWKO4TAlSXMN5PY 5czpZdgZutvQjHLSvvrpVvMzPc5Qgu0eycBI4HZeBZwqFYK/k8f6YMIMPR3fdo91ExIU nUaQ== X-Gm-Message-State: AOJu0Yz0C29Aovg2lS0nOn0Dfj1iS3rtwie7Khe3biiVK7ldgTC9jd6O RQUVNmkU98Gp/Oi9k4Z7RY4uxr+2/7/gd4k= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6402:e84:b0:53d:bc68:51fc with SMTP id h4-20020a0564020e8400b0053dbc6851fcmr151990eda.2.1698861750910; Wed, 01 Nov 2023 11:02:30 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:36 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-6-08ba9197f637@google.com> Subject: [PATCH RFC 06/20] rust_binder: add oneway transactions From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:17 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385723244201830 X-GMAIL-MSGID: 1781385723244201830 Add support for sending oneway transactions using the binder driver. To receive transactions, the process must first use mmap to create a memory region for holding the contents of incoming transactions. The driver will manage the resulting memory using two files: `allocation.rs` and `range_alloc.rs`. The `allocation.rs` file is responsible for actually managing the mmap'ed region of memory and has methods for writing to it. The `range_alloc.rs` file contains a data structure for tracking where in the mmap we are storing different things. It doesn't actually touch the mmap itself. Basically, it's a data structure that stores a set of non-overlapping intervals (the allocations) and it is able to find the smallest offset where the next X bytes are free and allocate that region. Other than that, this patch introduces a `Transaction` struct that stores the information related to a transaction, and adds the necessary infrastructure to send and receive them. This uses the work lists introduces in a previous patch to deliver incoming transactions. There are several different possible implementations of the range allocator, and we have implemented several of them. The simplest possible implementation is to use a linked list to store the allocations and free regions sorted by address. Another possibility is to store the same thing using a red-black tree. The red-black tree is preferable to the linked list because its accesses are logarithmic rather than linear. This RFC implements the range allocator using a red-black tree. We have also looked into replacing the red-black tree with an XArray. However, this is challenging because it doesn't have a good way to look up the smallest free region whose size is at least some lower bound. You can use `xa_find`, but there could be many free regions of the same size, which makes it a challenge to maintain this information correctly. We also run into issues with having to allocate while holding a lock. Finally, the XArray is not optimized for this use-case: all of the indices are going to have gaps between them. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Co-developed-by: Matt Gilbride Signed-off-by: Matt Gilbride Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 140 +++++++++++++++++ drivers/android/defs.rs | 39 +++++ drivers/android/error.rs | 10 ++ drivers/android/node.rs | 1 - drivers/android/process.rs | 171 ++++++++++++++++++-- drivers/android/range_alloc.rs | 344 +++++++++++++++++++++++++++++++++++++++++ drivers/android/rust_binder.rs | 54 +++++++ drivers/android/thread.rs | 208 +++++++++++++++++++++++-- drivers/android/transaction.rs | 163 +++++++++++++++++++ rust/helpers.c | 7 + rust/kernel/security.rs | 7 + 11 files changed, 1123 insertions(+), 21 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs new file mode 100644 index 000000000000..1ab0f254fded --- /dev/null +++ b/drivers/android/allocation.rs @@ -0,0 +1,140 @@ +// SPDX-License-Identifier: GPL-2.0 +use core::mem::size_of_val; + +use kernel::{bindings, pages::Pages, prelude::*, sync::Arc, user_ptr::UserSlicePtrReader}; + +use crate::{node::NodeRef, process::Process}; + +#[derive(Default)] +pub(crate) struct AllocationInfo { + /// The target node of the transaction this allocation is associated to. + /// Not set for replies. + pub(crate) target_node: Option, + /// Zero the data in the buffer on free. + pub(crate) clear_on_free: bool, +} + +/// Represents an allocation that the kernel is currently using. +/// +/// When allocations are idle, the range allocator holds the data related to them. +pub(crate) struct Allocation { + pub(crate) offset: usize, + size: usize, + pub(crate) ptr: usize, + pages: Arc>>, + pub(crate) process: Arc, + allocation_info: Option, + free_on_drop: bool, +} + +impl Allocation { + pub(crate) fn new( + process: Arc, + offset: usize, + size: usize, + ptr: usize, + pages: Arc>>, + ) -> Self { + Self { + process, + offset, + size, + ptr, + pages, + allocation_info: None, + free_on_drop: true, + } + } + + fn iterate(&self, mut offset: usize, mut size: usize, mut cb: T) -> Result + where + T: FnMut(&Pages<0>, usize, usize) -> Result, + { + // Check that the request is within the buffer. + if offset.checked_add(size).ok_or(EINVAL)? > self.size { + return Err(EINVAL); + } + offset += self.offset; + let mut page_index = offset >> bindings::PAGE_SHIFT; + offset &= (1 << bindings::PAGE_SHIFT) - 1; + while size > 0 { + let available = core::cmp::min(size, (1 << bindings::PAGE_SHIFT) - offset); + cb(&self.pages[page_index], offset, available)?; + size -= available; + page_index += 1; + offset = 0; + } + Ok(()) + } + + pub(crate) fn copy_into( + &self, + reader: &mut UserSlicePtrReader, + offset: usize, + size: usize, + ) -> Result { + self.iterate(offset, size, |page, offset, to_copy| { + page.copy_into_page(reader, offset, to_copy) + }) + } + + pub(crate) fn write(&self, offset: usize, obj: &T) -> Result { + let mut obj_offset = 0; + self.iterate(offset, size_of_val(obj), |page, offset, to_copy| { + // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T. + let obj_ptr = unsafe { (obj as *const T as *const u8).add(obj_offset) }; + // SAFETY: We have a reference to the object, so the pointer is valid. + unsafe { page.write(obj_ptr, offset, to_copy) }?; + obj_offset += to_copy; + Ok(()) + }) + } + + pub(crate) fn fill_zero(&self) -> Result { + self.iterate(0, self.size, |page, offset, len| { + page.fill_zero(offset, len) + }) + } + + pub(crate) fn keep_alive(mut self) { + self.process + .buffer_make_freeable(self.offset, self.allocation_info.take()); + self.free_on_drop = false; + } + + pub(crate) fn set_info(&mut self, info: AllocationInfo) { + self.allocation_info = Some(info); + } + + pub(crate) fn get_or_init_info(&mut self) -> &mut AllocationInfo { + self.allocation_info.get_or_insert_with(Default::default) + } + + pub(crate) fn set_info_clear_on_drop(&mut self) { + self.get_or_init_info().clear_on_free = true; + } + + pub(crate) fn set_info_target_node(&mut self, target_node: NodeRef) { + self.get_or_init_info().target_node = Some(target_node); + } +} + +impl Drop for Allocation { + fn drop(&mut self) { + if !self.free_on_drop { + return; + } + + if let Some(mut info) = self.allocation_info.take() { + info.target_node = None; + + if info.clear_on_free { + if let Err(e) = self.fill_zero() { + pr_warn!("Failed to clear data on free: {:?}", e); + } + } + } + + self.process.buffer_raw_free(self.ptr); + } +} diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 8a83df975e61..d0fc00fa5a57 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -14,6 +14,9 @@ macro_rules! pub_no_prefix { pub_no_prefix!( binder_driver_return_protocol_, + BR_TRANSACTION, + BR_TRANSACTION_SEC_CTX, + BR_REPLY, BR_DEAD_REPLY, BR_FAILED_REPLY, BR_NOOP, @@ -28,6 +31,9 @@ macro_rules! pub_no_prefix { pub_no_prefix!( binder_driver_command_protocol_, + BC_TRANSACTION, + BC_TRANSACTION_SG, + BC_FREE_BUFFER, BC_ENTER_LOOPER, BC_EXIT_LOOPER, BC_REGISTER_LOOPER, @@ -39,6 +45,10 @@ macro_rules! pub_no_prefix { BC_ACQUIRE_DONE ); +pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 = + kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX; +pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_CLEAR_BUF); + macro_rules! decl_wrapper { ($newname:ident, $wrapped:ty) => { #[derive(Copy, Clone, Default)] @@ -67,6 +77,15 @@ fn deref_mut(&mut self) -> &mut Self::Target { decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info); decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref); decl_wrapper!(FlatBinderObject, bindings::flat_binder_object); +decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data); +decl_wrapper!( + BinderTransactionDataSecctx, + bindings::binder_transaction_data_secctx +); +decl_wrapper!( + BinderTransactionDataSg, + bindings::binder_transaction_data_sg +); decl_wrapper!(BinderWriteRead, bindings::binder_write_read); decl_wrapper!(BinderVersion, bindings::binder_version); decl_wrapper!(ExtendedError, bindings::binder_extended_error); @@ -79,6 +98,26 @@ pub(crate) fn current() -> Self { } } +impl BinderTransactionData { + pub(crate) fn with_buffers_size(self, buffers_size: u64) -> BinderTransactionDataSg { + BinderTransactionDataSg(bindings::binder_transaction_data_sg { + transaction_data: self.0, + buffers_size, + }) + } +} + +impl BinderTransactionDataSecctx { + /// View the inner data as wrapped in `BinderTransactionData`. + pub(crate) fn tr_data(&mut self) -> &mut BinderTransactionData { + // SAFETY: Transparent wrapper is safe to transmute. + unsafe { + &mut *(&mut self.transaction_data as *mut bindings::binder_transaction_data + as *mut BinderTransactionData) + } + } +} + impl ExtendedError { pub(crate) fn new(id: u32, command: u32, param: i32) -> Self { Self(bindings::binder_extended_error { id, command, param }) diff --git a/drivers/android/error.rs b/drivers/android/error.rs index a31b696efafc..430b0994affa 100644 --- a/drivers/android/error.rs +++ b/drivers/android/error.rs @@ -4,6 +4,8 @@ use crate::defs::*; +pub(crate) type BinderResult = core::result::Result; + /// An error that will be returned to userspace via the `BINDER_WRITE_READ` ioctl rather than via /// errno. pub(crate) struct BinderError { @@ -18,6 +20,14 @@ pub(crate) fn new_dead() -> Self { source: None, } } + + pub(crate) fn is_dead(&self) -> bool { + self.reply == BR_DEAD_REPLY + } + + pub(crate) fn as_errno(&self) -> core::ffi::c_int { + self.source.unwrap_or(EINVAL).to_errno() + } } /// Convert an errno into a `BinderError` and store the errno used to construct it. The errno diff --git a/drivers/android/node.rs b/drivers/android/node.rs index 0ca4b72b8710..c6c3d81e705d 100644 --- a/drivers/android/node.rs +++ b/drivers/android/node.rs @@ -49,7 +49,6 @@ pub(crate) struct Node { pub(crate) global_id: u64, ptr: usize, cookie: usize, - #[allow(dead_code)] pub(crate) flags: u32, pub(crate) owner: Arc, inner: LockedBy, diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 2d8aa29776a1..26dd9309fbee 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -17,6 +17,7 @@ io_buffer::{IoBufferReader, IoBufferWriter}, list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks}, mm, + pages::Pages, prelude::*, rbtree::RBTree, sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock}, @@ -27,16 +28,35 @@ }; use crate::{ + allocation::{Allocation, AllocationInfo}, context::Context, defs::*, - error::BinderError, + error::{BinderError, BinderResult}, node::{Node, NodeRef}, + range_alloc::{self, RangeAllocator}, thread::{PushWorkRes, Thread}, DArc, DLArc, DTRWrap, DeliverToRead, }; use core::mem::take; +struct Mapping { + address: usize, + alloc: RangeAllocator, + pages: Arc>>, +} + +impl Mapping { + fn new(address: usize, size: usize, pages: Arc>>) -> Result { + let alloc = RangeAllocator::new(size)?; + Ok(Self { + address, + alloc, + pages, + }) + } +} + const PROC_DEFER_FLUSH: u8 = 1; const PROC_DEFER_RELEASE: u8 = 2; @@ -47,6 +67,7 @@ pub(crate) struct ProcessInner { threads: RBTree>, ready_threads: List, nodes: RBTree>, + mapping: Option, work: List>, /// The number of requested threads that haven't registered yet. @@ -67,6 +88,7 @@ fn new() -> Self { is_dead: false, threads: RBTree::new(), ready_threads: List::new(), + mapping: None, nodes: RBTree::new(), work: List::new(), requested_thread_count: 0, @@ -459,6 +481,15 @@ pub(crate) fn insert_or_update_handle( Ok(target) } + pub(crate) fn get_transaction_node(&self, handle: u32) -> BinderResult { + // When handle is zero, try to get the context manager. + if handle == 0 { + Ok(self.ctx.get_manager_node(true)?) + } else { + Ok(self.get_node_from_handle(handle, true)?) + } + } + pub(crate) fn get_node_from_handle(&self, handle: u32, strong: bool) -> Result { self.node_refs .lock() @@ -511,6 +542,97 @@ pub(crate) fn inc_ref_done(&self, reader: &mut UserSlicePtrReader, strong: bool) Ok(()) } + pub(crate) fn buffer_alloc( + self: &Arc, + size: usize, + is_oneway: bool, + ) -> BinderResult { + let alloc = range_alloc::ReserveNewBox::try_new()?; + let mut inner = self.inner.lock(); + let mapping = inner.mapping.as_mut().ok_or_else(BinderError::new_dead)?; + let offset = mapping.alloc.reserve_new(size, is_oneway, alloc)?; + Ok(Allocation::new( + self.clone(), + offset, + size, + mapping.address + offset, + mapping.pages.clone(), + )) + } + + pub(crate) fn buffer_get(self: &Arc, ptr: usize) -> Option { + let mut inner = self.inner.lock(); + let mapping = inner.mapping.as_mut()?; + let offset = ptr.checked_sub(mapping.address)?; + let (size, odata) = mapping.alloc.reserve_existing(offset).ok()?; + let mut alloc = Allocation::new(self.clone(), offset, size, ptr, mapping.pages.clone()); + if let Some(data) = odata { + alloc.set_info(data); + } + Some(alloc) + } + + pub(crate) fn buffer_raw_free(&self, ptr: usize) { + let mut inner = self.inner.lock(); + if let Some(ref mut mapping) = &mut inner.mapping { + if ptr < mapping.address + || mapping + .alloc + .reservation_abort(ptr - mapping.address) + .is_err() + { + pr_warn!( + "Pointer {:x} failed to free, base = {:x}\n", + ptr, + mapping.address + ); + } + } + } + + pub(crate) fn buffer_make_freeable(&self, offset: usize, data: Option) { + let mut inner = self.inner.lock(); + if let Some(ref mut mapping) = &mut inner.mapping { + if mapping.alloc.reservation_commit(offset, data).is_err() { + pr_warn!("Offset {} failed to be marked freeable\n", offset); + } + } + } + + fn create_mapping(&self, vma: &mut mm::virt::Area) -> Result { + use kernel::bindings::PAGE_SIZE; + let size = core::cmp::min(vma.end() - vma.start(), bindings::SZ_4M as usize); + let page_count = size / PAGE_SIZE; + + // Allocate and map all pages. + // + // N.B. If we fail halfway through mapping these pages, the kernel will unmap them. + let mut pages = Vec::new(); + pages.try_reserve_exact(page_count)?; + let mut address = vma.start(); + for _ in 0..page_count { + let page = Pages::<0>::new()?; + vma.insert_page(address, &page)?; + pages.try_push(page)?; + address += PAGE_SIZE; + } + + let ref_pages = Arc::try_new(pages)?; + let mapping = Mapping::new(vma.start(), size, ref_pages)?; + + // Save pages for later. + let mut inner = self.inner.lock(); + match &inner.mapping { + None => inner.mapping = Some(mapping), + Some(_) => { + drop(inner); + drop(mapping); + return Err(EBUSY); + } + } + Ok(()) + } + fn version(&self, data: UserSlicePtr) -> Result { data.writer().write(&BinderVersion::current()) } @@ -610,11 +732,6 @@ fn deferred_release(self: Arc) { self.ctx.deregister_process(&self); - // Cancel all pending work items. - while let Some(work) = self.get_work() { - work.into_arc().cancel(); - } - // Move the threads out of `inner` so that we can iterate over them without holding the // lock. let mut inner = self.inner.lock(); @@ -625,6 +742,26 @@ fn deferred_release(self: Arc) { for thread in threads.values() { thread.release(); } + + // Cancel all pending work items. + while let Some(work) = self.get_work() { + work.into_arc().cancel(); + } + + // Free any resources kept alive by allocated buffers. + let omapping = self.inner.lock().mapping.take(); + if let Some(mut mapping) = omapping { + let address = mapping.address; + let pages = mapping.pages.clone(); + mapping.alloc.take_for_each(|offset, size, odata| { + let ptr = offset + address; + let mut alloc = Allocation::new(self.clone(), offset, size, ptr, pages.clone()); + if let Some(data) = odata { + alloc.set_info(data); + } + drop(alloc) + }); + } } pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result { @@ -736,11 +873,27 @@ pub(crate) fn compat_ioctl( } pub(crate) fn mmap( - _this: ArcBorrow<'_, Process>, + this: ArcBorrow<'_, Process>, _file: &File, - _vma: &mut mm::virt::Area, + vma: &mut mm::virt::Area, ) -> Result { - Err(EINVAL) + // We don't allow mmap to be used in a different process. + if !core::ptr::eq(kernel::current!().group_leader(), &*this.task) { + return Err(EINVAL); + } + if vma.start() == 0 { + return Err(EINVAL); + } + let mut flags = vma.flags(); + use mm::virt::flags::*; + if flags & WRITE != 0 { + return Err(EPERM); + } + flags |= DONTCOPY | MIXEDMAP; + flags &= !MAYWRITE; + vma.set_flags(flags); + // TODO: Set ops. We need to learn when the user unmaps so that we can stop using it. + this.create_mapping(vma) } pub(crate) fn poll( diff --git a/drivers/android/range_alloc.rs b/drivers/android/range_alloc.rs new file mode 100644 index 000000000000..e757129613cf --- /dev/null +++ b/drivers/android/range_alloc.rs @@ -0,0 +1,344 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::{ + prelude::*, + rbtree::{RBTree, RBTreeNode, RBTreeNodeReservation}, +}; + +/// Keeps track of allocations in a process' mmap. +/// +/// Each process has an mmap where the data for incoming transactions will be placed. This struct +/// keeps track of allocations made in the mmap. For each allocation, we store a descriptor that +/// has metadata related to the allocation. We also keep track of available free space. +pub(crate) struct RangeAllocator { + tree: RBTree>, + free_tree: RBTree, + free_oneway_space: usize, +} + +impl RangeAllocator { + pub(crate) fn new(size: usize) -> Result { + let mut tree = RBTree::new(); + tree.try_create_and_insert(0, Descriptor::new(0, size))?; + let mut free_tree = RBTree::new(); + free_tree.try_create_and_insert((size, 0), ())?; + Ok(Self { + free_oneway_space: size / 2, + tree, + free_tree, + }) + } + + fn find_best_match(&mut self, size: usize) -> Option<&mut Descriptor> { + let free_cursor = self.free_tree.cursor_lower_bound(&(size, 0))?; + let ((_, offset), _) = free_cursor.current(); + self.tree.get_mut(offset) + } + + /// Try to reserve a new buffer, using the provided allocation if necessary. + pub(crate) fn reserve_new( + &mut self, + size: usize, + is_oneway: bool, + alloc: ReserveNewBox, + ) -> Result { + // Compute new value of free_oneway_space, which is set only on success. + let new_oneway_space = if is_oneway { + match self.free_oneway_space.checked_sub(size) { + Some(new_oneway_space) => new_oneway_space, + None => return Err(ENOSPC), + } + } else { + self.free_oneway_space + }; + + let (found_size, found_off, tree_node, free_tree_node) = match self.find_best_match(size) { + None => { + pr_warn!("ENOSPC from range_alloc.reserve_new - size: {}", size); + return Err(ENOSPC); + } + Some(desc) => { + let found_size = desc.size; + let found_offset = desc.offset; + + // In case we need to break up the descriptor + let new_desc = Descriptor::new(found_offset + size, found_size - size); + let (tree_node, free_tree_node, desc_node_res) = alloc.initialize(new_desc); + + desc.state = Some(DescriptorState::new(is_oneway, desc_node_res)); + desc.size = size; + + (found_size, found_offset, tree_node, free_tree_node) + } + }; + self.free_oneway_space = new_oneway_space; + self.free_tree.remove(&(found_size, found_off)); + + if found_size != size { + self.tree.insert(tree_node); + self.free_tree.insert(free_tree_node); + } + + Ok(found_off) + } + + pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result { + let mut cursor = self.tree.cursor_lower_bound(&offset).ok_or_else(|| { + pr_warn!( + "EINVAL from range_alloc.reservation_abort - offset: {}", + offset + ); + EINVAL + })?; + + let (_, desc) = cursor.current_mut(); + + if desc.offset != offset { + pr_warn!( + "EINVAL from range_alloc.reservation_abort - offset: {}", + offset + ); + return Err(EINVAL); + } + + let reservation = desc.try_change_state(|state| match state { + Some(DescriptorState::Reserved(reservation)) => (None, Ok(reservation)), + None => { + pr_warn!( + "EINVAL from range_alloc.reservation_abort - offset: {}", + offset + ); + (None, Err(EINVAL)) + } + allocated => { + pr_warn!( + "EPERM from range_alloc.reservation_abort - offset: {}", + offset + ); + (allocated, Err(EPERM)) + } + })?; + + let mut size = desc.size; + let mut offset = desc.offset; + let free_oneway_space_add = if reservation.is_oneway { size } else { 0 }; + + self.free_oneway_space += free_oneway_space_add; + + // Merge next into current if next is free + let remove_next = match cursor.peek_next() { + Some((_, next)) if next.state.is_none() => { + self.free_tree.remove(&(next.size, next.offset)); + size += next.size; + true + } + _ => false, + }; + + if remove_next { + let (_, desc) = cursor.current_mut(); + desc.size = size; + cursor.remove_next(); + } + + // Merge current into prev if prev is free + match cursor.peek_prev_mut() { + Some((_, prev)) if prev.state.is_none() => { + // merge previous with current, remove current + self.free_tree.remove(&(prev.size, prev.offset)); + offset = prev.offset; + size += prev.size; + prev.size = size; + cursor.remove_current(); + } + _ => {} + }; + + self.free_tree + .insert(reservation.free_res.into_node((size, offset), ())); + + Ok(()) + } + + pub(crate) fn reservation_commit(&mut self, offset: usize, data: Option) -> Result { + let desc = self.tree.get_mut(&offset).ok_or_else(|| { + pr_warn!( + "ENOENT from range_alloc.reservation_commit - offset: {}", + offset + ); + ENOENT + })?; + + desc.try_change_state(|state| match state { + Some(DescriptorState::Reserved(reservation)) => ( + Some(DescriptorState::Allocated(reservation.allocate(data))), + Ok(()), + ), + other => { + pr_warn!( + "ENOENT from range_alloc.reservation_commit - offset: {}", + offset + ); + (other, Err(ENOENT)) + } + }) + } + + /// Takes an entry at the given offset from [`DescriptorState::Allocated`] to + /// [`DescriptorState::Reserved`]. + /// + /// Returns the size of the existing entry and the data associated with it. + pub(crate) fn reserve_existing(&mut self, offset: usize) -> Result<(usize, Option)> { + let desc = self.tree.get_mut(&offset).ok_or_else(|| { + pr_warn!( + "ENOENT from range_alloc.reserve_existing - offset: {}", + offset + ); + ENOENT + })?; + + let data = desc.try_change_state(|state| match state { + Some(DescriptorState::Allocated(allocation)) => { + let (reservation, data) = allocation.deallocate(); + (Some(DescriptorState::Reserved(reservation)), Ok(data)) + } + other => { + pr_warn!( + "ENOENT from range_alloc.reserve_existing - offset: {}", + offset + ); + (other, Err(ENOENT)) + } + })?; + + Ok((desc.size, data)) + } + + /// Call the provided callback at every allocated region. + /// + /// This destroys the range allocator. Used only during shutdown. + pub(crate) fn take_for_each)>(&mut self, callback: F) { + for (_, desc) in self.tree.iter_mut() { + if let Some(DescriptorState::Allocated(allocation)) = &mut desc.state { + callback(desc.offset, desc.size, allocation.take()); + } + } + } +} + +struct Descriptor { + size: usize, + offset: usize, + state: Option>, +} + +impl Descriptor { + fn new(offset: usize, size: usize) -> Self { + Self { + size, + offset, + state: None, + } + } + + fn try_change_state(&mut self, f: F) -> Result + where + F: FnOnce(Option>) -> (Option>, Result), + { + let (new_state, result) = f(self.state.take()); + self.state = new_state; + result + } +} + +enum DescriptorState { + Reserved(Reservation), + Allocated(Allocation), +} + +impl DescriptorState { + fn new(is_oneway: bool, free_res: FreeNodeRes) -> Self { + DescriptorState::Reserved(Reservation { + is_oneway, + free_res, + }) + } +} + +struct Reservation { + is_oneway: bool, + free_res: FreeNodeRes, +} + +impl Reservation { + fn allocate(self, data: Option) -> Allocation { + Allocation { + data, + is_oneway: self.is_oneway, + free_res: self.free_res, + } + } +} + +struct Allocation { + is_oneway: bool, + free_res: FreeNodeRes, + data: Option, +} + +impl Allocation { + fn deallocate(self) -> (Reservation, Option) { + ( + Reservation { + is_oneway: self.is_oneway, + free_res: self.free_res, + }, + self.data, + ) + } + + fn take(&mut self) -> Option { + self.data.take() + } +} + +// (Descriptor.size, Descriptor.offset) +type FreeKey = (usize, usize); +type FreeNodeRes = RBTreeNodeReservation; + +/// An allocation for use by `reserve_new`. +pub(crate) struct ReserveNewBox { + tree_node_res: RBTreeNodeReservation>, + free_tree_node_res: FreeNodeRes, + desc_node_res: FreeNodeRes, +} + +impl ReserveNewBox { + pub(crate) fn try_new() -> Result { + let tree_node_res = RBTree::try_reserve_node()?; + let free_tree_node_res = RBTree::try_reserve_node()?; + let desc_node_res = RBTree::try_reserve_node()?; + Ok(Self { + tree_node_res, + free_tree_node_res, + desc_node_res, + }) + } + + fn initialize( + self, + desc: Descriptor, + ) -> ( + RBTreeNode>, + RBTreeNode, + FreeNodeRes, + ) { + let size = desc.size; + let offset = desc.offset; + ( + self.tree_node_res.into_node(offset, desc), + self.free_tree_node_res.into_node((size, offset), ()), + self.desc_node_res, + ) + } +} diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 2ef37cc2c556..218c2001e8cb 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -5,6 +5,7 @@ use kernel::{ bindings::{self, seq_file}, file::{File, PollTable}, + io_buffer::IoBufferWriter, list::{ HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc, }, @@ -16,12 +17,17 @@ use crate::{context::Context, process::Process, thread::Thread}; +use core::sync::atomic::{AtomicBool, Ordering}; + +mod allocation; mod context; mod defs; mod error; mod node; mod process; +mod range_alloc; mod thread; +mod transaction; module! { type: BinderModule, @@ -111,6 +117,54 @@ fn arc_pin_init(init: impl PinInit) -> Result, kernel::error::Error> } } +struct DeliverCode { + code: u32, + skip: AtomicBool, +} + +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for DeliverCode { untracked; } +} + +impl DeliverCode { + fn new(code: u32) -> Self { + Self { + code, + skip: AtomicBool::new(false), + } + } + + /// Disable this DeliverCode and make it do nothing. + /// + /// This is used instead of removing it from the work list, since `LinkedList::remove` is + /// unsafe, whereas this method is not. + fn skip(&self) { + self.skip.store(true, Ordering::Relaxed); + } +} + +impl DeliverToRead for DeliverCode { + fn do_work( + self: DArc, + _thread: &Thread, + writer: &mut UserSlicePtrWriter, + ) -> Result { + if !self.skip.load(Ordering::Relaxed) { + writer.write(&self.code)?; + } + Ok(true) + } + + fn should_sync_wakeup(&self) -> bool { + false + } +} + +const fn ptr_align(value: usize) -> usize { + let size = core::mem::size_of::() - 1; + (value + size) & !size +} + struct BinderModule {} impl kernel::Module for BinderModule { diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index f7d62fc380e5..f34de7ad6e6f 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -9,17 +9,25 @@ bindings, io_buffer::{IoBufferReader, IoBufferWriter}, list::{ - AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc, + AtomicListArcTracker, HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks, + TryNewListArc, }, prelude::*, + security, sync::{Arc, CondVar, SpinLock}, types::Either, - user_ptr::UserSlicePtr, + user_ptr::{UserSlicePtr, UserSlicePtrWriter}, }; -use crate::{defs::*, process::Process, DLArc, DTRWrap, DeliverToRead}; +use crate::{ + allocation::Allocation, defs::*, error::BinderResult, process::Process, ptr_align, + transaction::Transaction, DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead, +}; -use core::mem::size_of; +use core::{ + mem::size_of, + sync::atomic::{AtomicU32, Ordering}, +}; pub(crate) enum PushWorkRes { Ok, @@ -47,6 +55,10 @@ struct InnerThread { /// Determines if thread is dead. is_dead: bool, + /// Work item used to deliver error codes to the current thread. Stored here so that it can be + /// reused. + return_work: DArc, + /// Determines whether the work list below should be processed. When set to false, `work_list` /// is treated as if it were empty. process_work_list: bool, @@ -65,22 +77,21 @@ struct InnerThread { const LOOPER_WAITING_PROC: u32 = 0x20; impl InnerThread { - fn new() -> Self { - use core::sync::atomic::{AtomicU32, Ordering}; - + fn new() -> Result { fn next_err_id() -> u32 { static EE_ID: AtomicU32 = AtomicU32::new(0); EE_ID.fetch_add(1, Ordering::Relaxed) } - Self { + Ok(Self { looper_flags: 0, looper_need_return: false, is_dead: false, process_work_list: false, + return_work: ThreadError::try_new()?, work_list: List::new(), extended_error: ExtendedError::new(next_err_id(), BR_OK, 0), - } + }) } fn pop_work(&mut self) -> Option> { @@ -103,6 +114,15 @@ fn push_work(&mut self, work: DLArc) -> PushWorkRes { } } + fn push_return_work(&mut self, reply: u32) { + if let Ok(work) = ListArc::try_from_arc(self.return_work.clone()) { + work.set_error_code(reply); + self.push_work(work); + } else { + pr_warn!("Thread return work is already in use."); + } + } + /// Used to push work items that do not need to be processed immediately and can wait until the /// thread gets another work item. fn push_work_deferred(&mut self, work: DLArc) { @@ -175,10 +195,12 @@ impl ListItem<0> for Thread { impl Thread { pub(crate) fn new(id: i32, process: Arc) -> Result> { + let inner = InnerThread::new()?; + Arc::pin_init(pin_init!(Thread { id, process, - inner <- kernel::new_spinlock!(InnerThread::new(), "Thread::inner"), + inner <- kernel::new_spinlock!(inner, "Thread::inner"), work_condvar <- kernel::new_condvar!("Thread::work_condvar"), links <- ListLinks::new(), links_track <- AtomicListArcTracker::new(), @@ -312,15 +334,131 @@ pub(crate) fn push_work_deferred(&self, work: DLArc) { self.inner.lock().push_work_deferred(work); } + pub(crate) fn copy_transaction_data( + &self, + to_process: Arc, + tr: &BinderTransactionDataSg, + txn_security_ctx_offset: Option<&mut usize>, + ) -> BinderResult { + let trd = &tr.transaction_data; + let is_oneway = trd.flags & TF_ONE_WAY != 0; + let mut secctx = if let Some(offset) = txn_security_ctx_offset { + let secid = self.process.cred.get_secid(); + let ctx = match security::SecurityCtx::from_secid(secid) { + Ok(ctx) => ctx, + Err(err) => { + pr_warn!("Failed to get security ctx for id {}: {:?}", secid, err); + return Err(err.into()); + } + }; + Some((offset, ctx)) + } else { + None + }; + + let data_size = trd.data_size.try_into().map_err(|_| EINVAL)?; + let adata_size = ptr_align(data_size); + let asecctx_size = secctx + .as_ref() + .map(|(_, ctx)| ptr_align(ctx.len())) + .unwrap_or(0); + + // This guarantees that at least `sizeof(usize)` bytes will be allocated. + let len = usize::max( + adata_size.checked_add(asecctx_size).ok_or(ENOMEM)?, + size_of::(), + ); + let secctx_off = adata_size; + let alloc = match to_process.buffer_alloc(len, is_oneway) { + Ok(alloc) => alloc, + Err(err) => { + pr_warn!( + "Failed to allocate buffer. len:{}, is_oneway:{}", + len, + is_oneway + ); + return Err(err); + } + }; + + let mut buffer_reader = + unsafe { UserSlicePtr::new(trd.data.ptr.buffer as _, data_size) }.reader(); + + alloc.copy_into(&mut buffer_reader, 0, data_size)?; + + if let Some((off_out, secctx)) = secctx.as_mut() { + if let Err(err) = alloc.write(secctx_off, secctx.as_bytes()) { + pr_warn!("Failed to write security context: {:?}", err); + return Err(err.into()); + } + **off_out = secctx_off; + } + Ok(alloc) + } + + fn transaction(self: &Arc, tr: &BinderTransactionDataSg, inner: T) + where + T: FnOnce(&Arc, &BinderTransactionDataSg) -> BinderResult, + { + if let Err(err) = inner(self, tr) { + if err.reply != BR_TRANSACTION_COMPLETE { + let mut ee = self.inner.lock().extended_error; + ee.command = err.reply; + ee.param = err.as_errno(); + pr_warn!( + "Transaction failed: {:?} my_pid:{}", + err, + self.process.task.pid_in_current_ns() + ); + } + + self.inner.lock().push_return_work(err.reply); + } + } + + fn oneway_transaction_inner(self: &Arc, tr: &BinderTransactionDataSg) -> BinderResult { + let handle = unsafe { tr.transaction_data.target.handle }; + let node_ref = self.process.get_transaction_node(handle)?; + security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?; + let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?; + let transaction = Transaction::new(node_ref, self, tr)?; + let completion = list_completion.clone_arc(); + self.inner.lock().push_work(list_completion); + match transaction.submit() { + Ok(()) => Ok(()), + Err(err) => { + completion.skip(); + Err(err) + } + } + } + fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { let write_start = req.write_buffer.wrapping_add(req.write_consumed); let write_len = req.write_size - req.write_consumed; let mut reader = UserSlicePtr::new(write_start as _, write_len as _).reader(); - while reader.len() >= size_of::() { + while reader.len() >= size_of::() && self.inner.lock().return_work.is_unused() { let before = reader.len(); let cmd = reader.read::()?; match cmd { + BC_TRANSACTION => { + let tr = reader.read::()?.with_buffers_size(0); + if tr.transaction_data.flags & TF_ONE_WAY != 0 { + self.transaction(&tr, Self::oneway_transaction_inner); + } else { + return Err(EINVAL); + } + } + BC_TRANSACTION_SG => { + let tr = reader.read::()?; + if tr.transaction_data.flags & TF_ONE_WAY != 0 { + self.transaction(&tr, Self::oneway_transaction_inner); + } else { + return Err(EINVAL); + } + } + BC_FREE_BUFFER => drop(self.process.buffer_get(reader.read()?)), BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?, BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?, BC_RELEASE => self.process.update_ref(reader.read()?, false, true)?, @@ -475,3 +613,51 @@ pub(crate) fn release(self: &Arc) { } } } + +#[pin_data] +struct ThreadError { + error_code: AtomicU32, + #[pin] + links_track: AtomicListArcTracker, +} + +impl ThreadError { + fn try_new() -> Result> { + DTRWrap::arc_pin_init(pin_init!(Self { + error_code: AtomicU32::new(BR_OK), + links_track <- AtomicListArcTracker::new(), + })) + .map(ListArc::into_arc) + } + + fn set_error_code(&self, code: u32) { + self.error_code.store(code, Ordering::Relaxed); + } + + fn is_unused(&self) -> bool { + self.error_code.load(Ordering::Relaxed) == BR_OK + } +} + +impl DeliverToRead for ThreadError { + fn do_work( + self: DArc, + _thread: &Thread, + writer: &mut UserSlicePtrWriter, + ) -> Result { + let code = self.error_code.load(Ordering::Relaxed); + self.error_code.store(BR_OK, Ordering::Relaxed); + writer.write(&code)?; + Ok(true) + } + + fn should_sync_wakeup(&self) -> bool { + false + } +} + +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for ThreadError { + tracked_by links_track: AtomicListArcTracker; + } +} diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs new file mode 100644 index 000000000000..8b4274ddc415 --- /dev/null +++ b/drivers/android/transaction.rs @@ -0,0 +1,163 @@ +// SPDX-License-Identifier: GPL-2.0 + +use kernel::{ + io_buffer::IoBufferWriter, + list::ListArcSafe, + prelude::*, + sync::{Arc, SpinLock}, + task::Kuid, + user_ptr::UserSlicePtrWriter, +}; + +use crate::{ + allocation::Allocation, + defs::*, + error::BinderResult, + node::{Node, NodeRef}, + process::Process, + ptr_align, + thread::Thread, + DArc, DLArc, DTRWrap, DeliverToRead, +}; + +#[pin_data] +pub(crate) struct Transaction { + target_node: Option>, + pub(crate) from: Arc, + to: Arc, + #[pin] + allocation: SpinLock>, + code: u32, + pub(crate) flags: u32, + data_size: usize, + data_address: usize, + sender_euid: Kuid, + txn_security_ctx_off: Option, +} + +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for Transaction { untracked; } +} + +impl Transaction { + pub(crate) fn new( + node_ref: NodeRef, + from: &Arc, + tr: &BinderTransactionDataSg, + ) -> BinderResult> { + let trd = &tr.transaction_data; + let txn_security_ctx = node_ref.node.flags & FLAT_BINDER_FLAG_TXN_SECURITY_CTX != 0; + let mut txn_security_ctx_off = if txn_security_ctx { Some(0) } else { None }; + let to = node_ref.node.owner.clone(); + let mut alloc = + match from.copy_transaction_data(to.clone(), tr, txn_security_ctx_off.as_mut()) { + Ok(alloc) => alloc, + Err(err) => { + if !err.is_dead() { + pr_warn!("Failure in copy_transaction_data: {:?}", err); + } + return Err(err); + } + }; + if trd.flags & TF_ONE_WAY == 0 { + pr_warn!("Non-oneway transactions are not yet supported."); + return Err(EINVAL.into()); + } + if trd.flags & TF_CLEAR_BUF != 0 { + alloc.set_info_clear_on_drop(); + } + let target_node = node_ref.node.clone(); + alloc.set_info_target_node(node_ref); + let data_address = alloc.ptr; + + Ok(DTRWrap::arc_pin_init(pin_init!(Transaction { + target_node: Some(target_node), + sender_euid: from.process.cred.euid(), + from: from.clone(), + to, + code: trd.code, + flags: trd.flags, + data_size: trd.data_size as _, + data_address, + allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), + txn_security_ctx_off, + }))?) + } + + /// Submits the transaction to a work queue. + pub(crate) fn submit(self: DLArc) -> BinderResult { + let process = self.to.clone(); + let mut process_inner = process.inner.lock(); + match process_inner.push_work(self) { + Ok(()) => Ok(()), + Err((err, work)) => { + // Drop work after releasing process lock. + drop(process_inner); + drop(work); + Err(err) + } + } + } +} + +impl DeliverToRead for Transaction { + fn do_work( + self: DArc, + _thread: &Thread, + writer: &mut UserSlicePtrWriter, + ) -> Result { + let mut tr_sec = BinderTransactionDataSecctx::default(); + let tr = tr_sec.tr_data(); + if let Some(target_node) = &self.target_node { + let (ptr, cookie) = target_node.get_id(); + tr.target.ptr = ptr as _; + tr.cookie = cookie as _; + }; + tr.code = self.code; + tr.flags = self.flags; + tr.data_size = self.data_size as _; + tr.data.ptr.buffer = self.data_address as _; + tr.offsets_size = 0; + if tr.offsets_size > 0 { + tr.data.ptr.offsets = (self.data_address + ptr_align(self.data_size)) as _; + } + tr.sender_euid = self.sender_euid.into_uid_in_current_ns(); + tr.sender_pid = 0; + if self.target_node.is_some() && self.flags & TF_ONE_WAY == 0 { + // Not a reply and not one-way. + tr.sender_pid = self.from.process.task.pid_in_current_ns(); + } + let code = if self.target_node.is_none() { + BR_REPLY + } else if self.txn_security_ctx_off.is_some() { + BR_TRANSACTION_SEC_CTX + } else { + BR_TRANSACTION + }; + + // Write the transaction code and data to the user buffer. + writer.write(&code)?; + if let Some(off) = self.txn_security_ctx_off { + tr_sec.secctx = (self.data_address + off) as u64; + writer.write(&tr_sec)?; + } else { + writer.write(&*tr)?; + } + + // It is now the user's responsibility to clear the allocation. + let alloc = self.allocation.lock().take(); + if let Some(alloc) = alloc { + alloc.keep_alive(); + } + + Ok(false) + } + + fn cancel(self: DArc) { + drop(self.allocation.lock().take()); + } + + fn should_sync_wakeup(&self) -> bool { + self.flags & TF_ONE_WAY == 0 + } +} diff --git a/rust/helpers.c b/rust/helpers.c index adb94ace2334..e70255f3774f 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -335,6 +335,13 @@ int rust_helper_security_binder_set_context_mgr(const struct cred *mgr) return security_binder_set_context_mgr(mgr); } EXPORT_SYMBOL_GPL(rust_helper_security_binder_set_context_mgr); + +int rust_helper_security_binder_transaction(const struct cred *from, + const struct cred *to) +{ + return security_binder_transaction(from, to); +} +EXPORT_SYMBOL_GPL(rust_helper_security_binder_transaction); #endif /* diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs index f94c3c37560d..9e3e4cf08ecb 100644 --- a/rust/kernel/security.rs +++ b/rust/kernel/security.rs @@ -17,6 +17,13 @@ pub fn binder_set_context_mgr(mgr: &Credential) -> Result { to_result(unsafe { bindings::security_binder_set_context_mgr(mgr.0.get()) }) } +/// Calls the security modules to determine if binder transactions are allowed from task `from` to +/// task `to`. +pub fn binder_transaction(from: &Credential, to: &Credential) -> Result { + // SAFETY: `from` and `to` are valid because the shared references guarantee nonzero refcounts. + to_result(unsafe { bindings::security_binder_transaction(from.0.get(), to.0.get()) }) +} + /// A security context string. /// /// The struct has the invariant that it always contains a valid security context. From patchwork Wed Nov 1 18:01:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160643 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp606043vqx; Wed, 1 Nov 2023 11:04:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEiAFNX1DGzFE4TzE2GtIFKLZtkRP57feh30sn/6rx2uneMp1rvPgAsgAV4zHm9UXRdx1pH X-Received: by 2002:a05:6a20:cea3:b0:179:f858:784d with SMTP id if35-20020a056a20cea300b00179f858784dmr12123581pzb.21.1698861881354; Wed, 01 Nov 2023 11:04:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861881; cv=none; d=google.com; s=arc-20160816; b=RHAAFDns/+sk6E6jwi1SETyzoVKY0Q0DYkY2MbHqQYkR65D5j+a2DwdCaniz1nGCTF n6hQJGWsC/8Uzvrr979V+VY9Pgc4x+Uju4vk5pHt3OAEnDRgVbgyox5Jcecd0r9eQYh9 hxVuaIRgYq31QuBQeWSdspNME3UOokKdiYwKyJqBoPPbP7X7uUFX3pHUc3yN6qjUXHGI 6o5FaKUK/6mvfK1gH6EqET/2bqM9g1RTvAY1HEWd8/7PUZPeUR4wlUZ5oJjuQORlXKDA 0Vfkb84OkmQj5HIcuUTkjsZ22W8hGVfTDM3VYXYTGfPSXpG5OI/NOPmSRqSgGGY3Swtv aPSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=np4dzLTdX7nbpsOC9rtdyhQZ+PyAXv4r1ilV9zFPSOg=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=Am2HvHU+gdZutBBiVX0rku1M6TyfEqgt0le1d7ZN9CS88U2f0vCyeP5r+OBPmbqeFt xxMgixJi3XMGYNG6t+iLiPbGvYw9ZLw5DyG46psqqN3TyXAHdhHZ6XSyIaBLXL+7Z31a Snm4Gk2C8WRkMp2B8/tN02x4Lx7x96VCwR/TcwvZ+F2SqA3GC0rh2+KEAUdvuVOKEFEl u8ncu0I0pmsd1HiAaBQo0jJHwIfzojerm5VdnkM96tNr6kqT7DJsQlYiitqzzGian8lG 2ervedA6gnDulFvpiT9Wp7Y7VclS3srcQ2xcVIPpKM00nEQ00LL6YPnS5v9Qe4NEqnVx n3cw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Y2g9gJf5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id a189-20020a624dc6000000b00690ba709d02si1992296pfb.381.2023.11.01.11.04.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Y2g9gJf5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id D1D8F8027870; Wed, 1 Nov 2023 11:04:28 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344880AbjKASDE (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344813AbjKASCk (ORCPT ); Wed, 1 Nov 2023 14:02:40 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24414C1 for ; Wed, 1 Nov 2023 11:02:34 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5a7cf717bacso1631427b3.1 for ; Wed, 01 Nov 2023 11:02:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861753; x=1699466553; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=np4dzLTdX7nbpsOC9rtdyhQZ+PyAXv4r1ilV9zFPSOg=; b=Y2g9gJf55WF0NrRG8xnxNOnghNOAs8RCUv8NIeq+TZcDgfA70TDpICsyawLDW//7WD qmSIfAa7MOh9dGZ+oqcYwEkNNVq2ALpRpjCPEMXRvOjRMkrM/1ZxIT7Jq9d/nmnzhBm4 czc2TrHvCv3h8sxzl6vW40NhJ7KUJ6aSPnIKL1RrwOo/6R3QMDZ3nIJvUIslJFhuPROG X4VQRJCYYT7CHRuQv1+L9b/ic/c+SqW+Jl9qLlNyRkhe2/qYF/7fzZhC54JKNg85pann PsuM9uZIeBSH9+UPv+OueMAfCOzIkhlnOp7QDVe//VBBXhJmAFKAD3zFs9S0XMUTG0PX 3n+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861753; x=1699466553; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=np4dzLTdX7nbpsOC9rtdyhQZ+PyAXv4r1ilV9zFPSOg=; b=tvKLtgaRXMYLSqqRA0IQ2XWvBin4e2SPeBesfUv9Cwcxc6PCZdZPvq/Cl8rGwg2TKT s+AeBWD4NFk7DgHV+0H/UVFTERZt3TqYpcm+0HTYCoL9XxjP1PDlOnptX+h3LiKd4O03 ypztrAFuqXLYcMlmyV2PNj38xbltgskRAVPVOFV2OxQm/1gPI0i7KdBwjThbs137nE9J RTa4eSbn2NhoHtU5H8b48TyPiCdaGMIm1i6CYpFgvVkeALiFm1Kd8lp6XE6fCLAKp4KO joXO5GCvppq/QgK917n3kEkpGZIOnRQdBxzm4lfOh13e4MSiaBBFx5Hj9x2x33Ceyp1Y ovxQ== X-Gm-Message-State: AOJu0Ywjmell3U5f84KxNluGiW3908NSZKwHUYEeU9UO5Y5TVcecWI/W hbvrNkTLpOxKb3JKtCRp+xMbUjRNScOVB98= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a25:e082:0:b0:d9a:ec95:9687 with SMTP id x124-20020a25e082000000b00d9aec959687mr330827ybg.11.1698861753440; Wed, 01 Nov 2023 11:02:33 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:37 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-7-08ba9197f637@google.com> Subject: [PATCH RFC 07/20] rust_binder: add epoll support From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:04:28 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385796074761515 X-GMAIL-MSGID: 1781385796074761515 From: Wedson Almeida Filho This adds epoll integration, allowing you to get an epoll notification when an incoming transaction arrives. Signed-off-by: Wedson Almeida Filho Co-developed-by: Alice Ryhl Signed-off-by: Alice Ryhl --- drivers/android/process.rs | 21 +++++++++++++++++---- drivers/android/thread.rs | 39 ++++++++++++++++++++++++++++++++++++--- 2 files changed, 53 insertions(+), 7 deletions(-) diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 26dd9309fbee..2e8b0fc07756 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -122,8 +122,16 @@ pub(crate) fn push_work( } else if self.is_dead { Err((BinderError::new_dead(), work)) } else { + let sync = work.should_sync_wakeup(); + // There are no ready threads. Push work to process queue. self.work.push_back(work); + + // Wake up polling threads, if any. + for thread in self.threads.values() { + thread.notify_if_poll_ready(sync); + } + Ok(()) } } @@ -897,11 +905,16 @@ pub(crate) fn mmap( } pub(crate) fn poll( - _this: ArcBorrow<'_, Process>, - _file: &File, - _table: &mut PollTable, + this: ArcBorrow<'_, Process>, + file: &File, + table: &mut PollTable, ) -> Result { - Err(EINVAL) + let thread = this.get_thread(kernel::current!().pid())?; + let (from_proc, mut mask) = thread.poll(file, table); + if mask == 0 && from_proc && !this.inner.lock().work.is_empty() { + mask |= bindings::POLLIN; + } + Ok(mask) } } diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index f34de7ad6e6f..159beebbd23e 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -7,6 +7,7 @@ use kernel::{ bindings, + file::{File, PollCondVar, PollTable}, io_buffer::{IoBufferReader, IoBufferWriter}, list::{ AtomicListArcTracker, HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks, @@ -14,7 +15,7 @@ }, prelude::*, security, - sync::{Arc, CondVar, SpinLock}, + sync::{Arc, SpinLock}, types::Either, user_ptr::{UserSlicePtr, UserSlicePtrWriter}, }; @@ -75,6 +76,7 @@ struct InnerThread { const LOOPER_INVALID: u32 = 0x08; const LOOPER_WAITING: u32 = 0x10; const LOOPER_WAITING_PROC: u32 = 0x20; +const LOOPER_POLL: u32 = 0x40; impl InnerThread { fn new() -> Result { @@ -159,6 +161,15 @@ fn is_looper(&self) -> bool { fn should_use_process_work_queue(&self) -> bool { !self.process_work_list && self.is_looper() } + + fn poll(&mut self) -> u32 { + self.looper_flags |= LOOPER_POLL; + if self.process_work_list || self.looper_need_return { + bindings::POLLIN + } else { + 0 + } + } } /// This represents a thread that's used with binder. @@ -169,7 +180,7 @@ pub(crate) struct Thread { #[pin] inner: SpinLock, #[pin] - work_condvar: CondVar, + work_condvar: PollCondVar, /// Used to insert this thread into the process' `ready_threads` list. /// /// INVARIANT: May never be used for any other list than the `self.process.ready_threads`. @@ -201,7 +212,7 @@ pub(crate) fn new(id: i32, process: Arc) -> Result> { id, process, inner <- kernel::new_spinlock!(inner, "Thread::inner"), - work_condvar <- kernel::new_condvar!("Thread::work_condvar"), + work_condvar <- kernel::new_poll_condvar!("Thread::work_condvar"), links <- ListLinks::new(), links_track <- AtomicListArcTracker::new(), })) @@ -590,6 +601,12 @@ pub(crate) fn write_read(self: &Arc, data: UserSlicePtr, wait: bool) -> Re ret } + pub(crate) fn poll(&self, file: &File, table: &mut PollTable) -> (bool, u32) { + table.register_wait(file, &self.work_condvar); + let mut inner = self.inner.lock(); + (inner.should_use_process_work_queue(), inner.poll()) + } + /// Make the call to `get_work` or `get_work_local` return immediately, if any. pub(crate) fn exit_looper(&self) { let mut inner = self.inner.lock(); @@ -604,6 +621,22 @@ pub(crate) fn exit_looper(&self) { } } + pub(crate) fn notify_if_poll_ready(&self, sync: bool) { + // Determine if we need to notify. This requires the lock. + let inner = self.inner.lock(); + let notify = inner.looper_flags & LOOPER_POLL != 0 && inner.should_use_process_work_queue(); + drop(inner); + + // Now that the lock is no longer held, notify the waiters if we have to. + if notify { + if sync { + self.work_condvar.notify_sync(); + } else { + self.work_condvar.notify_one(); + } + } + } + pub(crate) fn release(self: &Arc) { self.inner.lock().is_dead = true; From patchwork Wed Nov 1 18:01:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160630 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605208vqx; Wed, 1 Nov 2023 11:03:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGIGwbedCAyV7sEx7dP9WA0DzvdP5y3rVEd/vpuHRvzF++hOFNbHx/tkeGYE/3PeAeHyVjg X-Received: by 2002:a05:6358:704:b0:168:dea8:8896 with SMTP id e4-20020a056358070400b00168dea88896mr18677070rwj.3.1698861815915; Wed, 01 Nov 2023 11:03:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861815; cv=none; d=google.com; s=arc-20160816; b=KsM8b91i2HrL+JMBHoz2zfZcTm5AipRXCLsqUN3FKTGMm6TErB9MqXmMZqraVO/7fu 7R3s7+B1/yWIgOvs39beEN7Z/8wyHhmfNzxvfImtMJ9DRjI+fdqzlqiakHbBrzhcDQsj ytfefJnZv9X4ROpuVefXkaQ6VSz6YezOIyCwUUOv05du30hYCZRwkLVDWBPe0HwDJKVy kUa+I0KDOvG/7ZZD4+yR0TRIOMdt7vfdk7VWhTcr4HejzesuCtZkQFsV3CPbNZnZ+co+ KXBn/k62scXoBzb/mqhNkypglRGUSwlhiJRngSOzF3x5ai/WFAIUUtX2aSGMabVXwIFC dGiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=D1N1ayCOZ2IlywhoU6sO7hhv9VakR10Zh4aCkasI2mY=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=bxuEdMH01GEeLr5pA5Oqo4rdzSc+Em9flTO1ZyooIlIGrwoQHrIr1dUQyem1ure8wQ +mPSaHN2AyGhVAwpvp+QJ/bgAScv6ti/yIBRUr1L3lgeVdhr+XMIUr+j/mQ3OPTva1mm xYkW8BC0sOvQAKR/ruhdjs75BUak+JRosAr1CNw7Q6/6S3oqymtFxN0npnnCtpR6Rmfd nw5RLQhJumRHDchvjtWs6cm67waztJ3ZC4hyBGknagY+WqyAKoP9r68/lhPG5I4fEQiY UhtFkZcCqWpJ0y+y91UDrUmdbSv5b6FO6Zpixkjk1dGxnXWAdI6ShAXPglSSrQ9ViUgs Yxyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=mqZaHcKP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id z4-20020a633304000000b005ac50a019b3si342762pgz.745.2023.11.01.11.03.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:03:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=mqZaHcKP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 8656F818ABD8; Wed, 1 Nov 2023 11:03:21 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344957AbjKASDI (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344866AbjKASDB (ORCPT ); Wed, 1 Nov 2023 14:03:01 -0400 Received: from mail-ed1-x549.google.com (mail-ed1-x549.google.com [IPv6:2a00:1450:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D362DB for ; Wed, 1 Nov 2023 11:02:37 -0700 (PDT) Received: by mail-ed1-x549.google.com with SMTP id 4fb4d7f45d1cf-54356d8ea43so24205a12.2 for ; Wed, 01 Nov 2023 11:02:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861756; x=1699466556; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=D1N1ayCOZ2IlywhoU6sO7hhv9VakR10Zh4aCkasI2mY=; b=mqZaHcKPL6/OxnQt3P5nyYKR3X4HscJs0T0A5x8EcX1LAI8gRT+Bckh3w9cy4JA6G0 XJIy8EJ7dn7nnxuziZNP3xCGVLxfzE1jfnOdPvVwWw3dEIZgaxUXVCDibx9SchxvwlKe pNTiDeamdKrlcU4WKN1yF5UFZdKhhPxm6w60jX0UiOox2IOYdkb8aB6ivD1ClEy4mGKE EpNu6rdhagE7GZIHRdPX1RlEfnizwTufajK4jbZD4KARWUiH2jeVRYP43zl7RSGMPWjb jyK62CGJmZGjTmmInF6kEumw6CkDyrqXamqYgqV9rhoWH284DvKWYCDY1EGk5GQeHkYD w9sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861756; x=1699466556; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=D1N1ayCOZ2IlywhoU6sO7hhv9VakR10Zh4aCkasI2mY=; b=qEsAFG0YHVLBrUeOSTHbXwcil6Y0+J7Z8CrNcWy4PNxVXwMA2nScH/q1Dj66l4RVNQ 0W+sybZ1fMlDvbFQZwp/l+gqD1Kw2KKd4PrZ4+8khrIs4qECXD73xp2cI3VXGJ9XiUma SSBPvg1tpsXpbqY9JJk3LJXTwzrCUoUc0vxhCAzv8o6ObiU1YUthbzaVcdLPfm1l6GV8 WbyFS/jdmQJEcjbrnj14UKJJMbYUwmxcJc5eMYi1JY9vyc6dYhYjIdKav0kTOFUlLFiY YlI0r2RiFvbtpb1YcyuUwwQedOtIGFOfDfbwejOEIiXE96fZOG+6tRXhioo2RQSNZ1sO gBug== X-Gm-Message-State: AOJu0YzVCFzY4hnbka6DvfTSTJpky7IqCPjmkLCUpeptkGXSH/VV2PPv 7LCZdBRpX+/ixnyo7QjM32aetB+U2oz8M98= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6402:3514:b0:543:92f2:216 with SMTP id b20-20020a056402351400b0054392f20216mr38865edd.7.1698861755870; Wed, 01 Nov 2023 11:02:35 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:38 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-8-08ba9197f637@google.com> Subject: [PATCH RFC 08/20] rust_binder: add non-oneway transactions From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:21 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385727495553932 X-GMAIL-MSGID: 1781385727495553932 From: Wedson Almeida Filho Make it possible to send transactions that are not oneway transactions, that is, transactions that you need to reply to. Generally, binder will try to look like a normal function call, where the call blocks until the function returns. This is implemented by allowing you to reply to incoming transactions, and having the sender sleep until a reply arrives. For each thread, binder will keep track of the current transaction. Furthermore, if you send a transaction from a thread that already has a current transaction, then binder will make that transaction into a "sub-transaction". This mimicks a call stack with normal functions. If you use subtransactions to send calls A->B->A with A and B being two different processes, then binder will ensure that the incoming sub-transaction is executed on the thread in A that sent the original message to B (and that thread in A is not used for any other incoming transactions). This feature is often referred to as "deadlock avoidance" because it avoids the case where A's threadpool has run out of threads, preventing the incoming subtransaction from being processed. Signed-off-by: Wedson Almeida Filho Co-developed-by: Alice Ryhl Signed-off-by: Alice Ryhl --- drivers/android/defs.rs | 2 + drivers/android/thread.rs | 218 ++++++++++++++++++++++++++++++++++++++++- drivers/android/transaction.rs | 132 ++++++++++++++++++++++--- 3 files changed, 336 insertions(+), 16 deletions(-) diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index d0fc00fa5a57..32178e8c5596 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -33,6 +33,8 @@ macro_rules! pub_no_prefix { binder_driver_command_protocol_, BC_TRANSACTION, BC_TRANSACTION_SG, + BC_REPLY, + BC_REPLY_SG, BC_FREE_BUFFER, BC_ENTER_LOOPER, BC_EXIT_LOOPER, diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 159beebbd23e..b583297cea91 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -56,6 +56,10 @@ struct InnerThread { /// Determines if thread is dead. is_dead: bool, + /// Work item used to deliver error codes to the thread that started a transaction. Stored here + /// so that it can be reused. + reply_work: DArc, + /// Work item used to deliver error codes to the current thread. Stored here so that it can be /// reused. return_work: DArc, @@ -65,6 +69,7 @@ struct InnerThread { process_work_list: bool, /// List of work items to deliver to userspace. work_list: List>, + current_transaction: Option>, /// Extended error information for this thread. extended_error: ExtendedError, @@ -90,8 +95,10 @@ fn next_err_id() -> u32 { looper_need_return: false, is_dead: false, process_work_list: false, + reply_work: ThreadError::try_new()?, return_work: ThreadError::try_new()?, work_list: List::new(), + current_transaction: None, extended_error: ExtendedError::new(next_err_id(), BR_OK, 0), }) } @@ -116,6 +123,15 @@ fn push_work(&mut self, work: DLArc) -> PushWorkRes { } } + fn push_reply_work(&mut self, code: u32) { + if let Ok(work) = ListArc::try_from_arc(self.reply_work.clone()) { + work.set_error_code(code); + self.push_work(work); + } else { + pr_warn!("Thread reply work is already in use."); + } + } + fn push_return_work(&mut self, reply: u32) { if let Ok(work) = ListArc::try_from_arc(self.return_work.clone()) { work.set_error_code(reply); @@ -131,6 +147,36 @@ fn push_work_deferred(&mut self, work: DLArc) { self.work_list.push_back(work); } + /// Fetches the transaction this thread can reply to. If the thread has a pending transaction + /// (that it could respond to) but it has also issued a transaction, it must first wait for the + /// previously-issued transaction to complete. + /// + /// The `thread` parameter should be the thread containing this `ThreadInner`. + fn pop_transaction_to_reply(&mut self, thread: &Thread) -> Result> { + let transaction = self.current_transaction.take().ok_or(EINVAL)?; + if core::ptr::eq(thread, transaction.from.as_ref()) { + self.current_transaction = Some(transaction); + return Err(EINVAL); + } + // Find a new current transaction for this thread. + self.current_transaction = transaction.find_from(thread); + Ok(transaction) + } + + fn pop_transaction_replied(&mut self, transaction: &DArc) -> bool { + match self.current_transaction.take() { + None => false, + Some(old) => { + if !Arc::ptr_eq(transaction, &old) { + self.current_transaction = Some(old); + return false; + } + self.current_transaction = old.clone_next(); + true + } + } + } + fn looper_enter(&mut self) { self.looper_flags |= LOOPER_ENTERED; if self.looper_flags & LOOPER_REGISTERED != 0 { @@ -159,7 +205,7 @@ fn is_looper(&self) -> bool { /// looper. Also, if there is local work, we want to return to userspace before we deliver any /// remote work. fn should_use_process_work_queue(&self) -> bool { - !self.process_work_list && self.is_looper() + self.current_transaction.is_none() && !self.process_work_list && self.is_looper() } fn poll(&mut self) -> u32 { @@ -225,6 +271,10 @@ pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result { Ok(()) } + pub(crate) fn set_current_transaction(&self, transaction: DArc) { + self.inner.lock().current_transaction = Some(transaction); + } + /// Attempts to fetch a work item from the thread-local queue. The behaviour if the queue is /// empty depends on `wait`: if it is true, the function waits for some work to be queued (or a /// signal); otherwise it returns indicating that none is available. @@ -407,6 +457,89 @@ pub(crate) fn copy_transaction_data( Ok(alloc) } + fn unwind_transaction_stack(self: &Arc) { + let mut thread = self.clone(); + while let Ok(transaction) = { + let mut inner = thread.inner.lock(); + inner.pop_transaction_to_reply(thread.as_ref()) + } { + let reply = Either::Right(BR_DEAD_REPLY); + if !transaction.from.deliver_single_reply(reply, &transaction) { + break; + } + + thread = transaction.from.clone(); + } + } + + pub(crate) fn deliver_reply( + &self, + reply: Either, u32>, + transaction: &DArc, + ) { + if self.deliver_single_reply(reply, transaction) { + transaction.from.unwind_transaction_stack(); + } + } + + /// Delivers a reply to the thread that started a transaction. The reply can either be a + /// reply-transaction or an error code to be delivered instead. + /// + /// Returns whether the thread is dead. If it is, the caller is expected to unwind the + /// transaction stack by completing transactions for threads that are dead. + fn deliver_single_reply( + &self, + reply: Either, u32>, + transaction: &DArc, + ) -> bool { + { + let mut inner = self.inner.lock(); + if !inner.pop_transaction_replied(transaction) { + return false; + } + + if inner.is_dead { + return true; + } + + match reply { + Either::Left(work) => { + inner.push_work(work); + } + Either::Right(code) => inner.push_reply_work(code), + } + } + + // Notify the thread now that we've released the inner lock. + self.work_condvar.notify_sync(); + false + } + + /// Determines if the given transaction is the current transaction for this thread. + fn is_current_transaction(&self, transaction: &DArc) -> bool { + let inner = self.inner.lock(); + match &inner.current_transaction { + None => false, + Some(current) => Arc::ptr_eq(current, transaction), + } + } + + /// Determines the current top of the transaction stack. It fails if the top is in another + /// thread (i.e., this thread belongs to a stack but it has called another thread). The top is + /// [`None`] if the thread is not currently participating in a transaction stack. + fn top_of_transaction_stack(&self) -> Result>> { + let inner = self.inner.lock(); + if let Some(cur) = &inner.current_transaction { + if core::ptr::eq(self, cur.from.as_ref()) { + pr_warn!("got new transaction with bad transaction stack"); + return Err(EINVAL); + } + Ok(Some(cur.clone())) + } else { + Ok(None) + } + } + fn transaction(self: &Arc, tr: &BinderTransactionDataSg, inner: T) where T: FnOnce(&Arc, &BinderTransactionDataSg) -> BinderResult, @@ -427,12 +560,79 @@ fn transaction(self: &Arc, tr: &BinderTransactionDataSg, inner: T) } } + fn transaction_inner(self: &Arc, tr: &BinderTransactionDataSg) -> BinderResult { + let handle = unsafe { tr.transaction_data.target.handle }; + let node_ref = self.process.get_transaction_node(handle)?; + security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?; + // TODO: We need to ensure that there isn't a pending transaction in the work queue. How + // could this happen? + let top = self.top_of_transaction_stack()?; + let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?; + let completion = list_completion.clone_arc(); + let transaction = Transaction::new(node_ref, top, self, tr)?; + + // Check that the transaction stack hasn't changed while the lock was released, then update + // it with the new transaction. + { + let mut inner = self.inner.lock(); + if !transaction.is_stacked_on(&inner.current_transaction) { + pr_warn!("Transaction stack changed during transaction!"); + return Err(EINVAL.into()); + } + inner.current_transaction = Some(transaction.clone_arc()); + // We push the completion as a deferred work so that we wait for the reply before returning + // to userland. + inner.push_work_deferred(list_completion); + } + + if let Err(e) = transaction.submit() { + completion.skip(); + // Define `transaction` first to drop it after `inner`. + let transaction; + let mut inner = self.inner.lock(); + transaction = inner.current_transaction.take().unwrap(); + inner.current_transaction = transaction.clone_next(); + Err(e) + } else { + Ok(()) + } + } + + fn reply_inner(self: &Arc, tr: &BinderTransactionDataSg) -> BinderResult { + let orig = self.inner.lock().pop_transaction_to_reply(self)?; + if !orig.from.is_current_transaction(&orig) { + return Err(EINVAL.into()); + } + + // We need to complete the transaction even if we cannot complete building the reply. + (|| -> BinderResult<_> { + let completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?; + let process = orig.from.process.clone(); + let reply = Transaction::new_reply(self, process, tr)?; + self.inner.lock().push_work(completion); + orig.from.deliver_reply(Either::Left(reply), &orig); + Ok(()) + })() + .map_err(|mut err| { + // At this point we only return `BR_TRANSACTION_COMPLETE` to the caller, and we must let + // the sender know that the transaction has completed (with an error in this case). + pr_warn!( + "Failure {:?} during reply - delivering BR_FAILED_REPLY to sender.", + err + ); + let reply = Either::Right(BR_FAILED_REPLY); + orig.from.deliver_reply(reply, &orig); + err.reply = BR_TRANSACTION_COMPLETE; + err + }) + } + fn oneway_transaction_inner(self: &Arc, tr: &BinderTransactionDataSg) -> BinderResult { let handle = unsafe { tr.transaction_data.target.handle }; let node_ref = self.process.get_transaction_node(handle)?; security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?; let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?; - let transaction = Transaction::new(node_ref, self, tr)?; + let transaction = Transaction::new(node_ref, None, self, tr)?; let completion = list_completion.clone_arc(); self.inner.lock().push_work(list_completion); match transaction.submit() { @@ -458,7 +658,7 @@ fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { if tr.transaction_data.flags & TF_ONE_WAY != 0 { self.transaction(&tr, Self::oneway_transaction_inner); } else { - return Err(EINVAL); + self.transaction(&tr, Self::transaction_inner); } } BC_TRANSACTION_SG => { @@ -466,9 +666,17 @@ fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { if tr.transaction_data.flags & TF_ONE_WAY != 0 { self.transaction(&tr, Self::oneway_transaction_inner); } else { - return Err(EINVAL); + self.transaction(&tr, Self::transaction_inner); } } + BC_REPLY => { + let tr = reader.read::()?.with_buffers_size(0); + self.transaction(&tr, Self::reply_inner) + } + BC_REPLY_SG => { + let tr = reader.read::()?; + self.transaction(&tr, Self::reply_inner) + } BC_FREE_BUFFER => drop(self.process.buffer_get(reader.read()?)), BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?, BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?, @@ -644,6 +852,8 @@ pub(crate) fn release(self: &Arc) { while let Ok(Some(work)) = self.get_work_local(false) { work.into_arc().cancel(); } + + self.unwind_transaction_stack(); } } diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index 8b4274ddc415..a6525a4253ea 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -6,23 +6,25 @@ prelude::*, sync::{Arc, SpinLock}, task::Kuid, + types::{Either, ScopeGuard}, user_ptr::UserSlicePtrWriter, }; use crate::{ allocation::Allocation, defs::*, - error::BinderResult, + error::{BinderError, BinderResult}, node::{Node, NodeRef}, process::Process, ptr_align, - thread::Thread, + thread::{PushWorkRes, Thread}, DArc, DLArc, DTRWrap, DeliverToRead, }; #[pin_data] pub(crate) struct Transaction { target_node: Option>, + stack_next: Option>, pub(crate) from: Arc, to: Arc, #[pin] @@ -42,6 +44,7 @@ pub(crate) struct Transaction { impl Transaction { pub(crate) fn new( node_ref: NodeRef, + stack_next: Option>, from: &Arc, tr: &BinderTransactionDataSg, ) -> BinderResult> { @@ -59,8 +62,8 @@ pub(crate) fn new( return Err(err); } }; - if trd.flags & TF_ONE_WAY == 0 { - pr_warn!("Non-oneway transactions are not yet supported."); + if trd.flags & TF_ONE_WAY != 0 && stack_next.is_some() { + pr_warn!("Oneway transaction should not be in a transaction stack."); return Err(EINVAL.into()); } if trd.flags & TF_CLEAR_BUF != 0 { @@ -72,6 +75,7 @@ pub(crate) fn new( Ok(DTRWrap::arc_pin_init(pin_init!(Transaction { target_node: Some(target_node), + stack_next, sender_euid: from.process.cred.euid(), from: from.clone(), to, @@ -84,15 +88,100 @@ pub(crate) fn new( }))?) } - /// Submits the transaction to a work queue. + pub(crate) fn new_reply( + from: &Arc, + to: Arc, + tr: &BinderTransactionDataSg, + ) -> BinderResult> { + let trd = &tr.transaction_data; + let mut alloc = match from.copy_transaction_data(to.clone(), tr, None) { + Ok(alloc) => alloc, + Err(err) => { + pr_warn!("Failure in copy_transaction_data: {:?}", err); + return Err(err); + } + }; + if trd.flags & TF_CLEAR_BUF != 0 { + alloc.set_info_clear_on_drop(); + } + Ok(DTRWrap::arc_pin_init(pin_init!(Transaction { + target_node: None, + stack_next: None, + sender_euid: from.process.task.euid(), + from: from.clone(), + to, + code: trd.code, + flags: trd.flags, + data_size: trd.data_size as _, + data_address: alloc.ptr, + allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), + txn_security_ctx_off: None, + }))?) + } + + /// Determines if the transaction is stacked on top of the given transaction. + pub(crate) fn is_stacked_on(&self, onext: &Option>) -> bool { + match (&self.stack_next, onext) { + (None, None) => true, + (Some(stack_next), Some(next)) => Arc::ptr_eq(stack_next, next), + _ => false, + } + } + + /// Returns a pointer to the next transaction on the transaction stack, if there is one. + pub(crate) fn clone_next(&self) -> Option> { + Some(self.stack_next.as_ref()?.clone()) + } + + /// Searches in the transaction stack for a thread that belongs to the target process. This is + /// useful when finding a target for a new transaction: if the node belongs to a process that + /// is already part of the transaction stack, we reuse the thread. + fn find_target_thread(&self) -> Option> { + let mut it = &self.stack_next; + while let Some(transaction) = it { + if Arc::ptr_eq(&transaction.from.process, &self.to) { + return Some(transaction.from.clone()); + } + it = &transaction.stack_next; + } + None + } + + /// Searches in the transaction stack for a transaction originating at the given thread. + pub(crate) fn find_from(&self, thread: &Thread) -> Option> { + let mut it = &self.stack_next; + while let Some(transaction) = it { + if core::ptr::eq(thread, transaction.from.as_ref()) { + return Some(transaction.clone()); + } + + it = &transaction.stack_next; + } + None + } + + /// Submits the transaction to a work queue. Uses a thread if there is one in the transaction + /// stack, otherwise uses the destination process. + /// + /// Not used for replies. pub(crate) fn submit(self: DLArc) -> BinderResult { let process = self.to.clone(); let mut process_inner = process.inner.lock(); - match process_inner.push_work(self) { + + let res = if let Some(thread) = self.find_target_thread() { + match thread.push_work(self) { + PushWorkRes::Ok => Ok(()), + PushWorkRes::FailedDead(me) => Err((BinderError::new_dead(), me)), + } + } else { + process_inner.push_work(self) + }; + drop(process_inner); + + match res { Ok(()) => Ok(()), Err((err, work)) => { // Drop work after releasing process lock. - drop(process_inner); drop(work); Err(err) } @@ -101,11 +190,14 @@ pub(crate) fn submit(self: DLArc) -> BinderResult { } impl DeliverToRead for Transaction { - fn do_work( - self: DArc, - _thread: &Thread, - writer: &mut UserSlicePtrWriter, - ) -> Result { + fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) -> Result { + let send_failed_reply = ScopeGuard::new(|| { + if self.target_node.is_some() && self.flags & TF_ONE_WAY == 0 { + let reply = Either::Right(BR_FAILED_REPLY); + self.from.deliver_reply(reply, &self); + } + }); + let mut tr_sec = BinderTransactionDataSecctx::default(); let tr = tr_sec.tr_data(); if let Some(target_node) = &self.target_node { @@ -144,17 +236,33 @@ fn do_work( writer.write(&*tr)?; } + // Dismiss the completion of transaction with a failure. No failure paths are allowed from + // here on out. + send_failed_reply.dismiss(); + // It is now the user's responsibility to clear the allocation. let alloc = self.allocation.lock().take(); if let Some(alloc) = alloc { alloc.keep_alive(); } + // When this is not a reply and not a oneway transaction, update `current_transaction`. If + // it's a reply, `current_transaction` has already been updated appropriately. + if self.target_node.is_some() && tr_sec.transaction_data.flags & TF_ONE_WAY == 0 { + thread.set_current_transaction(self); + } + Ok(false) } fn cancel(self: DArc) { drop(self.allocation.lock().take()); + + // If this is not a reply or oneway transaction, then send a dead reply. + if self.target_node.is_some() && self.flags & TF_ONE_WAY == 0 { + let reply = Either::Right(BR_DEAD_REPLY); + self.from.deliver_reply(reply, &self); + } } fn should_sync_wakeup(&self) -> bool { From patchwork Wed Nov 1 18:01:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160636 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605562vqx; Wed, 1 Nov 2023 11:04:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEpqkATq/8tTTg6ii1bZFAFbo37N9dXJcd0CBuPv8AlLlpWUlyG2LFOxe+PBraAZfDk+rJ4 X-Received: by 2002:a17:903:22c7:b0:1cc:3f10:4175 with SMTP id y7-20020a17090322c700b001cc3f104175mr13124391plg.28.1698861844840; Wed, 01 Nov 2023 11:04:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861844; cv=none; d=google.com; s=arc-20160816; b=E0B3VoJAkegOMTmHY2Qy06+yFeak6Abw5920kK5sK4vfFr4tQ34XyMppuq7Re5NGYX qSx+6h1qgjr0HphPqze45uxPa2ssj9KhARXbiYPWRkD2+kb8PvZJ1KmIvOIIPeGLZ1i7 7OJw06PYJ4K3Mbw7tCFrI5yT6Xn5Py8VOnP9OHW9/5+cvv8XyTib34nbv/uaBwHYnQ1B NxJNVufJavukKJ7VUORa9Y7DhbDC04qWWxFZ7QqDTIzVAjERqsVQYRWBWnacy0mVEGw+ FEF99t3zdDWrvVqyWQ2I+xWkQmpw19btqsz3sq4XHA+p6HEuD0pzIaGf59PWCFW060jI FFKg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=dDLIbYfXHuasXnwFeCXtVR7UAQw9kXfm6H29EVGLaPI=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=UmjUzIrcUmcluNqhr/80kEeLTOf9yXGYrjmj0Dbt0Y1v68QgmPnCI816xGkgfUmSVN YRrFkx5w7GOkFsFH/Ag0mNdA81N0UHon2h5OLRsV3vnpt3O1SkmxsAW8FdrlRedhrCSn x0pVcFJb+oAEO2zzc/7JIUO1J86mx1CHg/ARR1dtD+aDMHBn+GGrQaNT3yzcvSm9w4dS QVzO5qSCvp5OozQx1HbJg+583G5DJ9nXRjdIjxEXK0vME+QOHIHOUMA3+6yr2K30scdm ovZ+/88jRWKq6ECzuq22dRxoYxl+6EMZF4NC6e6GoeGi3W4to5/kbR454Q2t/cyJalOq 4NwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=2bAw7LBQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id m9-20020a170902db0900b001bb95a5cb9fsi3651398plx.522.2023.11.01.11.03.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=2bAw7LBQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id BE0778027864; Wed, 1 Nov 2023 11:03:44 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344996AbjKASDN (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39342 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344870AbjKASDC (ORCPT ); Wed, 1 Nov 2023 14:03:02 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 009DA111 for ; Wed, 1 Nov 2023 11:02:38 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5a7bbe0a453so1706227b3.0 for ; Wed, 01 Nov 2023 11:02:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861758; x=1699466558; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dDLIbYfXHuasXnwFeCXtVR7UAQw9kXfm6H29EVGLaPI=; b=2bAw7LBQR0BmITiaYlDtY8vfbx1z9BWfuy9uJrduek6ePnoIEXkcOCfWgvNYPc7DNH 9vontYNZG8+No95hSNXxsAk7oq1nGFDWgvhsqybJ+LR2r6HvfJey/kgxH3CTkGtTyUP/ bBcTzs2HgAFDuCrTvSdX3iSE2hsIAnuKkkmeomvvJfBaqfy6PCyXXMiYsYjIY7k4mRRG zAn5eg0hesENM5IBRAVhpzphqk5A9bx8D/aUvdWMQoECjS51/MV9nph/BMZjgsFcrdhu 05sdllSSRGxcXFNn8PPtDlL5/UTNN3J9ofuTQ4v4LdMjOU2mysxlTwvGtKztG+QaRJBM Vm4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861758; x=1699466558; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dDLIbYfXHuasXnwFeCXtVR7UAQw9kXfm6H29EVGLaPI=; b=Cn5dd+cgD9M1HChHUpzr5XkgxVIcPlNa7CMCmUJuPV1nCmYO19FZ9FNmWdppCnHwPP qh8mJGm2b/RmNkq8JMciLmkcwW3Ard6Bot8piO/AS747ROHC777vWQQqS8+npLVqGtyb V+lWNTZSu25ofRg7KMfkBthY89dMhGwweYQ2MiSPBlioQKaFQOuSeoCQVMILLslQeGmx FSinq8gdJg5z6JlPBQ6Z35QE0DbwUTMW4hXPTK929pPgXkcaanAwlBA9uY2+JMiTJvnT P3fgJjvtQVOippKznfTWTqSFbEJSNfovrC3AgvypqFFZzob7y/QA6lplfVmOgiSDbPXG 34Wg== X-Gm-Message-State: AOJu0YxWoFSwx2e2ivRn9T5D3P1FZj3mZQKT0kG3cLQqYRVEnPzw0SBB TfcRAjfYjLCBH6602+BOvQw0c8CqzvzCT8I= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a0d:db97:0:b0:5a7:b15a:1a7d with SMTP id d145-20020a0ddb97000000b005a7b15a1a7dmr325916ywe.2.1698861758246; Wed, 01 Nov 2023 11:02:38 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:39 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-9-08ba9197f637@google.com> Subject: [PATCH RFC 09/20] rust_binder: serialize oneway transactions From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:44 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385757443972277 X-GMAIL-MSGID: 1781385757443972277 The binder driver guarantees that oneway transactions sent to the same node are serialized, that is, userspace will not be given the next one until it has finished processing the previous oneway transaction. This is done to avoid the case where two oneway transactions arrive in opposite order from the order in which they were sent. (E.g., they could be delivered to two different threads, which could appear as-if they were sent in opposite order.) To fix that, we store pending oneway transactions in a separate list in the node, and don't deliver the next oneway transaction until userspace signals that it has finished processing the previous oneway transaction by calling the BC_FREE_BUFFER ioctl. Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 19 +++++++++- drivers/android/node.rs | 79 ++++++++++++++++++++++++++++++++++++++++-- drivers/android/process.rs | 25 ++++++++++--- drivers/android/transaction.rs | 26 ++++++++++++-- 4 files changed, 138 insertions(+), 11 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs index 1ab0f254fded..0fdef5425918 100644 --- a/drivers/android/allocation.rs +++ b/drivers/android/allocation.rs @@ -3,13 +3,22 @@ use kernel::{bindings, pages::Pages, prelude::*, sync::Arc, user_ptr::UserSlicePtrReader}; -use crate::{node::NodeRef, process::Process}; +use crate::{ + node::{Node, NodeRef}, + process::Process, + DArc, +}; #[derive(Default)] pub(crate) struct AllocationInfo { /// The target node of the transaction this allocation is associated to. /// Not set for replies. pub(crate) target_node: Option, + /// When this allocation is dropped, call `pending_oneway_finished` on the node. + /// + /// This is used to serialize oneway transaction on the same node. Binder guarantees that + /// oneway transactions to the same node are delivered sequentially in the order they are sent. + pub(crate) oneway_node: Option>, /// Zero the data in the buffer on free. pub(crate) clear_on_free: bool, } @@ -110,6 +119,10 @@ pub(crate) fn get_or_init_info(&mut self) -> &mut AllocationInfo { self.allocation_info.get_or_insert_with(Default::default) } + pub(crate) fn set_info_oneway_node(&mut self, oneway_node: DArc) { + self.get_or_init_info().oneway_node = Some(oneway_node); + } + pub(crate) fn set_info_clear_on_drop(&mut self) { self.get_or_init_info().clear_on_free = true; } @@ -126,6 +139,10 @@ fn drop(&mut self) { } if let Some(mut info) = self.allocation_info.take() { + if let Some(oneway_node) = info.oneway_node.as_ref() { + oneway_node.pending_oneway_finished(); + } + info.target_node = None; if info.clear_on_free { diff --git a/drivers/android/node.rs b/drivers/android/node.rs index c6c3d81e705d..b8a08b16c06d 100644 --- a/drivers/android/node.rs +++ b/drivers/android/node.rs @@ -2,7 +2,9 @@ use kernel::{ io_buffer::IoBufferWriter, - list::{AtomicListArcTracker, ListArcSafe, TryNewListArc}, + list::{ + AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc, + }, prelude::*, sync::lock::{spinlock::SpinLockBackend, Guard}, sync::{Arc, LockedBy}, @@ -11,9 +13,11 @@ use crate::{ defs::*, + error::BinderError, process::{Process, ProcessInner}, thread::Thread, - DArc, DeliverToRead, + transaction::Transaction, + DArc, DLArc, DTRWrap, DeliverToRead, }; struct CountState { @@ -36,6 +40,8 @@ fn new() -> Self { struct NodeInner { strong: CountState, weak: CountState, + oneway_todo: List>, + has_pending_oneway_todo: bool, /// The number of active BR_INCREFS or BR_ACQUIRE operations. (should be maximum two) /// /// If this is non-zero, then we postpone any BR_RELEASE or BR_DECREFS notifications until the @@ -62,6 +68,16 @@ impl ListArcSafe<0> for Node { } } +// These make `oneway_todo` work. +kernel::list::impl_has_list_links! { + impl HasListLinks<0> for DTRWrap { self.links.inner } +} +kernel::list::impl_list_item! { + impl ListItem<0> for DTRWrap { + using ListLinks; + } +} + impl Node { pub(crate) fn new( ptr: usize, @@ -79,6 +95,8 @@ pub(crate) fn new( NodeInner { strong: CountState::new(), weak: CountState::new(), + oneway_todo: List::new(), + has_pending_oneway_todo: false, active_inc_refs: 0, }, ), @@ -201,6 +219,63 @@ fn write(&self, writer: &mut UserSlicePtrWriter, code: u32) -> Result { writer.write(&self.cookie)?; Ok(()) } + + pub(crate) fn submit_oneway( + &self, + transaction: DLArc, + guard: &mut Guard<'_, ProcessInner, SpinLockBackend>, + ) -> Result<(), (BinderError, DLArc)> { + if guard.is_dead { + return Err((BinderError::new_dead(), transaction)); + } + + let inner = self.inner.access_mut(guard); + if inner.has_pending_oneway_todo { + inner.oneway_todo.push_back(transaction); + } else { + inner.has_pending_oneway_todo = true; + guard.push_work(transaction)?; + } + Ok(()) + } + + pub(crate) fn release(&self, guard: &mut Guard<'_, ProcessInner, SpinLockBackend>) { + // Move every pending oneshot message to the process todolist. The process + // will cancel it later. + // + // New items can't be pushed after this call, since `submit_oneway` fails when the process + // is dead, which is set before `Node::release` is called. + // + // TODO: Give our linked list implementation the ability to move everything in one go. + while let Some(work) = self.inner.access_mut(guard).oneway_todo.pop_front() { + guard.push_work_for_release(work); + } + } + + pub(crate) fn pending_oneway_finished(&self) { + let mut guard = self.owner.inner.lock(); + if guard.is_dead { + // Cleanup will happen in `Process::deferred_release`. + return; + } + + let inner = self.inner.access_mut(&mut guard); + + let transaction = inner.oneway_todo.pop_front(); + inner.has_pending_oneway_todo = transaction.is_some(); + if let Some(transaction) = transaction { + match guard.push_work(transaction) { + Ok(()) => {} + Err((_err, work)) => { + // Process is dead. + // This shouldn't happen due to the `is_dead` check, but if it does, just drop + // the transaction and return. + drop(guard); + drop(work); + } + } + } + } } impl DeliverToRead for Node { diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 2e8b0fc07756..d4e50c7f9a88 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -136,6 +136,11 @@ pub(crate) fn push_work( } } + /// Push work to be cancelled. Only used during process teardown. + pub(crate) fn push_work_for_release(&mut self, work: DLArc) { + self.work.push_back(work); + } + pub(crate) fn remove_node(&mut self, ptr: usize) { self.nodes.remove(&ptr); } @@ -740,6 +745,21 @@ fn deferred_release(self: Arc) { self.ctx.deregister_process(&self); + // Move oneway_todo into the process todolist. + { + let mut inner = self.inner.lock(); + let nodes = take(&mut inner.nodes); + for node in nodes.values() { + node.release(&mut inner); + } + inner.nodes = nodes; + } + + // Cancel all pending work items. + while let Some(work) = self.get_work() { + work.into_arc().cancel(); + } + // Move the threads out of `inner` so that we can iterate over them without holding the // lock. let mut inner = self.inner.lock(); @@ -751,11 +771,6 @@ fn deferred_release(self: Arc) { thread.release(); } - // Cancel all pending work items. - while let Some(work) = self.get_work() { - work.into_arc().cancel(); - } - // Free any resources kept alive by allocated buffers. let omapping = self.inner.lock().mapping.take(); if let Some(mut mapping) = omapping { diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index a6525a4253ea..a4ffe0a3878c 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -62,9 +62,12 @@ pub(crate) fn new( return Err(err); } }; - if trd.flags & TF_ONE_WAY != 0 && stack_next.is_some() { - pr_warn!("Oneway transaction should not be in a transaction stack."); - return Err(EINVAL.into()); + if trd.flags & TF_ONE_WAY != 0 { + if stack_next.is_some() { + pr_warn!("Oneway transaction should not be in a transaction stack."); + return Err(EINVAL.into()); + } + alloc.set_info_oneway_node(node_ref.node.clone()); } if trd.flags & TF_CLEAR_BUF != 0 { alloc.set_info_clear_on_drop(); @@ -165,9 +168,26 @@ pub(crate) fn find_from(&self, thread: &Thread) -> Option> { /// /// Not used for replies. pub(crate) fn submit(self: DLArc) -> BinderResult { + let oneway = self.flags & TF_ONE_WAY != 0; let process = self.to.clone(); let mut process_inner = process.inner.lock(); + if oneway { + if let Some(target_node) = self.target_node.clone() { + match target_node.submit_oneway(self, &mut process_inner) { + Ok(()) => return Ok(()), + Err((err, work)) => { + drop(process_inner); + // Drop work after releasing process lock. + drop(work); + return Err(err); + } + } + } else { + pr_err!("Failed to submit oneway transaction to node."); + } + } + let res = if let Some(thread) = self.find_target_thread() { match thread.push_work(self) { PushWorkRes::Ok => Ok(()), From patchwork Wed Nov 1 18:01:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160639 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605674vqx; Wed, 1 Nov 2023 11:04:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFkB3hsnvL2LwTaQB9suWklQYOjLze268dcNsDgXLxZiwd/k5aStDYMadgo9QwWfE6zJvzL X-Received: by 2002:a17:90a:8507:b0:27d:30d5:c0f8 with SMTP id l7-20020a17090a850700b0027d30d5c0f8mr15305707pjn.43.1698861853426; Wed, 01 Nov 2023 11:04:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861853; cv=none; d=google.com; s=arc-20160816; b=J8JSRd1H/hUovOaYADpNHuviRZMxkI3W7ez+12ZyByjAkLeCqlRJZil4aA257JniVL 7uIwdmsyeC8WPXchVBgPCpxCuhzQiyITcWion5c5RfnmF8b1B8UCgXmklp0amASjIp5m rQ3eNNIFS6jKrg95eymMuCYgP0z4cRcW9Gf//DDwxdvk1OuYxjrL583d4j3CS0X1nWS2 KrqzSoXKDtZOY2CKN0//0Q7k3leYGqIwGttAqpzlimvtEhO58Kcc6yMJUC5f7w0JYG5b Rph0FZh4yoq7uTFBqhMJDq3XB09h7MArU6tOh15qefslgLlx6fzMmciIMslARrqg+FaX gDdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=azhVe2rhV1gDlFmhFwnbqw9ViDwuarkzu6YKI2xsrC4=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=aX5+aq9wviE79FXzoxIxbOUlC46zvz0PlQAtZMzaF8ssAVC5ElV8Rk++IO5Cmw+dq/ 3A+0Df4GVJA9OVCCaDTGk3g0NO0wqpr8shjs++ucCQ5/gu1MjVMXvkWiGMxmoAFPEJ4+ bv4upwc6fX6KgoGBTazPh4lxwBayYeWhp35SGVyp9UxGxAIAkjM3WryooW8j1ggUGejl w42cJu5CtO/3XX7HsnNxJLgXsWIoGRx3fSUUgIaJLOHFj2/ttFHbC1/WVMxXm+C6ndzU wFcYwXnORut2Kc6RKC9vlzJDqnffoncIxZ61MIeyQMEmsX57V4Kwt8DnoULHhuQbJMt+ 7UHA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Zk2+K0Z8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id gv20-20020a17090b11d400b00280708b2eaasi1274063pjb.158.2023.11.01.11.04.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=Zk2+K0Z8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id BEDAD80278C6; Wed, 1 Nov 2023 11:03:53 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345036AbjKASDQ (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344795AbjKASDC (ORCPT ); Wed, 1 Nov 2023 14:03:02 -0400 Received: from mail-ed1-x549.google.com (mail-ed1-x549.google.com [IPv6:2a00:1450:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4535411C for ; Wed, 1 Nov 2023 11:02:42 -0700 (PDT) Received: by mail-ed1-x549.google.com with SMTP id 4fb4d7f45d1cf-543d2bc7d9dso30646a12.1 for ; Wed, 01 Nov 2023 11:02:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861761; x=1699466561; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=azhVe2rhV1gDlFmhFwnbqw9ViDwuarkzu6YKI2xsrC4=; b=Zk2+K0Z8vnB/TazQEeQH1S6Jtp6THf67hyu4kJmXPCAZIIheuDk0g4GVm/AcGshhir Jns270pNIXEnM5eH0ZyKYJYVEHPMfjReVpZFmOcuxB2CnLqH+nsrfUuE/tt/S29LRUhe twgfG7fu74Enj8DRplKnoMLbgt6S8j5wAl9pNRl4uRUE836hZwuPNpJWwDewZU+j2MGu PIgHnkoMPmk+tyhT8lrqF46T3bJuy5Fe8cOVC3LBvxCqrJiegZN04sDzssfreg0ylLxk DgtsJ8w9jkQJBEjDqLiNlTzGujJRH2QdeZlS8vlKFznFFTfZDfhODQP0CTo1DBZPVM14 vcBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861761; x=1699466561; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=azhVe2rhV1gDlFmhFwnbqw9ViDwuarkzu6YKI2xsrC4=; b=Coi/KS1mPfUE0ofiFg1mMJ8LrwIIbP2BmN01k8DWWP04I72dMCyfh1j2EQfkOGDwem V+OnBhgDIjVQRTvLw0iYWgBKHy8igoPBWxP4FY6m89RHl/WDTJDSru6hoOhEDfkKDp+l YKHdmS/C/xi9t16uniTc2YdB/5IAaDS6ezijvTBdCsYodkhenInxbVp8fQ1EFw7FDhuh 3PfLluJHvZbfOxO/dAioueGVLszAr883koMLRfucYUlKHPJDwv96seyhDEKcilkqQslM tvVn/4Mo8eYDEN2lt8tW8BlgZN6yNBbyI+ip/hgWvHdsPlZi6ULLQ6rMZaV41ClyMSH/ Cffw== X-Gm-Message-State: AOJu0YxfAca8nrFqzIOVKSSimKDE903tn/yaV7HDMlw6xMPS8Au81675 ffdHgGCRgm/hO/uDz+5V9pPDY4RMGCuE5xo= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6402:500d:b0:543:6cf7:87db with SMTP id p13-20020a056402500d00b005436cf787dbmr61701eda.5.1698861760736; Wed, 01 Nov 2023 11:02:40 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:40 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-10-08ba9197f637@google.com> Subject: [PATCH RFC 10/20] rust_binder: add death notifications From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:53 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385766979901305 X-GMAIL-MSGID: 1781385766979901305 From: Wedson Almeida Filho This adds death notifications that let one process be notified when another process dies. A process can request to be notified when a process dies using `BC_REQUEST_DEATH_NOTIFICATION`. This will make the driver send a `BR_DEAD_BINDER` to userspace when the process dies (or immediately if it is already dead). Userspace is supposed to respond with `BC_DEAD_BINDER_DONE` once it has processed the notification. Userspace can unregister from death notifications using the `BC_CLEAR_DEATH_NOTIFICATION` command. In this case, the kernel will respond with `BR_CLEAR_DEATH_NOTIFICATION_DONE` once the notification has been removed. Note that if the remote process dies before the kernel has responded with `BR_CLEAR_DEATH_NOTIFICATION_DONE`, then the kernel will still send a `BR_DEAD_BINDER`, which userspace must be able to process. In this case, the kernel will wait for the `BC_DEAD_BINDER_DONE` command before it sends `BR_CLEAR_DEATH_NOTIFICATION_DONE`. Note that even if the kernel sends a `BR_DEAD_BINDER`, this does not remove the death notification. Userspace must still remove it manually using `BC_CLEAR_DEATH_NOTIFICATION`. If a process uses `BC_RELEASE` to destroy its last refcount on a node that has an active death registration, then the death registration is immediately deleted. However, userspace is not supposed to delete a node reference without first deregistering death notifications, so this codepath is not executed under normal circumstances. Signed-off-by: Wedson Almeida Filho Co-developed-by: Alice Ryhl Signed-off-by: Alice Ryhl --- drivers/android/defs.rs | 10 +- drivers/android/node.rs | 258 ++++++++++++++++++++++++++++++++++++++++- drivers/android/process.rs | 193 +++++++++++++++++++++++++++--- drivers/android/rust_binder.rs | 7 ++ drivers/android/thread.rs | 22 +++- 5 files changed, 471 insertions(+), 19 deletions(-) diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 32178e8c5596..753f7e86c92d 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -23,10 +23,13 @@ macro_rules! pub_no_prefix { BR_SPAWN_LOOPER, BR_TRANSACTION_COMPLETE, BR_OK, + BR_ERROR, BR_INCREFS, BR_ACQUIRE, BR_RELEASE, - BR_DECREFS + BR_DECREFS, + BR_DEAD_BINDER, + BR_CLEAR_DEATH_NOTIFICATION_DONE ); pub_no_prefix!( @@ -44,7 +47,10 @@ macro_rules! pub_no_prefix { BC_RELEASE, BC_DECREFS, BC_INCREFS_DONE, - BC_ACQUIRE_DONE + BC_ACQUIRE_DONE, + BC_REQUEST_DEATH_NOTIFICATION, + BC_CLEAR_DEATH_NOTIFICATION, + BC_DEAD_BINDER_DONE ); pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 = diff --git a/drivers/android/node.rs b/drivers/android/node.rs index b8a08b16c06d..7ed494bf9f7c 100644 --- a/drivers/android/node.rs +++ b/drivers/android/node.rs @@ -3,11 +3,12 @@ use kernel::{ io_buffer::IoBufferWriter, list::{ - AtomicListArcTracker, HasListLinks, List, ListArcSafe, ListItem, ListLinks, TryNewListArc, + AtomicListArcTracker, HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks, + TryNewListArc, }, prelude::*, sync::lock::{spinlock::SpinLockBackend, Guard}, - sync::{Arc, LockedBy}, + sync::{Arc, LockedBy, SpinLock}, user_ptr::UserSlicePtrWriter, }; @@ -40,6 +41,7 @@ fn new() -> Self { struct NodeInner { strong: CountState, weak: CountState, + death_list: List, 1>, oneway_todo: List>, has_pending_oneway_todo: bool, /// The number of active BR_INCREFS or BR_ACQUIRE operations. (should be maximum two) @@ -95,6 +97,7 @@ pub(crate) fn new( NodeInner { strong: CountState::new(), weak: CountState::new(), + death_list: List::new(), oneway_todo: List::new(), has_pending_oneway_todo: false, active_inc_refs: 0, @@ -112,6 +115,25 @@ pub(crate) fn get_id(&self) -> (usize, usize) { (self.ptr, self.cookie) } + pub(crate) fn next_death( + &self, + guard: &mut Guard<'_, ProcessInner, SpinLockBackend>, + ) -> Option> { + self.inner + .access_mut(guard) + .death_list + .pop_front() + .map(|larc| larc.into_arc()) + } + + pub(crate) fn add_death( + &self, + death: ListArc, 1>, + guard: &mut Guard<'_, ProcessInner, SpinLockBackend>, + ) { + self.inner.access_mut(guard).death_list.push_back(death); + } + pub(crate) fn inc_ref_done_locked( &self, _strong: bool, @@ -449,3 +471,235 @@ fn drop(&mut self) { } } } + +struct NodeDeathInner { + dead: bool, + cleared: bool, + notification_done: bool, + /// Indicates whether the normal flow was interrupted by removing the handle. In this case, we + /// need behave as if the death notification didn't exist (i.e., we don't deliver anything to + /// the user. + aborted: bool, +} + +/// Used to deliver notifications when a process dies. +/// +/// A process can request to be notified when a process dies using `BC_REQUEST_DEATH_NOTIFICATION`. +/// This will make the driver send a `BR_DEAD_BINDER` to userspace when the process dies (or +/// immediately if it is already dead). Userspace is supposed to respond with `BC_DEAD_BINDER_DONE` +/// once it has processed the notification. +/// +/// Userspace can unregister from death notifications using the `BC_CLEAR_DEATH_NOTIFICATION` +/// command. In this case, the kernel will respond with `BR_CLEAR_DEATH_NOTIFICATION_DONE` once the +/// notification has been removed. Note that if the remote process dies before the kernel has +/// responded with `BR_CLEAR_DEATH_NOTIFICATION_DONE`, then the kernel will still send a +/// `BR_DEAD_BINDER`, which userspace must be able to process. In this case, the kernel will wait +/// for the `BC_DEAD_BINDER_DONE` command before it sends `BR_CLEAR_DEATH_NOTIFICATION_DONE`. +/// +/// Note that even if the kernel sends a `BR_DEAD_BINDER`, this does not remove the death +/// notification. Userspace must still remove it manually using `BC_CLEAR_DEATH_NOTIFICATION`. +/// +/// If a process uses `BC_RELEASE` to destroy its last refcount on a node that has an active death +/// registration, then the death registration is immediately deleted (we implement this using the +/// `aborted` field). However, userspace is not supposed to delete a `NodeRef` without first +/// deregistering death notifications, so this codepath is not executed under normal circumstances. +#[pin_data] +pub(crate) struct NodeDeath { + node: DArc, + process: Arc, + pub(crate) cookie: usize, + #[pin] + links_track: AtomicListArcTracker<0>, + /// Used by the owner `Node` to store a list of registered death notifications. + /// + /// # Invariants + /// + /// Only ever used with the `death_list` list of `self.node`. + #[pin] + death_links: ListLinks<1>, + /// Used by the process to keep track of the death notifications for which we have sent a + /// `BR_DEAD_BINDER` but not yet received a `BC_DEAD_BINDER_DONE`. + /// + /// # Invariants + /// + /// Only ever used with the `delivered_deaths` list of `self.process`. + #[pin] + delivered_links: ListLinks<2>, + #[pin] + delivered_links_track: AtomicListArcTracker<2>, + #[pin] + inner: SpinLock, +} + +impl NodeDeath { + /// Constructs a new node death notification object. + pub(crate) fn new( + node: DArc, + process: Arc, + cookie: usize, + ) -> impl PinInit> { + DTRWrap::new(pin_init!( + Self { + node, + process, + cookie, + links_track <- AtomicListArcTracker::new(), + death_links <- ListLinks::new(), + delivered_links <- ListLinks::new(), + delivered_links_track <- AtomicListArcTracker::new(), + inner <- kernel::new_spinlock!(NodeDeathInner { + dead: false, + cleared: false, + notification_done: false, + aborted: false, + }, "NodeDeath::inner"), + } + )) + } + + /// Sets the cleared flag to `true`. + /// + /// It removes `self` from the node's death notification list if needed. + /// + /// Returns whether it needs to be queued. + pub(crate) fn set_cleared(self: &DArc, abort: bool) -> bool { + let (needs_removal, needs_queueing) = { + // Update state and determine if we need to queue a work item. We only need to do it + // when the node is not dead or if the user already completed the death notification. + let mut inner = self.inner.lock(); + if abort { + inner.aborted = true; + } + if inner.cleared { + // Already cleared. + return false; + } + inner.cleared = true; + (!inner.dead, !inner.dead || inner.notification_done) + }; + + // Remove death notification from node. + if needs_removal { + let mut owner_inner = self.node.owner.inner.lock(); + let node_inner = self.node.inner.access_mut(&mut owner_inner); + // SAFETY: A `NodeDeath` is never inserted into the death list of any node other than + // its owner, so it is either in this death list or in no death list. + unsafe { node_inner.death_list.remove(self) }; + } + needs_queueing + } + + /// Sets the 'notification done' flag to `true`. + pub(crate) fn set_notification_done(self: DArc, thread: &Thread) { + let needs_queueing = { + let mut inner = self.inner.lock(); + inner.notification_done = true; + inner.cleared + }; + if needs_queueing { + if let Some(death) = ListArc::try_from_arc_or_drop(self) { + let _ = thread.push_work_if_looper(death); + } + } + } + + /// Sets the 'dead' flag to `true` and queues work item if needed. + pub(crate) fn set_dead(self: DArc) { + let needs_queueing = { + let mut inner = self.inner.lock(); + if inner.cleared { + false + } else { + inner.dead = true; + true + } + }; + if needs_queueing { + // Push the death notification to the target process. There is nothing else to do if + // it's already dead. + if let Some(death) = ListArc::try_from_arc_or_drop(self) { + let process = death.process.clone(); + let _ = process.push_work(death); + } + } + } +} + +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<0> for NodeDeath { + tracked_by links_track: AtomicListArcTracker; + } +} + +kernel::list::impl_has_list_links! { + impl HasListLinks<1> for DTRWrap { self.wrapped.death_links } +} +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<1> for DTRWrap { untracked; } +} +kernel::list::impl_list_item! { + impl ListItem<1> for DTRWrap { + using ListLinks; + } +} + +kernel::list::impl_has_list_links! { + impl HasListLinks<2> for DTRWrap { self.wrapped.delivered_links } +} +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<2> for DTRWrap { + tracked_by wrapped: NodeDeath; + } +} +kernel::list::impl_list_arc_safe! { + impl ListArcSafe<2> for NodeDeath { + tracked_by delivered_links_track: AtomicListArcTracker<2>; + } +} +kernel::list::impl_list_item! { + impl ListItem<2> for DTRWrap { + using ListLinks; + } +} + +impl DeliverToRead for NodeDeath { + fn do_work( + self: DArc, + _thread: &Thread, + writer: &mut UserSlicePtrWriter, + ) -> Result { + let done = { + let inner = self.inner.lock(); + if inner.aborted { + return Ok(true); + } + inner.cleared && (!inner.dead || inner.notification_done) + }; + + let cookie = self.cookie; + let cmd = if done { + BR_CLEAR_DEATH_NOTIFICATION_DONE + } else { + let process = self.process.clone(); + let mut process_inner = process.inner.lock(); + let inner = self.inner.lock(); + if inner.aborted { + return Ok(true); + } + // We're still holding the inner lock, so it cannot be aborted while we insert it into + // the delivered list. + process_inner.death_delivered(self.clone()); + BR_DEAD_BINDER + }; + + writer.write(&cmd)?; + writer.write(&cookie)?; + // Mimic the original code: we stop processing work items when we get to a death + // notification. + Ok(cmd != BR_DEAD_BINDER) + } + + fn should_sync_wakeup(&self) -> bool { + false + } +} diff --git a/drivers/android/process.rs b/drivers/android/process.rs index d4e50c7f9a88..0b79fa59ffa5 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -20,7 +20,7 @@ pages::Pages, prelude::*, rbtree::RBTree, - sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock}, + sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock, UniqueArc}, task::Task, types::{ARef, Either}, user_ptr::{UserSlicePtr, UserSlicePtrReader}, @@ -32,7 +32,7 @@ context::Context, defs::*, error::{BinderError, BinderResult}, - node::{Node, NodeRef}, + node::{Node, NodeDeath, NodeRef}, range_alloc::{self, RangeAllocator}, thread::{PushWorkRes, Thread}, DArc, DLArc, DTRWrap, DeliverToRead, @@ -69,6 +69,7 @@ pub(crate) struct ProcessInner { nodes: RBTree>, mapping: Option, work: List>, + delivered_deaths: List, 2>, /// The number of requested threads that haven't registered yet. requested_thread_count: u32, @@ -91,6 +92,7 @@ fn new() -> Self { mapping: None, nodes: RBTree::new(), work: List::new(), + delivered_deaths: List::new(), requested_thread_count: 0, max_threads: 0, started_thread_count: 0, @@ -225,15 +227,40 @@ fn register_thread(&mut self) -> bool { self.started_thread_count += 1; true } + + /// Finds a delivered death notification with the given cookie, removes it from the thread's + /// delivered list, and returns it. + fn pull_delivered_death(&mut self, cookie: usize) -> Option> { + let mut cursor_opt = self.delivered_deaths.cursor_front(); + while let Some(cursor) = cursor_opt { + if cursor.current().cookie == cookie { + return Some(cursor.remove().into_arc()); + } + cursor_opt = cursor.next(); + } + None + } + + pub(crate) fn death_delivered(&mut self, death: DArc) { + if let Some(death) = ListArc::try_from_arc_or_drop(death) { + self.delivered_deaths.push_back(death); + } else { + pr_warn!("Notification added to `delivered_deaths` twice."); + } + } } struct NodeRefInfo { node_ref: NodeRef, + death: Option>, } impl NodeRefInfo { fn new(node_ref: NodeRef) -> Self { - Self { node_ref } + Self { + node_ref, + death: None, + } } } @@ -385,6 +412,18 @@ fn get_thread(self: ArcBorrow<'_, Self>, id: i32) -> Result> { Ok(ta) } + pub(crate) fn push_work(&self, work: DLArc) -> BinderResult { + // If push_work fails, drop the work item outside the lock. + let res = self.inner.lock().push_work(work); + match res { + Ok(()) => Ok(()), + Err((err, work)) => { + drop(work); + Err(err) + } + } + } + fn set_as_manager( self: ArcBorrow<'_, Self>, info: Option, @@ -513,6 +552,14 @@ pub(crate) fn get_node_from_handle(&self, handle: u32, strong: bool) -> Result) { + let mut inner = self.inner.lock(); + // SAFETY: By the invariant on the `delivered_links` field, this is the right linked list. + let removed = unsafe { inner.delivered_deaths.remove(death) }; + drop(inner); + drop(removed); + } + pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result { if inc && handle == 0 { if let Ok(node_ref) = self.ctx.get_manager_node(strong) { @@ -529,6 +576,12 @@ pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result let mut refs = self.node_refs.lock(); if let Some(info) = refs.by_handle.get_mut(&handle) { if info.node_ref.update(inc, strong) { + // Clean up death if there is one attached to this node reference. + if let Some(death) = info.death.take() { + death.set_cleared(true); + self.remove_from_delivered_deaths(&death); + } + // Remove reference from process tables. let id = info.node_ref.node.global_id; refs.by_handle.remove(&handle); @@ -725,6 +778,87 @@ pub(crate) fn needs_thread(&self) -> bool { ret } + pub(crate) fn request_death( + self: &Arc, + reader: &mut UserSlicePtrReader, + thread: &Thread, + ) -> Result { + let handle: u32 = reader.read()?; + let cookie: usize = reader.read()?; + + // TODO: First two should result in error, but not the others. + + // TODO: Do we care about the context manager dying? + + // Queue BR_ERROR if we can't allocate memory for the death notification. + let death = UniqueArc::try_new_uninit().map_err(|err| { + thread.push_return_work(BR_ERROR); + err + })?; + let mut refs = self.node_refs.lock(); + let info = refs.by_handle.get_mut(&handle).ok_or(EINVAL)?; + + // Nothing to do if there is already a death notification request for this handle. + if info.death.is_some() { + return Ok(()); + } + + let death = { + let death_init = NodeDeath::new(info.node_ref.node.clone(), self.clone(), cookie); + match death.pin_init_with(death_init) { + Ok(death) => death, + // error is infallible + Err(err) => match err {}, + } + }; + + // Register the death notification. + { + let mut owner_inner = info.node_ref.node.owner.inner.lock(); + if owner_inner.is_dead { + let death = ListArc::from_pin_unique(death); + info.death = Some(death.clone_arc()); + drop(owner_inner); + let _ = self.push_work(death); + } else { + let death = ListArc::from_pin_unique(death); + info.death = Some(death.clone_arc()); + info.node_ref.node.add_death(death, &mut owner_inner); + } + } + Ok(()) + } + + pub(crate) fn clear_death(&self, reader: &mut UserSlicePtrReader, thread: &Thread) -> Result { + let handle: u32 = reader.read()?; + let cookie: usize = reader.read()?; + + let mut refs = self.node_refs.lock(); + let info = refs.by_handle.get_mut(&handle).ok_or(EINVAL)?; + + let death = info.death.take().ok_or(EINVAL)?; + if death.cookie != cookie { + info.death = Some(death); + return Err(EINVAL); + } + + // Update state and determine if we need to queue a work item. We only need to do it when + // the node is not dead or if the user already completed the death notification. + if death.set_cleared(false) { + if let Some(death) = ListArc::try_from_arc_or_drop(death) { + let _ = thread.push_work_if_looper(death); + } + } + + Ok(()) + } + + pub(crate) fn dead_binder_done(&self, cookie: usize, thread: &Thread) { + if let Some(death) = self.inner.lock().pull_delivered_death(cookie) { + death.set_notification_done(thread); + } + } + fn deferred_flush(&self) { let inner = self.inner.lock(); for thread in inner.threads.values() { @@ -760,17 +894,6 @@ fn deferred_release(self: Arc) { work.into_arc().cancel(); } - // Move the threads out of `inner` so that we can iterate over them without holding the - // lock. - let mut inner = self.inner.lock(); - let threads = take(&mut inner.threads); - drop(inner); - - // Release all threads. - for thread in threads.values() { - thread.release(); - } - // Free any resources kept alive by allocated buffers. let omapping = self.inner.lock().mapping.take(); if let Some(mut mapping) = omapping { @@ -785,6 +908,48 @@ fn deferred_release(self: Arc) { drop(alloc) }); } + + // Drop all references. We do this dance with `swap` to avoid destroying the references + // while holding the lock. + let mut refs = self.node_refs.lock(); + let mut node_refs = take(&mut refs.by_handle); + drop(refs); + + // Remove all death notifications from the nodes (that belong to a different process). + for info in node_refs.values_mut() { + let death = if let Some(existing) = info.death.take() { + existing + } else { + continue; + }; + death.set_cleared(false); + } + + // Do similar dance for the state lock. + let mut inner = self.inner.lock(); + let threads = take(&mut inner.threads); + let nodes = take(&mut inner.nodes); + drop(inner); + + // Release all threads. + for thread in threads.values() { + thread.release(); + } + + // Deliver death notifications. + for node in nodes.values() { + loop { + let death = { + let mut inner = self.inner.lock(); + if let Some(death) = node.next_death(&mut inner) { + death + } else { + break; + } + }; + death.set_dead(); + } + } } pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result { diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 218c2001e8cb..04477ff7e5a0 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -100,6 +100,13 @@ impl core::ops::Receiver for DTRWrap {} type DLArc = kernel::list::ListArc>; impl DTRWrap { + fn new(val: impl PinInit) -> impl PinInit { + pin_init!(Self { + links <- ListLinksSelfPtr::new(), + wrapped <- val, + }) + } + #[allow(dead_code)] fn arc_try_new(val: T) -> Result, alloc::alloc::AllocError> { ListArc::pin_init(pin_init!(Self { diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index b583297cea91..b70a5e3c064b 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -391,10 +391,27 @@ pub(crate) fn push_work(&self, work: DLArc) -> PushWorkRes { res } + /// Attempts to push to given work item to the thread if it's a looper thread (i.e., if it's + /// part of a thread pool) and is alive. Otherwise, push the work item to the process instead. + pub(crate) fn push_work_if_looper(&self, work: DLArc) -> BinderResult { + let mut inner = self.inner.lock(); + if inner.is_looper() && !inner.is_dead { + inner.push_work(work); + Ok(()) + } else { + drop(inner); + self.process.push_work(work) + } + } + pub(crate) fn push_work_deferred(&self, work: DLArc) { self.inner.lock().push_work_deferred(work); } + pub(crate) fn push_return_work(&self, reply: u32) { + self.inner.lock().push_return_work(reply); + } + pub(crate) fn copy_transaction_data( &self, to_process: Arc, @@ -556,7 +573,7 @@ fn transaction(self: &Arc, tr: &BinderTransactionDataSg, inner: T) ); } - self.inner.lock().push_return_work(err.reply); + self.push_return_work(err.reply); } } @@ -684,6 +701,9 @@ fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { BC_DECREFS => self.process.update_ref(reader.read()?, false, false)?, BC_INCREFS_DONE => self.process.inc_ref_done(&mut reader, false)?, BC_ACQUIRE_DONE => self.process.inc_ref_done(&mut reader, true)?, + BC_REQUEST_DEATH_NOTIFICATION => self.process.request_death(&mut reader, self)?, + BC_CLEAR_DEATH_NOTIFICATION => self.process.clear_death(&mut reader, self)?, + BC_DEAD_BINDER_DONE => self.process.dead_binder_done(reader.read()?, self), BC_REGISTER_LOOPER => { let valid = self.process.register_thread(); self.inner.lock().looper_register(valid); From patchwork Wed Nov 1 18:01:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160634 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605505vqx; Wed, 1 Nov 2023 11:04:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGDG41qlfp3INDAps9Nx56RIkFXFKpOIt3cf790rr6PpvZsn5pmCOyDxYETpOT84mstDNH5 X-Received: by 2002:a05:6a20:3c89:b0:13e:fb5e:b460 with SMTP id b9-20020a056a203c8900b0013efb5eb460mr5071380pzj.0.1698861840707; Wed, 01 Nov 2023 11:04:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861840; cv=none; d=google.com; s=arc-20160816; b=kBLBRQt3Yo8Cu2lVnQolymfNzNatM6+vgVyzm73kHo8j4ZtqYXM0/K8lw4QuhZyVJg TY66DtIj7+lE764yS610pQW3xd9BV4gWssqB9Pc5AoEbUfHJo/81arkLbnyG800VWxoV esnbb1GvmqvCEw1uwleEN9xoi/kh3YoFA16A3EWUSwZ0guAj6vxXLw/bGIkFERULe8Zi EpNnGnD71Y8xEENe86B5lXy/d4JOevPg2q8rvmR98y1miGYU8WiUJp5ZqAcjyFsFHItC u6Kn4BnyYjTur20xwGYNuJXGs/J10Qnzm9wcAs1SZisACIwAoq+8iVtQOZG6+mVywp7x UHEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=kukThpDw1OlU2Qi41kPUcezUwH/dCviJIoImwIYtSdk=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=0UUMjvgTcn8Qk6mE8BtD9VxoghwwuG6HKW6LYA3IBCAFaB8Llo+riHYRbuqIS/9fua KlIjJxPcnQla5chCV2zYC4Xzz8mimKPPflS3WiFbnPMDltJ+Li6+64Pwm4wfCHerR4G0 c2CnRu4Ts4akADKqRmH1SYu8Vnv6WKfgeancnsylFP9BTU9RVU3L7Y2Q9CqPW8htZCYO Yw+vHJb2XPcI0Q/YUQ45vS95CXawfA9Tb2es5/gMj/fztY47gaZwPYMWHoJ4cwQynzXa 3+AhteQ6ipYkAJ2iu6Gkm8Wv8Fjrd1IWd04UQx9fTl1i/RRi+dgGvTPtiJqVVOssFk4D VgpQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="dw/kZijC"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id j38-20020a634a66000000b005b8afa325b8si360583pgl.410.2023.11.01.11.03.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="dw/kZijC"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 39C00818ABF4; Wed, 1 Nov 2023 11:03:37 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345059AbjKASDT (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39398 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344792AbjKASDD (ORCPT ); Wed, 1 Nov 2023 14:03:03 -0400 Received: from mail-ej1-x649.google.com (mail-ej1-x649.google.com [IPv6:2a00:1450:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D02B812C for ; Wed, 1 Nov 2023 11:02:44 -0700 (PDT) Received: by mail-ej1-x649.google.com with SMTP id a640c23a62f3a-9d2606301eeso10965166b.0 for ; Wed, 01 Nov 2023 11:02:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861763; x=1699466563; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kukThpDw1OlU2Qi41kPUcezUwH/dCviJIoImwIYtSdk=; b=dw/kZijCmsPpxOgiZ7gqxfe9KU4WKJXqee5Au8rd7OAOc8/9ane5peufyR4aLNbkB8 MKjejCWximCo1G9Hk8EcmBaPAo9yxdW4YpvLtK72MO+p0eB8OLpfy017PJLhMGXTM9O7 MWxeausv/53eNRvRgWdrTov05bukY2pPsTl9fs7ECTaWVc4dfT9JN4XmiWvLQ7z6xue0 tfBUAaErgsze7ix/QGzP9lUbgRWefh6FZQy9cEfLPS15LcS/ff/Th+82X9+EFGN7Voru P7YC9XXhswPfJdKMgjQcG6I1q4YIH0BsILO7CZ7X3LK8447N5+O4M7ewdLjKGltBIQXv /Vog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861763; x=1699466563; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kukThpDw1OlU2Qi41kPUcezUwH/dCviJIoImwIYtSdk=; b=Ndyhm8YSp/CsR/BSCZq7UNpESXsrDriqlnZMv10ye4rilfPb2EMz3WsMMbeOp5o+N5 7aQuB7vEVVqJeekEBvNFEaVGccAzBkMqBMDWuC7by+GadybxddM5FCkrY9O2kWAdFDR6 DgEEFEBZKVDVc6ENHTAfZHGv7W1KpgtBTO8JWFFgOcjRPVkHhHOTPLPH5w3h/Fwmdgvb CFVjnKT3zu4A9Dy/Zx0t0b8beHnDoC6ZRCoWpV9LfNkVxaAB1z0rJQNxSdwfdpin2b0h ReqniXs8nTC4p5p+rGyzaF+GIAz3wQ942D5sIdpyxfyPQs3JJU8IRuTJxXa5IKFTrRJ8 WW6w== X-Gm-Message-State: AOJu0Yx1s130B+wq386YeRBS2jIXkIVVb1ZWNWGBymVQk1cLKa5RZnPa T39flGC/xDGy6zFSrObfx0tFyO4IaYwec7A= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a17:906:f247:b0:9c7:1cbb:3a71 with SMTP id gy7-20020a170906f24700b009c71cbb3a71mr30909ejb.1.1698861763349; Wed, 01 Nov 2023 11:02:43 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:41 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-11-08ba9197f637@google.com> Subject: [PATCH RFC 11/20] rust_binder: send nodes in transactions From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:37 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385753374601110 X-GMAIL-MSGID: 1781385753374601110 To send a transaction to any process other than the context manager, someone must first send you the binder node. Usually, you get it from the context manager. The transaction allocation now contains a list of offsets of objects in the transaction that must be translated before they are passed to the target process. In this patch, we only support translation of binder nodes, but future patches will extend this to other object types. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 266 ++++++++++++++++++++++++++++++++++++++++- drivers/android/defs.rs | 44 +++++-- drivers/android/process.rs | 8 ++ drivers/android/thread.rs | 118 +++++++++++++++++- drivers/android/transaction.rs | 5 +- rust/helpers.c | 7 ++ rust/kernel/security.rs | 7 ++ 7 files changed, 436 insertions(+), 19 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs index 0fdef5425918..32bc268956f2 100644 --- a/drivers/android/allocation.rs +++ b/drivers/android/allocation.rs @@ -1,9 +1,18 @@ // SPDX-License-Identifier: GPL-2.0 -use core::mem::size_of_val; +use core::mem::{size_of, size_of_val, MaybeUninit}; +use core::ops::Range; -use kernel::{bindings, pages::Pages, prelude::*, sync::Arc, user_ptr::UserSlicePtrReader}; +use kernel::{ + bindings, + io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes}, + pages::Pages, + prelude::*, + sync::Arc, + user_ptr::UserSlicePtrReader, +}; use crate::{ + defs::*, node::{Node, NodeRef}, process::Process, DArc, @@ -11,6 +20,8 @@ #[derive(Default)] pub(crate) struct AllocationInfo { + /// Range within the allocation where we can find the offsets to the object descriptors. + pub(crate) offsets: Option>, /// The target node of the transaction this allocation is associated to. /// Not set for replies. pub(crate) target_node: Option, @@ -87,6 +98,21 @@ pub(crate) fn copy_into( }) } + pub(crate) fn read(&self, offset: usize) -> Result { + let mut out = MaybeUninit::::uninit(); + let mut out_offset = 0; + self.iterate(offset, size_of::(), |page, offset, to_copy| { + // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T. + let obj_ptr = unsafe { (out.as_mut_ptr() as *mut u8).add(out_offset) }; + // SAFETY: The pointer points is in-bounds of the `out` variable, so it is valid. + unsafe { page.read(obj_ptr, offset, to_copy) }?; + out_offset += to_copy; + Ok(()) + })?; + // SAFETY: We just initialised the data. + Ok(unsafe { out.assume_init() }) + } + pub(crate) fn write(&self, offset: usize, obj: &T) -> Result { let mut obj_offset = 0; self.iterate(offset, size_of_val(obj), |page, offset, to_copy| { @@ -119,6 +145,10 @@ pub(crate) fn get_or_init_info(&mut self) -> &mut AllocationInfo { self.allocation_info.get_or_insert_with(Default::default) } + pub(crate) fn set_info_offsets(&mut self, offsets: Range) { + self.get_or_init_info().offsets = Some(offsets); + } + pub(crate) fn set_info_oneway_node(&mut self, oneway_node: DArc) { self.get_or_init_info().oneway_node = Some(oneway_node); } @@ -145,6 +175,15 @@ fn drop(&mut self) { info.target_node = None; + if let Some(offsets) = info.offsets.clone() { + let view = AllocationView::new(self, offsets.start); + for i in offsets.step_by(size_of::()) { + if view.cleanup_object(i).is_err() { + pr_warn!("Error cleaning up object at offset {}\n", i) + } + } + } + if info.clear_on_free { if let Err(e) = self.fill_zero() { pr_warn!("Failed to clear data on free: {:?}", e); @@ -155,3 +194,226 @@ fn drop(&mut self) { self.process.buffer_raw_free(self.ptr); } } + +/// A view into the beginning of an allocation. +/// +/// All attempts to read or write outside of the view will fail. To intentionally access outside of +/// this view, use the `alloc` field of this struct directly. +pub(crate) struct AllocationView<'a> { + pub(crate) alloc: &'a mut Allocation, + limit: usize, +} + +impl<'a> AllocationView<'a> { + pub(crate) fn new(alloc: &'a mut Allocation, limit: usize) -> Self { + AllocationView { alloc, limit } + } + + pub(crate) fn read(&self, offset: usize) -> Result { + if offset.checked_add(size_of::()).ok_or(EINVAL)? > self.limit { + return Err(EINVAL); + } + self.alloc.read(offset) + } + + pub(crate) fn write(&self, offset: usize, obj: &T) -> Result { + if offset.checked_add(size_of::()).ok_or(EINVAL)? > self.limit { + return Err(EINVAL); + } + self.alloc.write(offset, obj) + } + + pub(crate) fn transfer_binder_object( + &self, + offset: usize, + obj: &bindings::flat_binder_object, + strong: bool, + node_ref: NodeRef, + ) -> Result { + if Arc::ptr_eq(&node_ref.node.owner, &self.alloc.process) { + // The receiving process is the owner of the node, so send it a binder object (instead + // of a handle). + let (ptr, cookie) = node_ref.node.get_id(); + let mut newobj = FlatBinderObject::default(); + newobj.hdr.type_ = if strong { + BINDER_TYPE_BINDER + } else { + BINDER_TYPE_WEAK_BINDER + }; + newobj.flags = obj.flags; + newobj.__bindgen_anon_1.binder = ptr as _; + newobj.cookie = cookie as _; + self.write(offset, &newobj)?; + // Increment the user ref count on the node. It will be decremented as part of the + // destruction of the buffer, when we see a binder or weak-binder object. + node_ref.node.update_refcount(true, 1, strong); + } else { + // The receiving process is different from the owner, so we need to insert a handle to + // the binder object. + let handle = self + .alloc + .process + .insert_or_update_handle(node_ref, false)?; + let mut newobj = FlatBinderObject::default(); + newobj.hdr.type_ = if strong { + BINDER_TYPE_HANDLE + } else { + BINDER_TYPE_WEAK_HANDLE + }; + newobj.flags = obj.flags; + newobj.__bindgen_anon_1.handle = handle; + if self.write(offset, &newobj).is_err() { + // Decrement ref count on the handle we just created. + let _ = self.alloc.process.update_ref(handle, false, strong); + return Err(EINVAL); + } + } + Ok(()) + } + + fn cleanup_object(&self, index_offset: usize) -> Result { + let offset = self.alloc.read(index_offset)?; + let header = self.read::(offset)?; + match header.type_ { + BINDER_TYPE_WEAK_BINDER | BINDER_TYPE_BINDER => { + let obj = self.read::(offset)?; + let strong = header.type_ == BINDER_TYPE_BINDER; + // SAFETY: The type is `BINDER_TYPE_{WEAK_}BINDER`, so the `binder` field is + // populated. + let ptr = unsafe { obj.__bindgen_anon_1.binder } as usize; + let cookie = obj.cookie as usize; + self.alloc.process.update_node(ptr, cookie, strong); + Ok(()) + } + BINDER_TYPE_WEAK_HANDLE | BINDER_TYPE_HANDLE => { + let obj = self.read::(offset)?; + let strong = header.type_ == BINDER_TYPE_HANDLE; + // SAFETY: The type is `BINDER_TYPE_{WEAK_}HANDLE`, so the `handle` field is + // populated. + let handle = unsafe { obj.__bindgen_anon_1.handle } as _; + self.alloc.process.update_ref(handle, false, strong) + } + _ => Ok(()), + } + } +} + +/// A binder object as it is serialized. +/// +/// # Invariants +/// +/// All bytes must be initialized, and the value of `self.hdr.type_` must be one of the allowed +/// types. +#[repr(C)] +pub(crate) union BinderObject { + hdr: bindings::binder_object_header, + fbo: bindings::flat_binder_object, + fdo: bindings::binder_fd_object, + bbo: bindings::binder_buffer_object, + fdao: bindings::binder_fd_array_object, +} + +/// A view into a `BinderObject` that can be used in a match statement. +pub(crate) enum BinderObjectRef<'a> { + Binder(&'a mut bindings::flat_binder_object), + Handle(&'a mut bindings::flat_binder_object), + Fd(&'a mut bindings::binder_fd_object), + Ptr(&'a mut bindings::binder_buffer_object), + Fda(&'a mut bindings::binder_fd_array_object), +} + +impl BinderObject { + pub(crate) fn read_from(reader: &mut UserSlicePtrReader) -> Result { + let object = Self::read_from_inner(|slice| { + let read_len = usize::min(slice.len(), reader.len()); + // SAFETY: The length we pass to `read_raw` is at most the length of the slice. + unsafe { + reader + .clone_reader() + .read_raw(slice.as_mut_ptr(), read_len)?; + } + Ok(()) + })?; + + // If we used a object type smaller than the largest object size, then we've read more + // bytes than we needed to. However, we used `.clone_reader()` to avoid advancing the + // original reader. Now, we call `skip` so that the caller's reader is advanced by the + // right amount. + // + // The `skip` call fails if the reader doesn't have `size` bytes available. This could + // happen if the type header corresponds to an object type that is larger than the rest of + // the reader. + // + // Any extra bytes beyond the size of the object are inaccessible after this call, so + // reading them again from the `reader` later does not result in TOCTOU bugs. + reader.skip(object.size())?; + + Ok(object) + } + + /// Use the provided reader closure to construct a `BinderObject`. + /// + /// The closure should write the bytes for the object into the provided slice. + pub(crate) fn read_from_inner(reader: R) -> Result + where + R: FnOnce(&mut [u8; size_of::()]) -> Result<()>, + { + let mut obj = MaybeUninit::::zeroed(); + + // SAFETY: The lengths of `BinderObject` and `[u8; size_of::()]` are equal, + // and the byte array has an alignment requirement of one, so the pointer cast is okay. + // Additionally, `obj` was initialized to zeros, so the byte array will not be + // uninitialized. + (reader)(unsafe { &mut *obj.as_mut_ptr().cast() })?; + + // SAFETY: The entire object is initialized, so accessing this field is safe. + let type_ = unsafe { obj.assume_init_ref().hdr.type_ }; + if Self::type_to_size(type_).is_none() { + // The value of `obj.hdr_type_` was invalid. + return Err(EINVAL); + } + + // SAFETY: All bytes are initialized (since we zeroed them at the start) and we checked + // that `self.hdr.type_` is one of the allowed types, so the type invariants are satisfied. + unsafe { Ok(obj.assume_init()) } + } + + pub(crate) fn as_ref(&mut self) -> BinderObjectRef<'_> { + use BinderObjectRef::*; + // SAFETY: The constructor ensures that all bytes of `self` are initialized, and all + // variants of this union accept all initialized bit patterns. + unsafe { + match self.hdr.type_ { + BINDER_TYPE_WEAK_BINDER | BINDER_TYPE_BINDER => Binder(&mut self.fbo), + BINDER_TYPE_WEAK_HANDLE | BINDER_TYPE_HANDLE => Handle(&mut self.fbo), + BINDER_TYPE_FD => Fd(&mut self.fdo), + BINDER_TYPE_PTR => Ptr(&mut self.bbo), + BINDER_TYPE_FDA => Fda(&mut self.fdao), + // SAFETY: By the type invariant, the value of `self.hdr.type_` cannot have any + // other value than the ones checked above. + _ => core::hint::unreachable_unchecked(), + } + } + } + + pub(crate) fn size(&self) -> usize { + // SAFETY: The entire object is initialized, so accessing this field is safe. + let type_ = unsafe { self.hdr.type_ }; + + // SAFETY: The type invariants guarantee that the type field is correct. + unsafe { Self::type_to_size(type_).unwrap_unchecked() } + } + + fn type_to_size(type_: u32) -> Option { + match type_ { + BINDER_TYPE_WEAK_BINDER => Some(size_of::()), + BINDER_TYPE_BINDER => Some(size_of::()), + BINDER_TYPE_WEAK_HANDLE => Some(size_of::()), + BINDER_TYPE_HANDLE => Some(size_of::()), + BINDER_TYPE_FD => Some(size_of::()), + BINDER_TYPE_PTR => Some(size_of::()), + BINDER_TYPE_FDA => Some(size_of::()), + _ => None, + } + } +} diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 753f7e86c92d..68f32a779a3c 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 +use core::mem::MaybeUninit; use core::ops::{Deref, DerefMut}; use kernel::{ bindings::{self, *}, @@ -57,11 +58,18 @@ macro_rules! pub_no_prefix { kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX; pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_CLEAR_BUF); +pub(crate) use bindings::{ + BINDER_TYPE_BINDER, BINDER_TYPE_FD, BINDER_TYPE_FDA, BINDER_TYPE_HANDLE, BINDER_TYPE_PTR, + BINDER_TYPE_WEAK_BINDER, BINDER_TYPE_WEAK_HANDLE, +}; + macro_rules! decl_wrapper { ($newname:ident, $wrapped:ty) => { - #[derive(Copy, Clone, Default)] + // Define a wrapper around the C type. Use `MaybeUninit` to enforce that the value of + // padding bytes must be preserved. + #[derive(Copy, Clone)] #[repr(transparent)] - pub(crate) struct $newname($wrapped); + pub(crate) struct $newname(MaybeUninit<$wrapped>); // SAFETY: This macro is only used with types where this is ok. unsafe impl ReadableFromBytes for $newname {} @@ -70,13 +78,24 @@ unsafe impl WritableToBytes for $newname {} impl Deref for $newname { type Target = $wrapped; fn deref(&self) -> &Self::Target { - &self.0 + // SAFETY: We use `MaybeUninit` only to preserve padding. The value must still + // always be valid. + unsafe { self.0.assume_init_ref() } } } impl DerefMut for $newname { fn deref_mut(&mut self) -> &mut Self::Target { - &mut self.0 + // SAFETY: We use `MaybeUninit` only to preserve padding. The value must still + // always be valid. + unsafe { self.0.assume_init_mut() } + } + } + + impl Default for $newname { + fn default() -> Self { + // Create a new value of this type where all bytes (including padding) are zeroed. + Self(MaybeUninit::zeroed()) } } }; @@ -85,6 +104,7 @@ fn deref_mut(&mut self) -> &mut Self::Target { decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info); decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref); decl_wrapper!(FlatBinderObject, bindings::flat_binder_object); +decl_wrapper!(BinderObjectHeader, bindings::binder_object_header); decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data); decl_wrapper!( BinderTransactionDataSecctx, @@ -100,18 +120,18 @@ fn deref_mut(&mut self) -> &mut Self::Target { impl BinderVersion { pub(crate) fn current() -> Self { - Self(bindings::binder_version { + Self(MaybeUninit::new(bindings::binder_version { protocol_version: bindings::BINDER_CURRENT_PROTOCOL_VERSION as _, - }) + })) } } impl BinderTransactionData { pub(crate) fn with_buffers_size(self, buffers_size: u64) -> BinderTransactionDataSg { - BinderTransactionDataSg(bindings::binder_transaction_data_sg { - transaction_data: self.0, + BinderTransactionDataSg(MaybeUninit::new(bindings::binder_transaction_data_sg { + transaction_data: *self, buffers_size, - }) + })) } } @@ -128,6 +148,10 @@ pub(crate) fn tr_data(&mut self) -> &mut BinderTransactionData { impl ExtendedError { pub(crate) fn new(id: u32, command: u32, param: i32) -> Self { - Self(bindings::binder_extended_error { id, command, param }) + Self(MaybeUninit::new(bindings::binder_extended_error { + id, + command, + param, + })) } } diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 0b79fa59ffa5..944297b7403c 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -591,6 +591,14 @@ pub(crate) fn update_ref(&self, handle: u32, inc: bool, strong: bool) -> Result Ok(()) } + /// Decrements the refcount of the given node, if one exists. + pub(crate) fn update_node(&self, ptr: usize, cookie: usize, strong: bool) { + let mut inner = self.inner.lock(); + if let Ok(Some(node)) = inner.get_existing_node(ptr, cookie) { + inner.update_node_refcount(&node, false, strong, 1, None); + } + } + pub(crate) fn inc_ref_done(&self, reader: &mut UserSlicePtrReader, strong: bool) -> Result { let ptr = reader.read::()?; let cookie = reader.read::()?; diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index b70a5e3c064b..a9afc7b706c6 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -21,8 +21,13 @@ }; use crate::{ - allocation::Allocation, defs::*, error::BinderResult, process::Process, ptr_align, - transaction::Transaction, DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead, + allocation::{Allocation, AllocationView, BinderObject, BinderObjectRef}, + defs::*, + error::BinderResult, + process::Process, + ptr_align, + transaction::Transaction, + DArc, DLArc, DTRWrap, DeliverCode, DeliverToRead, }; use core::{ @@ -412,6 +417,54 @@ pub(crate) fn push_return_work(&self, reply: u32) { self.inner.lock().push_return_work(reply); } + fn translate_object( + &self, + offset: usize, + object: BinderObjectRef<'_>, + view: &mut AllocationView<'_>, + ) -> BinderResult { + match object { + BinderObjectRef::Binder(obj) => { + let strong = obj.hdr.type_ == BINDER_TYPE_BINDER; + // SAFETY: `binder` is a `binder_uintptr_t`; any bit pattern is a valid + // representation. + let ptr = unsafe { obj.__bindgen_anon_1.binder } as _; + let cookie = obj.cookie as _; + let flags = obj.flags as _; + let node = self.process.as_arc_borrow().get_node( + ptr, + cookie, + flags, + strong, + Some(self), + )?; + security::binder_transfer_binder(&self.process.cred, &view.alloc.process.cred)?; + view.transfer_binder_object(offset, obj, strong, node)?; + } + BinderObjectRef::Handle(obj) => { + let strong = obj.hdr.type_ == BINDER_TYPE_HANDLE; + // SAFETY: `handle` is a `u32`; any bit pattern is a valid representation. + let handle = unsafe { obj.__bindgen_anon_1.handle } as _; + let node = self.process.get_node_from_handle(handle, strong)?; + security::binder_transfer_binder(&self.process.cred, &view.alloc.process.cred)?; + view.transfer_binder_object(offset, obj, strong, node)?; + } + BinderObjectRef::Fd(_obj) => { + pr_warn!("Using unsupported binder object type fd."); + return Err(EINVAL.into()); + } + BinderObjectRef::Ptr(_obj) => { + pr_warn!("Using unsupported binder object type ptr."); + return Err(EINVAL.into()); + } + BinderObjectRef::Fda(_obj) => { + pr_warn!("Using unsupported binder object type fda."); + return Err(EINVAL.into()); + } + } + Ok(()) + } + pub(crate) fn copy_transaction_data( &self, to_process: Arc, @@ -436,6 +489,8 @@ pub(crate) fn copy_transaction_data( let data_size = trd.data_size.try_into().map_err(|_| EINVAL)?; let adata_size = ptr_align(data_size); + let offsets_size = trd.offsets_size.try_into().map_err(|_| EINVAL)?; + let aoffsets_size = ptr_align(offsets_size); let asecctx_size = secctx .as_ref() .map(|(_, ctx)| ptr_align(ctx.len())) @@ -443,11 +498,14 @@ pub(crate) fn copy_transaction_data( // This guarantees that at least `sizeof(usize)` bytes will be allocated. let len = usize::max( - adata_size.checked_add(asecctx_size).ok_or(ENOMEM)?, + adata_size + .checked_add(aoffsets_size) + .and_then(|sum| sum.checked_add(asecctx_size)) + .ok_or(ENOMEM)?, size_of::(), ); - let secctx_off = adata_size; - let alloc = match to_process.buffer_alloc(len, is_oneway) { + let secctx_off = adata_size + aoffsets_size; + let mut alloc = match to_process.buffer_alloc(len, is_oneway) { Ok(alloc) => alloc, Err(err) => { pr_warn!( @@ -461,8 +519,56 @@ pub(crate) fn copy_transaction_data( let mut buffer_reader = unsafe { UserSlicePtr::new(trd.data.ptr.buffer as _, data_size) }.reader(); + let mut end_of_previous_object = 0; + + // Copy offsets if there are any. + if offsets_size > 0 { + { + let mut reader = + unsafe { UserSlicePtr::new(trd.data.ptr.offsets as _, offsets_size) }.reader(); + alloc.copy_into(&mut reader, adata_size, offsets_size)?; + } + + let offsets_start = adata_size; + let offsets_end = adata_size + aoffsets_size; + + // Traverse the objects specified. + let mut view = AllocationView::new(&mut alloc, data_size); + for index_offset in (offsets_start..offsets_end).step_by(size_of::()) { + let offset = view.alloc.read(index_offset)?; + + // Copy data between two objects. + if end_of_previous_object < offset { + view.alloc.copy_into( + &mut buffer_reader, + end_of_previous_object, + offset - end_of_previous_object, + )?; + } + + let mut object = BinderObject::read_from(&mut buffer_reader)?; + + match self.translate_object(offset, object.as_ref(), &mut view) { + Ok(()) => end_of_previous_object = offset + object.size(), + Err(err) => { + pr_warn!("Error while translating object."); + return Err(err); + } + } + + // Update the indexes containing objects to clean up. + let offset_after_object = index_offset + size_of::(); + view.alloc + .set_info_offsets(offsets_start..offset_after_object); + } + } - alloc.copy_into(&mut buffer_reader, 0, data_size)?; + // Copy remaining raw data. + alloc.copy_into( + &mut buffer_reader, + end_of_previous_object, + data_size - end_of_previous_object, + )?; if let Some((off_out, secctx)) = secctx.as_mut() { if let Err(err) = alloc.write(secctx_off, secctx.as_bytes()) { diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index a4ffe0a3878c..2faba6e1f47f 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -32,6 +32,7 @@ pub(crate) struct Transaction { code: u32, pub(crate) flags: u32, data_size: usize, + offsets_size: usize, data_address: usize, sender_euid: Kuid, txn_security_ctx_off: Option, @@ -85,6 +86,7 @@ pub(crate) fn new( code: trd.code, flags: trd.flags, data_size: trd.data_size as _, + offsets_size: trd.offsets_size as _, data_address, allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), txn_security_ctx_off, @@ -116,6 +118,7 @@ pub(crate) fn new_reply( code: trd.code, flags: trd.flags, data_size: trd.data_size as _, + offsets_size: trd.offsets_size as _, data_address: alloc.ptr, allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), txn_security_ctx_off: None, @@ -229,7 +232,7 @@ fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) - tr.flags = self.flags; tr.data_size = self.data_size as _; tr.data.ptr.buffer = self.data_address as _; - tr.offsets_size = 0; + tr.offsets_size = self.offsets_size as _; if tr.offsets_size > 0 { tr.data.ptr.offsets = (self.data_address + ptr_align(self.data_size)) as _; } diff --git a/rust/helpers.c b/rust/helpers.c index e70255f3774f..924c7a00f433 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -342,6 +342,13 @@ int rust_helper_security_binder_transaction(const struct cred *from, return security_binder_transaction(from, to); } EXPORT_SYMBOL_GPL(rust_helper_security_binder_transaction); + +int rust_helper_security_binder_transfer_binder(const struct cred *from, + const struct cred *to) +{ + return security_binder_transfer_binder(from, to); +} +EXPORT_SYMBOL_GPL(rust_helper_security_binder_transfer_binder); #endif /* diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs index 9e3e4cf08ecb..9179fc225406 100644 --- a/rust/kernel/security.rs +++ b/rust/kernel/security.rs @@ -24,6 +24,13 @@ pub fn binder_transaction(from: &Credential, to: &Credential) -> Result { to_result(unsafe { bindings::security_binder_transaction(from.0.get(), to.0.get()) }) } +/// Calls the security modules to determine if task `from` is allowed to send binder objects +/// (owned by itself or other processes) to task `to` through a binder transaction. +pub fn binder_transfer_binder(from: &Credential, to: &Credential) -> Result { + // SAFETY: `from` and `to` are valid because the shared references guarantee nonzero refcounts. + to_result(unsafe { bindings::security_binder_transfer_binder(from.0.get(), to.0.get()) }) +} + /// A security context string. /// /// The struct has the invariant that it always contains a valid security context. From patchwork Wed Nov 1 18:01:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160633 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605520vqx; Wed, 1 Nov 2023 11:04:02 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFWWBlLFDa0e56b78XyaA2UaBmlYDokswlybQQD4/vxPK3T/dn86K7uEkTf131UOboez4+l X-Received: by 2002:a17:903:41cc:b0:1cc:5dd4:7ce5 with SMTP id u12-20020a17090341cc00b001cc5dd47ce5mr8731523ple.19.1698861841662; Wed, 01 Nov 2023 11:04:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861841; cv=none; d=google.com; s=arc-20160816; b=RFcfw0xfxlSK3JHfz34Zx6wGAl+9P+IqyiS2xZOcJJrvng3W6NCW51qLwBirnx/eUi v9fbmG9+SFKk8a5mI1c++ixnhHKR1C7Lx/7RRkxQ1ob8AJTR4gyp7oPFrgyW8xLFD4ES O/IILvsg8am2J8pQB937WkQ6APj3zewrmyLBBOvgdMTEG2b7giJoLHSryzW3ZWoLVUul aRlWAlnmO58p/L6dDESHwzcoV08sHzp4taToGa6vh0Qq/9z9GqHQTlLwublzjK5M4rAe iA2lSAQRAf4IspbmxobD57WuIxbDq2y1P4PKK8wdIHzNpDRsE6WAgtCH/j7wxSuaDUvb iouQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=JV3zrCLyeqQczhMfmNzMo9ruR8VdjWrgmPRKWC6d6dI=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=HY0qj8vgnp0hQ1d5jQ/jc2G8oyGSilwDg8ZiuZNaW3ssrWhBFXwBeR32ecda26/oNJ hzYrU0rpqql3rol7JmfpFC+Qso+KujAJ/Gv80Hc4eb5t/Bq8hKsdYorauHcEtkaY1bFo sTEo9zCB5VHNzC9mbpjalPMTWoQ2MhE5PHqX2CwVhdGR6fWR2pKjEA6sfdE8RVprS1EW rXJhxO9oUCXIeQkGN6fVsg0BSTNtmsSF5rhlOwZoiIzDyoq+19ueJl9hqa/shXnydS5v DDn/uuF3bQlmhBNJ1r3xKBhJFmmSRrBm4kBDZLzrJMvmMDngZsIsQ73vz2b5Xee7c/Hg NwRA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=wNQiFmRt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id s12-20020a170902ea0c00b001cc0d2e97f8si3753026plg.575.2023.11.01.11.03.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=wNQiFmRt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 552CD818ABD8; Wed, 1 Nov 2023 11:03:38 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345072AbjKASDX (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344883AbjKASDD (ORCPT ); Wed, 1 Nov 2023 14:03:03 -0400 Received: from mail-ed1-x549.google.com (mail-ed1-x549.google.com [IPv6:2a00:1450:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 366F6115 for ; Wed, 1 Nov 2023 11:02:47 -0700 (PDT) Received: by mail-ed1-x549.google.com with SMTP id 4fb4d7f45d1cf-53fa5cd4480so38958a12.0 for ; Wed, 01 Nov 2023 11:02:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861765; x=1699466565; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=JV3zrCLyeqQczhMfmNzMo9ruR8VdjWrgmPRKWC6d6dI=; b=wNQiFmRtQpc4qIM731WBr652K7jKpu5FdL5QOeDC3v4bjwZXS6VqXV7IxwzqWm3rCz IV7KrHV+rou2MAGOwzAYciVsZA1Q6jCPzZ0c85qsV4l+OANZvAK4bOWjvUv792knJY1I KRFkzRFDURvCO/UyCn3adPv4IYtvMlXUB159ajQ2DMtXl0LJRnUkG3+XMyDNb5WxHSNB FLVWcK4IZqsGCLxtjiEbGIkIPjWr+MQcLvdZYtEB2zV8GXTGqvQd6VliJtxNd0QVLlU9 OW/VMd19Kg6vP2E3wyHbM8/pwNyTFIvfgG81R4dwwV7+71ClYllJUX/ScUET0lsczD+B pYyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861765; x=1699466565; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JV3zrCLyeqQczhMfmNzMo9ruR8VdjWrgmPRKWC6d6dI=; b=j60AesYXVbEOZKfZNOHesx8JkgITZsRrm4Ihi5JJxufZJt5R+foByqDMsYthwYmm9J jHvuQl5JlUPc1LI/neGIjogpocM3WjxQv5TEOKbyRMCuOKq3tfQa6DAGfPnO/zP2Q8Us BOYqu8KYAYB1t6b54eiin/eJgAubU85Frjw3AON5cD4Q39sApbfajOzTHvWyxQ3FEAtu 543C6Wo5XfNJHM71syzaRdJfbr6lhKzNlVOkAheaywPJcgd7QQ0yfIgOFVZhaHgIxaeP vthphFpKBDsRS7tSQxW3oD1Gf6p7cBVWk+AK0Fww26SntvZ6Ah8hVhI8flAzmP4NjWAF 3wKg== X-Gm-Message-State: AOJu0Yxm9xRENzTILnNdmzITbEKWu2pZ7GSaryztEIvGtwqtnYKUM44q J7oKZFVCO5LT+1mlTUhcemFC0OYi10w0pwY= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6402:540e:b0:543:78e5:e00d with SMTP id ev14-20020a056402540e00b0054378e5e00dmr50282edb.6.1698861765780; Wed, 01 Nov 2023 11:02:45 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:42 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-12-08ba9197f637@google.com> Subject: [PATCH RFC 12/20] rust_binder: add BINDER_TYPE_PTR support From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:38 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385754446006518 X-GMAIL-MSGID: 1781385754446006518 Implement support for the scatter-gather feature of binder, which lets you embed pointers in binder transactions and have them be translated so that the recipient gets a pointer that also works for them. This works by adding a second kind of object to the offset array, namely the BINDER_TYPE_PTR object. This object has a pointer and length embedded. The kernel will copy the data behind the pointer, and update the address of the pointer so that the recipient will be able to follow the pointer and see the same data. These objects are supported recursively. Other than the pointer in the main transaction buffer, each buffer may be pointed at by a pointer in one of the other buffers. This can be used to build arbitrary trees of buffers. Signed-off-by: Alice Ryhl --- drivers/android/defs.rs | 1 + drivers/android/error.rs | 9 ++ drivers/android/thread.rs | 340 +++++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 344 insertions(+), 6 deletions(-) diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 68f32a779a3c..267266f3ad76 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -105,6 +105,7 @@ fn default() -> Self { decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref); decl_wrapper!(FlatBinderObject, bindings::flat_binder_object); decl_wrapper!(BinderObjectHeader, bindings::binder_object_header); +decl_wrapper!(BinderBufferObject, bindings::binder_buffer_object); decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data); decl_wrapper!( BinderTransactionDataSecctx, diff --git a/drivers/android/error.rs b/drivers/android/error.rs index 430b0994affa..c9b991d133d9 100644 --- a/drivers/android/error.rs +++ b/drivers/android/error.rs @@ -50,6 +50,15 @@ fn from(_: core::alloc::AllocError) -> Self { } } +impl From for BinderError { + fn from(_: alloc::collections::TryReserveError) -> Self { + Self { + reply: BR_FAILED_REPLY, + source: Some(ENOMEM), + } + } +} + impl core::fmt::Debug for BinderError { fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { match self.reply { diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index a9afc7b706c6..86bb32bbabd9 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -35,6 +35,184 @@ sync::atomic::{AtomicU32, Ordering}, }; +/// Stores the layout of the scatter-gather entries. This is used during the `translate_objects` +/// call and is discarded when it returns. +struct ScatterGatherState { + /// A struct that tracks the amount of unused buffer space. + unused_buffer_space: UnusedBufferSpace, + /// Scatter-gather entries to copy. + sg_entries: Vec, + /// Indexes into `sg_entries` corresponding to the last binder_buffer_object that + /// was processed and all of its ancestors. The array is in sorted order. + ancestors: Vec, +} + +/// This entry specifies an additional buffer that should be copied using the scatter-gather +/// mechanism. +struct ScatterGatherEntry { + /// The index in the offset array of the BINDER_TYPE_PTR that this entry originates from. + obj_index: usize, + /// Offset in target buffer. + offset: usize, + /// User address in source buffer. + sender_uaddr: usize, + /// Number of bytes to copy. + length: usize, + /// The minimum offset of the next fixup in this buffer. + fixup_min_offset: usize, + /// The offsets within this buffer that contain pointers which should be translated. + pointer_fixups: Vec, +} + +/// This entry specifies that a fixup should happen at `target_offset` of the +/// buffer. If `skip` is nonzero, then the fixup is a `binder_fd_array_object` +/// and is applied later. Otherwise if `skip` is zero, then the size of the +/// fixup is `sizeof::()` and `pointer_value` is written to the buffer. +struct PointerFixupEntry { + /// The number of bytes to skip, or zero for a `binder_buffer_object` fixup. + skip: usize, + /// The translated pointer to write when `skip` is zero. + pointer_value: u64, + /// The offset at which the value should be written. The offset is relative + /// to the original buffer. + target_offset: usize, +} + +/// Return type of `apply_and_validate_fixup_in_parent`. +struct ParentFixupInfo { + /// The index of the parent buffer in `sg_entries`. + parent_sg_index: usize, + /// The number of ancestors of the buffer. + /// + /// The buffer is considered an ancestor of itself, so this is always at + /// least one. + num_ancestors: usize, + /// New value of `fixup_min_offset` if this fixup is applied. + new_min_offset: usize, + /// The offset of the fixup in the target buffer. + target_offset: usize, +} + +impl ScatterGatherState { + /// Called when a `binder_buffer_object` or `binder_fd_array_object` tries + /// to access a region in its parent buffer. These accesses have various + /// restrictions, which this method verifies. + /// + /// The `parent_offset` and `length` arguments describe the offset and + /// length of the access in the parent buffer. + /// + /// # Detailed restrictions + /// + /// Obviously the fixup must be in-bounds for the parent buffer. + /// + /// For safety reasons, we only allow fixups inside a buffer to happen + /// at increasing offsets; additionally, we only allow fixup on the last + /// buffer object that was verified, or one of its parents. + /// + /// Example of what is allowed: + /// + /// A + /// B (parent = A, offset = 0) + /// C (parent = A, offset = 16) + /// D (parent = C, offset = 0) + /// E (parent = A, offset = 32) // min_offset is 16 (C.parent_offset) + /// + /// Examples of what is not allowed: + /// + /// Decreasing offsets within the same parent: + /// A + /// C (parent = A, offset = 16) + /// B (parent = A, offset = 0) // decreasing offset within A + /// + /// Arcerring to a parent that wasn't the last object or any of its parents: + /// A + /// B (parent = A, offset = 0) + /// C (parent = A, offset = 0) + /// C (parent = A, offset = 16) + /// D (parent = B, offset = 0) // B is not A or any of A's parents + fn validate_parent_fixup( + &self, + parent: usize, + parent_offset: usize, + length: usize, + ) -> Result { + // Using `position` would also be correct, but `rposition` avoids + // quadratic running times. + let ancestors_i = self + .ancestors + .iter() + .copied() + .rposition(|sg_idx| self.sg_entries[sg_idx].obj_index == parent) + .ok_or(EINVAL)?; + let sg_idx = self.ancestors[ancestors_i]; + let sg_entry = match self.sg_entries.get(sg_idx) { + Some(sg_entry) => sg_entry, + None => { + pr_err!( + "self.ancestors[{}] is {}, but self.sg_entries.len() is {}", + ancestors_i, + sg_idx, + self.sg_entries.len() + ); + return Err(EINVAL); + } + }; + if sg_entry.fixup_min_offset > parent_offset { + pr_warn!( + "validate_parent_fixup: fixup_min_offset={}, parent_offset={}", + sg_entry.fixup_min_offset, + parent_offset + ); + return Err(EINVAL); + } + let new_min_offset = parent_offset.checked_add(length).ok_or(EINVAL)?; + if new_min_offset > sg_entry.length { + pr_warn!( + "validate_parent_fixup: new_min_offset={}, sg_entry.length={}", + new_min_offset, + sg_entry.length + ); + return Err(EINVAL); + } + let target_offset = sg_entry.offset.checked_add(parent_offset).ok_or(EINVAL)?; + // The `ancestors_i + 1` operation can't overflow since the output of the addition is at + // most `self.ancestors.len()`, which also fits in a usize. + Ok(ParentFixupInfo { + parent_sg_index: sg_idx, + num_ancestors: ancestors_i + 1, + new_min_offset, + target_offset, + }) + } +} + +/// Keeps track of how much unused buffer space is left. The initial amount is the number of bytes +/// requested by the user using the `buffers_size` field of `binder_transaction_data_sg`. Each time +/// we translate an object of type `BINDER_TYPE_PTR`, some of the unused buffer space is consumed. +struct UnusedBufferSpace { + /// The start of the remaining space. + offset: usize, + /// The end of the remaining space. + limit: usize, +} +impl UnusedBufferSpace { + /// Claim the next `size` bytes from the unused buffer space. The offset for the claimed chunk + /// into the buffer is returned. + fn claim_next(&mut self, size: usize) -> Result { + // We require every chunk to be aligned. + let size = ptr_align(size); + let new_offset = self.offset.checked_add(size).ok_or(EINVAL)?; + + if new_offset <= self.limit { + let offset = self.offset; + self.offset = new_offset; + Ok(offset) + } else { + Err(EINVAL) + } + } +} + pub(crate) enum PushWorkRes { Ok, FailedDead(DLArc), @@ -419,9 +597,11 @@ pub(crate) fn push_return_work(&self, reply: u32) { fn translate_object( &self, + obj_index: usize, offset: usize, object: BinderObjectRef<'_>, view: &mut AllocationView<'_>, + sg_state: &mut ScatterGatherState, ) -> BinderResult { match object { BinderObjectRef::Binder(obj) => { @@ -453,9 +633,78 @@ fn translate_object( pr_warn!("Using unsupported binder object type fd."); return Err(EINVAL.into()); } - BinderObjectRef::Ptr(_obj) => { - pr_warn!("Using unsupported binder object type ptr."); - return Err(EINVAL.into()); + BinderObjectRef::Ptr(obj) => { + let obj_length = obj.length.try_into().map_err(|_| EINVAL)?; + let alloc_offset = match sg_state.unused_buffer_space.claim_next(obj_length) { + Ok(alloc_offset) => alloc_offset, + Err(err) => { + pr_warn!( + "Failed to claim space for a BINDER_TYPE_PTR. (offset: {}, limit: {}, size: {})", + sg_state.unused_buffer_space.offset, + sg_state.unused_buffer_space.limit, + obj_length, + ); + return Err(err.into()); + } + }; + + let sg_state_idx = sg_state.sg_entries.len(); + sg_state.sg_entries.try_push(ScatterGatherEntry { + obj_index, + offset: alloc_offset, + sender_uaddr: obj.buffer as _, + length: obj_length, + pointer_fixups: Vec::new(), + fixup_min_offset: 0, + })?; + + let buffer_ptr_in_user_space = (view.alloc.ptr + alloc_offset) as u64; + + if obj.flags & bindings::BINDER_BUFFER_FLAG_HAS_PARENT == 0 { + sg_state.ancestors.clear(); + sg_state.ancestors.try_push(sg_state_idx)?; + } else { + // Another buffer also has a pointer to this buffer, and we need to fixup that + // pointer too. + + let parent_index = usize::try_from(obj.parent).map_err(|_| EINVAL)?; + let parent_offset = usize::try_from(obj.parent_offset).map_err(|_| EINVAL)?; + + let info = sg_state.validate_parent_fixup( + parent_index, + parent_offset, + size_of::(), + )?; + + sg_state.ancestors.truncate(info.num_ancestors); + sg_state.ancestors.try_push(sg_state_idx)?; + + let parent_entry = match sg_state.sg_entries.get_mut(info.parent_sg_index) { + Some(parent_entry) => parent_entry, + None => { + pr_err!( + "validate_parent_fixup returned index out of bounds for sg.entries" + ); + return Err(EINVAL.into()); + } + }; + + parent_entry.fixup_min_offset = info.new_min_offset; + parent_entry.pointer_fixups.try_push(PointerFixupEntry { + skip: 0, + pointer_value: buffer_ptr_in_user_space, + target_offset: info.target_offset, + })?; + } + + let mut obj_write = BinderBufferObject::default(); + obj_write.hdr.type_ = BINDER_TYPE_PTR; + obj_write.flags = obj.flags; + obj_write.buffer = buffer_ptr_in_user_space; + obj_write.length = obj.length; + obj_write.parent = obj.parent; + obj_write.parent_offset = obj.parent_offset; + view.write::(offset, &obj_write)?; } BinderObjectRef::Fda(_obj) => { pr_warn!("Using unsupported binder object type fda."); @@ -465,6 +714,61 @@ fn translate_object( Ok(()) } + fn apply_sg(&self, alloc: &mut Allocation, sg_state: &mut ScatterGatherState) -> BinderResult { + for sg_entry in &mut sg_state.sg_entries { + let mut end_of_previous_fixup = sg_entry.offset; + let offset_end = sg_entry.offset.checked_add(sg_entry.length).ok_or(EINVAL)?; + + let mut reader = + UserSlicePtr::new(sg_entry.sender_uaddr as _, sg_entry.length).reader(); + for fixup in &mut sg_entry.pointer_fixups { + let fixup_len = if fixup.skip == 0 { + size_of::() + } else { + fixup.skip + }; + + let target_offset_end = fixup.target_offset.checked_add(fixup_len).ok_or(EINVAL)?; + if fixup.target_offset < end_of_previous_fixup || offset_end < target_offset_end { + pr_warn!( + "Fixups oob {} {} {} {}", + fixup.target_offset, + end_of_previous_fixup, + offset_end, + target_offset_end + ); + return Err(EINVAL.into()); + } + + let copy_off = end_of_previous_fixup; + let copy_len = fixup.target_offset - end_of_previous_fixup; + if let Err(err) = alloc.copy_into(&mut reader, copy_off, copy_len) { + pr_warn!("Failed copying into alloc: {:?}", err); + return Err(err.into()); + } + if fixup.skip == 0 { + let res = alloc.write::(fixup.target_offset, &fixup.pointer_value); + if let Err(err) = res { + pr_warn!("Failed copying ptr into alloc: {:?}", err); + return Err(err.into()); + } + } + if let Err(err) = reader.skip(fixup_len) { + pr_warn!("Failed skipping {} from reader: {:?}", fixup_len, err); + return Err(err.into()); + } + end_of_previous_fixup = target_offset_end; + } + let copy_off = end_of_previous_fixup; + let copy_len = offset_end - end_of_previous_fixup; + if let Err(err) = alloc.copy_into(&mut reader, copy_off, copy_len) { + pr_warn!("Failed copying remainder into alloc: {:?}", err); + return Err(err.into()); + } + } + Ok(()) + } + pub(crate) fn copy_transaction_data( &self, to_process: Arc, @@ -491,6 +795,8 @@ pub(crate) fn copy_transaction_data( let adata_size = ptr_align(data_size); let offsets_size = trd.offsets_size.try_into().map_err(|_| EINVAL)?; let aoffsets_size = ptr_align(offsets_size); + let buffers_size = tr.buffers_size.try_into().map_err(|_| EINVAL)?; + let abuffers_size = ptr_align(buffers_size); let asecctx_size = secctx .as_ref() .map(|(_, ctx)| ptr_align(ctx.len())) @@ -500,11 +806,12 @@ pub(crate) fn copy_transaction_data( let len = usize::max( adata_size .checked_add(aoffsets_size) + .and_then(|sum| sum.checked_add(abuffers_size)) .and_then(|sum| sum.checked_add(asecctx_size)) .ok_or(ENOMEM)?, size_of::(), ); - let secctx_off = adata_size + aoffsets_size; + let secctx_off = adata_size + aoffsets_size + abuffers_size; let mut alloc = match to_process.buffer_alloc(len, is_oneway) { Ok(alloc) => alloc, Err(err) => { @@ -520,6 +827,7 @@ pub(crate) fn copy_transaction_data( let mut buffer_reader = unsafe { UserSlicePtr::new(trd.data.ptr.buffer as _, data_size) }.reader(); let mut end_of_previous_object = 0; + let mut sg_state = None; // Copy offsets if there are any. if offsets_size > 0 { @@ -532,9 +840,22 @@ pub(crate) fn copy_transaction_data( let offsets_start = adata_size; let offsets_end = adata_size + aoffsets_size; + // This state is used for BINDER_TYPE_PTR objects. + let sg_state = sg_state.insert(ScatterGatherState { + unused_buffer_space: UnusedBufferSpace { + offset: offsets_end, + limit: len, + }, + sg_entries: Vec::new(), + ancestors: Vec::new(), + }); + // Traverse the objects specified. let mut view = AllocationView::new(&mut alloc, data_size); - for index_offset in (offsets_start..offsets_end).step_by(size_of::()) { + for (index, index_offset) in (offsets_start..offsets_end) + .step_by(size_of::()) + .enumerate() + { let offset = view.alloc.read(index_offset)?; // Copy data between two objects. @@ -548,7 +869,7 @@ pub(crate) fn copy_transaction_data( let mut object = BinderObject::read_from(&mut buffer_reader)?; - match self.translate_object(offset, object.as_ref(), &mut view) { + match self.translate_object(index, offset, object.as_ref(), &mut view, sg_state) { Ok(()) => end_of_previous_object = offset + object.size(), Err(err) => { pr_warn!("Error while translating object."); @@ -570,6 +891,13 @@ pub(crate) fn copy_transaction_data( data_size - end_of_previous_object, )?; + if let Some(sg_state) = sg_state.as_mut() { + if let Err(err) = self.apply_sg(&mut alloc, sg_state) { + pr_warn!("Failure in apply_sg: {:?}", err); + return Err(err); + } + } + if let Some((off_out, secctx)) = secctx.as_mut() { if let Err(err) = alloc.write(secctx_off, secctx.as_bytes()) { pr_warn!("Failed to write security context: {:?}", err); From patchwork Wed Nov 1 18:01:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160638 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605599vqx; Wed, 1 Nov 2023 11:04:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFD1gFbadoGeAIji91YNzGwEqwNSXk2KAZ83SB/GRxz74HG/D9H1aKH5irEB32lrfCIBEGg X-Received: by 2002:a17:90b:3a85:b0:279:1367:b9a3 with SMTP id om5-20020a17090b3a8500b002791367b9a3mr13331077pjb.4.1698861848494; Wed, 01 Nov 2023 11:04:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861848; cv=none; d=google.com; s=arc-20160816; b=v288dKC3HUfxly565XDmHwh0c44wpLVPkCghLG77o1ldPuET96Kpv0g6FnjFg9LcJ3 5aI0qDB68JVEiVI5yo+qQTTGhuyEncrk1M5qr6TaS4/fuXflBigoII4yLW9YeYAMEi20 Onemzo5RzUMqqK7LwQCLw+uBbpJ3ZKZp90hYi5ZqtweLEsD39aolbaxXxCa1XTHY3eLJ v2ZD/0JIxqG1yPuoPzlLA86bODyohZ6TDyIT5KTHpNObyqsSaejk7vh9lYLZfHN7PjEZ MxnEBNxZ+uEPjA8f5WDoNNuwR3Ge4np6Dix+muyH7u4ZfxHlkMqqBohy7Fr6GlGu9ZX9 40xA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=psYJm8f56TM0xllYa3fp9wp+aSzbtAS//QrfAgDcVlw=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=hsv49gOgYHLUcisNt1PxDEXXbrC6kGetUEx9fmi2/+eXwXz3KFD6a+GtXEB9qG2JI+ vmk/xNX0hAvglLKLbWzOYLTwM/3o8NnwaKlQGYIs+bJsLT/DUCm5fxSCJ9xipGEWMVFy OGNj0WzULcg9KsDMjvpcJVOXhl62GqYWeGBQT+p+imcbQQew1vkNHH02nM8ZjWv4bBvD EPiJ/eyziXO245Ur3dcO1Hdm3lafKRXpKP9NpKHMMvOx/6Ko5AoB2D0wKMzGcp4diirX u9OeD0GWpTcgSHzgzgZFHShrIY/5ZLa086+NJbOtBqQe220V8Az+rxMwui9f1bsIjuMO RZjw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3xSPh+i3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id x23-20020a17090a165700b0028054d20dadsi1204676pje.95.2023.11.01.11.03.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3xSPh+i3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 0A63B818ABDF; Wed, 1 Nov 2023 11:03:45 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345097AbjKASDc (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344898AbjKASDD (ORCPT ); Wed, 1 Nov 2023 14:03:03 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F539135 for ; Wed, 1 Nov 2023 11:02:49 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5a7b9e83b70so875337b3.0 for ; Wed, 01 Nov 2023 11:02:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861768; x=1699466568; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=psYJm8f56TM0xllYa3fp9wp+aSzbtAS//QrfAgDcVlw=; b=3xSPh+i3ZANmX7patSomJLhthYcdsVXgn0nK1v6T+1IH8ZvWlV7Tr8rCIxD+rchPAR jmcugb4F2XXr+MFozChimoh2R49JRwf2iwzYpbEdwBJy598mIGP64X/RWUEIEuBKHAvh arqw6qX8sqcADJkw43I2mBCQimBTP9e/lM2TKZr/NdkSBB44Z3etER+jW60Dp+b0CHu0 /hldRxxozACuN5FoXgoxrPLn5fpVskvupjzi+oHbKsrizYy4MWVUh4SHDdJciWVPeemw TB8CF+QLCSh5VMgFjHNI1HO7/2hTfpoKYrSvZludxqseiMIlI4st1+AK9iXIJeTnEorU Y2NA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861768; x=1699466568; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=psYJm8f56TM0xllYa3fp9wp+aSzbtAS//QrfAgDcVlw=; b=IDAX/yeQCT466ejYDlKCaOOQE/AeqxnVA7kiczdUjCJ+L0A2ueoKWWU/gbILsX5/gp gIl1HTiFAS+onM9UxC7iuVX3TYzQ+mQulWvuHahWTAm/Nk3kra1s/COepZ81DMGwLtBE 8XeauAxMqu1y7cszT+fHUCopB+eL4+gpQNcJtMA5scoOJQY+rq16onXhgGS4YWADWQUJ U4LqFYcdm893SjeYU56fwyD5Dld9ptumi/sJb2bYYLlKRGPzmKqaQRNbNxNUbASsr73g wNXXLEgqCzX3DY/XB9iUin7tgWopKHnhXbFRROXR1hgRsPg6Kq68PzhmaPPk1tRNjPcI 0CtA== X-Gm-Message-State: AOJu0Yx2WlfgvuEaFLAFVOUSdz5Qvqq7/hakVwFDiP7hFmp+4RT9DK0Q Zm9WxJzxMD6PQkdADYv98blvmW5vjE2SXlc= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a0d:c001:0:b0:5a4:fd03:2516 with SMTP id b1-20020a0dc001000000b005a4fd032516mr180157ywd.1.1698861768386; Wed, 01 Nov 2023 11:02:48 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:43 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-13-08ba9197f637@google.com> Subject: [PATCH RFC 13/20] rust_binder: add BINDER_TYPE_FD support From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:45 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385761516896500 X-GMAIL-MSGID: 1781385761516896500 Add support for sending fds over binder. Unlike the other object types, file descriptors are not translated until the transaction is actually received by the recipient. Until that happens, we store `u32::MAX` as the fd. Translating fds is done in a two-phase process. First, the file descriptors are allocated and written to the allocation. Then, once we have allocated all of them, we commit them to the files in question. Using this strategy, we are able to guarantee that we either send all of the fds, or none of them. Co-developed-by: Wedson Almeida Filho Signed-off-by: Wedson Almeida Filho Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 71 ++++++++++++++++++++++++++++++++++++++++++ drivers/android/defs.rs | 4 ++- drivers/android/error.rs | 6 ++++ drivers/android/thread.rs | 42 ++++++++++++++++++++++--- drivers/android/transaction.rs | 53 ++++++++++++++++++++++++------- rust/helpers.c | 8 +++++ rust/kernel/file.rs | 2 +- rust/kernel/security.rs | 11 +++++++ 8 files changed, 179 insertions(+), 18 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs index 32bc268956f2..9d777ffb7176 100644 --- a/drivers/android/allocation.rs +++ b/drivers/android/allocation.rs @@ -4,10 +4,12 @@ use kernel::{ bindings, + file::{File, FileDescriptorReservation}, io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes}, pages::Pages, prelude::*, sync::Arc, + types::ARef, user_ptr::UserSlicePtrReader, }; @@ -32,6 +34,8 @@ pub(crate) struct AllocationInfo { pub(crate) oneway_node: Option>, /// Zero the data in the buffer on free. pub(crate) clear_on_free: bool, + /// List of files embedded in this transaction. + file_list: FileList, } /// Represents an allocation that the kernel is currently using. @@ -160,6 +164,38 @@ pub(crate) fn set_info_clear_on_drop(&mut self) { pub(crate) fn set_info_target_node(&mut self, target_node: NodeRef) { self.get_or_init_info().target_node = Some(target_node); } + + pub(crate) fn info_add_fd(&mut self, file: ARef, buffer_offset: usize) -> Result { + self.get_or_init_info() + .file_list + .files_to_translate + .try_push(FileEntry { + file, + buffer_offset, + })?; + + Ok(()) + } + + pub(crate) fn translate_fds(&mut self) -> Result { + let file_list = match self.allocation_info.as_mut() { + Some(info) => &mut info.file_list, + None => return Ok(TranslatedFds::new()), + }; + + let files = core::mem::take(&mut file_list.files_to_translate); + let mut reservations = Vec::try_with_capacity(files.len())?; + for file_info in files { + let res = FileDescriptorReservation::new(bindings::O_CLOEXEC)?; + self.write::(file_info.buffer_offset, &res.reserved_fd())?; + reservations.try_push(Reservation { + res, + file: file_info.file, + })?; + } + + Ok(TranslatedFds { reservations }) + } } impl Drop for Allocation { @@ -417,3 +453,38 @@ fn type_to_size(type_: u32) -> Option { } } } + +#[derive(Default)] +struct FileList { + files_to_translate: Vec, +} + +struct FileEntry { + /// The file for which a descriptor will be created in the recipient process. + file: ARef, + /// The offset in the buffer where the file descriptor is stored. + buffer_offset: usize, +} + +pub(crate) struct TranslatedFds { + reservations: Vec, +} + +struct Reservation { + res: FileDescriptorReservation, + file: ARef, +} + +impl TranslatedFds { + pub(crate) fn new() -> Self { + Self { + reservations: Vec::new(), + } + } + + pub(crate) fn commit(self) { + for entry in self.reservations { + entry.res.commit(entry.file); + } + } +} diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 267266f3ad76..fa4ec3eff424 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -54,9 +54,10 @@ macro_rules! pub_no_prefix { BC_DEAD_BINDER_DONE ); +pub(crate) const FLAT_BINDER_FLAG_ACCEPTS_FDS: u32 = kernel::bindings::FLAT_BINDER_FLAG_ACCEPTS_FDS; pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 = kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX; -pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_CLEAR_BUF); +pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_ACCEPT_FDS, TF_CLEAR_BUF); pub(crate) use bindings::{ BINDER_TYPE_BINDER, BINDER_TYPE_FD, BINDER_TYPE_FDA, BINDER_TYPE_HANDLE, BINDER_TYPE_PTR, @@ -104,6 +105,7 @@ fn default() -> Self { decl_wrapper!(BinderNodeDebugInfo, bindings::binder_node_debug_info); decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref); decl_wrapper!(FlatBinderObject, bindings::flat_binder_object); +decl_wrapper!(BinderFdObject, bindings::binder_fd_object); decl_wrapper!(BinderObjectHeader, bindings::binder_object_header); decl_wrapper!(BinderBufferObject, bindings::binder_buffer_object); decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data); diff --git a/drivers/android/error.rs b/drivers/android/error.rs index c9b991d133d9..6735636d2a1c 100644 --- a/drivers/android/error.rs +++ b/drivers/android/error.rs @@ -41,6 +41,12 @@ fn from(source: Error) -> Self { } } +impl From for BinderError { + fn from(source: kernel::file::BadFdError) -> Self { + BinderError::from(Error::from(source)) + } +} + impl From for BinderError { fn from(_: core::alloc::AllocError) -> Self { Self { diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 86bb32bbabd9..56b36dc43bcc 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -601,6 +601,7 @@ fn translate_object( offset: usize, object: BinderObjectRef<'_>, view: &mut AllocationView<'_>, + allow_fds: bool, sg_state: &mut ScatterGatherState, ) -> BinderResult { match object { @@ -629,9 +630,31 @@ fn translate_object( security::binder_transfer_binder(&self.process.cred, &view.alloc.process.cred)?; view.transfer_binder_object(offset, obj, strong, node)?; } - BinderObjectRef::Fd(_obj) => { - pr_warn!("Using unsupported binder object type fd."); - return Err(EINVAL.into()); + BinderObjectRef::Fd(obj) => { + if !allow_fds { + return Err(EPERM.into()); + } + + // SAFETY: `fd` is a `u32`; any bit pattern is a valid representation. + let fd = unsafe { obj.__bindgen_anon_1.fd }; + let file = File::from_fd(fd)?; + security::binder_transfer_file( + &self.process.cred, + &view.alloc.process.cred, + &file, + )?; + + let mut obj_write = BinderFdObject::default(); + obj_write.hdr.type_ = BINDER_TYPE_FD; + // This will be overwritten with the actual fd when the transaction is received. + obj_write.__bindgen_anon_1.fd = u32::MAX; + obj_write.cookie = obj.cookie; + view.write::(offset, &obj_write)?; + + const FD_FIELD_OFFSET: usize = + ::core::mem::offset_of!(bindings::binder_fd_object, __bindgen_anon_1.fd) + as usize; + view.alloc.info_add_fd(file, offset + FD_FIELD_OFFSET)?; } BinderObjectRef::Ptr(obj) => { let obj_length = obj.length.try_into().map_err(|_| EINVAL)?; @@ -773,6 +796,7 @@ pub(crate) fn copy_transaction_data( &self, to_process: Arc, tr: &BinderTransactionDataSg, + allow_fds: bool, txn_security_ctx_offset: Option<&mut usize>, ) -> BinderResult { let trd = &tr.transaction_data; @@ -869,7 +893,14 @@ pub(crate) fn copy_transaction_data( let mut object = BinderObject::read_from(&mut buffer_reader)?; - match self.translate_object(index, offset, object.as_ref(), &mut view, sg_state) { + match self.translate_object( + index, + offset, + object.as_ref(), + &mut view, + allow_fds, + sg_state, + ) { Ok(()) => end_of_previous_object = offset + object.size(), Err(err) => { pr_warn!("Error while translating object."); @@ -1059,7 +1090,8 @@ fn reply_inner(self: &Arc, tr: &BinderTransactionDataSg) -> BinderResult { (|| -> BinderResult<_> { let completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?; let process = orig.from.process.clone(); - let reply = Transaction::new_reply(self, process, tr)?; + let allow_fds = orig.flags & TF_ACCEPT_FDS != 0; + let reply = Transaction::new_reply(self, process, tr, allow_fds)?; self.inner.lock().push_work(completion); orig.from.deliver_reply(Either::Left(reply), &orig); Ok(()) diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index 2faba6e1f47f..3230ea490a5b 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -11,7 +11,7 @@ }; use crate::{ - allocation::Allocation, + allocation::{Allocation, TranslatedFds}, defs::*, error::{BinderError, BinderResult}, node::{Node, NodeRef}, @@ -50,19 +50,24 @@ pub(crate) fn new( tr: &BinderTransactionDataSg, ) -> BinderResult> { let trd = &tr.transaction_data; + let allow_fds = node_ref.node.flags & FLAT_BINDER_FLAG_ACCEPTS_FDS != 0; let txn_security_ctx = node_ref.node.flags & FLAT_BINDER_FLAG_TXN_SECURITY_CTX != 0; let mut txn_security_ctx_off = if txn_security_ctx { Some(0) } else { None }; let to = node_ref.node.owner.clone(); - let mut alloc = - match from.copy_transaction_data(to.clone(), tr, txn_security_ctx_off.as_mut()) { - Ok(alloc) => alloc, - Err(err) => { - if !err.is_dead() { - pr_warn!("Failure in copy_transaction_data: {:?}", err); - } - return Err(err); + let mut alloc = match from.copy_transaction_data( + to.clone(), + tr, + allow_fds, + txn_security_ctx_off.as_mut(), + ) { + Ok(alloc) => alloc, + Err(err) => { + if !err.is_dead() { + pr_warn!("Failure in copy_transaction_data: {:?}", err); } - }; + return Err(err); + } + }; if trd.flags & TF_ONE_WAY != 0 { if stack_next.is_some() { pr_warn!("Oneway transaction should not be in a transaction stack."); @@ -97,9 +102,10 @@ pub(crate) fn new_reply( from: &Arc, to: Arc, tr: &BinderTransactionDataSg, + allow_fds: bool, ) -> BinderResult> { let trd = &tr.transaction_data; - let mut alloc = match from.copy_transaction_data(to.clone(), tr, None) { + let mut alloc = match from.copy_transaction_data(to.clone(), tr, allow_fds, None) { Ok(alloc) => alloc, Err(err) => { pr_warn!("Failure in copy_transaction_data: {:?}", err); @@ -210,6 +216,22 @@ pub(crate) fn submit(self: DLArc) -> BinderResult { } } } + + fn prepare_file_list(&self) -> Result { + let mut alloc = self.allocation.lock().take().ok_or(ESRCH)?; + + match alloc.translate_fds() { + Ok(translated) => { + *self.allocation.lock() = Some(alloc); + Ok(translated) + } + Err(err) => { + // Free the allocation eagerly. + drop(alloc); + Err(err) + } + } + } } impl DeliverToRead for Transaction { @@ -220,6 +242,13 @@ fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) - self.from.deliver_reply(reply, &self); } }); + let files = if let Ok(list) = self.prepare_file_list() { + list + } else { + // On failure to process the list, we send a reply back to the sender and ignore the + // transaction on the recipient. + return Ok(true); + }; let mut tr_sec = BinderTransactionDataSecctx::default(); let tr = tr_sec.tr_data(); @@ -269,6 +298,8 @@ fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) - alloc.keep_alive(); } + files.commit(); + // When this is not a reply and not a oneway transaction, update `current_transaction`. If // it's a reply, `current_transaction` has already been updated appropriately. if self.target_node.is_some() && tr_sec.transaction_data.flags & TF_ONE_WAY == 0 { diff --git a/rust/helpers.c b/rust/helpers.c index 924c7a00f433..be295d8bdb46 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -349,6 +349,14 @@ int rust_helper_security_binder_transfer_binder(const struct cred *from, return security_binder_transfer_binder(from, to); } EXPORT_SYMBOL_GPL(rust_helper_security_binder_transfer_binder); + +int rust_helper_security_binder_transfer_file(const struct cred *from, + const struct cred *to, + struct file *file) +{ + return security_binder_transfer_file(from, to, file); +} +EXPORT_SYMBOL_GPL(rust_helper_security_binder_transfer_file); #endif /* diff --git a/rust/kernel/file.rs b/rust/kernel/file.rs index 2e983285cc16..a0319c93f367 100644 --- a/rust/kernel/file.rs +++ b/rust/kernel/file.rs @@ -107,7 +107,7 @@ pub mod flags { /// Instances of this type are always ref-counted, that is, a call to `get_file` ensures that the /// allocation remains valid at least until the matching call to `fput`. #[repr(transparent)] -pub struct File(Opaque); +pub struct File(pub(crate) Opaque); // SAFETY: By design, the only way to access a `File` is via an immutable reference or an `ARef`. // This means that the only situation in which a `File` can be accessed mutably is when the diff --git a/rust/kernel/security.rs b/rust/kernel/security.rs index 9179fc225406..d308b8183c59 100644 --- a/rust/kernel/security.rs +++ b/rust/kernel/security.rs @@ -8,6 +8,7 @@ bindings, cred::Credential, error::{to_result, Result}, + file::File, }; /// Calls the security modules to determine if the given task can become the manager of a binder @@ -31,6 +32,16 @@ pub fn binder_transfer_binder(from: &Credential, to: &Credential) -> Result { to_result(unsafe { bindings::security_binder_transfer_binder(from.0.get(), to.0.get()) }) } +/// Calls the security modules to determine if task `from` is allowed to send the given file to +/// task `to` (which would get its own file descriptor) through a binder transaction. +pub fn binder_transfer_file(from: &Credential, to: &Credential, file: &File) -> Result { + // SAFETY: `from`, `to` and `file` are valid because the shared references guarantee nonzero + // refcounts. + to_result(unsafe { + bindings::security_binder_transfer_file(from.0.get(), to.0.get(), file.0.get()) + }) +} + /// A security context string. /// /// The struct has the invariant that it always contains a valid security context. From patchwork Wed Nov 1 18:01:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160635 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605550vqx; Wed, 1 Nov 2023 11:04:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFrRgV8EGKBSgFV0a+AGR7UkoyGjZB80OOs3b4oxGKbkS0G2OEY4y/MRQg0Gh5zPLb7Hg7t X-Received: by 2002:a17:90a:1a17:b0:280:2c55:77c5 with SMTP id 23-20020a17090a1a1700b002802c5577c5mr9752999pjk.46.1698861844004; Wed, 01 Nov 2023 11:04:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861843; cv=none; d=google.com; s=arc-20160816; b=KghSlMR5dxCAmwDDCd/FxZskO/NZOAfzC1TkZpmQjTrNfuJfMpTPwMrhySV8IlDALK eV5xIZmh8sI3qh4SU8kVxQ9pdIwVIIkXr2wTwalypr9PIBuAQ++mP3v9qH57KxdOO4R3 G35ZdSSa9fg2MtcYSEfNZcrYlW4NdVRV+MSHuTVQtmBSKGTrxlNwP6toAkOob74PHKVO qL6oAxPqoRnFt5gBVUztxUGTiZ2+e1qKWR+TkBkQVRzeJkIz75Xd8ji8xk2IPMpFztzv PcFtdCXoVjwoICUi8F6+qGrEosZV0I+TvrUQcnvD2b638kUGWT4r7tl4q4Ocan9aTVjD DMpg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=B7OV6uaLlQsboEevkgmaMATEvLuiVaZeY7BNHWhRI4M=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=Q8aKu7J43HnXi4ufwaV7Nt164RTNnJCuV0q0EXsbHx6nQlvmQG6Mr2u4TDTq1XjyVe m7C3Xs6CkhBbCyLKoI4/4q2hZTfBeE168dARVVEnWkKMQZIVVGK9mB5bJLGbL2gbmt2j PwDSZ6WWnRR274P+fNSwpyEaLufd/j3UiD7pxRHfh4VsXl6SdtadWJSHu6NUdsRXMaLM T4VaKi42FyG7H4O1ou/5ZmjzFuG4t0NoKD6HBSp7r/6WlKdlQLeO/CbO7pSktBL1VLwk al7V9R4ENpRE5eQ84fdve9ZoqicAIRxlN7twUk9yupOHBnA4afBloyZXy1UKSidP/bUd cdcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tD0Lgdpf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id oc15-20020a17090b1c0f00b00274df6d4c38si1375878pjb.89.2023.11.01.11.03.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tD0Lgdpf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id BB5F181896F6; Wed, 1 Nov 2023 11:03:40 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344989AbjKASD2 (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344900AbjKASDD (ORCPT ); Wed, 1 Nov 2023 14:03:03 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE14D137 for ; Wed, 1 Nov 2023 11:02:51 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5a7be940fe1so1062137b3.2 for ; Wed, 01 Nov 2023 11:02:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861771; x=1699466571; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=B7OV6uaLlQsboEevkgmaMATEvLuiVaZeY7BNHWhRI4M=; b=tD0LgdpfbXEG+lDY2qY+s79bp5g4ekyYR4FS6rQKgW0hSnsFn+xVDpfpPmqCV6QVJs /NOE4paGLYVwRqYQgxG0zSUa1Zad5pIPosOUrgCph+40A3+IG6Os7xhXuK65mUxKTiCw MfcMD0Zn0CGVjS5cDooWKiY2ze1u79/8g+V6+SdHOCvHHD+BGsdYLv5Wgi1ZwcWBiFIC mXJp4MovqDYn2iz1SNaN1g9SH4B1xv1R/s7xjOctjgXBZDXu/+tPLs1UdqneX4BahKeQ uFrt862M8MmUihMhsJYDuXmM8l5dhLaXOPytf1n690OpwKD2oL5LvHYnDnaDu+w6J6P1 RQQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861771; x=1699466571; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=B7OV6uaLlQsboEevkgmaMATEvLuiVaZeY7BNHWhRI4M=; b=P1bZDN1QDVvaALId3W5c3fQ5AuRNfZxo2tyBwMlqcA3UXUbmCNdB/yfJLYRyVkmqN7 oLwa9hEVccMX+Vuy9ZldmIWkjS3PkFI80iiepc+Qql+BdRzyTLbOus7fT3GK/KbXXMn/ KT5q3UZTfsYQ16FAsr/+iRyPndBmT4fx6dEQGphoUP3ifIbU3TI9K1k/ZeMXy9PrXGY0 7C5Obyc3BvpmJPa9qgswZBjwFN15f/21H1b1O8UjQEqGyl/kX3zZHRWlY31zu38St9UP YQCBtQF9+QlQME11JlVZggDpK2TsGFJWYVIa6r7EXDOPFq0NI8zTJqL49IAerFjbHw8T Cv0A== X-Gm-Message-State: AOJu0YwMqExtcbb0w5VxGk/0ofJRitHs+wwtPgDpwqexFU/fqg7iXXas v48TM6iAjiHX7NKHAVZF9orIu81o0pedSkQ= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a25:5c2:0:b0:d9a:68de:16a1 with SMTP id 185-20020a2505c2000000b00d9a68de16a1mr318148ybf.0.1698861770944; Wed, 01 Nov 2023 11:02:50 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:44 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-14-08ba9197f637@google.com> Subject: [PATCH RFC 14/20] rust_binder: add BINDER_TYPE_FDA support From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:40 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385756966197432 X-GMAIL-MSGID: 1781385756966197432 In the previous patch, we introduced support for `BINDER_TYPE_FD` objects that let you send a single fd, and in this patch, we add support for FD arrays. One important difference between `BINDER_TYPE_FD` and `BINDER_TYPE_FDA` is that FD arrays will close the file descriptors when the transaction allocation is freed, whereas FDs sent using `BINDER_TYPE_FD` are not closed. Note that `BINDER_TYPE_FDA` is used only with hwbinder. Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 74 ++++++++++++++++++++++++++++++++--- drivers/android/defs.rs | 1 + drivers/android/thread.rs | 87 +++++++++++++++++++++++++++++++++++++++--- drivers/android/transaction.rs | 13 ++++--- 4 files changed, 159 insertions(+), 16 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs index 9d777ffb7176..c7f44a54b79b 100644 --- a/drivers/android/allocation.rs +++ b/drivers/android/allocation.rs @@ -4,7 +4,7 @@ use kernel::{ bindings, - file::{File, FileDescriptorReservation}, + file::{DeferredFdCloser, File, FileDescriptorReservation}, io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes}, pages::Pages, prelude::*, @@ -165,18 +165,38 @@ pub(crate) fn set_info_target_node(&mut self, target_node: NodeRef) { self.get_or_init_info().target_node = Some(target_node); } - pub(crate) fn info_add_fd(&mut self, file: ARef, buffer_offset: usize) -> Result { + /// Reserve enough space to push at least `num_fds` fds. + pub(crate) fn info_add_fd_reserve(&mut self, num_fds: usize) -> Result { + self.get_or_init_info() + .file_list + .files_to_translate + .try_reserve(num_fds)?; + + Ok(()) + } + + pub(crate) fn info_add_fd( + &mut self, + file: ARef, + buffer_offset: usize, + close_on_free: bool, + ) -> Result { self.get_or_init_info() .file_list .files_to_translate .try_push(FileEntry { file, buffer_offset, + close_on_free, })?; Ok(()) } + pub(crate) fn set_info_close_on_free(&mut self, cof: FdsCloseOnFree) { + self.get_or_init_info().file_list.close_on_free = cof.0; + } + pub(crate) fn translate_fds(&mut self) -> Result { let file_list = match self.allocation_info.as_mut() { Some(info) => &mut info.file_list, @@ -184,17 +204,38 @@ pub(crate) fn translate_fds(&mut self) -> Result { }; let files = core::mem::take(&mut file_list.files_to_translate); + + let num_close_on_free = files.iter().filter(|entry| entry.close_on_free).count(); + let mut close_on_free = Vec::try_with_capacity(num_close_on_free)?; + let mut reservations = Vec::try_with_capacity(files.len())?; for file_info in files { let res = FileDescriptorReservation::new(bindings::O_CLOEXEC)?; - self.write::(file_info.buffer_offset, &res.reserved_fd())?; + let fd = res.reserved_fd(); + self.write::(file_info.buffer_offset, &fd)?; reservations.try_push(Reservation { res, file: file_info.file, })?; + if file_info.close_on_free { + close_on_free.try_push(fd)?; + } } - Ok(TranslatedFds { reservations }) + Ok(TranslatedFds { + reservations, + close_on_free: FdsCloseOnFree(close_on_free), + }) + } + + /// Should the looper return to userspace when freeing this allocation? + pub(crate) fn looper_need_return_on_free(&self) -> bool { + // Closing fds involves pushing task_work for execution when we return to userspace. Hence, + // we should return to userspace asap if we are closing fds. + match self.allocation_info { + Some(ref info) => !info.file_list.close_on_free.is_empty(), + None => false, + } } } @@ -220,6 +261,18 @@ fn drop(&mut self) { } } + for &fd in &info.file_list.close_on_free { + let closer = match DeferredFdCloser::new() { + Ok(closer) => closer, + Err(core::alloc::AllocError) => { + // Ignore allocation failures. + break; + } + }; + + closer.close_fd(fd); + } + if info.clear_on_free { if let Err(e) = self.fill_zero() { pr_warn!("Failed to clear data on free: {:?}", e); @@ -457,6 +510,7 @@ fn type_to_size(type_: u32) -> Option { #[derive(Default)] struct FileList { files_to_translate: Vec, + close_on_free: Vec, } struct FileEntry { @@ -464,10 +518,15 @@ struct FileEntry { file: ARef, /// The offset in the buffer where the file descriptor is stored. buffer_offset: usize, + /// Whether this fd should be closed when the allocation is freed. + close_on_free: bool, } pub(crate) struct TranslatedFds { reservations: Vec, + /// If commit is called, then these fds should be closed. (If commit is not called, then they + /// shouldn't be closed.) + close_on_free: FdsCloseOnFree, } struct Reservation { @@ -479,12 +538,17 @@ impl TranslatedFds { pub(crate) fn new() -> Self { Self { reservations: Vec::new(), + close_on_free: FdsCloseOnFree(Vec::new()), } } - pub(crate) fn commit(self) { + pub(crate) fn commit(self) -> FdsCloseOnFree { for entry in self.reservations { entry.res.commit(entry.file); } + + self.close_on_free } } + +pub(crate) struct FdsCloseOnFree(Vec); diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index fa4ec3eff424..8f9419d474de 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -106,6 +106,7 @@ fn default() -> Self { decl_wrapper!(BinderNodeInfoForRef, bindings::binder_node_info_for_ref); decl_wrapper!(FlatBinderObject, bindings::flat_binder_object); decl_wrapper!(BinderFdObject, bindings::binder_fd_object); +decl_wrapper!(BinderFdArrayObject, bindings::binder_fd_array_object); decl_wrapper!(BinderObjectHeader, bindings::binder_object_header); decl_wrapper!(BinderBufferObject, bindings::binder_buffer_object); decl_wrapper!(BinderTransactionData, bindings::binder_transaction_data); diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 56b36dc43bcc..2e86592fb61f 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -654,7 +654,8 @@ fn translate_object( const FD_FIELD_OFFSET: usize = ::core::mem::offset_of!(bindings::binder_fd_object, __bindgen_anon_1.fd) as usize; - view.alloc.info_add_fd(file, offset + FD_FIELD_OFFSET)?; + view.alloc + .info_add_fd(file, offset + FD_FIELD_OFFSET, false)?; } BinderObjectRef::Ptr(obj) => { let obj_length = obj.length.try_into().map_err(|_| EINVAL)?; @@ -729,9 +730,77 @@ fn translate_object( obj_write.parent_offset = obj.parent_offset; view.write::(offset, &obj_write)?; } - BinderObjectRef::Fda(_obj) => { - pr_warn!("Using unsupported binder object type fda."); - return Err(EINVAL.into()); + BinderObjectRef::Fda(obj) => { + if !allow_fds { + return Err(EPERM.into()); + } + let parent_index = usize::try_from(obj.parent).map_err(|_| EINVAL)?; + let parent_offset = usize::try_from(obj.parent_offset).map_err(|_| EINVAL)?; + let num_fds = usize::try_from(obj.num_fds).map_err(|_| EINVAL)?; + let fds_len = num_fds.checked_mul(size_of::()).ok_or(EINVAL)?; + + view.alloc.info_add_fd_reserve(num_fds)?; + + let info = sg_state.validate_parent_fixup(parent_index, parent_offset, fds_len)?; + + sg_state.ancestors.truncate(info.num_ancestors); + let parent_entry = match sg_state.sg_entries.get_mut(info.parent_sg_index) { + Some(parent_entry) => parent_entry, + None => { + pr_err!( + "validate_parent_fixup returned index out of bounds for sg.entries" + ); + return Err(EINVAL.into()); + } + }; + + parent_entry.fixup_min_offset = info.new_min_offset; + parent_entry + .pointer_fixups + .try_push(PointerFixupEntry { + skip: fds_len, + pointer_value: 0, + target_offset: info.target_offset, + }) + .map_err(|_| ENOMEM)?; + + let fda_uaddr = parent_entry + .sender_uaddr + .checked_add(parent_offset) + .ok_or(EINVAL)?; + let fda_bytes = UserSlicePtr::new(fda_uaddr as _, fds_len).read_all()?; + + if fds_len != fda_bytes.len() { + pr_err!("UserSlicePtr::read_all returned wrong length in BINDER_TYPE_FDA"); + return Err(EINVAL.into()); + } + + for i in (0..fds_len).step_by(size_of::()) { + let fd = { + let mut fd_bytes = [0u8; size_of::()]; + fd_bytes.copy_from_slice(&fda_bytes[i..i + size_of::()]); + u32::from_ne_bytes(fd_bytes) + }; + + let file = File::from_fd(fd)?; + security::binder_transfer_file( + &self.process.cred, + &view.alloc.process.cred, + &file, + )?; + + // The `validate_parent_fixup` call ensuers that this addition will not + // overflow. + view.alloc.info_add_fd(file, info.target_offset + i, true)?; + } + drop(fda_bytes); + + let mut obj_write = BinderFdArrayObject::default(); + obj_write.hdr.type_ = BINDER_TYPE_FDA; + obj_write.num_fds = obj.num_fds; + obj_write.parent = obj.parent; + obj_write.parent_offset = obj.parent_offset; + view.write::(offset, &obj_write)?; } } Ok(()) @@ -1160,7 +1229,15 @@ fn write(self: &Arc, req: &mut BinderWriteRead) -> Result { let tr = reader.read::()?; self.transaction(&tr, Self::reply_inner) } - BC_FREE_BUFFER => drop(self.process.buffer_get(reader.read()?)), + BC_FREE_BUFFER => { + let buffer = self.process.buffer_get(reader.read()?); + if let Some(buffer) = &buffer { + if buffer.looper_need_return_on_free() { + self.inner.lock().looper_need_return = true; + } + } + drop(buffer); + } BC_INCREFS => self.process.update_ref(reader.read()?, true, false)?, BC_ACQUIRE => self.process.update_ref(reader.read()?, true, true)?, BC_RELEASE => self.process.update_ref(reader.read()?, false, true)?, diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index 3230ea490a5b..ec32a9fd0ff1 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -288,17 +288,18 @@ fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) - writer.write(&*tr)?; } + let mut alloc = self.allocation.lock().take().ok_or(ESRCH)?; + // Dismiss the completion of transaction with a failure. No failure paths are allowed from // here on out. send_failed_reply.dismiss(); - // It is now the user's responsibility to clear the allocation. - let alloc = self.allocation.lock().take(); - if let Some(alloc) = alloc { - alloc.keep_alive(); - } + // Commit files, and set FDs in FDA to be closed on buffer free. + let close_on_free = files.commit(); + alloc.set_info_close_on_free(close_on_free); - files.commit(); + // It is now the user's responsibility to clear the allocation. + alloc.keep_alive(); // When this is not a reply and not a oneway transaction, update `current_transaction`. If // it's a reply, `current_transaction` has already been updated appropriately. From patchwork Wed Nov 1 18:01:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160631 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605341vqx; Wed, 1 Nov 2023 11:03:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE8wanD04khCzLFyx0zA2pPMY3rC1LLCY4p0UEGTIp22MARaF3WJx5bSA0stEpdDhDu/JlZ X-Received: by 2002:a17:903:4093:b0:1cc:26ee:f4c5 with SMTP id z19-20020a170903409300b001cc26eef4c5mr9971667plc.68.1698861827089; Wed, 01 Nov 2023 11:03:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861827; cv=none; d=google.com; s=arc-20160816; b=jBjpiTwdaj9i2m0bcQApJuY03Z66hKGTmOGopXcDRWWhxwCfU1Eiaha7iHbclxwOir 6iKixysbETzpiH1S3dD8w8APACxzF55AY/3NFMpyxc5kOUvXMPTJ3+ZLa8a7yIBbr4ec 9ACpGi0MtcATtC+/PO77IcXV/tOuGsnv43ewJnIWA8egqaKzf0bHLQTqBilvzRTIniOx NPgNgPJ7+BTn8hAaC7ztd9oxnem4u/sGTPowWZlwpNQv319A0KUNP7pJsMkauwxucV5z J9qBYycYduaWfmsHl0bDcwmI/ltRerTqnnRZgtIZxLrPvLXov5RgzQawjXd1wOjojg62 YgeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=CWakx9lTbzNwRej/JqTJ1fh/9TiorybtUYSDPZLWX0Q=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=y5s3O60e9RhXufb0GKTZnUUicuVOBtb2NvFn9LBCfqD2qCSrHFv0srTAPe6uMoV6eW FUsxdQPCKKTKPbcQ7LwhWfmtg9Ti3mE7CKbNDPdD2Fegji/0DjRu7iBoZ9vson5oWmKT eYGAL13HrnFEIOjz1uoyATvRPicCsfymDytuBS+fgpXn9uUK5mR/NarBhbSDI6MiKRik k9f9jKSmT6K7btK1NAF+4TcKnM91a8MSoxXztlcCE30wPq3qZMsSbVaAEYWFxlUo9Zbh ybqfLRNJMOUF9T8996q9qClcCFFi3uimJTvxK1F/M9qB0geBaXsRIVSV81i0oIjmeVxB FNYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=TTErfZPu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id w2-20020a170902e88200b001c7845637efsi3522164plg.488.2023.11.01.11.03.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:03:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=TTErfZPu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 3FB1F818F69D; Wed, 1 Nov 2023 11:03:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345102AbjKASDg (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344924AbjKASDD (ORCPT ); Wed, 1 Nov 2023 14:03:03 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 708EE13E for ; Wed, 1 Nov 2023 11:02:54 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5a828bdcfbaso1470947b3.2 for ; Wed, 01 Nov 2023 11:02:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861773; x=1699466573; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=CWakx9lTbzNwRej/JqTJ1fh/9TiorybtUYSDPZLWX0Q=; b=TTErfZPu9eWZt2qlwb3oiO/y/yGolOqKb36BDlzLdizonLcAfBst0Mbx4sm08ND4eE PegQon3micTEZGvLOg0eNy2S8LG0EBOl4Bzt2nF2KGKjKgV2n5wOSaOyVAtq64XWb/lx bLwlBLs0IUdhJ5TZf1aTLwBEneSkc5s2gMjQ+teoQ/KeCIU5H9VZPkTnZ/gAL/p9mUi7 eWDNWMchg8bDAAY78bpzQgJ8N6RZ7W6U+IOTWSaSGw42oXBCRykthL/WVZ7wx34InSJT Rl6awkF0pBDsere4+i7cjafhDQ6QdZT8lYEqc7v5bwHI6krh0IUaeDRNpsQ5yfk4ZFK7 iGfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861773; x=1699466573; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=CWakx9lTbzNwRej/JqTJ1fh/9TiorybtUYSDPZLWX0Q=; b=XwQCaQ/se20YbnMCVW3KxQTnEradvKlCBWa3Mol7gju7OeZ5o2S/sifsNFWRqKPRyH XSk0lzcKhkaffaMDhhx+UFxNJi0mqINYUB6Wtn0HyJR45mlfsdwQyusXKHNJBNTm+ISe FBHZuNgNg9aDMqMyW0xRzWijkkfqR6wNH3K6kLEnFld/6PE8EcPgG5ZkZ6cPRDKG9G46 qJCrQv4dnFTpjr3OeFl3x1ELfr+OQRflhhvDQe85olCKk6AhxHua8KBs6FnRHJhq1w5v z+Ml7u08yMH7Tk5AbeLEGWHPlGB9Mu4qDBxVcUhKGHCMcnwlYCmH8Wzpox1ynyTkpHgS B7JQ== X-Gm-Message-State: AOJu0YxMG741ryUuyJYpNGz3wSyvm520YOTwP0pNvYQf4QDVqPewL54I NYPF6oNVJnrWafsJTQmTSI0Yg60Qe2V5KVA= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a81:6cc5:0:b0:5a7:ad67:b4b6 with SMTP id h188-20020a816cc5000000b005a7ad67b4b6mr318063ywc.2.1698861773653; Wed, 01 Nov 2023 11:02:53 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:45 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-15-08ba9197f637@google.com> Subject: [PATCH RFC 15/20] rust_binder: add process freezing From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:46 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385739163830171 X-GMAIL-MSGID: 1781385739163830171 When you want to freeze a process, you should process all incoming transactions before you freeze it. This patch helps with that. The idea is that before you freeze the process, you mark it as frozen in the binder driver. When this happens, all new incoming transactions are rejected, which lets you empty the queue of incoming transactions that were sent before you decided to freeze the process. Once you have processed every transaction in that queue, you can perform the actual freeze operation. Signed-off-by: Alice Ryhl --- drivers/android/context.rs | 39 +++++++++++ drivers/android/defs.rs | 3 + drivers/android/error.rs | 8 +++ drivers/android/process.rs | 155 ++++++++++++++++++++++++++++++++++++++++- drivers/android/thread.rs | 12 +++- drivers/android/transaction.rs | 50 ++++++++++++- rust/kernel/sync/condvar.rs | 10 +++ 7 files changed, 272 insertions(+), 5 deletions(-) diff --git a/drivers/android/context.rs b/drivers/android/context.rs index b5de9d98a6b0..925c368238db 100644 --- a/drivers/android/context.rs +++ b/drivers/android/context.rs @@ -69,6 +69,18 @@ pub(crate) struct ContextList { list: List, } +pub(crate) fn get_all_contexts() -> Result>> { + let lock = CONTEXTS.lock(); + + let count = lock.list.iter().count(); + + let mut ctxs = Vec::try_with_capacity(count)?; + for ctx in &lock.list { + ctxs.try_push(Arc::from(ctx))?; + } + Ok(ctxs) +} + /// This struct keeps track of the processes using this context, and which process is the context /// manager. struct Manager { @@ -183,4 +195,31 @@ pub(crate) fn get_manager_node(&self, strong: bool) -> Result(&self, mut func: F) + where + F: FnMut(&Process), + { + let lock = self.manager.lock(); + for proc in &lock.all_procs { + func(&proc); + } + } + + pub(crate) fn get_all_procs(&self) -> Result>> { + let lock = self.manager.lock(); + let count = lock.all_procs.iter().count(); + + let mut procs = Vec::try_with_capacity(count)?; + for proc in &lock.all_procs { + procs.try_push(Arc::from(proc))?; + } + Ok(procs) + } + + pub(crate) fn get_procs_with_pid(&self, pid: i32) -> Result>> { + let mut procs = self.get_all_procs()?; + procs.retain(|proc| proc.task.pid() == pid); + Ok(procs) + } } diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 8f9419d474de..30659bd26bff 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -20,6 +20,7 @@ macro_rules! pub_no_prefix { BR_REPLY, BR_DEAD_REPLY, BR_FAILED_REPLY, + BR_FROZEN_REPLY, BR_NOOP, BR_SPAWN_LOOPER, BR_TRANSACTION_COMPLETE, @@ -120,6 +121,8 @@ fn default() -> Self { ); decl_wrapper!(BinderWriteRead, bindings::binder_write_read); decl_wrapper!(BinderVersion, bindings::binder_version); +decl_wrapper!(BinderFrozenStatusInfo, bindings::binder_frozen_status_info); +decl_wrapper!(BinderFreezeInfo, bindings::binder_freeze_info); decl_wrapper!(ExtendedError, bindings::binder_extended_error); impl BinderVersion { diff --git a/drivers/android/error.rs b/drivers/android/error.rs index 6735636d2a1c..5cc724931bd3 100644 --- a/drivers/android/error.rs +++ b/drivers/android/error.rs @@ -21,6 +21,13 @@ pub(crate) fn new_dead() -> Self { } } + pub(crate) fn new_frozen() -> Self { + Self { + reply: BR_FROZEN_REPLY, + source: None, + } + } + pub(crate) fn is_dead(&self) -> bool { self.reply == BR_DEAD_REPLY } @@ -76,6 +83,7 @@ fn fmt(&self, f: &mut core::fmt::Formatter<'_>) -> core::fmt::Result { None => f.pad("BR_FAILED_REPLY"), }, BR_DEAD_REPLY => f.pad("BR_DEAD_REPLY"), + BR_FROZEN_REPLY => f.pad("BR_FROZEN_REPLY"), BR_TRANSACTION_COMPLETE => f.pad("BR_TRANSACTION_COMPLETE"), _ => f .debug_struct("BinderError") diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 944297b7403c..44baf9e3f998 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -20,7 +20,9 @@ pages::Pages, prelude::*, rbtree::RBTree, - sync::{lock::Guard, Arc, ArcBorrow, Mutex, SpinLock, UniqueArc}, + sync::{ + lock::Guard, Arc, ArcBorrow, CondVar, CondVarTimeoutResult, Mutex, SpinLock, UniqueArc, + }, task::Task, types::{ARef, Either}, user_ptr::{UserSlicePtr, UserSlicePtrReader}, @@ -80,6 +82,16 @@ pub(crate) struct ProcessInner { /// Bitmap of deferred work to do. defer_work: u8, + + /// Number of transactions to be transmitted before processes in freeze_wait + /// are woken up. + outstanding_txns: u32, + /// Process is frozen and unable to service binder transactions. + pub(crate) is_frozen: bool, + /// Process received sync transactions since last frozen. + pub(crate) sync_recv: bool, + /// Process received async transactions since last frozen. + pub(crate) async_recv: bool, } impl ProcessInner { @@ -97,6 +109,10 @@ fn new() -> Self { max_threads: 0, started_thread_count: 0, defer_work: 0, + outstanding_txns: 0, + is_frozen: false, + sync_recv: false, + async_recv: false, } } @@ -248,6 +264,22 @@ pub(crate) fn death_delivered(&mut self, death: DArc) { pr_warn!("Notification added to `delivered_deaths` twice."); } } + + pub(crate) fn add_outstanding_txn(&mut self) { + self.outstanding_txns += 1; + } + + fn txns_pending_locked(&self) -> bool { + if self.outstanding_txns > 0 { + return true; + } + for thread in self.threads.values() { + if thread.has_current_transaction() { + return true; + } + } + false + } } struct NodeRefInfo { @@ -296,6 +328,11 @@ pub(crate) struct Process { #[pin] pub(crate) inner: SpinLock, + // Waitqueue of processes waiting for all outstanding transactions to be + // processed. + #[pin] + freeze_wait: CondVar, + // Node references are in a different lock to avoid recursive acquisition when // incrementing/decrementing a node in another process. #[pin] @@ -353,6 +390,7 @@ fn new(ctx: Arc, cred: ARef) -> Result> { cred, inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"), node_refs <- kernel::new_mutex!(ProcessNodeRefs::new(), "Process::node_refs"), + freeze_wait <- kernel::new_condvar!("Process::freeze_wait"), task: kernel::current!().group_leader().into(), defer_work <- kernel::new_work!("Process::defer_work"), links <- ListLinks::new(), @@ -878,6 +916,9 @@ fn deferred_release(self: Arc) { let is_manager = { let mut inner = self.inner.lock(); inner.is_dead = true; + inner.is_frozen = false; + inner.sync_recv = false; + inner.async_recv = false; inner.is_manager }; @@ -975,6 +1016,116 @@ pub(crate) fn flush(this: ArcBorrow<'_, Process>) -> Result { } Ok(()) } + + pub(crate) fn drop_outstanding_txn(&self) { + let wake = { + let mut inner = self.inner.lock(); + if inner.outstanding_txns == 0 { + pr_err!("outstanding_txns underflow"); + return; + } + inner.outstanding_txns -= 1; + inner.is_frozen && inner.outstanding_txns == 0 + }; + + if wake { + self.freeze_wait.notify_all(); + } + } + + pub(crate) fn ioctl_freeze(&self, info: &BinderFreezeInfo) -> Result { + if info.enable != 0 { + let mut inner = self.inner.lock(); + inner.sync_recv = false; + inner.async_recv = false; + inner.is_frozen = false; + return Ok(()); + } + + let mut inner = self.inner.lock(); + inner.sync_recv = false; + inner.async_recv = false; + inner.is_frozen = true; + + if info.timeout_ms > 0 { + // Safety: Just an FFI call. + let mut jiffies = unsafe { bindings::__msecs_to_jiffies(info.timeout_ms) }; + while jiffies > 0 { + if inner.outstanding_txns == 0 { + break; + } + + match self.freeze_wait.wait_timeout(&mut inner, jiffies) { + CondVarTimeoutResult::Signal { .. } => { + inner.is_frozen = false; + return Err(ERESTARTSYS); + } + CondVarTimeoutResult::Woken { jiffies: remaining } => { + jiffies = remaining; + } + CondVarTimeoutResult::Timeout => { + jiffies = 0; + } + } + } + } + + if inner.txns_pending_locked() { + inner.is_frozen = false; + Err(EAGAIN) + } else { + Ok(()) + } + } +} + +fn get_frozen_status(data: UserSlicePtr) -> Result { + let (mut reader, mut writer) = data.reader_writer(); + + let mut info = reader.read::()?; + info.sync_recv = 0; + info.async_recv = 0; + let mut found = false; + + for ctx in crate::context::get_all_contexts()? { + ctx.for_each_proc(|proc| { + if proc.task.pid() == info.pid as _ { + found = true; + let inner = proc.inner.lock(); + let txns_pending = inner.txns_pending_locked(); + info.async_recv |= inner.async_recv as u32; + info.sync_recv |= inner.sync_recv as u32; + info.sync_recv |= (txns_pending as u32) << 1; + } + }); + } + + if found { + writer.write(&info)?; + Ok(()) + } else { + Err(EINVAL) + } +} + +fn ioctl_freeze(reader: &mut UserSlicePtrReader) -> Result { + let info = reader.read::()?; + + // Very unlikely for there to be more than 3, since a process normally uses at most binder and + // hwbinder. + let mut procs = Vec::try_with_capacity(3)?; + + let ctxs = crate::context::get_all_contexts()?; + for ctx in ctxs { + for proc in ctx.get_procs_with_pid(info.pid as i32)? { + procs.try_push(proc)?; + } + } + + for proc in procs { + proc.ioctl_freeze(&info)?; + } + Ok(()) } /// The ioctl handler. @@ -993,6 +1144,7 @@ fn write( bindings::BINDER_SET_CONTEXT_MGR_EXT => { this.set_as_manager(Some(reader.read()?), &thread)? } + bindings::BINDER_FREEZE => ioctl_freeze(reader)?, _ => return Err(EINVAL), } Ok(0) @@ -1011,6 +1163,7 @@ fn read_write( bindings::BINDER_GET_NODE_DEBUG_INFO => this.get_node_debug_info(data)?, bindings::BINDER_GET_NODE_INFO_FOR_REF => this.get_node_info_from_ref(data)?, bindings::BINDER_VERSION => this.version(data)?, + bindings::BINDER_GET_FROZEN_INFO => get_frozen_status(data)?, bindings::BINDER_GET_EXTENDED_ERROR => thread.get_extended_error(data)?, _ => return Err(EINVAL), } diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 2e86592fb61f..0238c15604f6 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -458,6 +458,10 @@ pub(crate) fn set_current_transaction(&self, transaction: DArc) { self.inner.lock().current_transaction = Some(transaction); } + pub(crate) fn has_current_transaction(&self) -> bool { + self.inner.lock().current_transaction.is_some() + } + /// Attempts to fetch a work item from the thread-local queue. The behaviour if the queue is /// empty depends on `wait`: if it is true, the function waits for some work to be queued (or a /// signal); otherwise it returns indicating that none is available. @@ -482,7 +486,7 @@ fn get_work_local(self: &Arc, wait: bool) -> Result, wait: bool) -> Result, u32>, transaction: &DArc, ) -> bool { + if let Either::Left(transaction) = &reply { + transaction.set_outstanding(&mut self.process.inner.lock()); + } + { let mut inner = self.inner.lock(); if !inner.pop_transaction_replied(transaction) { diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index ec32a9fd0ff1..96f63684b1a3 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 +use core::sync::atomic::{AtomicBool, Ordering}; use kernel::{ io_buffer::IoBufferWriter, list::ListArcSafe, @@ -15,13 +16,13 @@ defs::*, error::{BinderError, BinderResult}, node::{Node, NodeRef}, - process::Process, + process::{Process, ProcessInner}, ptr_align, thread::{PushWorkRes, Thread}, DArc, DLArc, DTRWrap, DeliverToRead, }; -#[pin_data] +#[pin_data(PinnedDrop)] pub(crate) struct Transaction { target_node: Option>, stack_next: Option>, @@ -29,6 +30,7 @@ pub(crate) struct Transaction { to: Arc, #[pin] allocation: SpinLock>, + is_outstanding: AtomicBool, code: u32, pub(crate) flags: u32, data_size: usize, @@ -94,6 +96,7 @@ pub(crate) fn new( offsets_size: trd.offsets_size as _, data_address, allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), + is_outstanding: AtomicBool::new(false), txn_security_ctx_off, }))?) } @@ -127,6 +130,7 @@ pub(crate) fn new_reply( offsets_size: trd.offsets_size as _, data_address: alloc.ptr, allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), + is_outstanding: AtomicBool::new(false), txn_security_ctx_off: None, }))?) } @@ -172,6 +176,26 @@ pub(crate) fn find_from(&self, thread: &Thread) -> Option> { None } + pub(crate) fn set_outstanding(&self, to_process: &mut ProcessInner) { + // No race because this method is only called once. + if !self.is_outstanding.load(Ordering::Relaxed) { + self.is_outstanding.store(true, Ordering::Relaxed); + to_process.add_outstanding_txn(); + } + } + + /// Decrement `outstanding_txns` in `to` if it hasn't already been decremented. + fn drop_outstanding_txn(&self) { + // No race because this is called at most twice, and one of the calls are in the + // destructor, which is guaranteed to not race with any other operations on the + // transaction. It also cannot race with `set_outstanding`, since submission happens + // before delivery. + if self.is_outstanding.load(Ordering::Relaxed) { + self.is_outstanding.store(false, Ordering::Relaxed); + self.to.drop_outstanding_txn(); + } + } + /// Submits the transaction to a work queue. Uses a thread if there is one in the transaction /// stack, otherwise uses the destination process. /// @@ -181,8 +205,13 @@ pub(crate) fn submit(self: DLArc) -> BinderResult { let process = self.to.clone(); let mut process_inner = process.inner.lock(); + self.set_outstanding(&mut process_inner); + if oneway { if let Some(target_node) = self.target_node.clone() { + if process_inner.is_frozen { + process_inner.async_recv = true; + } match target_node.submit_oneway(self, &mut process_inner) { Ok(()) => return Ok(()), Err((err, work)) => { @@ -197,6 +226,11 @@ pub(crate) fn submit(self: DLArc) -> BinderResult { } } + if process_inner.is_frozen { + process_inner.sync_recv = true; + return Err(BinderError::new_frozen()); + } + let res = if let Some(thread) = self.find_target_thread() { match thread.push_work(self) { PushWorkRes::Ok => Ok(()), @@ -241,6 +275,7 @@ fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) - let reply = Either::Right(BR_FAILED_REPLY); self.from.deliver_reply(reply, &self); } + self.drop_outstanding_txn(); }); let files = if let Ok(list) = self.prepare_file_list() { list @@ -301,6 +336,8 @@ fn do_work(self: DArc, thread: &Thread, writer: &mut UserSlicePtrWriter) - // It is now the user's responsibility to clear the allocation. alloc.keep_alive(); + self.drop_outstanding_txn(); + // When this is not a reply and not a oneway transaction, update `current_transaction`. If // it's a reply, `current_transaction` has already been updated appropriately. if self.target_node.is_some() && tr_sec.transaction_data.flags & TF_ONE_WAY == 0 { @@ -318,9 +355,18 @@ fn cancel(self: DArc) { let reply = Either::Right(BR_DEAD_REPLY); self.from.deliver_reply(reply, &self); } + + self.drop_outstanding_txn(); } fn should_sync_wakeup(&self) -> bool { self.flags & TF_ONE_WAY == 0 } } + +#[pinned_drop] +impl PinnedDrop for Transaction { + fn drop(self: Pin<&mut Self>) { + self.drop_outstanding_txn(); + } +} diff --git a/rust/kernel/sync/condvar.rs b/rust/kernel/sync/condvar.rs index 07cf6ba2e757..490fdf378e42 100644 --- a/rust/kernel/sync/condvar.rs +++ b/rust/kernel/sync/condvar.rs @@ -191,6 +191,16 @@ pub fn wait(&self, guard: &mut Guard<'_, T, B>) -> bool { crate::current!().signal_pending() } + /// Releases the lock and waits for a notification in interruptible and freezable mode. + #[must_use = "wait returns if a signal is pending, so the caller must check the return value"] + pub fn wait_freezable(&self, guard: &mut Guard<'_, T, B>) -> bool { + self.wait_internal( + bindings::TASK_INTERRUPTIBLE | bindings::TASK_FREEZABLE, + guard, + ); + crate::current!().signal_pending() + } + /// Releases the lock and waits for a notification in uninterruptible mode. /// /// Similar to [`CondVar::wait`], except that the wait is not interruptible. That is, the From patchwork Wed Nov 1 18:01:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160642 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp606038vqx; Wed, 1 Nov 2023 11:04:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFn8MTSnWGuU4pRYEsGCa2154Uba36gWmQPwJcLDwrDWA14IJS4R1E1nmibwDXhfA2Uc/MK X-Received: by 2002:a05:687c:354c:b0:1e9:cddf:f825 with SMTP id li12-20020a05687c354c00b001e9cddff825mr11046123oac.15.1698861880955; Wed, 01 Nov 2023 11:04:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861880; cv=none; d=google.com; s=arc-20160816; b=w132NcelPO5qqfV+BmF9dC1rPNBMu9Dxn5f0JxX9+s+nqXgZ4fSXIWy2Mh45sgpDZJ 1d12As0awbbXJoH7OmlxxvT/F3XAgTgxvENHj++aUGX6lrVKpAK342Veb/w8J2p0gKXc lQk4VB90ZHaGd5ziQxGv0+f20oYsict3EWaR+6fYnNINMpmxSsL720Zw7v3AxM5iFDAL UY55tegM4ECB5v8CijkoDke2qPk27gVaLDWVN13WH/BZId6cTEfmTzDTaCquYPa7OwfU Jn6hwg9Mqx/ajATWgoFK8bJP4MFguvM4XGvg5f/gaKqOI7BUhE/fVDyk0tLt2J/EKrSD 42QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=ne8wQirTzVbrlGSbYvyMv1H5eqmeqAnY768DH+xgy+c=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=xLV876Am0CCq3b2JQGv1AaPXihwxprzJ8vJ2DK0uN9PV8dtAAqs//UMyPTGe+Xth/P SzJoSsqIo+hF8NR+v+QyU5xaATkrm3Vh6yUv5cjRExy2LhcpwecyzSx7D10DmAafrqQ8 7dgrRl024+mkugOrvBwHkCjWGhLxMQBOmCCy+f2nyUtnS/5O2B/cBy7oGuz/G4wT+879 0xtv2yD33z17SQp3oRhW4mvrAeUqiLIOMteIr378q53OVC4BNV/ZVgcFu+tSvvUUSP6P UN8CyfRuvY6SCCrYxWO/CDepJMUs1NM99GbAbNktyzBvluvzUws/g75BpgddmMKnhJr/ PgOg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=ciKkrd4N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id t4-20020a63f344000000b005aaefc07ccfsi291192pgj.36.2023.11.01.11.04.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=ciKkrd4N; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 9B03C80CF525; Wed, 1 Nov 2023 11:04:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345057AbjKASDj (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344948AbjKASDE (ORCPT ); Wed, 1 Nov 2023 14:03:04 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1647A188 for ; Wed, 1 Nov 2023 11:02:57 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5b053454aeeso1313077b3.0 for ; Wed, 01 Nov 2023 11:02:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861776; x=1699466576; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ne8wQirTzVbrlGSbYvyMv1H5eqmeqAnY768DH+xgy+c=; b=ciKkrd4NOvPu4IN/xcE5MLQCYSmApZR+sYp8fyQKm9wDem9bF0zUsEhS+6SmHL5O04 5DCDWtXaDoK6XweYiXGLegnQXvsDdwD7/euB/wPRc9q+UEvs3tQocjouUV1hzg2hjyKM GmIT+vkIyG4BZsNfKwOWIHKmXv4al8zwqO6mzZxwFgYVHMPerGqdYMppzZlZx1PXNchO QzPF9JTS2otUBNwdv2itom771fwjzftbt9TxTGn8enLhrOSK6OKI+dGFiN5TDX5ZICgl EyIRzroXaK49fHdDdJDyWajUwn4Oc8Ah0D9J5TjaFG2Pl+dPTvbtobwN9R186fvEWf5A Q+rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861776; x=1699466576; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ne8wQirTzVbrlGSbYvyMv1H5eqmeqAnY768DH+xgy+c=; b=h2qoIq0djNgdhGG022Cf1am8tdrYMrun5KskG23SyZeBQfiT0l61G4IVPpyr+gNB4G ue150ntlsDo/biu4pkpvwYYLS89vz+lhais8QPcDE7ui4jAxHLZvZgbBnYSQB6VzKNbv f018OvsiY4z2SG47IBg/hnJChAVlKxC+0LhJ6w4wX4eJ3AQSiHDOUmQg0KHddU5NPS72 H1B7HxpfhBLhLpgCX+0xb46No0k2GsFGT4FOFjV3kCgTlCI5g+VKbWt0rFNDn9esdCYl SpOQ/G0j9E9Qj042vb5ZB6h+3phX5aYSU9K6nU3WTX1lyA4xeVtsNtiSww3fAsTsmZlk //Qw== X-Gm-Message-State: AOJu0YzKkZI3oqrd5cxHw0Cp6pCjPp4ajD2lcRiDdBzoGDpPM5l/fz+H tt4PExBq3nYvNWD8brninbGPBpo18oFbc9c= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a25:e706:0:b0:d9a:36cd:482e with SMTP id e6-20020a25e706000000b00d9a36cd482emr313300ybh.13.1698861776263; Wed, 01 Nov 2023 11:02:56 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:46 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-16-08ba9197f637@google.com> Subject: [PATCH RFC 16/20] rust_binder: add TF_UPDATE_TXN support From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:04:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385795857790003 X-GMAIL-MSGID: 1781385795857790003 When a process is frozen, incoming oneway transactions are held in a queue until the process is unfrozen. If many oneway transactions are sent, then the process could run out of space for them. This patch adds a flag that avoids this by replacing previous oneway transactions in the queue to avoid having transactions of the same type build up. This can be useful when only the most recent transaction is necessary. Signed-off-by: Alice Ryhl --- drivers/android/defs.rs | 8 +++++++- drivers/android/node.rs | 19 +++++++++++++++++++ drivers/android/transaction.rs | 26 ++++++++++++++++++++++++++ 3 files changed, 52 insertions(+), 1 deletion(-) diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index 30659bd26bff..b1a54f85b365 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -58,7 +58,13 @@ macro_rules! pub_no_prefix { pub(crate) const FLAT_BINDER_FLAG_ACCEPTS_FDS: u32 = kernel::bindings::FLAT_BINDER_FLAG_ACCEPTS_FDS; pub(crate) const FLAT_BINDER_FLAG_TXN_SECURITY_CTX: u32 = kernel::bindings::FLAT_BINDER_FLAG_TXN_SECURITY_CTX; -pub_no_prefix!(transaction_flags_, TF_ONE_WAY, TF_ACCEPT_FDS, TF_CLEAR_BUF); +pub_no_prefix!( + transaction_flags_, + TF_ONE_WAY, + TF_ACCEPT_FDS, + TF_CLEAR_BUF, + TF_UPDATE_TXN +); pub(crate) use bindings::{ BINDER_TYPE_BINDER, BINDER_TYPE_FD, BINDER_TYPE_FDA, BINDER_TYPE_HANDLE, BINDER_TYPE_PTR, diff --git a/drivers/android/node.rs b/drivers/android/node.rs index 7ed494bf9f7c..2c056bd7582e 100644 --- a/drivers/android/node.rs +++ b/drivers/android/node.rs @@ -298,6 +298,25 @@ pub(crate) fn pending_oneway_finished(&self) { } } } + + /// Finds an outdated transaction that the given transaction can replace. + /// + /// If one is found, it is removed from the list and returned. + pub(crate) fn take_outdated_transaction( + &self, + new: &Transaction, + guard: &mut Guard<'_, ProcessInner, SpinLockBackend>, + ) -> Option> { + let inner = self.inner.access_mut(guard); + let mut cursor_opt = inner.oneway_todo.cursor_front(); + while let Some(cursor) = cursor_opt { + if new.can_replace(&cursor.current()) { + return Some(cursor.remove()); + } + cursor_opt = cursor.next(); + } + None + } } impl DeliverToRead for Node { diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index 96f63684b1a3..7028c504ef8c 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -201,6 +201,9 @@ fn drop_outstanding_txn(&self) { /// /// Not used for replies. pub(crate) fn submit(self: DLArc) -> BinderResult { + // Defined before `process_inner` so that the destructor runs after releasing the lock. + let mut _t_outdated = None; + let oneway = self.flags & TF_ONE_WAY != 0; let process = self.to.clone(); let mut process_inner = process.inner.lock(); @@ -211,6 +214,10 @@ pub(crate) fn submit(self: DLArc) -> BinderResult { if let Some(target_node) = self.target_node.clone() { if process_inner.is_frozen { process_inner.async_recv = true; + if self.flags & TF_UPDATE_TXN != 0 { + _t_outdated = + target_node.take_outdated_transaction(&self, &mut process_inner); + } } match target_node.submit_oneway(self, &mut process_inner) { Ok(()) => return Ok(()), @@ -251,6 +258,25 @@ pub(crate) fn submit(self: DLArc) -> BinderResult { } } + /// Check whether one oneway transaction can supersede another. + pub(crate) fn can_replace(&self, old: &Transaction) -> bool { + if self.from.process.task.pid() != old.from.process.task.pid() { + return false; + } + + if self.flags & old.flags & (TF_ONE_WAY | TF_UPDATE_TXN) != (TF_ONE_WAY | TF_UPDATE_TXN) { + return false; + } + + let target_node_match = match (self.target_node.as_ref(), old.target_node.as_ref()) { + (None, None) => true, + (Some(tn1), Some(tn2)) => Arc::ptr_eq(tn1, tn2), + _ => false, + }; + + self.code == old.code && self.flags == old.flags && target_node_match + } + fn prepare_file_list(&self) -> Result { let mut alloc = self.allocation.lock().take().ok_or(ESRCH)?; From patchwork Wed Nov 1 18:01:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160644 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp606067vqx; Wed, 1 Nov 2023 11:04:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFxKHUGG1QNycAKs/SvZoBvQuqjlLzdK3rfJ4UX5k4HpHL51ZC6zoJnndzmaY4nuHotIbgy X-Received: by 2002:a05:6358:6f09:b0:168:e2d5:ddcc with SMTP id r9-20020a0563586f0900b00168e2d5ddccmr21007758rwn.7.1698861884072; Wed, 01 Nov 2023 11:04:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861884; cv=none; d=google.com; s=arc-20160816; b=d2D9OmJjPRnfUqXvUOTWzxBu3CYONo8u7iCRSHkWM8y6mwA6z9WNOEJpI8wF1IUJCZ X3fX2lymlRqvxrbOexLcHIxr3mbEobGGUJDz3EfMpr3kQYbix8G+X29ve6Kq8K/zh/gU d7gC35kiKqd1C+Ge8RRVXJlLVH8NVzwQi/y0uYWqVCQagqMsGh/jzpsdyHJjOo4upGe+ RwXD0dpLTwHf3gQuVsuPJmwbboScqRFNoLCvfL/mtUPD52n8MhqMXwz8E5bGo+VL22KA FLzVGMUFjqhE2QoWIbMPZY6EjLFC/1/1zgFWZf1XHmMoA86apT0GiKjckmjzt/u99I6Z GWYw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=nYW75oSrNQi79Zio08wfQNCgl+W4tz8r9dl1ZKyv6O4=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=dO8cA57qJcdbgSFq26o4yKw52tjxqr/ndd79Vp5oQ7iLL2xR7KYOHsLVUvQNoYIs1a 3NqNrdDPdXBTG+q56NxiknYXmohlGht6M1fl7OXujxwdeFZy6+cBiAMhCMBEiOoUmR4j RorEtuQJfiyCvMBL5jCOc0WYmV9CpQFPf3jf8TxpqHti6AzTgCZH9jZPMutVzx8piHgX +kMWgtgwSUfvF3JGgurdUn+tDDtCZK7av7VDJr30HgPjjRTJDtMNlasqmgOwLfqWap7x Zyw/P5OoJaoZva2h8kEtygG+QEQhElTi2hK89LuiXqJPCj1EPUTKwkYnwvc02F4uc6p+ l9MQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=VhhTZgu3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id u11-20020a6540cb000000b005b92e5e9df9si343253pgp.257.2023.11.01.11.04.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=VhhTZgu3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 7E5A4802788E; Wed, 1 Nov 2023 11:04:31 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344968AbjKASDt (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344895AbjKASDG (ORCPT ); Wed, 1 Nov 2023 14:03:06 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D44D18E for ; Wed, 1 Nov 2023 11:03:00 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5b0e9c78309so1216177b3.1 for ; Wed, 01 Nov 2023 11:03:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861779; x=1699466579; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=nYW75oSrNQi79Zio08wfQNCgl+W4tz8r9dl1ZKyv6O4=; b=VhhTZgu3xoCVrmyn1AamR9fTEEa4P+z1B3WIF1mxz4WmGKRR0cVTghw4qzHgFKgp7R x9SwYKVB5juB3OVO1RAWIGznKpTRthfkhuWBP8OU9bla3Tive5OXQqVxqJbZnY3UuWb7 LkouFMfgIvPZwyij8enPS0GJ2EpIG695O61iG3oSEu4+/7/9fLMbV47CArN53S6JmqWL uIdVeCM0ZRgAVgCNIflEjkLgCwP5K4EpPj/ObS8nPIPchxElKjRNVFZ/Fbdv2NNuAMuP hhdMBDcwl6CDfyP9Aqfo8XDDDh/+eDNeII/MpmXQI+5NXXk1Pda99gtSUwqEsvFlanPs J/sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861779; x=1699466579; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=nYW75oSrNQi79Zio08wfQNCgl+W4tz8r9dl1ZKyv6O4=; b=oMRA304K6PSTavp+0gB14xPh+ofKbIynDnDLJ2ZPOX8lUaqlDIOjtBAvOPslg0Qdm2 HsJNaBAuu51kNIQA1u43f4Xqt+118qFzH7KVUl9T0jy9j4N/+ryzRvmcwHvfxyK9ZuPa G4oP6UkV9VGiw2AeALQ5vhoL1rqV2g0Rxuu+9rmsdmbeD19sMi4SVVx3EXZDezGJm5QK sqFFROEI/2dcOjYwo5R6dOI+ceFRLEHkUmlD8eRr4h44vsnkmlrZaB+Ch+I0sNNFg7yz VKVqNcU1QfUbqt7Uz41VmI0KBAsMQo84UXUd/8Zbiyk08uw6ecgzm9vW1NbVR7xE+20i k/wA== X-Gm-Message-State: AOJu0Yy18Oo1KikJr6HlIGzlY6/Mx4Mypk7LJnu/XZnrUgVhc5zajJHZ 6FJY+dyWi+jPWxHPN0pVufg0tCDojh7bns0= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a81:6e54:0:b0:595:5cf0:a9b0 with SMTP id j81-20020a816e54000000b005955cf0a9b0mr342967ywc.9.1698861779078; Wed, 01 Nov 2023 11:02:59 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:47 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-17-08ba9197f637@google.com> Subject: [PATCH RFC 17/20] rust_binder: add oneway spam detection From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:04:31 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385799079396066 X-GMAIL-MSGID: 1781385799079396066 From: Matt Gilbride The idea is that once we cross a certain threshold of free async space, whoever is responsible for the low async space is likely to try to send another async transaction. This change allows servers to turn on oneway spam detection and return a different binder reply when it is detected. Signed-off-by: Matt Gilbride Co-developed-by: Alice Ryhl Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 3 ++ drivers/android/defs.rs | 1 + drivers/android/process.rs | 39 ++++++++++++++++++++++++-- drivers/android/range_alloc.rs | 62 ++++++++++++++++++++++++++++++++++++++++-- drivers/android/rust_binder.rs | 1 - drivers/android/thread.rs | 11 ++++++-- drivers/android/transaction.rs | 5 ++++ rust/kernel/task.rs | 2 +- 8 files changed, 115 insertions(+), 9 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs index c7f44a54b79b..7b64e7fcce4d 100644 --- a/drivers/android/allocation.rs +++ b/drivers/android/allocation.rs @@ -49,6 +49,7 @@ pub(crate) struct Allocation { pub(crate) process: Arc, allocation_info: Option, free_on_drop: bool, + pub(crate) oneway_spam_detected: bool, } impl Allocation { @@ -58,6 +59,7 @@ pub(crate) fn new( size: usize, ptr: usize, pages: Arc>>, + oneway_spam_detected: bool, ) -> Self { Self { process, @@ -65,6 +67,7 @@ pub(crate) fn new( size, ptr, pages, + oneway_spam_detected, allocation_info: None, free_on_drop: true, } diff --git a/drivers/android/defs.rs b/drivers/android/defs.rs index b1a54f85b365..e345b6ea45cc 100644 --- a/drivers/android/defs.rs +++ b/drivers/android/defs.rs @@ -24,6 +24,7 @@ macro_rules! pub_no_prefix { BR_NOOP, BR_SPAWN_LOOPER, BR_TRANSACTION_COMPLETE, + BR_ONEWAY_SPAM_SUSPECT, BR_OK, BR_ERROR, BR_INCREFS, diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 44baf9e3f998..4ac5d09041a4 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -92,6 +92,8 @@ pub(crate) struct ProcessInner { pub(crate) sync_recv: bool, /// Process received async transactions since last frozen. pub(crate) async_recv: bool, + /// Check for oneway spam + oneway_spam_detection_enabled: bool, } impl ProcessInner { @@ -113,6 +115,7 @@ fn new() -> Self { is_frozen: false, sync_recv: false, async_recv: false, + oneway_spam_detection_enabled: false, } } @@ -658,17 +661,21 @@ pub(crate) fn buffer_alloc( self: &Arc, size: usize, is_oneway: bool, + from_pid: i32, ) -> BinderResult { let alloc = range_alloc::ReserveNewBox::try_new()?; let mut inner = self.inner.lock(); let mapping = inner.mapping.as_mut().ok_or_else(BinderError::new_dead)?; - let offset = mapping.alloc.reserve_new(size, is_oneway, alloc)?; + let offset = mapping + .alloc + .reserve_new(size, is_oneway, from_pid, alloc)?; Ok(Allocation::new( self.clone(), offset, size, mapping.address + offset, mapping.pages.clone(), + mapping.alloc.oneway_spam_detected, )) } @@ -677,7 +684,14 @@ pub(crate) fn buffer_get(self: &Arc, ptr: usize) -> Option { let mapping = inner.mapping.as_mut()?; let offset = ptr.checked_sub(mapping.address)?; let (size, odata) = mapping.alloc.reserve_existing(offset).ok()?; - let mut alloc = Allocation::new(self.clone(), offset, size, ptr, mapping.pages.clone()); + let mut alloc = Allocation::new( + self.clone(), + offset, + size, + ptr, + mapping.pages.clone(), + mapping.alloc.oneway_spam_detected, + ); if let Some(data) = odata { alloc.set_info(data); } @@ -762,6 +776,14 @@ fn set_max_threads(&self, max: u32) { self.inner.lock().max_threads = max; } + fn set_oneway_spam_detection_enabled(&self, enabled: u32) { + self.inner.lock().oneway_spam_detection_enabled = enabled != 0; + } + + pub(crate) fn is_oneway_spam_detection_enabled(&self) -> bool { + self.inner.lock().oneway_spam_detection_enabled + } + fn get_node_debug_info(&self, data: UserSlicePtr) -> Result { let (mut reader, mut writer) = data.reader_writer(); @@ -948,9 +970,17 @@ fn deferred_release(self: Arc) { if let Some(mut mapping) = omapping { let address = mapping.address; let pages = mapping.pages.clone(); + let oneway_spam_detected = mapping.alloc.oneway_spam_detected; mapping.alloc.take_for_each(|offset, size, odata| { let ptr = offset + address; - let mut alloc = Allocation::new(self.clone(), offset, size, ptr, pages.clone()); + let mut alloc = Allocation::new( + self.clone(), + offset, + size, + ptr, + pages.clone(), + oneway_spam_detected, + ); if let Some(data) = odata { alloc.set_info(data); } @@ -1144,6 +1174,9 @@ fn write( bindings::BINDER_SET_CONTEXT_MGR_EXT => { this.set_as_manager(Some(reader.read()?), &thread)? } + bindings::BINDER_ENABLE_ONEWAY_SPAM_DETECTION => { + this.set_oneway_spam_detection_enabled(reader.read()?) + } bindings::BINDER_FREEZE => ioctl_freeze(reader)?, _ => return Err(EINVAL), } diff --git a/drivers/android/range_alloc.rs b/drivers/android/range_alloc.rs index e757129613cf..c1d47115e54d 100644 --- a/drivers/android/range_alloc.rs +++ b/drivers/android/range_alloc.rs @@ -3,6 +3,7 @@ use kernel::{ prelude::*, rbtree::{RBTree, RBTreeNode, RBTreeNodeReservation}, + task::Pid, }; /// Keeps track of allocations in a process' mmap. @@ -13,7 +14,9 @@ pub(crate) struct RangeAllocator { tree: RBTree>, free_tree: RBTree, + size: usize, free_oneway_space: usize, + pub(crate) oneway_spam_detected: bool, } impl RangeAllocator { @@ -26,6 +29,8 @@ pub(crate) fn new(size: usize) -> Result { free_oneway_space: size / 2, tree, free_tree, + oneway_spam_detected: false, + size, }) } @@ -40,6 +45,7 @@ pub(crate) fn reserve_new( &mut self, size: usize, is_oneway: bool, + pid: Pid, alloc: ReserveNewBox, ) -> Result { // Compute new value of free_oneway_space, which is set only on success. @@ -52,6 +58,15 @@ pub(crate) fn reserve_new( self.free_oneway_space }; + // Start detecting spammers once we have less than 20% + // of async space left (which is less than 10% of total + // buffer size). + // + // (This will short-circut, so `low_oneway_space` is + // only called when necessary.) + self.oneway_spam_detected = + is_oneway && new_oneway_space < self.size / 10 && self.low_oneway_space(pid); + let (found_size, found_off, tree_node, free_tree_node) = match self.find_best_match(size) { None => { pr_warn!("ENOSPC from range_alloc.reserve_new - size: {}", size); @@ -65,7 +80,7 @@ pub(crate) fn reserve_new( let new_desc = Descriptor::new(found_offset + size, found_size - size); let (tree_node, free_tree_node, desc_node_res) = alloc.initialize(new_desc); - desc.state = Some(DescriptorState::new(is_oneway, desc_node_res)); + desc.state = Some(DescriptorState::new(is_oneway, pid, desc_node_res)); desc.size = size; (found_size, found_offset, tree_node, free_tree_node) @@ -224,6 +239,30 @@ pub(crate) fn take_for_each)>(&mut self, callback: } } } + + /// Find the amount and size of buffers allocated by the current caller. + /// + /// The idea is that once we cross the threshold, whoever is responsible + /// for the low async space is likely to try to send another async transaction, + /// and at some point we'll catch them in the act. This is more efficient + /// than keeping a map per pid. + fn low_oneway_space(&self, calling_pid: Pid) -> bool { + let mut total_alloc_size = 0; + let mut num_buffers = 0; + for (_, desc) in self.tree.iter() { + if let Some(state) = &desc.state { + if state.is_oneway() && state.pid() == calling_pid { + total_alloc_size += desc.size; + num_buffers += 1; + } + } + } + + // Warn if this pid has more than 50 transactions, or more than 50% of + // async space (which is 25% of total buffer size). Oneway spam is only + // detected when the threshold is exceeded. + num_buffers > 50 || total_alloc_size > self.size / 4 + } } struct Descriptor { @@ -257,16 +296,32 @@ enum DescriptorState { } impl DescriptorState { - fn new(is_oneway: bool, free_res: FreeNodeRes) -> Self { + fn new(is_oneway: bool, pid: Pid, free_res: FreeNodeRes) -> Self { DescriptorState::Reserved(Reservation { is_oneway, + pid, free_res, }) } + + fn pid(&self) -> Pid { + match self { + DescriptorState::Reserved(inner) => inner.pid, + DescriptorState::Allocated(inner) => inner.pid, + } + } + + fn is_oneway(&self) -> bool { + match self { + DescriptorState::Reserved(inner) => inner.is_oneway, + DescriptorState::Allocated(inner) => inner.is_oneway, + } + } } struct Reservation { is_oneway: bool, + pid: Pid, free_res: FreeNodeRes, } @@ -275,6 +330,7 @@ fn allocate(self, data: Option) -> Allocation { Allocation { data, is_oneway: self.is_oneway, + pid: self.pid, free_res: self.free_res, } } @@ -282,6 +338,7 @@ fn allocate(self, data: Option) -> Allocation { struct Allocation { is_oneway: bool, + pid: Pid, free_res: FreeNodeRes, data: Option, } @@ -291,6 +348,7 @@ fn deallocate(self) -> (Reservation, Option) { ( Reservation { is_oneway: self.is_oneway, + pid: self.pid, free_res: self.free_res, }, self.data, diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index 04477ff7e5a0..adf542872f36 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -107,7 +107,6 @@ fn new(val: impl PinInit) -> impl PinInit { }) } - #[allow(dead_code)] fn arc_try_new(val: T) -> Result, alloc::alloc::AllocError> { ListArc::pin_init(pin_init!(Self { links <- ListLinksSelfPtr::new(), diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 0238c15604f6..414ffb1387a0 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -909,7 +909,7 @@ pub(crate) fn copy_transaction_data( size_of::(), ); let secctx_off = adata_size + aoffsets_size + abuffers_size; - let mut alloc = match to_process.buffer_alloc(len, is_oneway) { + let mut alloc = match to_process.buffer_alloc(len, is_oneway, self.process.task.pid()) { Ok(alloc) => alloc, Err(err) => { pr_warn!( @@ -1191,8 +1191,15 @@ fn oneway_transaction_inner(self: &Arc, tr: &BinderTransactionDataSg) -> B let handle = unsafe { tr.transaction_data.target.handle }; let node_ref = self.process.get_transaction_node(handle)?; security::binder_transaction(&self.process.cred, &node_ref.node.owner.cred)?; - let list_completion = DTRWrap::arc_try_new(DeliverCode::new(BR_TRANSACTION_COMPLETE))?; let transaction = Transaction::new(node_ref, None, self, tr)?; + let code = if self.process.is_oneway_spam_detection_enabled() + && transaction.oneway_spam_detected + { + BR_ONEWAY_SPAM_SUSPECT + } else { + BR_TRANSACTION_COMPLETE + }; + let list_completion = DTRWrap::arc_try_new(DeliverCode::new(code))?; let completion = list_completion.clone_arc(); self.inner.lock().push_work(list_completion); match transaction.submit() { diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index 7028c504ef8c..84b9fe58fe3e 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -38,6 +38,7 @@ pub(crate) struct Transaction { data_address: usize, sender_euid: Kuid, txn_security_ctx_off: Option, + pub(crate) oneway_spam_detected: bool, } kernel::list::impl_list_arc_safe! { @@ -70,6 +71,7 @@ pub(crate) fn new( return Err(err); } }; + let oneway_spam_detected = alloc.oneway_spam_detected; if trd.flags & TF_ONE_WAY != 0 { if stack_next.is_some() { pr_warn!("Oneway transaction should not be in a transaction stack."); @@ -98,6 +100,7 @@ pub(crate) fn new( allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), is_outstanding: AtomicBool::new(false), txn_security_ctx_off, + oneway_spam_detected, }))?) } @@ -115,6 +118,7 @@ pub(crate) fn new_reply( return Err(err); } }; + let oneway_spam_detected = alloc.oneway_spam_detected; if trd.flags & TF_CLEAR_BUF != 0 { alloc.set_info_clear_on_drop(); } @@ -132,6 +136,7 @@ pub(crate) fn new_reply( allocation <- kernel::new_spinlock!(Some(alloc), "Transaction::new"), is_outstanding: AtomicBool::new(false), txn_security_ctx_off: None, + oneway_spam_detected, }))?) } diff --git a/rust/kernel/task.rs b/rust/kernel/task.rs index 1a27b968a907..81649f12758b 100644 --- a/rust/kernel/task.rs +++ b/rust/kernel/task.rs @@ -81,7 +81,7 @@ unsafe impl Send for Task {} unsafe impl Sync for Task {} /// The type of process identifiers (PIDs). -type Pid = bindings::pid_t; +pub type Pid = bindings::pid_t; /// The type of user identifiers (UIDs). #[derive(Copy, Clone)] From patchwork Wed Nov 1 18:01:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160637 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605597vqx; Wed, 1 Nov 2023 11:04:08 -0700 (PDT) X-Google-Smtp-Source: AGHT+IESLeb7qjXqXthfYUSv3EUwC2ycIqpb3x/WZZeU3lE2ZknC/ZdVhbtvTviyFargJjWgi+IU X-Received: by 2002:a05:6358:5924:b0:168:dfe6:c549 with SMTP id g36-20020a056358592400b00168dfe6c549mr14754637rwf.9.1698861848272; Wed, 01 Nov 2023 11:04:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861848; cv=none; d=google.com; s=arc-20160816; b=GfW7BPJQRmZ3217lunyx5VMjBapG9Yndw59+l6bXp2utUzXVu3MwCoBNSAVGs78XWy pdIJqbSWfrBFbHPpTLf9XMdPNv10WuikA2lXye4LTlCeOELIN4kApRPUwIwESJedkM7p wXTDoeotnFy6D/JnBEcXNspUxL0FLEwtBiqFnJPJzo9U6nkNaody5iTc3yjPjO+WK26P zT9/4J0GKeI62m1WxVNJk7NBpbGglqnFScBidK8U2woLD0nE9BvP+CKDY3e6YsWFm0ft XgCVcCIG48NjsiSfJNDn3aNcAmjr5W3woF174GUs/LL1aEQuzUeghb3Ro5DR0Lr8NhXQ dKoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=YNne2uV93ckwMpGguK52oBZlOAZFXC7MfuI9180HtAk=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=awipZtkjTcZuS/IkstfZ3/l/H366iaEGwBOz8qtFoDAI29AKVu2CsjqoidERPScwru ABnHa0puUBTM9ZtmUMobALCPtEPiJWym8DD5uvH0eRM/3wjTN+297fU7RlTvYlFt9MP6 ALlwGOJoHl/2ZQO/B9bi6TPFyqfhVNa6aRUsry/eaMJRespHSNk1pc5McqMo2EMHd/DS dHCzIJ1oWjbw9u+BF5nL13XsYPbOV00g6Mic0ioBq70FiVrlzvdoxIx4BKTtzAWBFOLA fdyhjoa5ICcR0gBH9BlgbG+zeeW+dtjI7toWOv5I1uhgO/rqjA9D0MPuvPkHDkyjp+MH 7skA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=cbIJ2ins; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id ca3-20020a056a02068300b005b8f446408dsi378670pgb.449.2023.11.01.11.04.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=cbIJ2ins; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 55F09818F699; Wed, 1 Nov 2023 11:03:59 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345194AbjKASDw (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39460 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344951AbjKASDI (ORCPT ); Wed, 1 Nov 2023 14:03:08 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7FDF1111 for ; Wed, 1 Nov 2023 11:03:02 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-da040c021aeso12091276.3 for ; Wed, 01 Nov 2023 11:03:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861781; x=1699466581; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YNne2uV93ckwMpGguK52oBZlOAZFXC7MfuI9180HtAk=; b=cbIJ2insncFswk6hAzuTTZwcesQ/J2lO0mj2Yq3IVqU15NJnrWwFq6pNS4pwajm9fn soPQkEZNwu7cViawUrtuASjbdF0ka6+eGU6wGdMaBRRwKo4pjt2vb+AJIWI8QmXdpvka L2X1836Hzh9UFykz6lwEzdBD3n9ZUoktUiyqOGG7rUfhdx1crqlOykXMopkc7m4cWoog 1C1+KXP+O8kHBFfVcII3569eIN92pAk918twl8wu7FogOiTkk+zS2MCALiQZ1YvoLwoO ISvPaxawcXdbD3v+mzo3jzI/Gi/b8UMspyOjkEVORBcKwopmdEJKNfr7A45pZSHCGKGX v0DA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861781; x=1699466581; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YNne2uV93ckwMpGguK52oBZlOAZFXC7MfuI9180HtAk=; b=J0u/LksqUI1apF4AA3DO4KYeDOSbrOD4JU4vNHFemxnkw1Tk2LUFaigLseBdz/Cgn3 ECaYjBh81MvTH/jvwIJ59Lnu2z9/i32z4pX6AVH1aKl0hYR50YheqlTBbGYUdsIR5eY7 U3EaYrihFTGfG81fE7OAuZzzCydOJBxNjYYX/WY2nGuOR07QpXoXM8o79KsPRnEuW++v EGOtbVtsulJBms8t2DM01qTQAmbjtE2mDAyIOh1pgkNqfXS2FXaN9zg6WEP6rQux9sXN SOlNz3Pa4Ci30YC6zbxp3PiSxcB0RVveZrAJwxJrEIenbNjaMNvvvmqiOfLHsj6nqd/N lrRA== X-Gm-Message-State: AOJu0YyEqmIgjvTv8T4nVZ+9+cAG403Ql7CNcVuT/1c2JxDZJe8HBQmm 5CgZ9BBtxURqG/cep/jpmC4gI99u9kaCbGg= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6902:1083:b0:da0:567d:f819 with SMTP id v3-20020a056902108300b00da0567df819mr389361ybu.10.1698861781743; Wed, 01 Nov 2023 11:03:01 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:48 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-18-08ba9197f637@google.com> Subject: [PATCH RFC 18/20] rust_binder: add binder_logs/state From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:03:59 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385761270667614 X-GMAIL-MSGID: 1781385761270667614 The binderfs directory has four files intended for debugging the driver. This patch implements the state file so that you can use it to view the current state of the driver. Signed-off-by: Alice Ryhl --- drivers/android/node.rs | 37 ++++++++++++++++++++ drivers/android/process.rs | 75 +++++++++++++++++++++++++++++++++++++++++ drivers/android/rust_binder.rs | 23 ++++++++++++- drivers/android/thread.rs | 68 +++++++++++++++++++++++++++++++++++++ drivers/android/transaction.rs | 25 ++++++++++++++ rust/bindings/bindings_helper.h | 1 + rust/kernel/lib.rs | 1 + rust/kernel/seq_file.rs | 47 ++++++++++++++++++++++++++ 8 files changed, 276 insertions(+), 1 deletion(-) diff --git a/drivers/android/node.rs b/drivers/android/node.rs index 2c056bd7582e..3acad1c2b963 100644 --- a/drivers/android/node.rs +++ b/drivers/android/node.rs @@ -7,6 +7,8 @@ TryNewListArc, }, prelude::*, + seq_file::SeqFile, + seq_print, sync::lock::{spinlock::SpinLockBackend, Guard}, sync::{Arc, LockedBy, SpinLock}, user_ptr::UserSlicePtrWriter, @@ -111,6 +113,41 @@ pub(crate) fn new( }) } + #[inline(never)] + pub(crate) fn debug_print(&self, m: &mut SeqFile) -> Result<()> { + let weak; + let strong; + let has_weak; + let has_strong; + let active_inc_refs; + { + let mut guard = self.owner.inner.lock(); + let inner = self.inner.access_mut(&mut guard); + weak = inner.weak.count; + has_weak = inner.weak.has_count; + strong = inner.strong.count; + has_strong = inner.strong.has_count; + active_inc_refs = inner.active_inc_refs; + } + + let has_weak = if has_weak { "Y" } else { "N" }; + let has_strong = if has_strong { "Y" } else { "N" }; + + seq_print!( + m, + "node gid:{},ptr:{:#x},cookie:{:#x}: strong{}{} weak{}{} active{}\n", + self.global_id, + self.ptr, + self.cookie, + strong, + has_strong, + weak, + has_weak, + active_inc_refs + ); + Ok(()) + } + pub(crate) fn get_id(&self) -> (usize, usize) { (self.ptr, self.cookie) } diff --git a/drivers/android/process.rs b/drivers/android/process.rs index 4ac5d09041a4..b5e44f9f2a14 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -20,6 +20,8 @@ pages::Pages, prelude::*, rbtree::RBTree, + seq_file::SeqFile, + seq_print, sync::{ lock::Guard, Arc, ArcBorrow, CondVar, CondVarTimeoutResult, Mutex, SpinLock, UniqueArc, }, @@ -405,6 +407,79 @@ fn new(ctx: Arc, cred: ARef) -> Result> { Ok(process) } + #[inline(never)] + pub(crate) fn debug_print(&self, m: &mut SeqFile) -> Result<()> { + seq_print!(m, "pid: {}\n", self.task.pid_in_current_ns()); + + let is_manager; + let started_threads; + let has_proc_work; + let mut ready_threads = Vec::new(); + let mut all_threads = Vec::new(); + let mut all_nodes = Vec::new(); + loop { + let inner = self.inner.lock(); + let ready_threads_len = inner.ready_threads.iter().count(); + let all_threads_len = inner.threads.values().count(); + let all_nodes_len = inner.nodes.values().count(); + + let resize_ready_threads = ready_threads_len > ready_threads.capacity(); + let resize_all_threads = all_threads_len > all_threads.capacity(); + let resize_all_nodes = all_nodes_len > all_nodes.capacity(); + if resize_ready_threads || resize_all_threads || resize_all_nodes { + drop(inner); + ready_threads.try_reserve(ready_threads_len)?; + all_threads.try_reserve(all_threads_len)?; + all_nodes.try_reserve(all_nodes_len)?; + continue; + } + + is_manager = inner.is_manager; + started_threads = inner.started_thread_count; + has_proc_work = !inner.work.is_empty(); + + for thread in &inner.ready_threads { + assert!(ready_threads.len() < ready_threads.capacity()); + ready_threads.try_push(thread.id)?; + } + + for thread in inner.threads.values() { + assert!(all_threads.len() < all_threads.capacity()); + all_threads.try_push(thread.clone())?; + } + + for node in inner.nodes.values() { + assert!(all_nodes.len() < all_nodes.capacity()); + all_nodes.try_push(node.clone())?; + } + + break; + } + + seq_print!(m, "is_manager: {}\n", is_manager); + seq_print!(m, "started_threads: {}\n", started_threads); + seq_print!(m, "has_proc_work: {}\n", has_proc_work); + if ready_threads.is_empty() { + seq_print!(m, "ready_thread_ids: none\n"); + } else { + seq_print!(m, "ready_thread_ids:"); + for thread_id in ready_threads { + seq_print!(m, " {}", thread_id); + } + seq_print!(m, "\n"); + } + + for node in all_nodes { + node.debug_print(m)?; + } + + seq_print!(m, "all threads:\n"); + for thread in all_threads { + thread.debug_print(m); + } + Ok(()) + } + /// Attempts to fetch a work item from the process queue. pub(crate) fn get_work(&self) -> Option> { self.inner.lock().work.pop_front() diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index adf542872f36..a1c95a1609d5 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -10,6 +10,8 @@ HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc, }, prelude::*, + seq_file::SeqFile, + seq_print, sync::Arc, types::ForeignOwnable, user_ptr::UserSlicePtrWriter, @@ -347,7 +349,13 @@ unsafe impl Sync for AssertSync {} } #[no_mangle] -unsafe extern "C" fn rust_binder_state_show(_: *mut seq_file) -> core::ffi::c_int { +unsafe extern "C" fn rust_binder_state_show(ptr: *mut seq_file) -> core::ffi::c_int { + // SAFETY: The caller ensures that the pointer is valid and exclusive for the duration in which + // this method is called. + let m = unsafe { SeqFile::from_raw(ptr) }; + if let Err(err) = rust_binder_state_show_impl(m) { + seq_print!(m, "failed to generate state: {:?}\n", err); + } 0 } @@ -360,3 +368,16 @@ unsafe impl Sync for AssertSync {} unsafe extern "C" fn rust_binder_transaction_log_show(_: *mut seq_file) -> core::ffi::c_int { 0 } + +fn rust_binder_state_show_impl(m: &mut SeqFile) -> Result<()> { + let contexts = context::get_all_contexts()?; + for ctx in contexts { + let procs = ctx.get_all_procs()?; + seq_print!(m, "context {}: ({} processes)\n", &*ctx.name, procs.len()); + for proc in procs { + proc.debug_print(m)?; + seq_print!(m, "\n"); + } + } + Ok(()) +} diff --git a/drivers/android/thread.rs b/drivers/android/thread.rs index 414ffb1387a0..d5a56119cc19 100644 --- a/drivers/android/thread.rs +++ b/drivers/android/thread.rs @@ -15,6 +15,8 @@ }, prelude::*, security, + seq_file::SeqFile, + seq_print, sync::{Arc, SpinLock}, types::Either, user_ptr::{UserSlicePtr, UserSlicePtrWriter}, @@ -447,6 +449,72 @@ pub(crate) fn new(id: i32, process: Arc) -> Result> { })) } + #[inline(never)] + pub(crate) fn debug_print(&self, m: &mut SeqFile) { + let looper_flags; + let looper_need_return; + let is_dead; + let has_work; + let process_work_list; + let current_transaction; + { + let inner = self.inner.lock(); + looper_flags = inner.looper_flags; + looper_need_return = inner.looper_need_return; + is_dead = inner.is_dead; + has_work = !inner.work_list.is_empty(); + process_work_list = inner.process_work_list; + current_transaction = inner.current_transaction.clone(); + } + seq_print!(m, " tid: {}\n", self.id); + seq_print!(m, " state:"); + if is_dead { + seq_print!(m, " dead"); + } + if looper_need_return { + seq_print!(m, " pending_flush_wakeup"); + } + if has_work && process_work_list { + seq_print!(m, " has_work"); + } + if has_work && !process_work_list { + seq_print!(m, " has_deferred_work"); + } + if looper_flags & LOOPER_REGISTERED != 0 { + seq_print!(m, " registered"); + } + if looper_flags & LOOPER_ENTERED != 0 { + seq_print!(m, " entered"); + } + if looper_flags & LOOPER_EXITED != 0 { + seq_print!(m, " exited"); + } + if looper_flags & LOOPER_INVALID != 0 { + seq_print!(m, " invalid"); + } + if looper_flags & LOOPER_WAITING != 0 { + if looper_flags & LOOPER_WAITING_PROC != 0 { + seq_print!(m, " in_get_work"); + } else { + seq_print!(m, " in_get_work_local"); + } + } + if looper_flags & LOOPER_POLL != 0 { + seq_print!(m, " poll_is_initialized"); + } + seq_print!(m, "\n"); + if current_transaction.is_some() { + seq_print!(m, " tstack:"); + let mut t = current_transaction; + while let Some(tt) = t.as_ref() { + seq_print!(m, " "); + tt.debug_print(m); + t = tt.clone_next(); + } + seq_print!(m, "\n"); + } + } + pub(crate) fn get_extended_error(&self, data: UserSlicePtr) -> Result { let mut writer = data.writer(); let ee = self.inner.lock().extended_error; diff --git a/drivers/android/transaction.rs b/drivers/android/transaction.rs index 84b9fe58fe3e..30c411ab0778 100644 --- a/drivers/android/transaction.rs +++ b/drivers/android/transaction.rs @@ -5,6 +5,8 @@ io_buffer::IoBufferWriter, list::ListArcSafe, prelude::*, + seq_file::SeqFile, + seq_print, sync::{Arc, SpinLock}, task::Kuid, types::{Either, ScopeGuard}, @@ -140,6 +142,29 @@ pub(crate) fn new_reply( }))?) } + #[inline(never)] + pub(crate) fn debug_print(&self, m: &mut SeqFile) { + let from_pid = self.from.process.task.pid_in_current_ns(); + let to_pid = self.to.task.pid_in_current_ns(); + let from_tid = self.from.id; + match self.target_node.as_ref() { + Some(target_node) => { + let node_id = target_node.global_id; + seq_print!( + m, + "{}(tid:{})->{}(nid:{})", + from_pid, + from_tid, + to_pid, + node_id + ); + } + None => { + seq_print!(m, "{}(tid:{})->{}(nid:_)", from_pid, from_tid, to_pid); + } + } + } + /// Determines if the transaction is stacked on top of the given transaction. pub(crate) fn is_stacked_on(&self, onext: &Option>) -> bool { match (&self.stack_next, onext) { diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index ffeea312f2fd..b2d60b4a9df6 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index f4d58da9202e..d46187783464 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -48,6 +48,7 @@ pub mod print; pub mod rbtree; pub mod security; +pub mod seq_file; mod static_assert; #[doc(hidden)] pub mod std_vendor; diff --git a/rust/kernel/seq_file.rs b/rust/kernel/seq_file.rs new file mode 100644 index 000000000000..997d527b2e9e --- /dev/null +++ b/rust/kernel/seq_file.rs @@ -0,0 +1,47 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! Seq file bindings. +//! +//! C header: [`include/linux/seq_file.h`](../../../../include/linux/seq_file.h) + +use crate::{bindings, c_str, types::Opaque}; + +/// A helper for implementing special files, where the complete contents can be generated on each +/// access. +pub struct SeqFile(Opaque); + +impl SeqFile { + /// Creates a new [`SeqFile`] from a raw pointer. + /// + /// # Safety + /// + /// The caller must ensure that, for the duration of 'a, the pointer must point at a valid + /// `seq_file` and that it will not be accessed via anything other than the returned reference. + pub unsafe fn from_raw<'a>(ptr: *mut bindings::seq_file) -> &'a mut SeqFile { + // SAFETY: The safety requirements guarantee the validity of the dereference, while the + // `Credential` type being transparent makes the cast ok. + unsafe { &mut *ptr.cast() } + } + + /// Used by the [`seq_print`] macro. + /// + /// [`seq_print`]: crate::seq_print + pub fn call_printf(&mut self, args: core::fmt::Arguments<'_>) { + // SAFETY: Passing a void pointer to `Arguments` is valid for `%pA`. + unsafe { + bindings::seq_printf( + self.0.get(), + c_str!("%pA").as_char_ptr(), + &args as *const _ as *const core::ffi::c_void, + ); + } + } +} + +/// Use for writing to a [`SeqFile`] with the ordinary Rust formatting syntax. +#[macro_export] +macro_rules! seq_print { + ($m:expr, $($arg:tt)+) => ( + $m.call_printf(format_args!($($arg)+)) + ); +} From patchwork Wed Nov 1 18:01:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160640 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605731vqx; Wed, 1 Nov 2023 11:04:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFTwq6MXop9Xsxg1AL1FcwvxjCretcsiscAlD8IvFJwGmk7kmBLKidtipUSS+27YjIqfI4c X-Received: by 2002:a05:6870:1197:b0:1f0:36b6:ef25 with SMTP id 23-20020a056870119700b001f036b6ef25mr1965162oau.23.1698861857798; Wed, 01 Nov 2023 11:04:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861857; cv=none; d=google.com; s=arc-20160816; b=xOBhnFgDfMSu3vMNe4oUO2pwzmd2N+nEjdP9Q2QI6PTV/vsMczPXaxBVKQZcAYT1zl pzR3yOkcKwc5GXqW1ABYQ2nrIc+fAazNkN2zFuBfaykLyMRnNbPACQiMEH6ovlVRLb0s Wrk1UtGGso7ZZJncUDkcAQ8ZizhKz3MlNOmQzF9SeJU5urNKcNys21WdZGPolK2bk3lv o1oCU8qoK9OYAZkCjSETUbOvXmZjrmloEZUFLxgLQvKm9HjpUQIGUWTF+UEHviMMw6FI MRzo1i2IOZwV2+g63k8piPZkeYBFaWfGo+m6mj+vAc+dcOWQXAyr/Aji6iikg0NNQApg etmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=ETN5faJZgnbDmZG9tYXVRa1jPl93RJnYR+FwdTDHd8E=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=DHrPFv23mlbxB0zr6u5FEH3q6yVGnc2s409OiXS4QrwTSUEUZOJfOIDRvoXjo12LoB B7Wvd9gddiLEQu8jTun1E3gsAcJX4ZL7f2ZlPmc1Z1l6wWK9K3ecwOxdfXLB8iVeVLq5 SjyGPugDPP6L4Ivnhc+8d3BylJa+WnmUqsuRHlyenfyaFMwGNXE9OLLcB1M3+WOo09UA 5n3Ru93Gso3c6rf/ZAAoIUkT5IH5QYfzw+xyHdVA/pkntuLV6E75+WW19MFPi4Pef59p ivBDu/zgnCQqb5mS4brUN2nKb1xGHK/ru4/jh3FjZepFAco5W2Oh7symN0WWeWuTSy7Z HtMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=qtICM3zH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id bn14-20020a056a02030e00b005859c255ce8si424340pgb.819.2023.11.01.11.04.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=qtICM3zH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 2830F818F6AE; Wed, 1 Nov 2023 11:04:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345244AbjKASDz (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344994AbjKASDN (ORCPT ); Wed, 1 Nov 2023 14:03:13 -0400 Received: from mail-ed1-x54a.google.com (mail-ed1-x54a.google.com [IPv6:2a00:1450:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC484123 for ; Wed, 1 Nov 2023 11:03:05 -0700 (PDT) Received: by mail-ed1-x54a.google.com with SMTP id 4fb4d7f45d1cf-53db0df5b7cso30286a12.1 for ; Wed, 01 Nov 2023 11:03:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861784; x=1699466584; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ETN5faJZgnbDmZG9tYXVRa1jPl93RJnYR+FwdTDHd8E=; b=qtICM3zHeabTYedb3wU9F0GouJn/EFyiMmwK2+dgJJmeDZAqTpBoLC8Z1j/GwKYezM GiVUhqWBhwmRHVaNiCHjJP9f6NysjA1S/ifKj2tPCMCF1pC57qWcj2WI3y+UUm1BywNa uT8uzM4AA6ezDC2NKfnK9C1mrCAs1/MzsvPFQqRpjV1LYJ3duX+RphJmRS+EvNHNWmgQ 5FIfU5wUvIfSXtjmcgSJSD2P3y26qsKLhX3KdAHW/GxmLa/2v6tjIcGiwkw/T67iJd3a Oe0xX/fSDpIH1bLxJnDQJt2MmIRw7E9lRLzNLuBGOUvoaBdfCboj6ebqJ8vUwYW4u7fj 719g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861784; x=1699466584; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ETN5faJZgnbDmZG9tYXVRa1jPl93RJnYR+FwdTDHd8E=; b=ZAYyM2r9Pneib9SpOjF/rZhU968qd/XOCUqifz5SncEvAFCWIayDb+T+KeXc83Peez 9Z6ZSEMm+SdrQEdQBYk6aO7slKrDvk7knKY2JZO9RjsjFsv7oB2U23UC6U6GdEjG16Ug 7zrBaxAcGmNuZzD7pwnJTyofbLr6WydH+TfRzVR8taOwWe9gUm6qI6ReH704F0raJaE4 vhyeI8ndqHEw6IeLOUKFrV8xRGpHFKm7CH7ryVk6YHZtZ+XqpdfZAzYRj/olg65gYfUf IwkNy7UJ/TVWsJ2XY7mLg+R/0tB8qUSdp/AqUq5WI81BYwNeSK8Hyk969fRlzvUTXcrE yEJg== X-Gm-Message-State: AOJu0Yx5f/2gcovn3luxOdgqV2mldBStdTTWe7YDT6pfUTwH4FLva8Bz GT3X6QyBMk2iPB/04cDp7m9dAlZVflbMkNI= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a05:6402:12da:b0:543:87e1:a198 with SMTP id k26-20020a05640212da00b0054387e1a198mr49219edx.2.1698861784178; Wed, 01 Nov 2023 11:03:04 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:49 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-19-08ba9197f637@google.com> Subject: [PATCH RFC 19/20] rust_binder: add vma shrinker From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:04:04 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385771313484473 X-GMAIL-MSGID: 1781385771313484473 When the system is under memory pressure, we want the driver to release unused pages. We do this by registering a memory shrinker with the kernel. The data for incoming transactions is stored in an mmap'ed region. Previously in this patch series, we just allocated all of the pages in that region immediately. With this patch series, we do not allocate the pages until we need them. Furthermore, when we no longer need a page, we mark it as "available" using an lru list. If the system is under memory pressure, this allows the shrinker to free that page by removing it from the lru list. If we need to use the page again before the shrinker frees it, then we just remove it from the lru list, and we don't need to reallocate the page. The page range abstraction is split into a fast path and slow path. The slow path is only used when a page is not allocated, which should only happen on first use, and when the system is under memory pressure. I'm not yet completely happy with this implementation. Specifically, I would like to improve the robustness of the unsafe code found in `allocation.rs`. The slow-path/fast-path implementation in `page_range.rs` is different from C Binder's current implementation, and was suggested to me by Carlos. Suggested-by: Carlos Llamas Signed-off-by: Alice Ryhl --- drivers/android/allocation.rs | 80 ++--- drivers/android/process.rs | 129 +++---- drivers/android/range_alloc.rs | 44 ++- drivers/android/rust_binder.rs | 6 + rust/bindings/bindings_helper.h | 2 + rust/helpers.c | 20 ++ rust/kernel/lib.rs | 1 + rust/kernel/page_range.rs | 715 ++++++++++++++++++++++++++++++++++++++ rust/kernel/sync/lock.rs | 24 ++ rust/kernel/sync/lock/mutex.rs | 10 + rust/kernel/sync/lock/spinlock.rs | 10 + 11 files changed, 931 insertions(+), 110 deletions(-) diff --git a/drivers/android/allocation.rs b/drivers/android/allocation.rs index 7b64e7fcce4d..4a9f8e7f2de3 100644 --- a/drivers/android/allocation.rs +++ b/drivers/android/allocation.rs @@ -6,7 +6,6 @@ bindings, file::{DeferredFdCloser, File, FileDescriptorReservation}, io_buffer::{IoBufferReader, ReadableFromBytes, WritableToBytes}, - pages::Pages, prelude::*, sync::Arc, types::ARef, @@ -41,11 +40,15 @@ pub(crate) struct AllocationInfo { /// Represents an allocation that the kernel is currently using. /// /// When allocations are idle, the range allocator holds the data related to them. +/// +/// # Invariants +/// +/// This allocation corresponds to an allocation in the range allocator, so the relevant pages are +/// marked in use in the page range. pub(crate) struct Allocation { pub(crate) offset: usize, size: usize, pub(crate) ptr: usize, - pages: Arc>>, pub(crate) process: Arc, allocation_info: Option, free_on_drop: bool, @@ -58,7 +61,6 @@ pub(crate) fn new( offset: usize, size: usize, ptr: usize, - pages: Arc>>, oneway_spam_detected: bool, ) -> Self { Self { @@ -66,30 +68,17 @@ pub(crate) fn new( offset, size, ptr, - pages, oneway_spam_detected, allocation_info: None, free_on_drop: true, } } - fn iterate(&self, mut offset: usize, mut size: usize, mut cb: T) -> Result - where - T: FnMut(&Pages<0>, usize, usize) -> Result, - { - // Check that the request is within the buffer. - if offset.checked_add(size).ok_or(EINVAL)? > self.size { - return Err(EINVAL); - } - offset += self.offset; - let mut page_index = offset >> bindings::PAGE_SHIFT; - offset &= (1 << bindings::PAGE_SHIFT) - 1; - while size > 0 { - let available = core::cmp::min(size, (1 << bindings::PAGE_SHIFT) - offset); - cb(&self.pages[page_index], offset, available)?; - size -= available; - page_index += 1; - offset = 0; + fn size_check(&self, offset: usize, size: usize) -> Result { + let overflow_fail = offset.checked_add(size).is_none(); + let cmp_size_fail = offset.wrapping_add(size) > self.size; + if overflow_fail || cmp_size_fail { + return Err(EFAULT); } Ok(()) } @@ -100,42 +89,37 @@ pub(crate) fn copy_into( offset: usize, size: usize, ) -> Result { - self.iterate(offset, size, |page, offset, to_copy| { - page.copy_into_page(reader, offset, to_copy) - }) + self.size_check(offset, size)?; + + // SAFETY: While this object exists, the range allocator will keep the range allocated, and + // in turn, the pages will be marked as in use. + unsafe { + self.process + .pages + .copy_into(reader, self.offset + offset, size) + } } pub(crate) fn read(&self, offset: usize) -> Result { - let mut out = MaybeUninit::::uninit(); - let mut out_offset = 0; - self.iterate(offset, size_of::(), |page, offset, to_copy| { - // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T. - let obj_ptr = unsafe { (out.as_mut_ptr() as *mut u8).add(out_offset) }; - // SAFETY: The pointer points is in-bounds of the `out` variable, so it is valid. - unsafe { page.read(obj_ptr, offset, to_copy) }?; - out_offset += to_copy; - Ok(()) - })?; - // SAFETY: We just initialised the data. - Ok(unsafe { out.assume_init() }) + self.size_check(offset, size_of::())?; + + // SAFETY: While this object exists, the range allocator will keep the range allocated, and + // in turn, the pages will be marked as in use. + unsafe { self.process.pages.read(self.offset + offset) } } pub(crate) fn write(&self, offset: usize, obj: &T) -> Result { - let mut obj_offset = 0; - self.iterate(offset, size_of_val(obj), |page, offset, to_copy| { - // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T. - let obj_ptr = unsafe { (obj as *const T as *const u8).add(obj_offset) }; - // SAFETY: We have a reference to the object, so the pointer is valid. - unsafe { page.write(obj_ptr, offset, to_copy) }?; - obj_offset += to_copy; - Ok(()) - }) + self.size_check(offset, size_of_val::(obj))?; + + // SAFETY: While this object exists, the range allocator will keep the range allocated, and + // in turn, the pages will be marked as in use. + unsafe { self.process.pages.write(self.offset + offset, obj) } } pub(crate) fn fill_zero(&self) -> Result { - self.iterate(0, self.size, |page, offset, len| { - page.fill_zero(offset, len) - }) + // SAFETY: While this object exists, the range allocator will keep the range allocated, and + // in turn, the pages will be marked as in use. + unsafe { self.process.pages.fill_zero(self.offset, self.size) } } pub(crate) fn keep_alive(mut self) { diff --git a/drivers/android/process.rs b/drivers/android/process.rs index b5e44f9f2a14..61809e496a48 100644 --- a/drivers/android/process.rs +++ b/drivers/android/process.rs @@ -17,7 +17,7 @@ io_buffer::{IoBufferReader, IoBufferWriter}, list::{HasListLinks, List, ListArc, ListArcSafe, ListItem, ListLinks}, mm, - pages::Pages, + page_range::ShrinkablePageRange, prelude::*, rbtree::RBTree, seq_file::SeqFile, @@ -47,17 +47,12 @@ struct Mapping { address: usize, alloc: RangeAllocator, - pages: Arc>>, } impl Mapping { - fn new(address: usize, size: usize, pages: Arc>>) -> Result { + fn new(address: usize, size: usize) -> Result { let alloc = RangeAllocator::new(size)?; - Ok(Self { - address, - alloc, - pages, - }) + Ok(Self { address, alloc }) } } @@ -333,6 +328,9 @@ pub(crate) struct Process { #[pin] pub(crate) inner: SpinLock, + #[pin] + pub(crate) pages: ShrinkablePageRange, + // Waitqueue of processes waiting for all outstanding transactions to be // processed. #[pin] @@ -390,10 +388,11 @@ fn run(me: Arc) { impl Process { fn new(ctx: Arc, cred: ARef) -> Result> { - let list_process = ListArc::pin_init(pin_init!(Process { + let list_process = ListArc::pin_init(try_pin_init!(Process { ctx, cred, inner <- kernel::new_spinlock!(ProcessInner::new(), "Process::inner"), + pages <- ShrinkablePageRange::new(&super::BINDER_SHRINKER), node_refs <- kernel::new_mutex!(ProcessNodeRefs::new(), "Process::node_refs"), freeze_wait <- kernel::new_condvar!("Process::freeze_wait"), task: kernel::current!().group_leader().into(), @@ -738,20 +737,46 @@ pub(crate) fn buffer_alloc( is_oneway: bool, from_pid: i32, ) -> BinderResult { + use kernel::bindings::PAGE_SIZE; + let alloc = range_alloc::ReserveNewBox::try_new()?; let mut inner = self.inner.lock(); let mapping = inner.mapping.as_mut().ok_or_else(BinderError::new_dead)?; let offset = mapping .alloc .reserve_new(size, is_oneway, from_pid, alloc)?; - Ok(Allocation::new( + + let res = Allocation::new( self.clone(), offset, size, mapping.address + offset, - mapping.pages.clone(), mapping.alloc.oneway_spam_detected, - )) + ); + drop(inner); + + // This allocation will be marked as in use until the `Allocation` is used to free it. + // + // This method can't be called while holding a lock, so we release the lock first. It's + // okay for several threads to use the method on the same index at the same time. In that + // case, one of the calls will allocate the given page (if missing), and the other call + // will wait for the other call to finish allocating the page. + // + // We will not call `stop_using_range` in parallel with this on the same page, because the + // allocation can only be removed via the destructor of the `Allocation` object that we + // currently own. + match self.pages.use_range( + offset / PAGE_SIZE, + (offset + size + (PAGE_SIZE - 1)) / PAGE_SIZE, + ) { + Ok(()) => {} + Err(err) => { + pr_warn!("use_range failure {:?}", err); + return Err(err.into()); + } + } + + Ok(res) } pub(crate) fn buffer_get(self: &Arc, ptr: usize) -> Option { @@ -764,7 +789,6 @@ pub(crate) fn buffer_get(self: &Arc, ptr: usize) -> Option { offset, size, ptr, - mapping.pages.clone(), mapping.alloc.oneway_spam_detected, ); if let Some(data) = odata { @@ -776,18 +800,29 @@ pub(crate) fn buffer_get(self: &Arc, ptr: usize) -> Option { pub(crate) fn buffer_raw_free(&self, ptr: usize) { let mut inner = self.inner.lock(); if let Some(ref mut mapping) = &mut inner.mapping { - if ptr < mapping.address - || mapping - .alloc - .reservation_abort(ptr - mapping.address) - .is_err() - { - pr_warn!( - "Pointer {:x} failed to free, base = {:x}\n", - ptr, - mapping.address - ); - } + let offset = match ptr.checked_sub(mapping.address) { + Some(offset) => offset, + None => return, + }; + + let freed_range = match mapping.alloc.reservation_abort(offset) { + Ok(freed_range) => freed_range, + Err(_) => { + pr_warn!( + "Pointer {:x} failed to free, base = {:x}\n", + ptr, + mapping.address + ); + return; + } + }; + + // No more allocations in this range. Mark them as not in use. + // + // Must be done before we release the lock so that `use_range` is not used on these + // indices until `stop_using_range` returns. + self.pages + .stop_using_range(freed_range.start_page_idx, freed_range.end_page_idx); } } @@ -802,35 +837,16 @@ pub(crate) fn buffer_make_freeable(&self, offset: usize, data: Option Result { use kernel::bindings::PAGE_SIZE; - let size = core::cmp::min(vma.end() - vma.start(), bindings::SZ_4M as usize); - let page_count = size / PAGE_SIZE; - - // Allocate and map all pages. - // - // N.B. If we fail halfway through mapping these pages, the kernel will unmap them. - let mut pages = Vec::new(); - pages.try_reserve_exact(page_count)?; - let mut address = vma.start(); - for _ in 0..page_count { - let page = Pages::<0>::new()?; - vma.insert_page(address, &page)?; - pages.try_push(page)?; - address += PAGE_SIZE; + let size = usize::min(vma.end() - vma.start(), bindings::SZ_4M as usize); + let mapping = Mapping::new(vma.start(), size)?; + let page_count = self.pages.register_with_vma(vma)?; + if page_count * PAGE_SIZE != size { + return Err(EINVAL); } - let ref_pages = Arc::try_new(pages)?; - let mapping = Mapping::new(vma.start(), size, ref_pages)?; + // Save range allocator for later. + self.inner.lock().mapping = Some(mapping); - // Save pages for later. - let mut inner = self.inner.lock(); - match &inner.mapping { - None => inner.mapping = Some(mapping), - Some(_) => { - drop(inner); - drop(mapping); - return Err(EBUSY); - } - } Ok(()) } @@ -1044,18 +1060,11 @@ fn deferred_release(self: Arc) { let omapping = self.inner.lock().mapping.take(); if let Some(mut mapping) = omapping { let address = mapping.address; - let pages = mapping.pages.clone(); let oneway_spam_detected = mapping.alloc.oneway_spam_detected; mapping.alloc.take_for_each(|offset, size, odata| { let ptr = offset + address; - let mut alloc = Allocation::new( - self.clone(), - offset, - size, - ptr, - pages.clone(), - oneway_spam_detected, - ); + let mut alloc = + Allocation::new(self.clone(), offset, size, ptr, oneway_spam_detected); if let Some(data) = odata { alloc.set_info(data); } diff --git a/drivers/android/range_alloc.rs b/drivers/android/range_alloc.rs index c1d47115e54d..4aa1b5236bf5 100644 --- a/drivers/android/range_alloc.rs +++ b/drivers/android/range_alloc.rs @@ -19,6 +19,26 @@ pub(crate) struct RangeAllocator { pub(crate) oneway_spam_detected: bool, } +const PAGE_SIZE: usize = kernel::bindings::PAGE_SIZE; + +/// Represents a range of pages that have just become completely free. +#[derive(Copy, Clone)] +pub(crate) struct FreedRange { + pub(crate) start_page_idx: usize, + pub(crate) end_page_idx: usize, +} + +impl FreedRange { + fn interior_pages(offset: usize, size: usize) -> FreedRange { + FreedRange { + // Divide round up + start_page_idx: (offset + (PAGE_SIZE - 1)) / PAGE_SIZE, + // Divide round down + end_page_idx: (offset + size) / PAGE_SIZE, + } + } +} + impl RangeAllocator { pub(crate) fn new(size: usize) -> Result { let mut tree = RBTree::new(); @@ -97,7 +117,7 @@ pub(crate) fn reserve_new( Ok(found_off) } - pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result { + pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result { let mut cursor = self.tree.cursor_lower_bound(&offset).ok_or_else(|| { pr_warn!( "EINVAL from range_alloc.reservation_abort - offset: {}", @@ -140,9 +160,26 @@ pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result { self.free_oneway_space += free_oneway_space_add; + let mut freed_range = FreedRange::interior_pages(offset, size); + // Compute how large the next free region needs to be to include one more page in + // the newly freed range. + let add_next_page_needed = match (offset + size) % PAGE_SIZE { + 0 => usize::MAX, + unalign => PAGE_SIZE - unalign, + }; + // Compute how large the previous free region needs to be to include one more page + // in the newly freed range. + let add_prev_page_needed = match offset % PAGE_SIZE { + 0 => usize::MAX, + unalign => unalign, + }; + // Merge next into current if next is free let remove_next = match cursor.peek_next() { Some((_, next)) if next.state.is_none() => { + if next.size >= add_next_page_needed { + freed_range.end_page_idx += 1; + } self.free_tree.remove(&(next.size, next.offset)); size += next.size; true @@ -159,6 +196,9 @@ pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result { // Merge current into prev if prev is free match cursor.peek_prev_mut() { Some((_, prev)) if prev.state.is_none() => { + if prev.size >= add_prev_page_needed { + freed_range.start_page_idx -= 1; + } // merge previous with current, remove current self.free_tree.remove(&(prev.size, prev.offset)); offset = prev.offset; @@ -172,7 +212,7 @@ pub(crate) fn reservation_abort(&mut self, offset: usize) -> Result { self.free_tree .insert(reservation.free_res.into_node((size, offset), ())); - Ok(()) + Ok(freed_range) } pub(crate) fn reservation_commit(&mut self, offset: usize, data: Option) -> Result { diff --git a/drivers/android/rust_binder.rs b/drivers/android/rust_binder.rs index a1c95a1609d5..0e4033dc8e71 100644 --- a/drivers/android/rust_binder.rs +++ b/drivers/android/rust_binder.rs @@ -9,6 +9,7 @@ list::{ HasListLinks, ListArc, ListArcSafe, ListItem, ListLinks, ListLinksSelfPtr, TryNewListArc, }, + page_range::Shrinker, prelude::*, seq_file::SeqFile, seq_print, @@ -173,12 +174,17 @@ const fn ptr_align(value: usize) -> usize { (value + size) & !size } +// SAFETY: We call register in `init`. +static BINDER_SHRINKER: Shrinker = unsafe { Shrinker::new() }; + struct BinderModule {} impl kernel::Module for BinderModule { fn init(_module: &'static kernel::ThisModule) -> Result { crate::context::CONTEXTS.init(); + BINDER_SHRINKER.register(kernel::c_str!("android-binder"))?; + // SAFETY: The module is being loaded, so we can initialize binderfs. #[cfg(CONFIG_ANDROID_BINDERFS_RUST)] unsafe { diff --git a/rust/bindings/bindings_helper.h b/rust/bindings/bindings_helper.h index b2d60b4a9df6..2f37e5192ce4 100644 --- a/rust/bindings/bindings_helper.h +++ b/rust/bindings/bindings_helper.h @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -21,6 +22,7 @@ #include #include #include +#include #include #include #include diff --git a/rust/helpers.c b/rust/helpers.c index be295d8bdb46..3392d2d4ee2c 100644 --- a/rust/helpers.c +++ b/rust/helpers.c @@ -93,6 +93,12 @@ void rust_helper_spin_unlock(spinlock_t *lock) } EXPORT_SYMBOL_GPL(rust_helper_spin_unlock); +int rust_helper_spin_trylock(spinlock_t *lock) +{ + return spin_trylock(lock); +} +EXPORT_SYMBOL_GPL(rust_helper_spin_trylock); + void rust_helper_init_wait(struct wait_queue_entry *wq_entry) { init_wait(wq_entry); @@ -310,6 +316,20 @@ struct vm_area_struct *rust_helper_vma_lookup(struct mm_struct *mm, } EXPORT_SYMBOL_GPL(rust_helper_vma_lookup); +unsigned long rust_helper_list_lru_count(struct list_lru *lru) +{ + return list_lru_count(lru); +} +EXPORT_SYMBOL_GPL(rust_helper_list_lru_count); + +unsigned long rust_helper_list_lru_walk(struct list_lru *lru, + list_lru_walk_cb isolate, void *cb_arg, + unsigned long nr_to_walk) +{ + return list_lru_walk(lru, isolate, cb_arg, nr_to_walk); +} +EXPORT_SYMBOL_GPL(rust_helper_list_lru_walk); + void rust_helper_rb_link_node(struct rb_node *node, struct rb_node *parent, struct rb_node **rb_link) { diff --git a/rust/kernel/lib.rs b/rust/kernel/lib.rs index d46187783464..02e670b92426 100644 --- a/rust/kernel/lib.rs +++ b/rust/kernel/lib.rs @@ -43,6 +43,7 @@ pub mod kunit; pub mod list; pub mod mm; +pub mod page_range; pub mod pages; pub mod prelude; pub mod print; diff --git a/rust/kernel/page_range.rs b/rust/kernel/page_range.rs new file mode 100644 index 000000000000..b13f8cd62b77 --- /dev/null +++ b/rust/kernel/page_range.rs @@ -0,0 +1,715 @@ +// SPDX-License-Identifier: GPL-2.0 + +//! This module has utilities for managing a page range where unused pages may be reclaimed by a +//! vma shrinker. + +// To avoid deadlocks, locks are taken in the order: +// +// 1. mmap lock +// 2. spinlock +// 3. lru spinlock +// +// The shrinker will use trylock methods because it locks them in a different order. + +use core::{ + alloc::Layout, + ffi::{c_ulong, c_void}, + marker::PhantomPinned, + mem::{size_of, size_of_val, MaybeUninit}, + ptr, +}; + +use crate::{ + bindings, + error::Result, + io_buffer::ReadableFromBytes, + mm::{virt, MmGrab}, + new_spinlock, + pages::Pages, + prelude::*, + str::CStr, + sync::SpinLock, + types::Opaque, + user_ptr::UserSlicePtrReader, +}; + +const PAGE_SIZE: usize = bindings::PAGE_SIZE; +const PAGE_SHIFT: usize = bindings::PAGE_SHIFT; +const PAGE_MASK: usize = bindings::PAGE_MASK; + +/// Represents a shrinker that can be registered with the kernel. +/// +/// Each shrinker can be used by many `ShrinkablePageRange` objects. +#[repr(C)] +pub struct Shrinker { + inner: Opaque, + list_lru: Opaque, +} + +unsafe impl Send for Shrinker {} +unsafe impl Sync for Shrinker {} + +impl Shrinker { + /// Create a new shrinker. + /// + /// # Safety + /// + /// Before using this shrinker with a `ShrinkablePageRange`, the `register` method must have + /// been called exactly once, and it must not have returned an error. + pub const unsafe fn new() -> Self { + Self { + inner: Opaque::uninit(), + list_lru: Opaque::uninit(), + } + } + + /// Register this shrinker with the kernel. + pub fn register(&'static self, name: &CStr) -> Result<()> { + // SAFETY: These fields are not yet used, so it's okay to zero them. + unsafe { + self.inner.get().write_bytes(0, 1); + self.list_lru.get().write_bytes(0, 1); + } + + // SAFETY: The field is not yet used, so we can initialize it. + let ret = unsafe { + bindings::__list_lru_init(self.list_lru.get(), false, ptr::null_mut(), ptr::null_mut()) + }; + if ret != 0 { + return Err(Error::from_errno(ret)); + } + + // SAFETY: We're about to register the shrinker, and these are the fields we need to + // initialize. (All other fields are already zeroed.) + unsafe { + let inner = self.inner.get(); + ptr::addr_of_mut!((*inner).count_objects).write(Some(rust_shrink_count)); + ptr::addr_of_mut!((*inner).scan_objects).write(Some(rust_shrink_scan)); + ptr::addr_of_mut!((*inner).seeks).write(bindings::DEFAULT_SEEKS as _); + } + + // SAFETY: We've initialized the shrinker fields we need to, so we can call this method. + let ret = unsafe { bindings::register_shrinker(self.inner.get(), name.as_char_ptr()) }; + if ret != 0 { + // SAFETY: We initialized it, so its okay to destroy it. + unsafe { bindings::list_lru_destroy(self.list_lru.get()) }; + return Err(Error::from_errno(ret)); + } + + Ok(()) + } +} + +/// A container that manages a page range in a vma. +/// +/// The pages can be thought of as an array of booleans of whether the pages are usable. The +/// methods `use_range` and `stop_using_range` set all booleans in a range to true or false +/// respectively. Initially, no pages are allocated. When a page is not used, it is not freed +/// immediately. Instead, it is made available to the memory shrinker to free it if the device is +/// under memory pressure. +/// +/// It's okay for `use_range` and `stop_using_range` to race with each other, although there's no +/// way to know whether an index ends up with true or false if a call to `use_range` races with +/// another call to `stop_using_range` on a given index. +/// +/// It's also okay for the two methods to race with themselves, e.g. if two threads call +/// `use_range` on the same index, then that's fine and neither call will return until the page is +/// allocated and mapped. +/// +/// The methods that read or write to a range require that the page is marked as in use. So it is +/// _not_ okay to call `stop_using_range` on a page that is in use by the methods that read or +/// write to the page. +#[pin_data(PinnedDrop)] +pub struct ShrinkablePageRange { + /// Shrinker object registered with the kernel. + shrinker: &'static Shrinker, + /// The mm for the relevant process. + mm: MmGrab, + /// Spinlock protecting changes to pages. + #[pin] + lock: SpinLock, + + /// Must not move, since page info has pointers back. + #[pin] + _pin: PhantomPinned, +} + +struct Inner { + /// Array of pages. + /// + /// Since this is also accessed by the shrinker, we can't use a `Box`, which asserts exclusive + /// ownership. To deal with that, we manage it using raw pointers. + pages: *mut PageInfo, + /// Length of the `pages` array. + size: usize, + /// The address of the vma to insert the pages into. + vma_addr: usize, +} + +unsafe impl Send for ShrinkablePageRange {} +unsafe impl Sync for ShrinkablePageRange {} + +/// An array element that describes the current state of a page. +/// +/// There are three states: +/// +/// * Free. The page is None. The `lru` element is not queued. +/// * Available. The page is Some. The `lru` element is queued to the shrinker's lru. +/// * Used. The page is Some. The `lru` element is not queued. +/// +/// When an element is available, the shrinker is able to free the page. +#[repr(C)] +struct PageInfo { + lru: bindings::list_head, + page: Option>, + range: *const ShrinkablePageRange, +} + +impl PageInfo { + /// # Safety + /// + /// The caller ensures that reading from `me.page` is ok. + unsafe fn has_page(me: *const PageInfo) -> bool { + // SAFETY: This pointer offset is in bounds. + let page = unsafe { ptr::addr_of!((*me).page) }; + + unsafe { (*page).is_some() } + } + + /// # Safety + /// + /// The caller ensures that writing to `me.page` is ok, and that the page is not currently set. + unsafe fn set_page(me: *mut PageInfo, page: Pages<0>) { + // SAFETY: This pointer offset is in bounds. + let ptr = unsafe { ptr::addr_of_mut!((*me).page) }; + + // SAFETY: The pointer is valid for writing, so also valid for reading. + if unsafe { (*ptr).is_some() } { + pr_err!("set_page called when there is already a page"); + // SAFETY: We will initialize the page again below. + unsafe { ptr::drop_in_place(ptr) }; + } + + // SAFETY: The pointer is valid for writing. + unsafe { ptr::write(ptr, Some(page)) }; + } + + /// # Safety + /// + /// The caller ensures that reading from `me.page` is ok for the duration of 'a. + unsafe fn get_page<'a>(me: *const PageInfo) -> Option<&'a Pages<0>> { + // SAFETY: This pointer offset is in bounds. + let ptr = unsafe { ptr::addr_of!((*me).page) }; + + // SAFETY: The pointer is valid for reading. + unsafe { (*ptr).as_ref() } + } + + /// # Safety + /// + /// The caller ensures that writing to `me.page` is ok for the duration of 'a. + unsafe fn take_page(me: *mut PageInfo) -> Option> { + // SAFETY: This pointer offset is in bounds. + let ptr = unsafe { ptr::addr_of_mut!((*me).page) }; + + // SAFETY: The pointer is valid for reading. + unsafe { (*ptr).take() } + } + + /// Add this page to the lru list, if not already in the list. + /// + /// # Safety + /// + /// The pointer must be valid, and it must be the right shrinker. + unsafe fn list_lru_add(me: *mut PageInfo, shrinker: &'static Shrinker) { + // SAFETY: This pointer offset is in bounds. + let lru_ptr = unsafe { ptr::addr_of_mut!((*me).lru) }; + // SAFETY: The lru pointer is valid, and we're not using it with any other lru list. + unsafe { bindings::list_lru_add(shrinker.list_lru.get(), lru_ptr) }; + } + + /// Remove this page from the lru list, if it is in the list. + /// + /// # Safety + /// + /// The pointer must be valid, and it must be the right shrinker. + unsafe fn list_lru_del(me: *mut PageInfo, shrinker: &'static Shrinker) { + // SAFETY: This pointer offset is in bounds. + let lru_ptr = unsafe { ptr::addr_of_mut!((*me).lru) }; + // SAFETY: The lru pointer is valid, and we're not using it with any other lru list. + unsafe { bindings::list_lru_del(shrinker.list_lru.get(), lru_ptr) }; + } +} + +impl ShrinkablePageRange { + /// Create a new `ShrinkablePageRange` using the given shrinker. + pub fn new(shrinker: &'static Shrinker) -> impl PinInit { + try_pin_init!(Self { + shrinker, + mm: MmGrab::mmgrab_current().ok_or(ESRCH)?, + lock <- new_spinlock!(Inner { + pages: ptr::null_mut(), + size: 0, + vma_addr: 0, + }, "ShrinkablePageRange"), + _pin: PhantomPinned, + }) + } + + /// Register a vma with this page range. Returns the size of the region. + pub fn register_with_vma(&self, vma: &virt::Area) -> Result { + let num_bytes = usize::min(vma.end() - vma.start(), bindings::SZ_4M as usize); + let num_pages = num_bytes >> PAGE_SHIFT; + + if !self.mm.is_same_mm(vma) { + pr_debug!("Failed to register with vma: invalid vma->vm_mm"); + return Err(EINVAL); + } + if num_pages == 0 { + pr_debug!("Failed to register with vma: size zero"); + return Err(EINVAL); + } + + let layout = Layout::array::(num_pages).map_err(|_| ENOMEM)?; + // SAFETY: The layout has non-zero size. + let pages = unsafe { alloc::alloc::alloc(layout) as *mut PageInfo }; + if pages.is_null() { + return Err(ENOMEM); + } + + // SAFETY: This just initializes the pages array. + unsafe { + let self_ptr = self as *const ShrinkablePageRange; + for i in 0..num_pages { + let info = pages.add(i); + ptr::addr_of_mut!((*info).range).write(self_ptr); + ptr::addr_of_mut!((*info).page).write(None); + let lru = ptr::addr_of_mut!((*info).lru); + ptr::addr_of_mut!((*lru).next).write(lru); + ptr::addr_of_mut!((*lru).prev).write(lru); + } + } + + let mut inner = self.lock.lock(); + if inner.size > 0 { + pr_debug!("Failed to register with vma: already registered"); + drop(inner); + // SAFETY: The `pages` array was allocated with the same layout. + unsafe { alloc::alloc::dealloc(pages.cast(), layout) }; + return Err(EBUSY); + } + + inner.pages = pages; + inner.size = num_pages; + inner.vma_addr = vma.start(); + + Ok(num_pages) + } + + /// Make sure that the given pages are allocated and mapped. + /// + /// Must not be called from an atomic context. + pub fn use_range(&self, start: usize, end: usize) -> Result<()> { + if start >= end { + return Ok(()); + } + let mut inner = self.lock.lock(); + assert!(end <= inner.size); + + for i in start..end { + // SAFETY: This pointer offset is in bounds. + let page_info = unsafe { inner.pages.add(i) }; + + // SAFETY: The pointer is valid, and we hold the lock so reading from the page is okay. + if unsafe { PageInfo::has_page(page_info) } { + // Since we're going to use the page, we should remove it from the lru list so that + // the shrinker will not free it. + // + // SAFETY: The pointer is valid, and this is the right shrinker. + // + // The shrinker can't free the page between the check and this call to + // `list_lru_del` because we hold the lock. + unsafe { PageInfo::list_lru_del(page_info, self.shrinker) }; + } else { + // We have to allocate a new page. Use the slow path. + drop(inner); + match self.use_page_slow(i) { + Ok(()) => {} + Err(err) => { + pr_warn!("Error in use_page_slow: {:?}", err); + return Err(err); + } + } + inner = self.lock.lock(); + } + } + Ok(()) + } + + /// Mark the given page as in use, slow path. + /// + /// Must not be called from an atomic context. + /// + /// # Safety + /// + /// Assumes that `i` is in bounds. + #[cold] + fn use_page_slow(&self, i: usize) -> Result<()> { + let new_page = Pages::new()?; + // We use `mmput_async` when dropping the `mm` because `use_page_slow` is usually used from + // a remote process. If the call to `mmput` races with the process shutting down, then the + // caller of `use_page_slow` becomes responsible for cleaning up the `mm`, which doesn't + // happen until it returns to userspace. However, the caller might instead go to sleep and + // wait for the owner of the `mm` to wake it up, which doesn't happen because it's in the + // middle of a shutdown process that wont complete until the `mm` is dropped. This can + // amount to a deadlock. + // + // Using `mmput_async` avoids this, because then the `mm` cleanup is instead queued to a + // workqueue. + let mm = self.mm.mmget_not_zero().ok_or(ESRCH)?.use_async_put(); + let mut mmap_lock = mm.mmap_write_lock(); + let inner = self.lock.lock(); + + // SAFETY: This pointer offset is in bounds. + let page_info = unsafe { inner.pages.add(i) }; + + // SAFETY: The pointer is valid, and we hold the lock so reading from the page is okay. + if unsafe { PageInfo::has_page(page_info) } { + // The page was already there, or someone else added the page while we didn't hold the + // spinlock. + // + // SAFETY: The pointer is valid, and this is the right shrinker. + // + // The shrinker can't free the page between the check and this call to + // `list_lru_del` because we hold the lock. + unsafe { PageInfo::list_lru_del(page_info, self.shrinker) }; + return Ok(()); + } + + let vma_addr = inner.vma_addr; + // Release the spinlock while we insert the page into the vma. + drop(inner); + + let vma = mmap_lock.vma_lookup(vma_addr).ok_or(ESRCH)?; + + // No overflow since we stay in bounds of the vma. + let user_page_addr = vma_addr + (i << PAGE_SHIFT); + match vma.insert_page(user_page_addr, &new_page) { + Ok(()) => {} + Err(err) => { + pr_warn!( + "Error in insert_page({}): vma_addr:{} i:{} err:{:?}", + user_page_addr, + vma_addr, + i, + err + ); + return Err(err); + } + } + + let inner = self.lock.lock(); + + // SAFETY: The `page_info` pointer is valid and currently does not have a page. The page + // can be written to since we hold the lock. + // + // We released and reacquired the spinlock since we checked that the page is null, but we + // always hold the mmap write lock when setting the page to a non-null value, so it's not + // possible for someone else to have changed it since our check. + unsafe { PageInfo::set_page(page_info, new_page) }; + + drop(inner); + + Ok(()) + } + + /// If the given page is in use, then mark it as available so that the shrinker can free it. + /// + /// May be called from an atomic context. + pub fn stop_using_range(&self, start: usize, end: usize) { + if start >= end { + return; + } + let inner = self.lock.lock(); + assert!(end <= inner.size); + + for i in (start..end).rev() { + // SAFETY: The pointer is in bounds. + let page_info = unsafe { inner.pages.add(i) }; + + // SAFETY: Okay for reading since we have the lock. + if unsafe { PageInfo::has_page(page_info) } { + // SAFETY: The pointer is valid, and it's the right shrinker. + unsafe { PageInfo::list_lru_add(page_info, self.shrinker) }; + } + } + } + + /// Helper for reading or writing to a range of bytes that may overlap with several pages. + /// + /// # Safety + /// + /// All pages touched by this operation must be in use for the duration of this call. + unsafe fn iterate(&self, mut offset: usize, mut size: usize, mut cb: T) -> Result + where + T: FnMut(&Pages<0>, usize, usize) -> Result, + { + if size == 0 { + return Ok(()); + } + + // SAFETY: The caller promises that the pages touched by this call are in use. It's only + // possible for a page to be in use if we have already been registered with a vma, and we + // only change the `pages` and `size` fields during registration with a vma, so there is no + // race when we read them here without taking the lock. + let (pages, num_pages) = unsafe { + let inner = self.lock.get_ptr(); + ( + ptr::addr_of!((*inner).pages).read(), + ptr::addr_of!((*inner).size).read(), + ) + }; + let num_bytes = num_pages << PAGE_SHIFT; + + // Check that the request is within the buffer. + if offset.checked_add(size).ok_or(EFAULT)? > num_bytes { + return Err(EFAULT); + } + + let mut page_index = offset >> PAGE_SHIFT; + offset &= PAGE_MASK; + while size > 0 { + let available = usize::min(size, PAGE_SIZE - offset); + // SAFETY: The pointer is in bounds. + let page_info = unsafe { pages.add(page_index) }; + // SAFETY: The caller guarantees that this page is in the "in use" state for the + // duration of this call to `iterate`, so nobody will change the page. + let page = unsafe { PageInfo::get_page(page_info) }; + if page.is_none() { + pr_warn!("Page is null!"); + } + let page = page.ok_or(EFAULT)?; + cb(page, offset, available)?; + size -= available; + page_index += 1; + offset = 0; + } + Ok(()) + } + + /// Copy from userspace into this page range. + /// + /// # Safety + /// + /// All pages touched by this operation must be in use for the duration of this call. + pub unsafe fn copy_into( + &self, + reader: &mut UserSlicePtrReader, + offset: usize, + size: usize, + ) -> Result { + // SAFETY: `self.iterate` has the same safety requirements as `copy_into`. + unsafe { + self.iterate(offset, size, |page, offset, to_copy| { + page.copy_into_page(reader, offset, to_copy) + }) + } + } + + /// Copy from this page range into kernel space. + /// + /// # Safety + /// + /// All pages touched by this operation must be in use for the duration of this call. + pub unsafe fn read(&self, offset: usize) -> Result { + let mut out = MaybeUninit::::uninit(); + let mut out_offset = 0; + // SAFETY: `self.iterate` has the same safety requirements as `copy_into`. + unsafe { + self.iterate(offset, size_of::(), |page, offset, to_copy| { + // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T. + let obj_ptr = (out.as_mut_ptr() as *mut u8).add(out_offset); + // SAFETY: The pointer points is in-bounds of the `out` variable, so it is valid. + page.read(obj_ptr, offset, to_copy)?; + out_offset += to_copy; + Ok(()) + })?; + } + // SAFETY: We just initialised the data. + Ok(unsafe { out.assume_init() }) + } + + /// Copy from kernel space into this page range. + /// + /// # Safety + /// + /// All pages touched by this operation must be in use for the duration of this call. + pub unsafe fn write(&self, offset: usize, obj: &T) -> Result { + let mut obj_offset = 0; + // SAFETY: `self.iterate` has the same safety requirements as `copy_into`. + unsafe { + self.iterate(offset, size_of_val(obj), |page, offset, to_copy| { + // SAFETY: The sum of `offset` and `to_copy` is bounded by the size of T. + let obj_ptr = (obj as *const T as *const u8).add(obj_offset); + // SAFETY: We have a reference to the object, so the pointer is valid. + page.write(obj_ptr, offset, to_copy)?; + obj_offset += to_copy; + Ok(()) + }) + } + } + + /// Write zeroes to the given range. + /// + /// # Safety + /// + /// All pages touched by this operation must be in use for the duration of this call. + pub unsafe fn fill_zero(&self, offset: usize, size: usize) -> Result { + // SAFETY: `self.iterate` has the same safety requirements as `copy_into`. + unsafe { + self.iterate(offset, size, |page, offset, len| { + page.fill_zero(offset, len) + }) + } + } +} + +#[pinned_drop] +impl PinnedDrop for ShrinkablePageRange { + fn drop(self: Pin<&mut Self>) { + let (pages, size) = { + let lock = self.lock.lock(); + (lock.pages, lock.size) + }; + + if size == 0 { + return; + } + + // This is the destructor, so unlike the other methods, we only need to worry about races + // with the shrinker here. + for i in 0..size { + // SAFETY: The pointer is valid and it's the right shrinker. + unsafe { PageInfo::list_lru_del(pages.add(i), self.shrinker) }; + // SAFETY: If the shrinker was going to free this page, then it would have taken it + // from the PageInfo before releasing the lru lock. Thus, the call to `list_lru_del` + // will either remove it before the shrinker can access it, or the shrinker will + // already have taken the page at this point. + unsafe { drop(PageInfo::take_page(pages.add(i))) }; + } + + // SAFETY: This computation did not overflow when allocating the pages array, so it will + // not overflow this time. + let layout = unsafe { Layout::array::(size).unwrap_unchecked() }; + + // SAFETY: The `pages` array was allocated with the same layout. + unsafe { alloc::alloc::dealloc(pages.cast(), layout) }; + } +} + +#[no_mangle] +unsafe extern "C" fn rust_shrink_count( + shrink: *mut bindings::shrinker, + _sc: *mut bindings::shrink_control, +) -> c_ulong { + // SAFETY: This method is only used with the `Shrinker` type, and the cast is valid since + // `shrinker` is the first field of a #[repr(C)] struct. + let shrinker = unsafe { &*shrink.cast::() }; + // SAFETY: Accessing the lru list is okay. Just an FFI call. + unsafe { bindings::list_lru_count(shrinker.list_lru.get()) } +} + +#[no_mangle] +unsafe extern "C" fn rust_shrink_scan( + shrink: *mut bindings::shrinker, + sc: *mut bindings::shrink_control, +) -> c_ulong { + // SAFETY: This method is only used with the `Shrinker` type, and the cast is valid since + // `shrinker` is the first field of a #[repr(C)] struct. + let shrinker = unsafe { &*shrink.cast::() }; + // SAFETY: Caller guarantees that it is safe to read this field. + let nr_to_scan = unsafe { (*sc).nr_to_scan }; + // SAFETY: Accessing the lru list is okay. Just an FFI call. + unsafe { + bindings::list_lru_walk( + shrinker.list_lru.get(), + Some(rust_shrink_free_page), + ptr::null_mut(), + nr_to_scan, + ) + } +} + +const LRU_SKIP: bindings::lru_status = bindings::lru_status_LRU_SKIP; +const LRU_REMOVED_ENTRY: bindings::lru_status = bindings::lru_status_LRU_REMOVED_RETRY; + +#[no_mangle] +unsafe extern "C" fn rust_shrink_free_page( + item: *mut bindings::list_head, + lru: *mut bindings::list_lru_one, + lru_lock: *mut bindings::spinlock_t, + _cb_arg: *mut c_void, +) -> bindings::lru_status { + // Fields that should survive after unlocking the lru lock. + let page; + let page_index; + let mm; + let mmap_read; + let vma_addr; + + { + // SAFETY: The `list_head` field is first in `PageInfo`. + let info = item as *mut PageInfo; + let range = unsafe { &*((*info).range) }; + + mm = match range.mm.mmget_not_zero() { + Some(mm) => mm.use_async_put(), + None => return LRU_SKIP, + }; + + mmap_read = match mm.mmap_read_trylock() { + Some(guard) => guard, + None => return LRU_SKIP, + }; + + // We can't lock it normally here, since we hold the lru lock. + let inner = match range.lock.trylock() { + Some(inner) => inner, + None => return LRU_SKIP, + }; + + // SAFETY: The item is in this lru list, so it's okay to remove it. + unsafe { bindings::list_lru_isolate(lru, item) }; + + // SAFETY: Both pointers are in bounds of the same allocation. + page_index = unsafe { info.offset_from(inner.pages) } as usize; + + // SAFETY: We hold the spinlock, so we can take the page. + // + // This sets the page pointer to zero before we unmap it from the vma. However, we call + // `zap_page_range` before we release the mmap lock, so `use_page_slow` will not be able to + // insert a new page until after our call to `zap_page_range`. + page = unsafe { PageInfo::take_page(info) }; + vma_addr = inner.vma_addr; + + // From this point on, we don't access this PageInfo or ShrinkablePageRange again, because + // they can be freed at any point after we unlock `lru_lock`. + } + + // SAFETY: The lru lock is locked when this method is called. + unsafe { bindings::spin_unlock(lru_lock) }; + + if let Some(vma) = mmap_read.vma_lookup(vma_addr) { + let user_page_addr = vma_addr + (page_index << PAGE_SHIFT); + vma.zap_page_range(user_page_addr, PAGE_SIZE); + } + + drop(mmap_read); + drop(mm); + drop(page); + + // SAFETY: We just unlocked the lru lock, but it should be locked when we return. + unsafe { bindings::spin_lock(lru_lock) }; + + LRU_REMOVED_ENTRY +} diff --git a/rust/kernel/sync/lock.rs b/rust/kernel/sync/lock.rs index 149a5259d431..8cf02edb6f4a 100644 --- a/rust/kernel/sync/lock.rs +++ b/rust/kernel/sync/lock.rs @@ -51,6 +51,14 @@ unsafe fn init( #[must_use] unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState; + /// Tries to acquire the lock, making the caller its owner. + /// + /// # Safety + /// + /// Callers must ensure that [`Backend::init`] has been previously called. + #[must_use] + unsafe fn trylock(ptr: *mut Self::State) -> Option; + /// Releases the lock, giving up its ownership. /// /// # Safety @@ -121,6 +129,22 @@ pub fn lock(&self) -> Guard<'_, T, B> { // SAFETY: The lock was just acquired. unsafe { Guard::new(self, state) } } + + /// Acquires the lock and gives the caller access to the data protected by it. + pub fn trylock(&self) -> Option> { + // SAFETY: The constructor of the type calls `init`, so the existence of the object proves + // that `init` was called. + let state = unsafe { B::trylock(self.state.get())? }; + // SAFETY: The lock was just acquired. + unsafe { Some(Guard::new(self, state)) } + } + + /// Get a raw pointer to the data without touching the lock. + /// + /// It is up to the user to make sure that the pointer is used correctly. + pub fn get_ptr(&self) -> *mut T { + self.data.get() + } } /// A lock guard. diff --git a/rust/kernel/sync/lock/mutex.rs b/rust/kernel/sync/lock/mutex.rs index 09276fedc091..0871d0034174 100644 --- a/rust/kernel/sync/lock/mutex.rs +++ b/rust/kernel/sync/lock/mutex.rs @@ -111,6 +111,16 @@ unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState { unsafe { bindings::mutex_lock(ptr) }; } + unsafe fn trylock(ptr: *mut Self::State) -> Option { + // SAFETY: The safety requirements of this function ensure that `ptr` points to valid + // memory, and that it has been initialised before. + if unsafe { bindings::mutex_trylock(ptr) } != 0 { + Some(()) + } else { + None + } + } + unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) { // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the // caller is the owner of the mutex. diff --git a/rust/kernel/sync/lock/spinlock.rs b/rust/kernel/sync/lock/spinlock.rs index 91eb2c9e9123..64ff1fcf36c4 100644 --- a/rust/kernel/sync/lock/spinlock.rs +++ b/rust/kernel/sync/lock/spinlock.rs @@ -110,6 +110,16 @@ unsafe fn lock(ptr: *mut Self::State) -> Self::GuardState { unsafe { bindings::spin_lock(ptr) } } + unsafe fn trylock(ptr: *mut Self::State) -> Option { + // SAFETY: The safety requirements of this function ensure that `ptr` points to valid + // memory, and that it has been initialised before. + if unsafe { bindings::spin_trylock(ptr) } != 0 { + Some(()) + } else { + None + } + } + unsafe fn unlock(ptr: *mut Self::State, _guard_state: &Self::GuardState) { // SAFETY: The safety requirements of this function ensure that `ptr` is valid and that the // caller is the owner of the mutex. From patchwork Wed Nov 1 18:01:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alice Ryhl X-Patchwork-Id: 160641 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:abcd:0:b0:403:3b70:6f57 with SMTP id f13csp605869vqx; Wed, 1 Nov 2023 11:04:27 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGHi53l/kEnPfiA8kQISSk2c+MLwuCWDdZuvyZ8qfejtzO5OyvcGm0uBkHJihcEl/SJbvij X-Received: by 2002:a05:6870:148a:b0:1e9:d4fd:6552 with SMTP id k10-20020a056870148a00b001e9d4fd6552mr21873547oab.32.1698861863013; Wed, 01 Nov 2023 11:04:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698861862; cv=none; d=google.com; s=arc-20160816; b=VNg/aaSUd1KQCKnCBqWoB9U2xGqqWavlgrZFk/onyk27icQ74IrtqcVW2TMghYv/Rv ynJJJd3gq6mQOsDdGjZ+MDc6UWEc/Vlv8Vnwg32y91FOrliOLc9okTD8lA0lcF+LlJg0 ahCmiQQQd+HrQeBzyBTgPrwfxs0SlWzMVd+WwmHlcUAIiGD9kYtdqOQrrPDkOZUJOZtV EkjIy3ukE4Zfn5RKQttCXtgemCnDxSCyE65lrvD2CVqvbJUFKe0y0hLc0S0qiLlcrI3C iJODloQryoL6BweBVTdF6gawt2UNhmMQhKQyAyO36YipyzFMTLTD7dBlAZc3kOFwrlDn UxTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=VdKUmiVmQbsyVIysnOazbE8PIEvqYXHLEkhNT2cED+s=; fh=cHdDrFPTfwdP0/Ip9jHI/T24Yd8xIIOhbocUOLU1mtg=; b=E88cFms55/bL+/Qg785PxFJV6MwIzN+69VnQ4Vt2jQnRw1J6Ram8nW0JBeHo0KL/mB j1bffaSbrW4IxejVRL/4br44w2/t6VTRVro7owguVk2qJ2m/az+YS/rGKIP2THbXpopH J17BOa5hG7Gtefgc6bXH2Qw06JjGYAUY4zeH2V/NLiFJ73XdGGcdqCdcwPQOGcILw6ca f4HE0FUAMOk01Ldk4tVaDt9sSLybxw7lbb7Ds3wGEja93tWwGtM5vT8Dryz/5r9AqB7L cgA6hIQHU2vvw4FSL4Gx8Q/4+8qoeWtLfuRUszrP7Vxv541YoiO3xOjF3WCryO4YNehW CDFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=KpYkHNwV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id y5-20020a655285000000b0057d7cff25besi351159pgp.829.2023.11.01.11.04.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 01 Nov 2023 11:04:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=KpYkHNwV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 4E431819144B; Wed, 1 Nov 2023 11:04:10 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1344949AbjKASD7 (ORCPT + 34 others); Wed, 1 Nov 2023 14:03:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345031AbjKASDP (ORCPT ); Wed, 1 Nov 2023 14:03:15 -0400 Received: from mail-ej1-x64a.google.com (mail-ej1-x64a.google.com [IPv6:2a00:1450:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4F88112E for ; Wed, 1 Nov 2023 11:03:08 -0700 (PDT) Received: by mail-ej1-x64a.google.com with SMTP id a640c23a62f3a-9d2606301eeso11006466b.0 for ; Wed, 01 Nov 2023 11:03:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1698861787; x=1699466587; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=VdKUmiVmQbsyVIysnOazbE8PIEvqYXHLEkhNT2cED+s=; b=KpYkHNwVGmbwdVj78FC0A3F+b6eHnM+jYisrpdzQpn5klrQXQnXbmfhlRpWb+3Bhmz m4CdGRoraS7lNbMcNG+7NGlRXvfzmrvgUu4sybUsUM3f+ybiwGEheMNr4GzmEuQxDwPk jt6w/nvovRQTKRdYzhYQl5CDCgOfe2ORXM5YbfdgFwtIqgHyLqJgTbtnrtfvHeXZkqNE r/HK1AjrSr0QzykyY+DWotRZs8HKefOyzTHR2VyYYiz8FuqK1nRuVSh9GAw9PCXTc1PM 9tkqzQCVWsZu98bDj6JhSiR1binwPrFDxoa5OHIXm8JgwJ7MC8idyFgjQhUndTw1xuyT BTqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698861787; x=1699466587; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=VdKUmiVmQbsyVIysnOazbE8PIEvqYXHLEkhNT2cED+s=; b=OHwAP//Ge5GemZB5zdkoLAe2km4IMpJRFmo3WDKDYQQOADAN0lczIx8e8WqOUR5ikt 3O4fJMg4S5S3KnpeVNHDN/1k0ZDqhhKXnayZmT+iLqBwWHEKgmN+HvGa+GWbptzALp0D qQbkpUlz8AyKcYW2IJUwZJLcXZFjjSUNRG8EoeBCVA3x5L1kZ/f8bUNd7vYb5mS8wk01 qV3Ys6vjiI/QyEBdnz7SIikEU1xkTaC5Xa3MM3Kbuh1XHHE8urhyPFc8onyQ2eLYYGaD 5N/zO3w8CwYkb9n0/mQaLG6A4V+fHKYDdjv/Z6aDSx1qP65KIVInnpsGh2ERfMMOCgGj LLng== X-Gm-Message-State: AOJu0Yw22fNplBxtXcPtXAqwCbjzvw+qFlukplTY8mIy1lEfKqgdkpy8 veTy7Ky7XLzR8Wcv3BkzgmtHl8lBjxSz7Rs= X-Received: from aliceryhl.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:6c8]) (user=aliceryhl job=sendgmr) by 2002:a17:906:32d4:b0:9a9:f5b0:4091 with SMTP id k20-20020a17090632d400b009a9f5b04091mr35794ejk.5.1698861786779; Wed, 01 Nov 2023 11:03:06 -0700 (PDT) Date: Wed, 01 Nov 2023 18:01:50 +0000 In-Reply-To: <20231101-rust-binder-v1-0-08ba9197f637@google.com> Mime-Version: 1.0 References: <20231101-rust-binder-v1-0-08ba9197f637@google.com> X-Mailer: b4 0.13-dev-26615 Message-ID: <20231101-rust-binder-v1-20-08ba9197f637@google.com> Subject: [PATCH RFC 20/20] binder: delete the C implementation From: Alice Ryhl To: Greg Kroah-Hartman , " =?utf-8?q?Arve_Hj?= =?utf-8?q?=C3=B8nnev=C3=A5g?= " , Todd Kjos , Martijn Coenen , Joel Fernandes , Christian Brauner , Carlos Llamas , Suren Baghdasaryan , Miguel Ojeda , Alex Gaynor , Wedson Almeida Filho Cc: linux-kernel@vger.kernel.org, rust-for-linux@vger.kernel.org, Boqun Feng , Gary Guo , " =?utf-8?q?Bj=C3=B6rn_Roy_Baron?= " , Benno Lossin , Andreas Hindborg , Matt Gilbride , Jeffrey Vander Stoep , Matthew Maurer , Alice Ryhl X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 01 Nov 2023 11:04:10 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1781385777121997248 X-GMAIL-MSGID: 1781385777121997248 The ultimate goal of this project is to replace the C implementation. Signed-off-by: Alice Ryhl Acked-by: Greg Kroah-Hartman Acked-by: Carlos Llamas --- drivers/android/Kconfig | 36 - drivers/android/binder.c | 6630 ---------------------------------------- drivers/android/binder_alloc.c | 1284 -------- drivers/android/binderfs.c | 827 ----- 4 files changed, 8777 deletions(-) diff --git a/drivers/android/Kconfig b/drivers/android/Kconfig index 82ed6ddabe1a..8b8badd87dc5 100644 --- a/drivers/android/Kconfig +++ b/drivers/android/Kconfig @@ -1,18 +1,6 @@ # SPDX-License-Identifier: GPL-2.0 menu "Android" -config ANDROID_BINDER_IPC - bool "Android Binder IPC Driver" - depends on MMU - default n - help - Binder is used in Android for both communication between processes, - and remote method invocation. - - This means one Android process can call a method/routine in another - Android process, using Binder to identify, invoke and pass arguments - between said processes. - config ANDROID_BINDER_IPC_RUST bool "Android Binder IPC Driver in Rust" depends on MMU && RUST @@ -24,18 +12,6 @@ config ANDROID_BINDER_IPC_RUST Android process, using Binder to identify, invoke and pass arguments between said processes. -config ANDROID_BINDERFS - bool "Android Binderfs filesystem" - depends on ANDROID_BINDER_IPC - default n - help - Binderfs is a pseudo-filesystem for the Android Binder IPC driver - which can be mounted per-ipc namespace allowing to run multiple - instances of Android. - Each binderfs mount initially only contains a binder-control device. - It can be used to dynamically allocate new binder IPC devices via - ioctls. - config ANDROID_BINDERFS_RUST bool "Android Binderfs filesystem in Rust" depends on ANDROID_BINDER_IPC_RUST @@ -48,18 +24,6 @@ config ANDROID_BINDERFS_RUST It can be used to dynamically allocate new binder IPC devices via ioctls. -config ANDROID_BINDER_DEVICES - string "Android Binder devices" - depends on ANDROID_BINDER_IPC - default "binder,hwbinder,vndbinder" - help - Default value for the binder.devices parameter. - - The binder.devices parameter is a comma-separated list of strings - that specifies the names of the binder device nodes that will be - created. Each binder device has its own context manager, and is - therefore logically separated from the other devices. - config ANDROID_BINDER_DEVICES_RUST string "Android Binder devices in Rust" depends on ANDROID_BINDER_IPC_RUST diff --git a/drivers/android/binder.c b/drivers/android/binder.c deleted file mode 100644 index 92128aae2d06..000000000000 --- a/drivers/android/binder.c +++ /dev/null @@ -1,6630 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* binder.c - * - * Android IPC Subsystem - * - * Copyright (C) 2007-2008 Google, Inc. - */ - -/* - * Locking overview - * - * There are 3 main spinlocks which must be acquired in the - * order shown: - * - * 1) proc->outer_lock : protects binder_ref - * binder_proc_lock() and binder_proc_unlock() are - * used to acq/rel. - * 2) node->lock : protects most fields of binder_node. - * binder_node_lock() and binder_node_unlock() are - * used to acq/rel - * 3) proc->inner_lock : protects the thread and node lists - * (proc->threads, proc->waiting_threads, proc->nodes) - * and all todo lists associated with the binder_proc - * (proc->todo, thread->todo, proc->delivered_death and - * node->async_todo), as well as thread->transaction_stack - * binder_inner_proc_lock() and binder_inner_proc_unlock() - * are used to acq/rel - * - * Any lock under procA must never be nested under any lock at the same - * level or below on procB. - * - * Functions that require a lock held on entry indicate which lock - * in the suffix of the function name: - * - * foo_olocked() : requires node->outer_lock - * foo_nlocked() : requires node->lock - * foo_ilocked() : requires proc->inner_lock - * foo_oilocked(): requires proc->outer_lock and proc->inner_lock - * foo_nilocked(): requires node->lock and proc->inner_lock - * ... - */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include - -#include - -#include "binder_internal.h" -#include "binder_trace.h" - -static HLIST_HEAD(binder_deferred_list); -static DEFINE_MUTEX(binder_deferred_lock); - -static HLIST_HEAD(binder_devices); -static HLIST_HEAD(binder_procs); -static DEFINE_MUTEX(binder_procs_lock); - -static HLIST_HEAD(binder_dead_nodes); -static DEFINE_SPINLOCK(binder_dead_nodes_lock); - -static struct dentry *binder_debugfs_dir_entry_root; -static struct dentry *binder_debugfs_dir_entry_proc; -static atomic_t binder_last_id; - -static int proc_show(struct seq_file *m, void *unused); -DEFINE_SHOW_ATTRIBUTE(proc); - -#define FORBIDDEN_MMAP_FLAGS (VM_WRITE) - -enum { - BINDER_DEBUG_USER_ERROR = 1U << 0, - BINDER_DEBUG_FAILED_TRANSACTION = 1U << 1, - BINDER_DEBUG_DEAD_TRANSACTION = 1U << 2, - BINDER_DEBUG_OPEN_CLOSE = 1U << 3, - BINDER_DEBUG_DEAD_BINDER = 1U << 4, - BINDER_DEBUG_DEATH_NOTIFICATION = 1U << 5, - BINDER_DEBUG_READ_WRITE = 1U << 6, - BINDER_DEBUG_USER_REFS = 1U << 7, - BINDER_DEBUG_THREADS = 1U << 8, - BINDER_DEBUG_TRANSACTION = 1U << 9, - BINDER_DEBUG_TRANSACTION_COMPLETE = 1U << 10, - BINDER_DEBUG_FREE_BUFFER = 1U << 11, - BINDER_DEBUG_INTERNAL_REFS = 1U << 12, - BINDER_DEBUG_PRIORITY_CAP = 1U << 13, - BINDER_DEBUG_SPINLOCKS = 1U << 14, -}; -static uint32_t binder_debug_mask = BINDER_DEBUG_USER_ERROR | - BINDER_DEBUG_FAILED_TRANSACTION | BINDER_DEBUG_DEAD_TRANSACTION; -module_param_named(debug_mask, binder_debug_mask, uint, 0644); - -char *binder_devices_param = CONFIG_ANDROID_BINDER_DEVICES; -module_param_named(devices, binder_devices_param, charp, 0444); - -static DECLARE_WAIT_QUEUE_HEAD(binder_user_error_wait); -static int binder_stop_on_user_error; - -static int binder_set_stop_on_user_error(const char *val, - const struct kernel_param *kp) -{ - int ret; - - ret = param_set_int(val, kp); - if (binder_stop_on_user_error < 2) - wake_up(&binder_user_error_wait); - return ret; -} -module_param_call(stop_on_user_error, binder_set_stop_on_user_error, - param_get_int, &binder_stop_on_user_error, 0644); - -static __printf(2, 3) void binder_debug(int mask, const char *format, ...) -{ - struct va_format vaf; - va_list args; - - if (binder_debug_mask & mask) { - va_start(args, format); - vaf.va = &args; - vaf.fmt = format; - pr_info_ratelimited("%pV", &vaf); - va_end(args); - } -} - -#define binder_txn_error(x...) \ - binder_debug(BINDER_DEBUG_FAILED_TRANSACTION, x) - -static __printf(1, 2) void binder_user_error(const char *format, ...) -{ - struct va_format vaf; - va_list args; - - if (binder_debug_mask & BINDER_DEBUG_USER_ERROR) { - va_start(args, format); - vaf.va = &args; - vaf.fmt = format; - pr_info_ratelimited("%pV", &vaf); - va_end(args); - } - - if (binder_stop_on_user_error) - binder_stop_on_user_error = 2; -} - -#define binder_set_extended_error(ee, _id, _command, _param) \ - do { \ - (ee)->id = _id; \ - (ee)->command = _command; \ - (ee)->param = _param; \ - } while (0) - -#define to_flat_binder_object(hdr) \ - container_of(hdr, struct flat_binder_object, hdr) - -#define to_binder_fd_object(hdr) container_of(hdr, struct binder_fd_object, hdr) - -#define to_binder_buffer_object(hdr) \ - container_of(hdr, struct binder_buffer_object, hdr) - -#define to_binder_fd_array_object(hdr) \ - container_of(hdr, struct binder_fd_array_object, hdr) - -static struct binder_stats binder_stats; - -static inline void binder_stats_deleted(enum binder_stat_types type) -{ - atomic_inc(&binder_stats.obj_deleted[type]); -} - -static inline void binder_stats_created(enum binder_stat_types type) -{ - atomic_inc(&binder_stats.obj_created[type]); -} - -struct binder_transaction_log_entry { - int debug_id; - int debug_id_done; - int call_type; - int from_proc; - int from_thread; - int target_handle; - int to_proc; - int to_thread; - int to_node; - int data_size; - int offsets_size; - int return_error_line; - uint32_t return_error; - uint32_t return_error_param; - char context_name[BINDERFS_MAX_NAME + 1]; -}; - -struct binder_transaction_log { - atomic_t cur; - bool full; - struct binder_transaction_log_entry entry[32]; -}; - -static struct binder_transaction_log binder_transaction_log; -static struct binder_transaction_log binder_transaction_log_failed; - -static struct binder_transaction_log_entry *binder_transaction_log_add( - struct binder_transaction_log *log) -{ - struct binder_transaction_log_entry *e; - unsigned int cur = atomic_inc_return(&log->cur); - - if (cur >= ARRAY_SIZE(log->entry)) - log->full = true; - e = &log->entry[cur % ARRAY_SIZE(log->entry)]; - WRITE_ONCE(e->debug_id_done, 0); - /* - * write-barrier to synchronize access to e->debug_id_done. - * We make sure the initialized 0 value is seen before - * memset() other fields are zeroed by memset. - */ - smp_wmb(); - memset(e, 0, sizeof(*e)); - return e; -} - -enum binder_deferred_state { - BINDER_DEFERRED_FLUSH = 0x01, - BINDER_DEFERRED_RELEASE = 0x02, -}; - -enum { - BINDER_LOOPER_STATE_REGISTERED = 0x01, - BINDER_LOOPER_STATE_ENTERED = 0x02, - BINDER_LOOPER_STATE_EXITED = 0x04, - BINDER_LOOPER_STATE_INVALID = 0x08, - BINDER_LOOPER_STATE_WAITING = 0x10, - BINDER_LOOPER_STATE_POLL = 0x20, -}; - -/** - * binder_proc_lock() - Acquire outer lock for given binder_proc - * @proc: struct binder_proc to acquire - * - * Acquires proc->outer_lock. Used to protect binder_ref - * structures associated with the given proc. - */ -#define binder_proc_lock(proc) _binder_proc_lock(proc, __LINE__) -static void -_binder_proc_lock(struct binder_proc *proc, int line) - __acquires(&proc->outer_lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_lock(&proc->outer_lock); -} - -/** - * binder_proc_unlock() - Release spinlock for given binder_proc - * @proc: struct binder_proc to acquire - * - * Release lock acquired via binder_proc_lock() - */ -#define binder_proc_unlock(proc) _binder_proc_unlock(proc, __LINE__) -static void -_binder_proc_unlock(struct binder_proc *proc, int line) - __releases(&proc->outer_lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_unlock(&proc->outer_lock); -} - -/** - * binder_inner_proc_lock() - Acquire inner lock for given binder_proc - * @proc: struct binder_proc to acquire - * - * Acquires proc->inner_lock. Used to protect todo lists - */ -#define binder_inner_proc_lock(proc) _binder_inner_proc_lock(proc, __LINE__) -static void -_binder_inner_proc_lock(struct binder_proc *proc, int line) - __acquires(&proc->inner_lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_lock(&proc->inner_lock); -} - -/** - * binder_inner_proc_unlock() - Release inner lock for given binder_proc - * @proc: struct binder_proc to acquire - * - * Release lock acquired via binder_inner_proc_lock() - */ -#define binder_inner_proc_unlock(proc) _binder_inner_proc_unlock(proc, __LINE__) -static void -_binder_inner_proc_unlock(struct binder_proc *proc, int line) - __releases(&proc->inner_lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_unlock(&proc->inner_lock); -} - -/** - * binder_node_lock() - Acquire spinlock for given binder_node - * @node: struct binder_node to acquire - * - * Acquires node->lock. Used to protect binder_node fields - */ -#define binder_node_lock(node) _binder_node_lock(node, __LINE__) -static void -_binder_node_lock(struct binder_node *node, int line) - __acquires(&node->lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_lock(&node->lock); -} - -/** - * binder_node_unlock() - Release spinlock for given binder_proc - * @node: struct binder_node to acquire - * - * Release lock acquired via binder_node_lock() - */ -#define binder_node_unlock(node) _binder_node_unlock(node, __LINE__) -static void -_binder_node_unlock(struct binder_node *node, int line) - __releases(&node->lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_unlock(&node->lock); -} - -/** - * binder_node_inner_lock() - Acquire node and inner locks - * @node: struct binder_node to acquire - * - * Acquires node->lock. If node->proc also acquires - * proc->inner_lock. Used to protect binder_node fields - */ -#define binder_node_inner_lock(node) _binder_node_inner_lock(node, __LINE__) -static void -_binder_node_inner_lock(struct binder_node *node, int line) - __acquires(&node->lock) __acquires(&node->proc->inner_lock) -{ - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - spin_lock(&node->lock); - if (node->proc) - binder_inner_proc_lock(node->proc); - else - /* annotation for sparse */ - __acquire(&node->proc->inner_lock); -} - -/** - * binder_node_inner_unlock() - Release node and inner locks - * @node: struct binder_node to acquire - * - * Release lock acquired via binder_node_lock() - */ -#define binder_node_inner_unlock(node) _binder_node_inner_unlock(node, __LINE__) -static void -_binder_node_inner_unlock(struct binder_node *node, int line) - __releases(&node->lock) __releases(&node->proc->inner_lock) -{ - struct binder_proc *proc = node->proc; - - binder_debug(BINDER_DEBUG_SPINLOCKS, - "%s: line=%d\n", __func__, line); - if (proc) - binder_inner_proc_unlock(proc); - else - /* annotation for sparse */ - __release(&node->proc->inner_lock); - spin_unlock(&node->lock); -} - -static bool binder_worklist_empty_ilocked(struct list_head *list) -{ - return list_empty(list); -} - -/** - * binder_worklist_empty() - Check if no items on the work list - * @proc: binder_proc associated with list - * @list: list to check - * - * Return: true if there are no items on list, else false - */ -static bool binder_worklist_empty(struct binder_proc *proc, - struct list_head *list) -{ - bool ret; - - binder_inner_proc_lock(proc); - ret = binder_worklist_empty_ilocked(list); - binder_inner_proc_unlock(proc); - return ret; -} - -/** - * binder_enqueue_work_ilocked() - Add an item to the work list - * @work: struct binder_work to add to list - * @target_list: list to add work to - * - * Adds the work to the specified list. Asserts that work - * is not already on a list. - * - * Requires the proc->inner_lock to be held. - */ -static void -binder_enqueue_work_ilocked(struct binder_work *work, - struct list_head *target_list) -{ - BUG_ON(target_list == NULL); - BUG_ON(work->entry.next && !list_empty(&work->entry)); - list_add_tail(&work->entry, target_list); -} - -/** - * binder_enqueue_deferred_thread_work_ilocked() - Add deferred thread work - * @thread: thread to queue work to - * @work: struct binder_work to add to list - * - * Adds the work to the todo list of the thread. Doesn't set the process_todo - * flag, which means that (if it wasn't already set) the thread will go to - * sleep without handling this work when it calls read. - * - * Requires the proc->inner_lock to be held. - */ -static void -binder_enqueue_deferred_thread_work_ilocked(struct binder_thread *thread, - struct binder_work *work) -{ - WARN_ON(!list_empty(&thread->waiting_thread_node)); - binder_enqueue_work_ilocked(work, &thread->todo); -} - -/** - * binder_enqueue_thread_work_ilocked() - Add an item to the thread work list - * @thread: thread to queue work to - * @work: struct binder_work to add to list - * - * Adds the work to the todo list of the thread, and enables processing - * of the todo queue. - * - * Requires the proc->inner_lock to be held. - */ -static void -binder_enqueue_thread_work_ilocked(struct binder_thread *thread, - struct binder_work *work) -{ - WARN_ON(!list_empty(&thread->waiting_thread_node)); - binder_enqueue_work_ilocked(work, &thread->todo); - thread->process_todo = true; -} - -/** - * binder_enqueue_thread_work() - Add an item to the thread work list - * @thread: thread to queue work to - * @work: struct binder_work to add to list - * - * Adds the work to the todo list of the thread, and enables processing - * of the todo queue. - */ -static void -binder_enqueue_thread_work(struct binder_thread *thread, - struct binder_work *work) -{ - binder_inner_proc_lock(thread->proc); - binder_enqueue_thread_work_ilocked(thread, work); - binder_inner_proc_unlock(thread->proc); -} - -static void -binder_dequeue_work_ilocked(struct binder_work *work) -{ - list_del_init(&work->entry); -} - -/** - * binder_dequeue_work() - Removes an item from the work list - * @proc: binder_proc associated with list - * @work: struct binder_work to remove from list - * - * Removes the specified work item from whatever list it is on. - * Can safely be called if work is not on any list. - */ -static void -binder_dequeue_work(struct binder_proc *proc, struct binder_work *work) -{ - binder_inner_proc_lock(proc); - binder_dequeue_work_ilocked(work); - binder_inner_proc_unlock(proc); -} - -static struct binder_work *binder_dequeue_work_head_ilocked( - struct list_head *list) -{ - struct binder_work *w; - - w = list_first_entry_or_null(list, struct binder_work, entry); - if (w) - list_del_init(&w->entry); - return w; -} - -static void -binder_defer_work(struct binder_proc *proc, enum binder_deferred_state defer); -static void binder_free_thread(struct binder_thread *thread); -static void binder_free_proc(struct binder_proc *proc); -static void binder_inc_node_tmpref_ilocked(struct binder_node *node); - -static bool binder_has_work_ilocked(struct binder_thread *thread, - bool do_proc_work) -{ - return thread->process_todo || - thread->looper_need_return || - (do_proc_work && - !binder_worklist_empty_ilocked(&thread->proc->todo)); -} - -static bool binder_has_work(struct binder_thread *thread, bool do_proc_work) -{ - bool has_work; - - binder_inner_proc_lock(thread->proc); - has_work = binder_has_work_ilocked(thread, do_proc_work); - binder_inner_proc_unlock(thread->proc); - - return has_work; -} - -static bool binder_available_for_proc_work_ilocked(struct binder_thread *thread) -{ - return !thread->transaction_stack && - binder_worklist_empty_ilocked(&thread->todo) && - (thread->looper & (BINDER_LOOPER_STATE_ENTERED | - BINDER_LOOPER_STATE_REGISTERED)); -} - -static void binder_wakeup_poll_threads_ilocked(struct binder_proc *proc, - bool sync) -{ - struct rb_node *n; - struct binder_thread *thread; - - for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n)) { - thread = rb_entry(n, struct binder_thread, rb_node); - if (thread->looper & BINDER_LOOPER_STATE_POLL && - binder_available_for_proc_work_ilocked(thread)) { - if (sync) - wake_up_interruptible_sync(&thread->wait); - else - wake_up_interruptible(&thread->wait); - } - } -} - -/** - * binder_select_thread_ilocked() - selects a thread for doing proc work. - * @proc: process to select a thread from - * - * Note that calling this function moves the thread off the waiting_threads - * list, so it can only be woken up by the caller of this function, or a - * signal. Therefore, callers *should* always wake up the thread this function - * returns. - * - * Return: If there's a thread currently waiting for process work, - * returns that thread. Otherwise returns NULL. - */ -static struct binder_thread * -binder_select_thread_ilocked(struct binder_proc *proc) -{ - struct binder_thread *thread; - - assert_spin_locked(&proc->inner_lock); - thread = list_first_entry_or_null(&proc->waiting_threads, - struct binder_thread, - waiting_thread_node); - - if (thread) - list_del_init(&thread->waiting_thread_node); - - return thread; -} - -/** - * binder_wakeup_thread_ilocked() - wakes up a thread for doing proc work. - * @proc: process to wake up a thread in - * @thread: specific thread to wake-up (may be NULL) - * @sync: whether to do a synchronous wake-up - * - * This function wakes up a thread in the @proc process. - * The caller may provide a specific thread to wake-up in - * the @thread parameter. If @thread is NULL, this function - * will wake up threads that have called poll(). - * - * Note that for this function to work as expected, callers - * should first call binder_select_thread() to find a thread - * to handle the work (if they don't have a thread already), - * and pass the result into the @thread parameter. - */ -static void binder_wakeup_thread_ilocked(struct binder_proc *proc, - struct binder_thread *thread, - bool sync) -{ - assert_spin_locked(&proc->inner_lock); - - if (thread) { - if (sync) - wake_up_interruptible_sync(&thread->wait); - else - wake_up_interruptible(&thread->wait); - return; - } - - /* Didn't find a thread waiting for proc work; this can happen - * in two scenarios: - * 1. All threads are busy handling transactions - * In that case, one of those threads should call back into - * the kernel driver soon and pick up this work. - * 2. Threads are using the (e)poll interface, in which case - * they may be blocked on the waitqueue without having been - * added to waiting_threads. For this case, we just iterate - * over all threads not handling transaction work, and - * wake them all up. We wake all because we don't know whether - * a thread that called into (e)poll is handling non-binder - * work currently. - */ - binder_wakeup_poll_threads_ilocked(proc, sync); -} - -static void binder_wakeup_proc_ilocked(struct binder_proc *proc) -{ - struct binder_thread *thread = binder_select_thread_ilocked(proc); - - binder_wakeup_thread_ilocked(proc, thread, /* sync = */false); -} - -static void binder_set_nice(long nice) -{ - long min_nice; - - if (can_nice(current, nice)) { - set_user_nice(current, nice); - return; - } - min_nice = rlimit_to_nice(rlimit(RLIMIT_NICE)); - binder_debug(BINDER_DEBUG_PRIORITY_CAP, - "%d: nice value %ld not allowed use %ld instead\n", - current->pid, nice, min_nice); - set_user_nice(current, min_nice); - if (min_nice <= MAX_NICE) - return; - binder_user_error("%d RLIMIT_NICE not set\n", current->pid); -} - -static struct binder_node *binder_get_node_ilocked(struct binder_proc *proc, - binder_uintptr_t ptr) -{ - struct rb_node *n = proc->nodes.rb_node; - struct binder_node *node; - - assert_spin_locked(&proc->inner_lock); - - while (n) { - node = rb_entry(n, struct binder_node, rb_node); - - if (ptr < node->ptr) - n = n->rb_left; - else if (ptr > node->ptr) - n = n->rb_right; - else { - /* - * take an implicit weak reference - * to ensure node stays alive until - * call to binder_put_node() - */ - binder_inc_node_tmpref_ilocked(node); - return node; - } - } - return NULL; -} - -static struct binder_node *binder_get_node(struct binder_proc *proc, - binder_uintptr_t ptr) -{ - struct binder_node *node; - - binder_inner_proc_lock(proc); - node = binder_get_node_ilocked(proc, ptr); - binder_inner_proc_unlock(proc); - return node; -} - -static struct binder_node *binder_init_node_ilocked( - struct binder_proc *proc, - struct binder_node *new_node, - struct flat_binder_object *fp) -{ - struct rb_node **p = &proc->nodes.rb_node; - struct rb_node *parent = NULL; - struct binder_node *node; - binder_uintptr_t ptr = fp ? fp->binder : 0; - binder_uintptr_t cookie = fp ? fp->cookie : 0; - __u32 flags = fp ? fp->flags : 0; - - assert_spin_locked(&proc->inner_lock); - - while (*p) { - - parent = *p; - node = rb_entry(parent, struct binder_node, rb_node); - - if (ptr < node->ptr) - p = &(*p)->rb_left; - else if (ptr > node->ptr) - p = &(*p)->rb_right; - else { - /* - * A matching node is already in - * the rb tree. Abandon the init - * and return it. - */ - binder_inc_node_tmpref_ilocked(node); - return node; - } - } - node = new_node; - binder_stats_created(BINDER_STAT_NODE); - node->tmp_refs++; - rb_link_node(&node->rb_node, parent, p); - rb_insert_color(&node->rb_node, &proc->nodes); - node->debug_id = atomic_inc_return(&binder_last_id); - node->proc = proc; - node->ptr = ptr; - node->cookie = cookie; - node->work.type = BINDER_WORK_NODE; - node->min_priority = flags & FLAT_BINDER_FLAG_PRIORITY_MASK; - node->accept_fds = !!(flags & FLAT_BINDER_FLAG_ACCEPTS_FDS); - node->txn_security_ctx = !!(flags & FLAT_BINDER_FLAG_TXN_SECURITY_CTX); - spin_lock_init(&node->lock); - INIT_LIST_HEAD(&node->work.entry); - INIT_LIST_HEAD(&node->async_todo); - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "%d:%d node %d u%016llx c%016llx created\n", - proc->pid, current->pid, node->debug_id, - (u64)node->ptr, (u64)node->cookie); - - return node; -} - -static struct binder_node *binder_new_node(struct binder_proc *proc, - struct flat_binder_object *fp) -{ - struct binder_node *node; - struct binder_node *new_node = kzalloc(sizeof(*node), GFP_KERNEL); - - if (!new_node) - return NULL; - binder_inner_proc_lock(proc); - node = binder_init_node_ilocked(proc, new_node, fp); - binder_inner_proc_unlock(proc); - if (node != new_node) - /* - * The node was already added by another thread - */ - kfree(new_node); - - return node; -} - -static void binder_free_node(struct binder_node *node) -{ - kfree(node); - binder_stats_deleted(BINDER_STAT_NODE); -} - -static int binder_inc_node_nilocked(struct binder_node *node, int strong, - int internal, - struct list_head *target_list) -{ - struct binder_proc *proc = node->proc; - - assert_spin_locked(&node->lock); - if (proc) - assert_spin_locked(&proc->inner_lock); - if (strong) { - if (internal) { - if (target_list == NULL && - node->internal_strong_refs == 0 && - !(node->proc && - node == node->proc->context->binder_context_mgr_node && - node->has_strong_ref)) { - pr_err("invalid inc strong node for %d\n", - node->debug_id); - return -EINVAL; - } - node->internal_strong_refs++; - } else - node->local_strong_refs++; - if (!node->has_strong_ref && target_list) { - struct binder_thread *thread = container_of(target_list, - struct binder_thread, todo); - binder_dequeue_work_ilocked(&node->work); - BUG_ON(&thread->todo != target_list); - binder_enqueue_deferred_thread_work_ilocked(thread, - &node->work); - } - } else { - if (!internal) - node->local_weak_refs++; - if (!node->has_weak_ref && list_empty(&node->work.entry)) { - if (target_list == NULL) { - pr_err("invalid inc weak node for %d\n", - node->debug_id); - return -EINVAL; - } - /* - * See comment above - */ - binder_enqueue_work_ilocked(&node->work, target_list); - } - } - return 0; -} - -static int binder_inc_node(struct binder_node *node, int strong, int internal, - struct list_head *target_list) -{ - int ret; - - binder_node_inner_lock(node); - ret = binder_inc_node_nilocked(node, strong, internal, target_list); - binder_node_inner_unlock(node); - - return ret; -} - -static bool binder_dec_node_nilocked(struct binder_node *node, - int strong, int internal) -{ - struct binder_proc *proc = node->proc; - - assert_spin_locked(&node->lock); - if (proc) - assert_spin_locked(&proc->inner_lock); - if (strong) { - if (internal) - node->internal_strong_refs--; - else - node->local_strong_refs--; - if (node->local_strong_refs || node->internal_strong_refs) - return false; - } else { - if (!internal) - node->local_weak_refs--; - if (node->local_weak_refs || node->tmp_refs || - !hlist_empty(&node->refs)) - return false; - } - - if (proc && (node->has_strong_ref || node->has_weak_ref)) { - if (list_empty(&node->work.entry)) { - binder_enqueue_work_ilocked(&node->work, &proc->todo); - binder_wakeup_proc_ilocked(proc); - } - } else { - if (hlist_empty(&node->refs) && !node->local_strong_refs && - !node->local_weak_refs && !node->tmp_refs) { - if (proc) { - binder_dequeue_work_ilocked(&node->work); - rb_erase(&node->rb_node, &proc->nodes); - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "refless node %d deleted\n", - node->debug_id); - } else { - BUG_ON(!list_empty(&node->work.entry)); - spin_lock(&binder_dead_nodes_lock); - /* - * tmp_refs could have changed so - * check it again - */ - if (node->tmp_refs) { - spin_unlock(&binder_dead_nodes_lock); - return false; - } - hlist_del(&node->dead_node); - spin_unlock(&binder_dead_nodes_lock); - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "dead node %d deleted\n", - node->debug_id); - } - return true; - } - } - return false; -} - -static void binder_dec_node(struct binder_node *node, int strong, int internal) -{ - bool free_node; - - binder_node_inner_lock(node); - free_node = binder_dec_node_nilocked(node, strong, internal); - binder_node_inner_unlock(node); - if (free_node) - binder_free_node(node); -} - -static void binder_inc_node_tmpref_ilocked(struct binder_node *node) -{ - /* - * No call to binder_inc_node() is needed since we - * don't need to inform userspace of any changes to - * tmp_refs - */ - node->tmp_refs++; -} - -/** - * binder_inc_node_tmpref() - take a temporary reference on node - * @node: node to reference - * - * Take reference on node to prevent the node from being freed - * while referenced only by a local variable. The inner lock is - * needed to serialize with the node work on the queue (which - * isn't needed after the node is dead). If the node is dead - * (node->proc is NULL), use binder_dead_nodes_lock to protect - * node->tmp_refs against dead-node-only cases where the node - * lock cannot be acquired (eg traversing the dead node list to - * print nodes) - */ -static void binder_inc_node_tmpref(struct binder_node *node) -{ - binder_node_lock(node); - if (node->proc) - binder_inner_proc_lock(node->proc); - else - spin_lock(&binder_dead_nodes_lock); - binder_inc_node_tmpref_ilocked(node); - if (node->proc) - binder_inner_proc_unlock(node->proc); - else - spin_unlock(&binder_dead_nodes_lock); - binder_node_unlock(node); -} - -/** - * binder_dec_node_tmpref() - remove a temporary reference on node - * @node: node to reference - * - * Release temporary reference on node taken via binder_inc_node_tmpref() - */ -static void binder_dec_node_tmpref(struct binder_node *node) -{ - bool free_node; - - binder_node_inner_lock(node); - if (!node->proc) - spin_lock(&binder_dead_nodes_lock); - else - __acquire(&binder_dead_nodes_lock); - node->tmp_refs--; - BUG_ON(node->tmp_refs < 0); - if (!node->proc) - spin_unlock(&binder_dead_nodes_lock); - else - __release(&binder_dead_nodes_lock); - /* - * Call binder_dec_node() to check if all refcounts are 0 - * and cleanup is needed. Calling with strong=0 and internal=1 - * causes no actual reference to be released in binder_dec_node(). - * If that changes, a change is needed here too. - */ - free_node = binder_dec_node_nilocked(node, 0, 1); - binder_node_inner_unlock(node); - if (free_node) - binder_free_node(node); -} - -static void binder_put_node(struct binder_node *node) -{ - binder_dec_node_tmpref(node); -} - -static struct binder_ref *binder_get_ref_olocked(struct binder_proc *proc, - u32 desc, bool need_strong_ref) -{ - struct rb_node *n = proc->refs_by_desc.rb_node; - struct binder_ref *ref; - - while (n) { - ref = rb_entry(n, struct binder_ref, rb_node_desc); - - if (desc < ref->data.desc) { - n = n->rb_left; - } else if (desc > ref->data.desc) { - n = n->rb_right; - } else if (need_strong_ref && !ref->data.strong) { - binder_user_error("tried to use weak ref as strong ref\n"); - return NULL; - } else { - return ref; - } - } - return NULL; -} - -/** - * binder_get_ref_for_node_olocked() - get the ref associated with given node - * @proc: binder_proc that owns the ref - * @node: binder_node of target - * @new_ref: newly allocated binder_ref to be initialized or %NULL - * - * Look up the ref for the given node and return it if it exists - * - * If it doesn't exist and the caller provides a newly allocated - * ref, initialize the fields of the newly allocated ref and insert - * into the given proc rb_trees and node refs list. - * - * Return: the ref for node. It is possible that another thread - * allocated/initialized the ref first in which case the - * returned ref would be different than the passed-in - * new_ref. new_ref must be kfree'd by the caller in - * this case. - */ -static struct binder_ref *binder_get_ref_for_node_olocked( - struct binder_proc *proc, - struct binder_node *node, - struct binder_ref *new_ref) -{ - struct binder_context *context = proc->context; - struct rb_node **p = &proc->refs_by_node.rb_node; - struct rb_node *parent = NULL; - struct binder_ref *ref; - struct rb_node *n; - - while (*p) { - parent = *p; - ref = rb_entry(parent, struct binder_ref, rb_node_node); - - if (node < ref->node) - p = &(*p)->rb_left; - else if (node > ref->node) - p = &(*p)->rb_right; - else - return ref; - } - if (!new_ref) - return NULL; - - binder_stats_created(BINDER_STAT_REF); - new_ref->data.debug_id = atomic_inc_return(&binder_last_id); - new_ref->proc = proc; - new_ref->node = node; - rb_link_node(&new_ref->rb_node_node, parent, p); - rb_insert_color(&new_ref->rb_node_node, &proc->refs_by_node); - - new_ref->data.desc = (node == context->binder_context_mgr_node) ? 0 : 1; - for (n = rb_first(&proc->refs_by_desc); n != NULL; n = rb_next(n)) { - ref = rb_entry(n, struct binder_ref, rb_node_desc); - if (ref->data.desc > new_ref->data.desc) - break; - new_ref->data.desc = ref->data.desc + 1; - } - - p = &proc->refs_by_desc.rb_node; - while (*p) { - parent = *p; - ref = rb_entry(parent, struct binder_ref, rb_node_desc); - - if (new_ref->data.desc < ref->data.desc) - p = &(*p)->rb_left; - else if (new_ref->data.desc > ref->data.desc) - p = &(*p)->rb_right; - else - BUG(); - } - rb_link_node(&new_ref->rb_node_desc, parent, p); - rb_insert_color(&new_ref->rb_node_desc, &proc->refs_by_desc); - - binder_node_lock(node); - hlist_add_head(&new_ref->node_entry, &node->refs); - - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "%d new ref %d desc %d for node %d\n", - proc->pid, new_ref->data.debug_id, new_ref->data.desc, - node->debug_id); - binder_node_unlock(node); - return new_ref; -} - -static void binder_cleanup_ref_olocked(struct binder_ref *ref) -{ - bool delete_node = false; - - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "%d delete ref %d desc %d for node %d\n", - ref->proc->pid, ref->data.debug_id, ref->data.desc, - ref->node->debug_id); - - rb_erase(&ref->rb_node_desc, &ref->proc->refs_by_desc); - rb_erase(&ref->rb_node_node, &ref->proc->refs_by_node); - - binder_node_inner_lock(ref->node); - if (ref->data.strong) - binder_dec_node_nilocked(ref->node, 1, 1); - - hlist_del(&ref->node_entry); - delete_node = binder_dec_node_nilocked(ref->node, 0, 1); - binder_node_inner_unlock(ref->node); - /* - * Clear ref->node unless we want the caller to free the node - */ - if (!delete_node) { - /* - * The caller uses ref->node to determine - * whether the node needs to be freed. Clear - * it since the node is still alive. - */ - ref->node = NULL; - } - - if (ref->death) { - binder_debug(BINDER_DEBUG_DEAD_BINDER, - "%d delete ref %d desc %d has death notification\n", - ref->proc->pid, ref->data.debug_id, - ref->data.desc); - binder_dequeue_work(ref->proc, &ref->death->work); - binder_stats_deleted(BINDER_STAT_DEATH); - } - binder_stats_deleted(BINDER_STAT_REF); -} - -/** - * binder_inc_ref_olocked() - increment the ref for given handle - * @ref: ref to be incremented - * @strong: if true, strong increment, else weak - * @target_list: list to queue node work on - * - * Increment the ref. @ref->proc->outer_lock must be held on entry - * - * Return: 0, if successful, else errno - */ -static int binder_inc_ref_olocked(struct binder_ref *ref, int strong, - struct list_head *target_list) -{ - int ret; - - if (strong) { - if (ref->data.strong == 0) { - ret = binder_inc_node(ref->node, 1, 1, target_list); - if (ret) - return ret; - } - ref->data.strong++; - } else { - if (ref->data.weak == 0) { - ret = binder_inc_node(ref->node, 0, 1, target_list); - if (ret) - return ret; - } - ref->data.weak++; - } - return 0; -} - -/** - * binder_dec_ref_olocked() - dec the ref for given handle - * @ref: ref to be decremented - * @strong: if true, strong decrement, else weak - * - * Decrement the ref. - * - * Return: %true if ref is cleaned up and ready to be freed. - */ -static bool binder_dec_ref_olocked(struct binder_ref *ref, int strong) -{ - if (strong) { - if (ref->data.strong == 0) { - binder_user_error("%d invalid dec strong, ref %d desc %d s %d w %d\n", - ref->proc->pid, ref->data.debug_id, - ref->data.desc, ref->data.strong, - ref->data.weak); - return false; - } - ref->data.strong--; - if (ref->data.strong == 0) - binder_dec_node(ref->node, strong, 1); - } else { - if (ref->data.weak == 0) { - binder_user_error("%d invalid dec weak, ref %d desc %d s %d w %d\n", - ref->proc->pid, ref->data.debug_id, - ref->data.desc, ref->data.strong, - ref->data.weak); - return false; - } - ref->data.weak--; - } - if (ref->data.strong == 0 && ref->data.weak == 0) { - binder_cleanup_ref_olocked(ref); - return true; - } - return false; -} - -/** - * binder_get_node_from_ref() - get the node from the given proc/desc - * @proc: proc containing the ref - * @desc: the handle associated with the ref - * @need_strong_ref: if true, only return node if ref is strong - * @rdata: the id/refcount data for the ref - * - * Given a proc and ref handle, return the associated binder_node - * - * Return: a binder_node or NULL if not found or not strong when strong required - */ -static struct binder_node *binder_get_node_from_ref( - struct binder_proc *proc, - u32 desc, bool need_strong_ref, - struct binder_ref_data *rdata) -{ - struct binder_node *node; - struct binder_ref *ref; - - binder_proc_lock(proc); - ref = binder_get_ref_olocked(proc, desc, need_strong_ref); - if (!ref) - goto err_no_ref; - node = ref->node; - /* - * Take an implicit reference on the node to ensure - * it stays alive until the call to binder_put_node() - */ - binder_inc_node_tmpref(node); - if (rdata) - *rdata = ref->data; - binder_proc_unlock(proc); - - return node; - -err_no_ref: - binder_proc_unlock(proc); - return NULL; -} - -/** - * binder_free_ref() - free the binder_ref - * @ref: ref to free - * - * Free the binder_ref. Free the binder_node indicated by ref->node - * (if non-NULL) and the binder_ref_death indicated by ref->death. - */ -static void binder_free_ref(struct binder_ref *ref) -{ - if (ref->node) - binder_free_node(ref->node); - kfree(ref->death); - kfree(ref); -} - -/** - * binder_update_ref_for_handle() - inc/dec the ref for given handle - * @proc: proc containing the ref - * @desc: the handle associated with the ref - * @increment: true=inc reference, false=dec reference - * @strong: true=strong reference, false=weak reference - * @rdata: the id/refcount data for the ref - * - * Given a proc and ref handle, increment or decrement the ref - * according to "increment" arg. - * - * Return: 0 if successful, else errno - */ -static int binder_update_ref_for_handle(struct binder_proc *proc, - uint32_t desc, bool increment, bool strong, - struct binder_ref_data *rdata) -{ - int ret = 0; - struct binder_ref *ref; - bool delete_ref = false; - - binder_proc_lock(proc); - ref = binder_get_ref_olocked(proc, desc, strong); - if (!ref) { - ret = -EINVAL; - goto err_no_ref; - } - if (increment) - ret = binder_inc_ref_olocked(ref, strong, NULL); - else - delete_ref = binder_dec_ref_olocked(ref, strong); - - if (rdata) - *rdata = ref->data; - binder_proc_unlock(proc); - - if (delete_ref) - binder_free_ref(ref); - return ret; - -err_no_ref: - binder_proc_unlock(proc); - return ret; -} - -/** - * binder_dec_ref_for_handle() - dec the ref for given handle - * @proc: proc containing the ref - * @desc: the handle associated with the ref - * @strong: true=strong reference, false=weak reference - * @rdata: the id/refcount data for the ref - * - * Just calls binder_update_ref_for_handle() to decrement the ref. - * - * Return: 0 if successful, else errno - */ -static int binder_dec_ref_for_handle(struct binder_proc *proc, - uint32_t desc, bool strong, struct binder_ref_data *rdata) -{ - return binder_update_ref_for_handle(proc, desc, false, strong, rdata); -} - - -/** - * binder_inc_ref_for_node() - increment the ref for given proc/node - * @proc: proc containing the ref - * @node: target node - * @strong: true=strong reference, false=weak reference - * @target_list: worklist to use if node is incremented - * @rdata: the id/refcount data for the ref - * - * Given a proc and node, increment the ref. Create the ref if it - * doesn't already exist - * - * Return: 0 if successful, else errno - */ -static int binder_inc_ref_for_node(struct binder_proc *proc, - struct binder_node *node, - bool strong, - struct list_head *target_list, - struct binder_ref_data *rdata) -{ - struct binder_ref *ref; - struct binder_ref *new_ref = NULL; - int ret = 0; - - binder_proc_lock(proc); - ref = binder_get_ref_for_node_olocked(proc, node, NULL); - if (!ref) { - binder_proc_unlock(proc); - new_ref = kzalloc(sizeof(*ref), GFP_KERNEL); - if (!new_ref) - return -ENOMEM; - binder_proc_lock(proc); - ref = binder_get_ref_for_node_olocked(proc, node, new_ref); - } - ret = binder_inc_ref_olocked(ref, strong, target_list); - *rdata = ref->data; - if (ret && ref == new_ref) { - /* - * Cleanup the failed reference here as the target - * could now be dead and have already released its - * references by now. Calling on the new reference - * with strong=0 and a tmp_refs will not decrement - * the node. The new_ref gets kfree'd below. - */ - binder_cleanup_ref_olocked(new_ref); - ref = NULL; - } - - binder_proc_unlock(proc); - if (new_ref && ref != new_ref) - /* - * Another thread created the ref first so - * free the one we allocated - */ - kfree(new_ref); - return ret; -} - -static void binder_pop_transaction_ilocked(struct binder_thread *target_thread, - struct binder_transaction *t) -{ - BUG_ON(!target_thread); - assert_spin_locked(&target_thread->proc->inner_lock); - BUG_ON(target_thread->transaction_stack != t); - BUG_ON(target_thread->transaction_stack->from != target_thread); - target_thread->transaction_stack = - target_thread->transaction_stack->from_parent; - t->from = NULL; -} - -/** - * binder_thread_dec_tmpref() - decrement thread->tmp_ref - * @thread: thread to decrement - * - * A thread needs to be kept alive while being used to create or - * handle a transaction. binder_get_txn_from() is used to safely - * extract t->from from a binder_transaction and keep the thread - * indicated by t->from from being freed. When done with that - * binder_thread, this function is called to decrement the - * tmp_ref and free if appropriate (thread has been released - * and no transaction being processed by the driver) - */ -static void binder_thread_dec_tmpref(struct binder_thread *thread) -{ - /* - * atomic is used to protect the counter value while - * it cannot reach zero or thread->is_dead is false - */ - binder_inner_proc_lock(thread->proc); - atomic_dec(&thread->tmp_ref); - if (thread->is_dead && !atomic_read(&thread->tmp_ref)) { - binder_inner_proc_unlock(thread->proc); - binder_free_thread(thread); - return; - } - binder_inner_proc_unlock(thread->proc); -} - -/** - * binder_proc_dec_tmpref() - decrement proc->tmp_ref - * @proc: proc to decrement - * - * A binder_proc needs to be kept alive while being used to create or - * handle a transaction. proc->tmp_ref is incremented when - * creating a new transaction or the binder_proc is currently in-use - * by threads that are being released. When done with the binder_proc, - * this function is called to decrement the counter and free the - * proc if appropriate (proc has been released, all threads have - * been released and not currenly in-use to process a transaction). - */ -static void binder_proc_dec_tmpref(struct binder_proc *proc) -{ - binder_inner_proc_lock(proc); - proc->tmp_ref--; - if (proc->is_dead && RB_EMPTY_ROOT(&proc->threads) && - !proc->tmp_ref) { - binder_inner_proc_unlock(proc); - binder_free_proc(proc); - return; - } - binder_inner_proc_unlock(proc); -} - -/** - * binder_get_txn_from() - safely extract the "from" thread in transaction - * @t: binder transaction for t->from - * - * Atomically return the "from" thread and increment the tmp_ref - * count for the thread to ensure it stays alive until - * binder_thread_dec_tmpref() is called. - * - * Return: the value of t->from - */ -static struct binder_thread *binder_get_txn_from( - struct binder_transaction *t) -{ - struct binder_thread *from; - - spin_lock(&t->lock); - from = t->from; - if (from) - atomic_inc(&from->tmp_ref); - spin_unlock(&t->lock); - return from; -} - -/** - * binder_get_txn_from_and_acq_inner() - get t->from and acquire inner lock - * @t: binder transaction for t->from - * - * Same as binder_get_txn_from() except it also acquires the proc->inner_lock - * to guarantee that the thread cannot be released while operating on it. - * The caller must call binder_inner_proc_unlock() to release the inner lock - * as well as call binder_dec_thread_txn() to release the reference. - * - * Return: the value of t->from - */ -static struct binder_thread *binder_get_txn_from_and_acq_inner( - struct binder_transaction *t) - __acquires(&t->from->proc->inner_lock) -{ - struct binder_thread *from; - - from = binder_get_txn_from(t); - if (!from) { - __acquire(&from->proc->inner_lock); - return NULL; - } - binder_inner_proc_lock(from->proc); - if (t->from) { - BUG_ON(from != t->from); - return from; - } - binder_inner_proc_unlock(from->proc); - __acquire(&from->proc->inner_lock); - binder_thread_dec_tmpref(from); - return NULL; -} - -/** - * binder_free_txn_fixups() - free unprocessed fd fixups - * @t: binder transaction for t->from - * - * If the transaction is being torn down prior to being - * processed by the target process, free all of the - * fd fixups and fput the file structs. It is safe to - * call this function after the fixups have been - * processed -- in that case, the list will be empty. - */ -static void binder_free_txn_fixups(struct binder_transaction *t) -{ - struct binder_txn_fd_fixup *fixup, *tmp; - - list_for_each_entry_safe(fixup, tmp, &t->fd_fixups, fixup_entry) { - fput(fixup->file); - if (fixup->target_fd >= 0) - put_unused_fd(fixup->target_fd); - list_del(&fixup->fixup_entry); - kfree(fixup); - } -} - -static void binder_txn_latency_free(struct binder_transaction *t) -{ - int from_proc, from_thread, to_proc, to_thread; - - spin_lock(&t->lock); - from_proc = t->from ? t->from->proc->pid : 0; - from_thread = t->from ? t->from->pid : 0; - to_proc = t->to_proc ? t->to_proc->pid : 0; - to_thread = t->to_thread ? t->to_thread->pid : 0; - spin_unlock(&t->lock); - - trace_binder_txn_latency_free(t, from_proc, from_thread, to_proc, to_thread); -} - -static void binder_free_transaction(struct binder_transaction *t) -{ - struct binder_proc *target_proc = t->to_proc; - - if (target_proc) { - binder_inner_proc_lock(target_proc); - target_proc->outstanding_txns--; - if (target_proc->outstanding_txns < 0) - pr_warn("%s: Unexpected outstanding_txns %d\n", - __func__, target_proc->outstanding_txns); - if (!target_proc->outstanding_txns && target_proc->is_frozen) - wake_up_interruptible_all(&target_proc->freeze_wait); - if (t->buffer) - t->buffer->transaction = NULL; - binder_inner_proc_unlock(target_proc); - } - if (trace_binder_txn_latency_free_enabled()) - binder_txn_latency_free(t); - /* - * If the transaction has no target_proc, then - * t->buffer->transaction has already been cleared. - */ - binder_free_txn_fixups(t); - kfree(t); - binder_stats_deleted(BINDER_STAT_TRANSACTION); -} - -static void binder_send_failed_reply(struct binder_transaction *t, - uint32_t error_code) -{ - struct binder_thread *target_thread; - struct binder_transaction *next; - - BUG_ON(t->flags & TF_ONE_WAY); - while (1) { - target_thread = binder_get_txn_from_and_acq_inner(t); - if (target_thread) { - binder_debug(BINDER_DEBUG_FAILED_TRANSACTION, - "send failed reply for transaction %d to %d:%d\n", - t->debug_id, - target_thread->proc->pid, - target_thread->pid); - - binder_pop_transaction_ilocked(target_thread, t); - if (target_thread->reply_error.cmd == BR_OK) { - target_thread->reply_error.cmd = error_code; - binder_enqueue_thread_work_ilocked( - target_thread, - &target_thread->reply_error.work); - wake_up_interruptible(&target_thread->wait); - } else { - /* - * Cannot get here for normal operation, but - * we can if multiple synchronous transactions - * are sent without blocking for responses. - * Just ignore the 2nd error in this case. - */ - pr_warn("Unexpected reply error: %u\n", - target_thread->reply_error.cmd); - } - binder_inner_proc_unlock(target_thread->proc); - binder_thread_dec_tmpref(target_thread); - binder_free_transaction(t); - return; - } - __release(&target_thread->proc->inner_lock); - next = t->from_parent; - - binder_debug(BINDER_DEBUG_FAILED_TRANSACTION, - "send failed reply for transaction %d, target dead\n", - t->debug_id); - - binder_free_transaction(t); - if (next == NULL) { - binder_debug(BINDER_DEBUG_DEAD_BINDER, - "reply failed, no target thread at root\n"); - return; - } - t = next; - binder_debug(BINDER_DEBUG_DEAD_BINDER, - "reply failed, no target thread -- retry %d\n", - t->debug_id); - } -} - -/** - * binder_cleanup_transaction() - cleans up undelivered transaction - * @t: transaction that needs to be cleaned up - * @reason: reason the transaction wasn't delivered - * @error_code: error to return to caller (if synchronous call) - */ -static void binder_cleanup_transaction(struct binder_transaction *t, - const char *reason, - uint32_t error_code) -{ - if (t->buffer->target_node && !(t->flags & TF_ONE_WAY)) { - binder_send_failed_reply(t, error_code); - } else { - binder_debug(BINDER_DEBUG_DEAD_TRANSACTION, - "undelivered transaction %d, %s\n", - t->debug_id, reason); - binder_free_transaction(t); - } -} - -/** - * binder_get_object() - gets object and checks for valid metadata - * @proc: binder_proc owning the buffer - * @u: sender's user pointer to base of buffer - * @buffer: binder_buffer that we're parsing. - * @offset: offset in the @buffer at which to validate an object. - * @object: struct binder_object to read into - * - * Copy the binder object at the given offset into @object. If @u is - * provided then the copy is from the sender's buffer. If not, then - * it is copied from the target's @buffer. - * - * Return: If there's a valid metadata object at @offset, the - * size of that object. Otherwise, it returns zero. The object - * is read into the struct binder_object pointed to by @object. - */ -static size_t binder_get_object(struct binder_proc *proc, - const void __user *u, - struct binder_buffer *buffer, - unsigned long offset, - struct binder_object *object) -{ - size_t read_size; - struct binder_object_header *hdr; - size_t object_size = 0; - - read_size = min_t(size_t, sizeof(*object), buffer->data_size - offset); - if (offset > buffer->data_size || read_size < sizeof(*hdr)) - return 0; - if (u) { - if (copy_from_user(object, u + offset, read_size)) - return 0; - } else { - if (binder_alloc_copy_from_buffer(&proc->alloc, object, buffer, - offset, read_size)) - return 0; - } - - /* Ok, now see if we read a complete object. */ - hdr = &object->hdr; - switch (hdr->type) { - case BINDER_TYPE_BINDER: - case BINDER_TYPE_WEAK_BINDER: - case BINDER_TYPE_HANDLE: - case BINDER_TYPE_WEAK_HANDLE: - object_size = sizeof(struct flat_binder_object); - break; - case BINDER_TYPE_FD: - object_size = sizeof(struct binder_fd_object); - break; - case BINDER_TYPE_PTR: - object_size = sizeof(struct binder_buffer_object); - break; - case BINDER_TYPE_FDA: - object_size = sizeof(struct binder_fd_array_object); - break; - default: - return 0; - } - if (offset <= buffer->data_size - object_size && - buffer->data_size >= object_size) - return object_size; - else - return 0; -} - -/** - * binder_validate_ptr() - validates binder_buffer_object in a binder_buffer. - * @proc: binder_proc owning the buffer - * @b: binder_buffer containing the object - * @object: struct binder_object to read into - * @index: index in offset array at which the binder_buffer_object is - * located - * @start_offset: points to the start of the offset array - * @object_offsetp: offset of @object read from @b - * @num_valid: the number of valid offsets in the offset array - * - * Return: If @index is within the valid range of the offset array - * described by @start and @num_valid, and if there's a valid - * binder_buffer_object at the offset found in index @index - * of the offset array, that object is returned. Otherwise, - * %NULL is returned. - * Note that the offset found in index @index itself is not - * verified; this function assumes that @num_valid elements - * from @start were previously verified to have valid offsets. - * If @object_offsetp is non-NULL, then the offset within - * @b is written to it. - */ -static struct binder_buffer_object *binder_validate_ptr( - struct binder_proc *proc, - struct binder_buffer *b, - struct binder_object *object, - binder_size_t index, - binder_size_t start_offset, - binder_size_t *object_offsetp, - binder_size_t num_valid) -{ - size_t object_size; - binder_size_t object_offset; - unsigned long buffer_offset; - - if (index >= num_valid) - return NULL; - - buffer_offset = start_offset + sizeof(binder_size_t) * index; - if (binder_alloc_copy_from_buffer(&proc->alloc, &object_offset, - b, buffer_offset, - sizeof(object_offset))) - return NULL; - object_size = binder_get_object(proc, NULL, b, object_offset, object); - if (!object_size || object->hdr.type != BINDER_TYPE_PTR) - return NULL; - if (object_offsetp) - *object_offsetp = object_offset; - - return &object->bbo; -} - -/** - * binder_validate_fixup() - validates pointer/fd fixups happen in order. - * @proc: binder_proc owning the buffer - * @b: transaction buffer - * @objects_start_offset: offset to start of objects buffer - * @buffer_obj_offset: offset to binder_buffer_object in which to fix up - * @fixup_offset: start offset in @buffer to fix up - * @last_obj_offset: offset to last binder_buffer_object that we fixed - * @last_min_offset: minimum fixup offset in object at @last_obj_offset - * - * Return: %true if a fixup in buffer @buffer at offset @offset is - * allowed. - * - * For safety reasons, we only allow fixups inside a buffer to happen - * at increasing offsets; additionally, we only allow fixup on the last - * buffer object that was verified, or one of its parents. - * - * Example of what is allowed: - * - * A - * B (parent = A, offset = 0) - * C (parent = A, offset = 16) - * D (parent = C, offset = 0) - * E (parent = A, offset = 32) // min_offset is 16 (C.parent_offset) - * - * Examples of what is not allowed: - * - * Decreasing offsets within the same parent: - * A - * C (parent = A, offset = 16) - * B (parent = A, offset = 0) // decreasing offset within A - * - * Referring to a parent that wasn't the last object or any of its parents: - * A - * B (parent = A, offset = 0) - * C (parent = A, offset = 0) - * C (parent = A, offset = 16) - * D (parent = B, offset = 0) // B is not A or any of A's parents - */ -static bool binder_validate_fixup(struct binder_proc *proc, - struct binder_buffer *b, - binder_size_t objects_start_offset, - binder_size_t buffer_obj_offset, - binder_size_t fixup_offset, - binder_size_t last_obj_offset, - binder_size_t last_min_offset) -{ - if (!last_obj_offset) { - /* Nothing to fix up in */ - return false; - } - - while (last_obj_offset != buffer_obj_offset) { - unsigned long buffer_offset; - struct binder_object last_object; - struct binder_buffer_object *last_bbo; - size_t object_size = binder_get_object(proc, NULL, b, - last_obj_offset, - &last_object); - if (object_size != sizeof(*last_bbo)) - return false; - - last_bbo = &last_object.bbo; - /* - * Safe to retrieve the parent of last_obj, since it - * was already previously verified by the driver. - */ - if ((last_bbo->flags & BINDER_BUFFER_FLAG_HAS_PARENT) == 0) - return false; - last_min_offset = last_bbo->parent_offset + sizeof(uintptr_t); - buffer_offset = objects_start_offset + - sizeof(binder_size_t) * last_bbo->parent; - if (binder_alloc_copy_from_buffer(&proc->alloc, - &last_obj_offset, - b, buffer_offset, - sizeof(last_obj_offset))) - return false; - } - return (fixup_offset >= last_min_offset); -} - -/** - * struct binder_task_work_cb - for deferred close - * - * @twork: callback_head for task work - * @fd: fd to close - * - * Structure to pass task work to be handled after - * returning from binder_ioctl() via task_work_add(). - */ -struct binder_task_work_cb { - struct callback_head twork; - struct file *file; -}; - -/** - * binder_do_fd_close() - close list of file descriptors - * @twork: callback head for task work - * - * It is not safe to call ksys_close() during the binder_ioctl() - * function if there is a chance that binder's own file descriptor - * might be closed. This is to meet the requirements for using - * fdget() (see comments for __fget_light()). Therefore use - * task_work_add() to schedule the close operation once we have - * returned from binder_ioctl(). This function is a callback - * for that mechanism and does the actual ksys_close() on the - * given file descriptor. - */ -static void binder_do_fd_close(struct callback_head *twork) -{ - struct binder_task_work_cb *twcb = container_of(twork, - struct binder_task_work_cb, twork); - - fput(twcb->file); - kfree(twcb); -} - -/** - * binder_deferred_fd_close() - schedule a close for the given file-descriptor - * @fd: file-descriptor to close - * - * See comments in binder_do_fd_close(). This function is used to schedule - * a file-descriptor to be closed after returning from binder_ioctl(). - */ -static void binder_deferred_fd_close(int fd) -{ - struct binder_task_work_cb *twcb; - - twcb = kzalloc(sizeof(*twcb), GFP_KERNEL); - if (!twcb) - return; - init_task_work(&twcb->twork, binder_do_fd_close); - twcb->file = close_fd_get_file(fd); - if (twcb->file) { - // pin it until binder_do_fd_close(); see comments there - get_file(twcb->file); - filp_close(twcb->file, current->files); - task_work_add(current, &twcb->twork, TWA_RESUME); - } else { - kfree(twcb); - } -} - -static void binder_transaction_buffer_release(struct binder_proc *proc, - struct binder_thread *thread, - struct binder_buffer *buffer, - binder_size_t off_end_offset, - bool is_failure) -{ - int debug_id = buffer->debug_id; - binder_size_t off_start_offset, buffer_offset; - - binder_debug(BINDER_DEBUG_TRANSACTION, - "%d buffer release %d, size %zd-%zd, failed at %llx\n", - proc->pid, buffer->debug_id, - buffer->data_size, buffer->offsets_size, - (unsigned long long)off_end_offset); - - if (buffer->target_node) - binder_dec_node(buffer->target_node, 1, 0); - - off_start_offset = ALIGN(buffer->data_size, sizeof(void *)); - - for (buffer_offset = off_start_offset; buffer_offset < off_end_offset; - buffer_offset += sizeof(binder_size_t)) { - struct binder_object_header *hdr; - size_t object_size = 0; - struct binder_object object; - binder_size_t object_offset; - - if (!binder_alloc_copy_from_buffer(&proc->alloc, &object_offset, - buffer, buffer_offset, - sizeof(object_offset))) - object_size = binder_get_object(proc, NULL, buffer, - object_offset, &object); - if (object_size == 0) { - pr_err("transaction release %d bad object at offset %lld, size %zd\n", - debug_id, (u64)object_offset, buffer->data_size); - continue; - } - hdr = &object.hdr; - switch (hdr->type) { - case BINDER_TYPE_BINDER: - case BINDER_TYPE_WEAK_BINDER: { - struct flat_binder_object *fp; - struct binder_node *node; - - fp = to_flat_binder_object(hdr); - node = binder_get_node(proc, fp->binder); - if (node == NULL) { - pr_err("transaction release %d bad node %016llx\n", - debug_id, (u64)fp->binder); - break; - } - binder_debug(BINDER_DEBUG_TRANSACTION, - " node %d u%016llx\n", - node->debug_id, (u64)node->ptr); - binder_dec_node(node, hdr->type == BINDER_TYPE_BINDER, - 0); - binder_put_node(node); - } break; - case BINDER_TYPE_HANDLE: - case BINDER_TYPE_WEAK_HANDLE: { - struct flat_binder_object *fp; - struct binder_ref_data rdata; - int ret; - - fp = to_flat_binder_object(hdr); - ret = binder_dec_ref_for_handle(proc, fp->handle, - hdr->type == BINDER_TYPE_HANDLE, &rdata); - - if (ret) { - pr_err("transaction release %d bad handle %d, ret = %d\n", - debug_id, fp->handle, ret); - break; - } - binder_debug(BINDER_DEBUG_TRANSACTION, - " ref %d desc %d\n", - rdata.debug_id, rdata.desc); - } break; - - case BINDER_TYPE_FD: { - /* - * No need to close the file here since user-space - * closes it for successfully delivered - * transactions. For transactions that weren't - * delivered, the new fd was never allocated so - * there is no need to close and the fput on the - * file is done when the transaction is torn - * down. - */ - } break; - case BINDER_TYPE_PTR: - /* - * Nothing to do here, this will get cleaned up when the - * transaction buffer gets freed - */ - break; - case BINDER_TYPE_FDA: { - struct binder_fd_array_object *fda; - struct binder_buffer_object *parent; - struct binder_object ptr_object; - binder_size_t fda_offset; - size_t fd_index; - binder_size_t fd_buf_size; - binder_size_t num_valid; - - if (is_failure) { - /* - * The fd fixups have not been applied so no - * fds need to be closed. - */ - continue; - } - - num_valid = (buffer_offset - off_start_offset) / - sizeof(binder_size_t); - fda = to_binder_fd_array_object(hdr); - parent = binder_validate_ptr(proc, buffer, &ptr_object, - fda->parent, - off_start_offset, - NULL, - num_valid); - if (!parent) { - pr_err("transaction release %d bad parent offset\n", - debug_id); - continue; - } - fd_buf_size = sizeof(u32) * fda->num_fds; - if (fda->num_fds >= SIZE_MAX / sizeof(u32)) { - pr_err("transaction release %d invalid number of fds (%lld)\n", - debug_id, (u64)fda->num_fds); - continue; - } - if (fd_buf_size > parent->length || - fda->parent_offset > parent->length - fd_buf_size) { - /* No space for all file descriptors here. */ - pr_err("transaction release %d not enough space for %lld fds in buffer\n", - debug_id, (u64)fda->num_fds); - continue; - } - /* - * the source data for binder_buffer_object is visible - * to user-space and the @buffer element is the user - * pointer to the buffer_object containing the fd_array. - * Convert the address to an offset relative to - * the base of the transaction buffer. - */ - fda_offset = - (parent->buffer - (uintptr_t)buffer->user_data) + - fda->parent_offset; - for (fd_index = 0; fd_index < fda->num_fds; - fd_index++) { - u32 fd; - int err; - binder_size_t offset = fda_offset + - fd_index * sizeof(fd); - - err = binder_alloc_copy_from_buffer( - &proc->alloc, &fd, buffer, - offset, sizeof(fd)); - WARN_ON(err); - if (!err) { - binder_deferred_fd_close(fd); - /* - * Need to make sure the thread goes - * back to userspace to complete the - * deferred close - */ - if (thread) - thread->looper_need_return = true; - } - } - } break; - default: - pr_err("transaction release %d bad object type %x\n", - debug_id, hdr->type); - break; - } - } -} - -/* Clean up all the objects in the buffer */ -static inline void binder_release_entire_buffer(struct binder_proc *proc, - struct binder_thread *thread, - struct binder_buffer *buffer, - bool is_failure) -{ - binder_size_t off_end_offset; - - off_end_offset = ALIGN(buffer->data_size, sizeof(void *)); - off_end_offset += buffer->offsets_size; - - binder_transaction_buffer_release(proc, thread, buffer, - off_end_offset, is_failure); -} - -static int binder_translate_binder(struct flat_binder_object *fp, - struct binder_transaction *t, - struct binder_thread *thread) -{ - struct binder_node *node; - struct binder_proc *proc = thread->proc; - struct binder_proc *target_proc = t->to_proc; - struct binder_ref_data rdata; - int ret = 0; - - node = binder_get_node(proc, fp->binder); - if (!node) { - node = binder_new_node(proc, fp); - if (!node) - return -ENOMEM; - } - if (fp->cookie != node->cookie) { - binder_user_error("%d:%d sending u%016llx node %d, cookie mismatch %016llx != %016llx\n", - proc->pid, thread->pid, (u64)fp->binder, - node->debug_id, (u64)fp->cookie, - (u64)node->cookie); - ret = -EINVAL; - goto done; - } - if (security_binder_transfer_binder(proc->cred, target_proc->cred)) { - ret = -EPERM; - goto done; - } - - ret = binder_inc_ref_for_node(target_proc, node, - fp->hdr.type == BINDER_TYPE_BINDER, - &thread->todo, &rdata); - if (ret) - goto done; - - if (fp->hdr.type == BINDER_TYPE_BINDER) - fp->hdr.type = BINDER_TYPE_HANDLE; - else - fp->hdr.type = BINDER_TYPE_WEAK_HANDLE; - fp->binder = 0; - fp->handle = rdata.desc; - fp->cookie = 0; - - trace_binder_transaction_node_to_ref(t, node, &rdata); - binder_debug(BINDER_DEBUG_TRANSACTION, - " node %d u%016llx -> ref %d desc %d\n", - node->debug_id, (u64)node->ptr, - rdata.debug_id, rdata.desc); -done: - binder_put_node(node); - return ret; -} - -static int binder_translate_handle(struct flat_binder_object *fp, - struct binder_transaction *t, - struct binder_thread *thread) -{ - struct binder_proc *proc = thread->proc; - struct binder_proc *target_proc = t->to_proc; - struct binder_node *node; - struct binder_ref_data src_rdata; - int ret = 0; - - node = binder_get_node_from_ref(proc, fp->handle, - fp->hdr.type == BINDER_TYPE_HANDLE, &src_rdata); - if (!node) { - binder_user_error("%d:%d got transaction with invalid handle, %d\n", - proc->pid, thread->pid, fp->handle); - return -EINVAL; - } - if (security_binder_transfer_binder(proc->cred, target_proc->cred)) { - ret = -EPERM; - goto done; - } - - binder_node_lock(node); - if (node->proc == target_proc) { - if (fp->hdr.type == BINDER_TYPE_HANDLE) - fp->hdr.type = BINDER_TYPE_BINDER; - else - fp->hdr.type = BINDER_TYPE_WEAK_BINDER; - fp->binder = node->ptr; - fp->cookie = node->cookie; - if (node->proc) - binder_inner_proc_lock(node->proc); - else - __acquire(&node->proc->inner_lock); - binder_inc_node_nilocked(node, - fp->hdr.type == BINDER_TYPE_BINDER, - 0, NULL); - if (node->proc) - binder_inner_proc_unlock(node->proc); - else - __release(&node->proc->inner_lock); - trace_binder_transaction_ref_to_node(t, node, &src_rdata); - binder_debug(BINDER_DEBUG_TRANSACTION, - " ref %d desc %d -> node %d u%016llx\n", - src_rdata.debug_id, src_rdata.desc, node->debug_id, - (u64)node->ptr); - binder_node_unlock(node); - } else { - struct binder_ref_data dest_rdata; - - binder_node_unlock(node); - ret = binder_inc_ref_for_node(target_proc, node, - fp->hdr.type == BINDER_TYPE_HANDLE, - NULL, &dest_rdata); - if (ret) - goto done; - - fp->binder = 0; - fp->handle = dest_rdata.desc; - fp->cookie = 0; - trace_binder_transaction_ref_to_ref(t, node, &src_rdata, - &dest_rdata); - binder_debug(BINDER_DEBUG_TRANSACTION, - " ref %d desc %d -> ref %d desc %d (node %d)\n", - src_rdata.debug_id, src_rdata.desc, - dest_rdata.debug_id, dest_rdata.desc, - node->debug_id); - } -done: - binder_put_node(node); - return ret; -} - -static int binder_translate_fd(u32 fd, binder_size_t fd_offset, - struct binder_transaction *t, - struct binder_thread *thread, - struct binder_transaction *in_reply_to) -{ - struct binder_proc *proc = thread->proc; - struct binder_proc *target_proc = t->to_proc; - struct binder_txn_fd_fixup *fixup; - struct file *file; - int ret = 0; - bool target_allows_fd; - - if (in_reply_to) - target_allows_fd = !!(in_reply_to->flags & TF_ACCEPT_FDS); - else - target_allows_fd = t->buffer->target_node->accept_fds; - if (!target_allows_fd) { - binder_user_error("%d:%d got %s with fd, %d, but target does not allow fds\n", - proc->pid, thread->pid, - in_reply_to ? "reply" : "transaction", - fd); - ret = -EPERM; - goto err_fd_not_accepted; - } - - file = fget(fd); - if (!file) { - binder_user_error("%d:%d got transaction with invalid fd, %d\n", - proc->pid, thread->pid, fd); - ret = -EBADF; - goto err_fget; - } - ret = security_binder_transfer_file(proc->cred, target_proc->cred, file); - if (ret < 0) { - ret = -EPERM; - goto err_security; - } - - /* - * Add fixup record for this transaction. The allocation - * of the fd in the target needs to be done from a - * target thread. - */ - fixup = kzalloc(sizeof(*fixup), GFP_KERNEL); - if (!fixup) { - ret = -ENOMEM; - goto err_alloc; - } - fixup->file = file; - fixup->offset = fd_offset; - fixup->target_fd = -1; - trace_binder_transaction_fd_send(t, fd, fixup->offset); - list_add_tail(&fixup->fixup_entry, &t->fd_fixups); - - return ret; - -err_alloc: -err_security: - fput(file); -err_fget: -err_fd_not_accepted: - return ret; -} - -/** - * struct binder_ptr_fixup - data to be fixed-up in target buffer - * @offset offset in target buffer to fixup - * @skip_size bytes to skip in copy (fixup will be written later) - * @fixup_data data to write at fixup offset - * @node list node - * - * This is used for the pointer fixup list (pf) which is created and consumed - * during binder_transaction() and is only accessed locally. No - * locking is necessary. - * - * The list is ordered by @offset. - */ -struct binder_ptr_fixup { - binder_size_t offset; - size_t skip_size; - binder_uintptr_t fixup_data; - struct list_head node; -}; - -/** - * struct binder_sg_copy - scatter-gather data to be copied - * @offset offset in target buffer - * @sender_uaddr user address in source buffer - * @length bytes to copy - * @node list node - * - * This is used for the sg copy list (sgc) which is created and consumed - * during binder_transaction() and is only accessed locally. No - * locking is necessary. - * - * The list is ordered by @offset. - */ -struct binder_sg_copy { - binder_size_t offset; - const void __user *sender_uaddr; - size_t length; - struct list_head node; -}; - -/** - * binder_do_deferred_txn_copies() - copy and fixup scatter-gather data - * @alloc: binder_alloc associated with @buffer - * @buffer: binder buffer in target process - * @sgc_head: list_head of scatter-gather copy list - * @pf_head: list_head of pointer fixup list - * - * Processes all elements of @sgc_head, applying fixups from @pf_head - * and copying the scatter-gather data from the source process' user - * buffer to the target's buffer. It is expected that the list creation - * and processing all occurs during binder_transaction() so these lists - * are only accessed in local context. - * - * Return: 0=success, else -errno - */ -static int binder_do_deferred_txn_copies(struct binder_alloc *alloc, - struct binder_buffer *buffer, - struct list_head *sgc_head, - struct list_head *pf_head) -{ - int ret = 0; - struct binder_sg_copy *sgc, *tmpsgc; - struct binder_ptr_fixup *tmppf; - struct binder_ptr_fixup *pf = - list_first_entry_or_null(pf_head, struct binder_ptr_fixup, - node); - - list_for_each_entry_safe(sgc, tmpsgc, sgc_head, node) { - size_t bytes_copied = 0; - - while (bytes_copied < sgc->length) { - size_t copy_size; - size_t bytes_left = sgc->length - bytes_copied; - size_t offset = sgc->offset + bytes_copied; - - /* - * We copy up to the fixup (pointed to by pf) - */ - copy_size = pf ? min(bytes_left, (size_t)pf->offset - offset) - : bytes_left; - if (!ret && copy_size) - ret = binder_alloc_copy_user_to_buffer( - alloc, buffer, - offset, - sgc->sender_uaddr + bytes_copied, - copy_size); - bytes_copied += copy_size; - if (copy_size != bytes_left) { - BUG_ON(!pf); - /* we stopped at a fixup offset */ - if (pf->skip_size) { - /* - * we are just skipping. This is for - * BINDER_TYPE_FDA where the translated - * fds will be fixed up when we get - * to target context. - */ - bytes_copied += pf->skip_size; - } else { - /* apply the fixup indicated by pf */ - if (!ret) - ret = binder_alloc_copy_to_buffer( - alloc, buffer, - pf->offset, - &pf->fixup_data, - sizeof(pf->fixup_data)); - bytes_copied += sizeof(pf->fixup_data); - } - list_del(&pf->node); - kfree(pf); - pf = list_first_entry_or_null(pf_head, - struct binder_ptr_fixup, node); - } - } - list_del(&sgc->node); - kfree(sgc); - } - list_for_each_entry_safe(pf, tmppf, pf_head, node) { - BUG_ON(pf->skip_size == 0); - list_del(&pf->node); - kfree(pf); - } - BUG_ON(!list_empty(sgc_head)); - - return ret > 0 ? -EINVAL : ret; -} - -/** - * binder_cleanup_deferred_txn_lists() - free specified lists - * @sgc_head: list_head of scatter-gather copy list - * @pf_head: list_head of pointer fixup list - * - * Called to clean up @sgc_head and @pf_head if there is an - * error. - */ -static void binder_cleanup_deferred_txn_lists(struct list_head *sgc_head, - struct list_head *pf_head) -{ - struct binder_sg_copy *sgc, *tmpsgc; - struct binder_ptr_fixup *pf, *tmppf; - - list_for_each_entry_safe(sgc, tmpsgc, sgc_head, node) { - list_del(&sgc->node); - kfree(sgc); - } - list_for_each_entry_safe(pf, tmppf, pf_head, node) { - list_del(&pf->node); - kfree(pf); - } -} - -/** - * binder_defer_copy() - queue a scatter-gather buffer for copy - * @sgc_head: list_head of scatter-gather copy list - * @offset: binder buffer offset in target process - * @sender_uaddr: user address in source process - * @length: bytes to copy - * - * Specify a scatter-gather block to be copied. The actual copy must - * be deferred until all the needed fixups are identified and queued. - * Then the copy and fixups are done together so un-translated values - * from the source are never visible in the target buffer. - * - * We are guaranteed that repeated calls to this function will have - * monotonically increasing @offset values so the list will naturally - * be ordered. - * - * Return: 0=success, else -errno - */ -static int binder_defer_copy(struct list_head *sgc_head, binder_size_t offset, - const void __user *sender_uaddr, size_t length) -{ - struct binder_sg_copy *bc = kzalloc(sizeof(*bc), GFP_KERNEL); - - if (!bc) - return -ENOMEM; - - bc->offset = offset; - bc->sender_uaddr = sender_uaddr; - bc->length = length; - INIT_LIST_HEAD(&bc->node); - - /* - * We are guaranteed that the deferred copies are in-order - * so just add to the tail. - */ - list_add_tail(&bc->node, sgc_head); - - return 0; -} - -/** - * binder_add_fixup() - queue a fixup to be applied to sg copy - * @pf_head: list_head of binder ptr fixup list - * @offset: binder buffer offset in target process - * @fixup: bytes to be copied for fixup - * @skip_size: bytes to skip when copying (fixup will be applied later) - * - * Add the specified fixup to a list ordered by @offset. When copying - * the scatter-gather buffers, the fixup will be copied instead of - * data from the source buffer. For BINDER_TYPE_FDA fixups, the fixup - * will be applied later (in target process context), so we just skip - * the bytes specified by @skip_size. If @skip_size is 0, we copy the - * value in @fixup. - * - * This function is called *mostly* in @offset order, but there are - * exceptions. Since out-of-order inserts are relatively uncommon, - * we insert the new element by searching backward from the tail of - * the list. - * - * Return: 0=success, else -errno - */ -static int binder_add_fixup(struct list_head *pf_head, binder_size_t offset, - binder_uintptr_t fixup, size_t skip_size) -{ - struct binder_ptr_fixup *pf = kzalloc(sizeof(*pf), GFP_KERNEL); - struct binder_ptr_fixup *tmppf; - - if (!pf) - return -ENOMEM; - - pf->offset = offset; - pf->fixup_data = fixup; - pf->skip_size = skip_size; - INIT_LIST_HEAD(&pf->node); - - /* Fixups are *mostly* added in-order, but there are some - * exceptions. Look backwards through list for insertion point. - */ - list_for_each_entry_reverse(tmppf, pf_head, node) { - if (tmppf->offset < pf->offset) { - list_add(&pf->node, &tmppf->node); - return 0; - } - } - /* - * if we get here, then the new offset is the lowest so - * insert at the head - */ - list_add(&pf->node, pf_head); - return 0; -} - -static int binder_translate_fd_array(struct list_head *pf_head, - struct binder_fd_array_object *fda, - const void __user *sender_ubuffer, - struct binder_buffer_object *parent, - struct binder_buffer_object *sender_uparent, - struct binder_transaction *t, - struct binder_thread *thread, - struct binder_transaction *in_reply_to) -{ - binder_size_t fdi, fd_buf_size; - binder_size_t fda_offset; - const void __user *sender_ufda_base; - struct binder_proc *proc = thread->proc; - int ret; - - if (fda->num_fds == 0) - return 0; - - fd_buf_size = sizeof(u32) * fda->num_fds; - if (fda->num_fds >= SIZE_MAX / sizeof(u32)) { - binder_user_error("%d:%d got transaction with invalid number of fds (%lld)\n", - proc->pid, thread->pid, (u64)fda->num_fds); - return -EINVAL; - } - if (fd_buf_size > parent->length || - fda->parent_offset > parent->length - fd_buf_size) { - /* No space for all file descriptors here. */ - binder_user_error("%d:%d not enough space to store %lld fds in buffer\n", - proc->pid, thread->pid, (u64)fda->num_fds); - return -EINVAL; - } - /* - * the source data for binder_buffer_object is visible - * to user-space and the @buffer element is the user - * pointer to the buffer_object containing the fd_array. - * Convert the address to an offset relative to - * the base of the transaction buffer. - */ - fda_offset = (parent->buffer - (uintptr_t)t->buffer->user_data) + - fda->parent_offset; - sender_ufda_base = (void __user *)(uintptr_t)sender_uparent->buffer + - fda->parent_offset; - - if (!IS_ALIGNED((unsigned long)fda_offset, sizeof(u32)) || - !IS_ALIGNED((unsigned long)sender_ufda_base, sizeof(u32))) { - binder_user_error("%d:%d parent offset not aligned correctly.\n", - proc->pid, thread->pid); - return -EINVAL; - } - ret = binder_add_fixup(pf_head, fda_offset, 0, fda->num_fds * sizeof(u32)); - if (ret) - return ret; - - for (fdi = 0; fdi < fda->num_fds; fdi++) { - u32 fd; - binder_size_t offset = fda_offset + fdi * sizeof(fd); - binder_size_t sender_uoffset = fdi * sizeof(fd); - - ret = copy_from_user(&fd, sender_ufda_base + sender_uoffset, sizeof(fd)); - if (!ret) - ret = binder_translate_fd(fd, offset, t, thread, - in_reply_to); - if (ret) - return ret > 0 ? -EINVAL : ret; - } - return 0; -} - -static int binder_fixup_parent(struct list_head *pf_head, - struct binder_transaction *t, - struct binder_thread *thread, - struct binder_buffer_object *bp, - binder_size_t off_start_offset, - binder_size_t num_valid, - binder_size_t last_fixup_obj_off, - binder_size_t last_fixup_min_off) -{ - struct binder_buffer_object *parent; - struct binder_buffer *b = t->buffer; - struct binder_proc *proc = thread->proc; - struct binder_proc *target_proc = t->to_proc; - struct binder_object object; - binder_size_t buffer_offset; - binder_size_t parent_offset; - - if (!(bp->flags & BINDER_BUFFER_FLAG_HAS_PARENT)) - return 0; - - parent = binder_validate_ptr(target_proc, b, &object, bp->parent, - off_start_offset, &parent_offset, - num_valid); - if (!parent) { - binder_user_error("%d:%d got transaction with invalid parent offset or type\n", - proc->pid, thread->pid); - return -EINVAL; - } - - if (!binder_validate_fixup(target_proc, b, off_start_offset, - parent_offset, bp->parent_offset, - last_fixup_obj_off, - last_fixup_min_off)) { - binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n", - proc->pid, thread->pid); - return -EINVAL; - } - - if (parent->length < sizeof(binder_uintptr_t) || - bp->parent_offset > parent->length - sizeof(binder_uintptr_t)) { - /* No space for a pointer here! */ - binder_user_error("%d:%d got transaction with invalid parent offset\n", - proc->pid, thread->pid); - return -EINVAL; - } - buffer_offset = bp->parent_offset + - (uintptr_t)parent->buffer - (uintptr_t)b->user_data; - return binder_add_fixup(pf_head, buffer_offset, bp->buffer, 0); -} - -/** - * binder_can_update_transaction() - Can a txn be superseded by an updated one? - * @t1: the pending async txn in the frozen process - * @t2: the new async txn to supersede the outdated pending one - * - * Return: true if t2 can supersede t1 - * false if t2 can not supersede t1 - */ -static bool binder_can_update_transaction(struct binder_transaction *t1, - struct binder_transaction *t2) -{ - if ((t1->flags & t2->flags & (TF_ONE_WAY | TF_UPDATE_TXN)) != - (TF_ONE_WAY | TF_UPDATE_TXN) || !t1->to_proc || !t2->to_proc) - return false; - if (t1->to_proc->tsk == t2->to_proc->tsk && t1->code == t2->code && - t1->flags == t2->flags && t1->buffer->pid == t2->buffer->pid && - t1->buffer->target_node->ptr == t2->buffer->target_node->ptr && - t1->buffer->target_node->cookie == t2->buffer->target_node->cookie) - return true; - return false; -} - -/** - * binder_find_outdated_transaction_ilocked() - Find the outdated transaction - * @t: new async transaction - * @target_list: list to find outdated transaction - * - * Return: the outdated transaction if found - * NULL if no outdated transacton can be found - * - * Requires the proc->inner_lock to be held. - */ -static struct binder_transaction * -binder_find_outdated_transaction_ilocked(struct binder_transaction *t, - struct list_head *target_list) -{ - struct binder_work *w; - - list_for_each_entry(w, target_list, entry) { - struct binder_transaction *t_queued; - - if (w->type != BINDER_WORK_TRANSACTION) - continue; - t_queued = container_of(w, struct binder_transaction, work); - if (binder_can_update_transaction(t_queued, t)) - return t_queued; - } - return NULL; -} - -/** - * binder_proc_transaction() - sends a transaction to a process and wakes it up - * @t: transaction to send - * @proc: process to send the transaction to - * @thread: thread in @proc to send the transaction to (may be NULL) - * - * This function queues a transaction to the specified process. It will try - * to find a thread in the target process to handle the transaction and - * wake it up. If no thread is found, the work is queued to the proc - * waitqueue. - * - * If the @thread parameter is not NULL, the transaction is always queued - * to the waitlist of that specific thread. - * - * Return: 0 if the transaction was successfully queued - * BR_DEAD_REPLY if the target process or thread is dead - * BR_FROZEN_REPLY if the target process or thread is frozen and - * the sync transaction was rejected - * BR_TRANSACTION_PENDING_FROZEN if the target process is frozen - * and the async transaction was successfully queued - */ -static int binder_proc_transaction(struct binder_transaction *t, - struct binder_proc *proc, - struct binder_thread *thread) -{ - struct binder_node *node = t->buffer->target_node; - bool oneway = !!(t->flags & TF_ONE_WAY); - bool pending_async = false; - struct binder_transaction *t_outdated = NULL; - bool frozen = false; - - BUG_ON(!node); - binder_node_lock(node); - if (oneway) { - BUG_ON(thread); - if (node->has_async_transaction) - pending_async = true; - else - node->has_async_transaction = true; - } - - binder_inner_proc_lock(proc); - if (proc->is_frozen) { - frozen = true; - proc->sync_recv |= !oneway; - proc->async_recv |= oneway; - } - - if ((frozen && !oneway) || proc->is_dead || - (thread && thread->is_dead)) { - binder_inner_proc_unlock(proc); - binder_node_unlock(node); - return frozen ? BR_FROZEN_REPLY : BR_DEAD_REPLY; - } - - if (!thread && !pending_async) - thread = binder_select_thread_ilocked(proc); - - if (thread) { - binder_enqueue_thread_work_ilocked(thread, &t->work); - } else if (!pending_async) { - binder_enqueue_work_ilocked(&t->work, &proc->todo); - } else { - if ((t->flags & TF_UPDATE_TXN) && frozen) { - t_outdated = binder_find_outdated_transaction_ilocked(t, - &node->async_todo); - if (t_outdated) { - binder_debug(BINDER_DEBUG_TRANSACTION, - "txn %d supersedes %d\n", - t->debug_id, t_outdated->debug_id); - list_del_init(&t_outdated->work.entry); - proc->outstanding_txns--; - } - } - binder_enqueue_work_ilocked(&t->work, &node->async_todo); - } - - if (!pending_async) - binder_wakeup_thread_ilocked(proc, thread, !oneway /* sync */); - - proc->outstanding_txns++; - binder_inner_proc_unlock(proc); - binder_node_unlock(node); - - /* - * To reduce potential contention, free the outdated transaction and - * buffer after releasing the locks. - */ - if (t_outdated) { - struct binder_buffer *buffer = t_outdated->buffer; - - t_outdated->buffer = NULL; - buffer->transaction = NULL; - trace_binder_transaction_update_buffer_release(buffer); - binder_release_entire_buffer(proc, NULL, buffer, false); - binder_alloc_free_buf(&proc->alloc, buffer); - kfree(t_outdated); - binder_stats_deleted(BINDER_STAT_TRANSACTION); - } - - if (oneway && frozen) - return BR_TRANSACTION_PENDING_FROZEN; - - return 0; -} - -/** - * binder_get_node_refs_for_txn() - Get required refs on node for txn - * @node: struct binder_node for which to get refs - * @procp: returns @node->proc if valid - * @error: if no @procp then returns BR_DEAD_REPLY - * - * User-space normally keeps the node alive when creating a transaction - * since it has a reference to the target. The local strong ref keeps it - * alive if the sending process dies before the target process processes - * the transaction. If the source process is malicious or has a reference - * counting bug, relying on the local strong ref can fail. - * - * Since user-space can cause the local strong ref to go away, we also take - * a tmpref on the node to ensure it survives while we are constructing - * the transaction. We also need a tmpref on the proc while we are - * constructing the transaction, so we take that here as well. - * - * Return: The target_node with refs taken or NULL if no @node->proc is NULL. - * Also sets @procp if valid. If the @node->proc is NULL indicating that the - * target proc has died, @error is set to BR_DEAD_REPLY. - */ -static struct binder_node *binder_get_node_refs_for_txn( - struct binder_node *node, - struct binder_proc **procp, - uint32_t *error) -{ - struct binder_node *target_node = NULL; - - binder_node_inner_lock(node); - if (node->proc) { - target_node = node; - binder_inc_node_nilocked(node, 1, 0, NULL); - binder_inc_node_tmpref_ilocked(node); - node->proc->tmp_ref++; - *procp = node->proc; - } else - *error = BR_DEAD_REPLY; - binder_node_inner_unlock(node); - - return target_node; -} - -static void binder_set_txn_from_error(struct binder_transaction *t, int id, - uint32_t command, int32_t param) -{ - struct binder_thread *from = binder_get_txn_from_and_acq_inner(t); - - if (!from) { - /* annotation for sparse */ - __release(&from->proc->inner_lock); - return; - } - - /* don't override existing errors */ - if (from->ee.command == BR_OK) - binder_set_extended_error(&from->ee, id, command, param); - binder_inner_proc_unlock(from->proc); - binder_thread_dec_tmpref(from); -} - -static void binder_transaction(struct binder_proc *proc, - struct binder_thread *thread, - struct binder_transaction_data *tr, int reply, - binder_size_t extra_buffers_size) -{ - int ret; - struct binder_transaction *t; - struct binder_work *w; - struct binder_work *tcomplete; - binder_size_t buffer_offset = 0; - binder_size_t off_start_offset, off_end_offset; - binder_size_t off_min; - binder_size_t sg_buf_offset, sg_buf_end_offset; - binder_size_t user_offset = 0; - struct binder_proc *target_proc = NULL; - struct binder_thread *target_thread = NULL; - struct binder_node *target_node = NULL; - struct binder_transaction *in_reply_to = NULL; - struct binder_transaction_log_entry *e; - uint32_t return_error = 0; - uint32_t return_error_param = 0; - uint32_t return_error_line = 0; - binder_size_t last_fixup_obj_off = 0; - binder_size_t last_fixup_min_off = 0; - struct binder_context *context = proc->context; - int t_debug_id = atomic_inc_return(&binder_last_id); - ktime_t t_start_time = ktime_get(); - char *secctx = NULL; - u32 secctx_sz = 0; - struct list_head sgc_head; - struct list_head pf_head; - const void __user *user_buffer = (const void __user *) - (uintptr_t)tr->data.ptr.buffer; - INIT_LIST_HEAD(&sgc_head); - INIT_LIST_HEAD(&pf_head); - - e = binder_transaction_log_add(&binder_transaction_log); - e->debug_id = t_debug_id; - e->call_type = reply ? 2 : !!(tr->flags & TF_ONE_WAY); - e->from_proc = proc->pid; - e->from_thread = thread->pid; - e->target_handle = tr->target.handle; - e->data_size = tr->data_size; - e->offsets_size = tr->offsets_size; - strscpy(e->context_name, proc->context->name, BINDERFS_MAX_NAME); - - binder_inner_proc_lock(proc); - binder_set_extended_error(&thread->ee, t_debug_id, BR_OK, 0); - binder_inner_proc_unlock(proc); - - if (reply) { - binder_inner_proc_lock(proc); - in_reply_to = thread->transaction_stack; - if (in_reply_to == NULL) { - binder_inner_proc_unlock(proc); - binder_user_error("%d:%d got reply transaction with no transaction stack\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EPROTO; - return_error_line = __LINE__; - goto err_empty_call_stack; - } - if (in_reply_to->to_thread != thread) { - spin_lock(&in_reply_to->lock); - binder_user_error("%d:%d got reply transaction with bad transaction stack, transaction %d has target %d:%d\n", - proc->pid, thread->pid, in_reply_to->debug_id, - in_reply_to->to_proc ? - in_reply_to->to_proc->pid : 0, - in_reply_to->to_thread ? - in_reply_to->to_thread->pid : 0); - spin_unlock(&in_reply_to->lock); - binder_inner_proc_unlock(proc); - return_error = BR_FAILED_REPLY; - return_error_param = -EPROTO; - return_error_line = __LINE__; - in_reply_to = NULL; - goto err_bad_call_stack; - } - thread->transaction_stack = in_reply_to->to_parent; - binder_inner_proc_unlock(proc); - binder_set_nice(in_reply_to->saved_priority); - target_thread = binder_get_txn_from_and_acq_inner(in_reply_to); - if (target_thread == NULL) { - /* annotation for sparse */ - __release(&target_thread->proc->inner_lock); - binder_txn_error("%d:%d reply target not found\n", - thread->pid, proc->pid); - return_error = BR_DEAD_REPLY; - return_error_line = __LINE__; - goto err_dead_binder; - } - if (target_thread->transaction_stack != in_reply_to) { - binder_user_error("%d:%d got reply transaction with bad target transaction stack %d, expected %d\n", - proc->pid, thread->pid, - target_thread->transaction_stack ? - target_thread->transaction_stack->debug_id : 0, - in_reply_to->debug_id); - binder_inner_proc_unlock(target_thread->proc); - return_error = BR_FAILED_REPLY; - return_error_param = -EPROTO; - return_error_line = __LINE__; - in_reply_to = NULL; - target_thread = NULL; - goto err_dead_binder; - } - target_proc = target_thread->proc; - target_proc->tmp_ref++; - binder_inner_proc_unlock(target_thread->proc); - } else { - if (tr->target.handle) { - struct binder_ref *ref; - - /* - * There must already be a strong ref - * on this node. If so, do a strong - * increment on the node to ensure it - * stays alive until the transaction is - * done. - */ - binder_proc_lock(proc); - ref = binder_get_ref_olocked(proc, tr->target.handle, - true); - if (ref) { - target_node = binder_get_node_refs_for_txn( - ref->node, &target_proc, - &return_error); - } else { - binder_user_error("%d:%d got transaction to invalid handle, %u\n", - proc->pid, thread->pid, tr->target.handle); - return_error = BR_FAILED_REPLY; - } - binder_proc_unlock(proc); - } else { - mutex_lock(&context->context_mgr_node_lock); - target_node = context->binder_context_mgr_node; - if (target_node) - target_node = binder_get_node_refs_for_txn( - target_node, &target_proc, - &return_error); - else - return_error = BR_DEAD_REPLY; - mutex_unlock(&context->context_mgr_node_lock); - if (target_node && target_proc->pid == proc->pid) { - binder_user_error("%d:%d got transaction to context manager from process owning it\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_invalid_target_handle; - } - } - if (!target_node) { - binder_txn_error("%d:%d cannot find target node\n", - thread->pid, proc->pid); - /* - * return_error is set above - */ - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_dead_binder; - } - e->to_node = target_node->debug_id; - if (WARN_ON(proc == target_proc)) { - binder_txn_error("%d:%d self transactions not allowed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_invalid_target_handle; - } - if (security_binder_transaction(proc->cred, - target_proc->cred) < 0) { - binder_txn_error("%d:%d transaction credentials failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EPERM; - return_error_line = __LINE__; - goto err_invalid_target_handle; - } - binder_inner_proc_lock(proc); - - w = list_first_entry_or_null(&thread->todo, - struct binder_work, entry); - if (!(tr->flags & TF_ONE_WAY) && w && - w->type == BINDER_WORK_TRANSACTION) { - /* - * Do not allow new outgoing transaction from a - * thread that has a transaction at the head of - * its todo list. Only need to check the head - * because binder_select_thread_ilocked picks a - * thread from proc->waiting_threads to enqueue - * the transaction, and nothing is queued to the - * todo list while the thread is on waiting_threads. - */ - binder_user_error("%d:%d new transaction not allowed when there is a transaction on thread todo\n", - proc->pid, thread->pid); - binder_inner_proc_unlock(proc); - return_error = BR_FAILED_REPLY; - return_error_param = -EPROTO; - return_error_line = __LINE__; - goto err_bad_todo_list; - } - - if (!(tr->flags & TF_ONE_WAY) && thread->transaction_stack) { - struct binder_transaction *tmp; - - tmp = thread->transaction_stack; - if (tmp->to_thread != thread) { - spin_lock(&tmp->lock); - binder_user_error("%d:%d got new transaction with bad transaction stack, transaction %d has target %d:%d\n", - proc->pid, thread->pid, tmp->debug_id, - tmp->to_proc ? tmp->to_proc->pid : 0, - tmp->to_thread ? - tmp->to_thread->pid : 0); - spin_unlock(&tmp->lock); - binder_inner_proc_unlock(proc); - return_error = BR_FAILED_REPLY; - return_error_param = -EPROTO; - return_error_line = __LINE__; - goto err_bad_call_stack; - } - while (tmp) { - struct binder_thread *from; - - spin_lock(&tmp->lock); - from = tmp->from; - if (from && from->proc == target_proc) { - atomic_inc(&from->tmp_ref); - target_thread = from; - spin_unlock(&tmp->lock); - break; - } - spin_unlock(&tmp->lock); - tmp = tmp->from_parent; - } - } - binder_inner_proc_unlock(proc); - } - if (target_thread) - e->to_thread = target_thread->pid; - e->to_proc = target_proc->pid; - - /* TODO: reuse incoming transaction for reply */ - t = kzalloc(sizeof(*t), GFP_KERNEL); - if (t == NULL) { - binder_txn_error("%d:%d cannot allocate transaction\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -ENOMEM; - return_error_line = __LINE__; - goto err_alloc_t_failed; - } - INIT_LIST_HEAD(&t->fd_fixups); - binder_stats_created(BINDER_STAT_TRANSACTION); - spin_lock_init(&t->lock); - - tcomplete = kzalloc(sizeof(*tcomplete), GFP_KERNEL); - if (tcomplete == NULL) { - binder_txn_error("%d:%d cannot allocate work for transaction\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -ENOMEM; - return_error_line = __LINE__; - goto err_alloc_tcomplete_failed; - } - binder_stats_created(BINDER_STAT_TRANSACTION_COMPLETE); - - t->debug_id = t_debug_id; - t->start_time = t_start_time; - - if (reply) - binder_debug(BINDER_DEBUG_TRANSACTION, - "%d:%d BC_REPLY %d -> %d:%d, data %016llx-%016llx size %lld-%lld-%lld\n", - proc->pid, thread->pid, t->debug_id, - target_proc->pid, target_thread->pid, - (u64)tr->data.ptr.buffer, - (u64)tr->data.ptr.offsets, - (u64)tr->data_size, (u64)tr->offsets_size, - (u64)extra_buffers_size); - else - binder_debug(BINDER_DEBUG_TRANSACTION, - "%d:%d BC_TRANSACTION %d -> %d - node %d, data %016llx-%016llx size %lld-%lld-%lld\n", - proc->pid, thread->pid, t->debug_id, - target_proc->pid, target_node->debug_id, - (u64)tr->data.ptr.buffer, - (u64)tr->data.ptr.offsets, - (u64)tr->data_size, (u64)tr->offsets_size, - (u64)extra_buffers_size); - - if (!reply && !(tr->flags & TF_ONE_WAY)) - t->from = thread; - else - t->from = NULL; - t->from_pid = proc->pid; - t->from_tid = thread->pid; - t->sender_euid = task_euid(proc->tsk); - t->to_proc = target_proc; - t->to_thread = target_thread; - t->code = tr->code; - t->flags = tr->flags; - t->priority = task_nice(current); - - if (target_node && target_node->txn_security_ctx) { - u32 secid; - size_t added_size; - - security_cred_getsecid(proc->cred, &secid); - ret = security_secid_to_secctx(secid, &secctx, &secctx_sz); - if (ret) { - binder_txn_error("%d:%d failed to get security context\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_get_secctx_failed; - } - added_size = ALIGN(secctx_sz, sizeof(u64)); - extra_buffers_size += added_size; - if (extra_buffers_size < added_size) { - binder_txn_error("%d:%d integer overflow of extra_buffers_size\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_extra_size; - } - } - - trace_binder_transaction(reply, t, target_node); - - t->buffer = binder_alloc_new_buf(&target_proc->alloc, tr->data_size, - tr->offsets_size, extra_buffers_size, - !reply && (t->flags & TF_ONE_WAY), current->tgid); - if (IS_ERR(t->buffer)) { - char *s; - - ret = PTR_ERR(t->buffer); - s = (ret == -ESRCH) ? ": vma cleared, target dead or dying" - : (ret == -ENOSPC) ? ": no space left" - : (ret == -ENOMEM) ? ": memory allocation failed" - : ""; - binder_txn_error("cannot allocate buffer%s", s); - - return_error_param = PTR_ERR(t->buffer); - return_error = return_error_param == -ESRCH ? - BR_DEAD_REPLY : BR_FAILED_REPLY; - return_error_line = __LINE__; - t->buffer = NULL; - goto err_binder_alloc_buf_failed; - } - if (secctx) { - int err; - size_t buf_offset = ALIGN(tr->data_size, sizeof(void *)) + - ALIGN(tr->offsets_size, sizeof(void *)) + - ALIGN(extra_buffers_size, sizeof(void *)) - - ALIGN(secctx_sz, sizeof(u64)); - - t->security_ctx = (uintptr_t)t->buffer->user_data + buf_offset; - err = binder_alloc_copy_to_buffer(&target_proc->alloc, - t->buffer, buf_offset, - secctx, secctx_sz); - if (err) { - t->security_ctx = 0; - WARN_ON(1); - } - security_release_secctx(secctx, secctx_sz); - secctx = NULL; - } - t->buffer->debug_id = t->debug_id; - t->buffer->transaction = t; - t->buffer->target_node = target_node; - t->buffer->clear_on_free = !!(t->flags & TF_CLEAR_BUF); - trace_binder_transaction_alloc_buf(t->buffer); - - if (binder_alloc_copy_user_to_buffer( - &target_proc->alloc, - t->buffer, - ALIGN(tr->data_size, sizeof(void *)), - (const void __user *) - (uintptr_t)tr->data.ptr.offsets, - tr->offsets_size)) { - binder_user_error("%d:%d got transaction with invalid offsets ptr\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EFAULT; - return_error_line = __LINE__; - goto err_copy_data_failed; - } - if (!IS_ALIGNED(tr->offsets_size, sizeof(binder_size_t))) { - binder_user_error("%d:%d got transaction with invalid offsets size, %lld\n", - proc->pid, thread->pid, (u64)tr->offsets_size); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_offset; - } - if (!IS_ALIGNED(extra_buffers_size, sizeof(u64))) { - binder_user_error("%d:%d got transaction with unaligned buffers size, %lld\n", - proc->pid, thread->pid, - (u64)extra_buffers_size); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_offset; - } - off_start_offset = ALIGN(tr->data_size, sizeof(void *)); - buffer_offset = off_start_offset; - off_end_offset = off_start_offset + tr->offsets_size; - sg_buf_offset = ALIGN(off_end_offset, sizeof(void *)); - sg_buf_end_offset = sg_buf_offset + extra_buffers_size - - ALIGN(secctx_sz, sizeof(u64)); - off_min = 0; - for (buffer_offset = off_start_offset; buffer_offset < off_end_offset; - buffer_offset += sizeof(binder_size_t)) { - struct binder_object_header *hdr; - size_t object_size; - struct binder_object object; - binder_size_t object_offset; - binder_size_t copy_size; - - if (binder_alloc_copy_from_buffer(&target_proc->alloc, - &object_offset, - t->buffer, - buffer_offset, - sizeof(object_offset))) { - binder_txn_error("%d:%d copy offset from buffer failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_offset; - } - - /* - * Copy the source user buffer up to the next object - * that will be processed. - */ - copy_size = object_offset - user_offset; - if (copy_size && (user_offset > object_offset || - binder_alloc_copy_user_to_buffer( - &target_proc->alloc, - t->buffer, user_offset, - user_buffer + user_offset, - copy_size))) { - binder_user_error("%d:%d got transaction with invalid data ptr\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EFAULT; - return_error_line = __LINE__; - goto err_copy_data_failed; - } - object_size = binder_get_object(target_proc, user_buffer, - t->buffer, object_offset, &object); - if (object_size == 0 || object_offset < off_min) { - binder_user_error("%d:%d got transaction with invalid offset (%lld, min %lld max %lld) or object.\n", - proc->pid, thread->pid, - (u64)object_offset, - (u64)off_min, - (u64)t->buffer->data_size); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_offset; - } - /* - * Set offset to the next buffer fragment to be - * copied - */ - user_offset = object_offset + object_size; - - hdr = &object.hdr; - off_min = object_offset + object_size; - switch (hdr->type) { - case BINDER_TYPE_BINDER: - case BINDER_TYPE_WEAK_BINDER: { - struct flat_binder_object *fp; - - fp = to_flat_binder_object(hdr); - ret = binder_translate_binder(fp, t, thread); - - if (ret < 0 || - binder_alloc_copy_to_buffer(&target_proc->alloc, - t->buffer, - object_offset, - fp, sizeof(*fp))) { - binder_txn_error("%d:%d translate binder failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_translate_failed; - } - } break; - case BINDER_TYPE_HANDLE: - case BINDER_TYPE_WEAK_HANDLE: { - struct flat_binder_object *fp; - - fp = to_flat_binder_object(hdr); - ret = binder_translate_handle(fp, t, thread); - if (ret < 0 || - binder_alloc_copy_to_buffer(&target_proc->alloc, - t->buffer, - object_offset, - fp, sizeof(*fp))) { - binder_txn_error("%d:%d translate handle failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_translate_failed; - } - } break; - - case BINDER_TYPE_FD: { - struct binder_fd_object *fp = to_binder_fd_object(hdr); - binder_size_t fd_offset = object_offset + - (uintptr_t)&fp->fd - (uintptr_t)fp; - int ret = binder_translate_fd(fp->fd, fd_offset, t, - thread, in_reply_to); - - fp->pad_binder = 0; - if (ret < 0 || - binder_alloc_copy_to_buffer(&target_proc->alloc, - t->buffer, - object_offset, - fp, sizeof(*fp))) { - binder_txn_error("%d:%d translate fd failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_translate_failed; - } - } break; - case BINDER_TYPE_FDA: { - struct binder_object ptr_object; - binder_size_t parent_offset; - struct binder_object user_object; - size_t user_parent_size; - struct binder_fd_array_object *fda = - to_binder_fd_array_object(hdr); - size_t num_valid = (buffer_offset - off_start_offset) / - sizeof(binder_size_t); - struct binder_buffer_object *parent = - binder_validate_ptr(target_proc, t->buffer, - &ptr_object, fda->parent, - off_start_offset, - &parent_offset, - num_valid); - if (!parent) { - binder_user_error("%d:%d got transaction with invalid parent offset or type\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_parent; - } - if (!binder_validate_fixup(target_proc, t->buffer, - off_start_offset, - parent_offset, - fda->parent_offset, - last_fixup_obj_off, - last_fixup_min_off)) { - binder_user_error("%d:%d got transaction with out-of-order buffer fixup\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_parent; - } - /* - * We need to read the user version of the parent - * object to get the original user offset - */ - user_parent_size = - binder_get_object(proc, user_buffer, t->buffer, - parent_offset, &user_object); - if (user_parent_size != sizeof(user_object.bbo)) { - binder_user_error("%d:%d invalid ptr object size: %zd vs %zd\n", - proc->pid, thread->pid, - user_parent_size, - sizeof(user_object.bbo)); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_parent; - } - ret = binder_translate_fd_array(&pf_head, fda, - user_buffer, parent, - &user_object.bbo, t, - thread, in_reply_to); - if (!ret) - ret = binder_alloc_copy_to_buffer(&target_proc->alloc, - t->buffer, - object_offset, - fda, sizeof(*fda)); - if (ret) { - binder_txn_error("%d:%d translate fd array failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret > 0 ? -EINVAL : ret; - return_error_line = __LINE__; - goto err_translate_failed; - } - last_fixup_obj_off = parent_offset; - last_fixup_min_off = - fda->parent_offset + sizeof(u32) * fda->num_fds; - } break; - case BINDER_TYPE_PTR: { - struct binder_buffer_object *bp = - to_binder_buffer_object(hdr); - size_t buf_left = sg_buf_end_offset - sg_buf_offset; - size_t num_valid; - - if (bp->length > buf_left) { - binder_user_error("%d:%d got transaction with too large buffer\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_offset; - } - ret = binder_defer_copy(&sgc_head, sg_buf_offset, - (const void __user *)(uintptr_t)bp->buffer, - bp->length); - if (ret) { - binder_txn_error("%d:%d deferred copy failed\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_translate_failed; - } - /* Fixup buffer pointer to target proc address space */ - bp->buffer = (uintptr_t) - t->buffer->user_data + sg_buf_offset; - sg_buf_offset += ALIGN(bp->length, sizeof(u64)); - - num_valid = (buffer_offset - off_start_offset) / - sizeof(binder_size_t); - ret = binder_fixup_parent(&pf_head, t, - thread, bp, - off_start_offset, - num_valid, - last_fixup_obj_off, - last_fixup_min_off); - if (ret < 0 || - binder_alloc_copy_to_buffer(&target_proc->alloc, - t->buffer, - object_offset, - bp, sizeof(*bp))) { - binder_txn_error("%d:%d failed to fixup parent\n", - thread->pid, proc->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_translate_failed; - } - last_fixup_obj_off = object_offset; - last_fixup_min_off = 0; - } break; - default: - binder_user_error("%d:%d got transaction with invalid object type, %x\n", - proc->pid, thread->pid, hdr->type); - return_error = BR_FAILED_REPLY; - return_error_param = -EINVAL; - return_error_line = __LINE__; - goto err_bad_object_type; - } - } - /* Done processing objects, copy the rest of the buffer */ - if (binder_alloc_copy_user_to_buffer( - &target_proc->alloc, - t->buffer, user_offset, - user_buffer + user_offset, - tr->data_size - user_offset)) { - binder_user_error("%d:%d got transaction with invalid data ptr\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = -EFAULT; - return_error_line = __LINE__; - goto err_copy_data_failed; - } - - ret = binder_do_deferred_txn_copies(&target_proc->alloc, t->buffer, - &sgc_head, &pf_head); - if (ret) { - binder_user_error("%d:%d got transaction with invalid offsets ptr\n", - proc->pid, thread->pid); - return_error = BR_FAILED_REPLY; - return_error_param = ret; - return_error_line = __LINE__; - goto err_copy_data_failed; - } - if (t->buffer->oneway_spam_suspect) - tcomplete->type = BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT; - else - tcomplete->type = BINDER_WORK_TRANSACTION_COMPLETE; - t->work.type = BINDER_WORK_TRANSACTION; - - if (reply) { - binder_enqueue_thread_work(thread, tcomplete); - binder_inner_proc_lock(target_proc); - if (target_thread->is_dead) { - return_error = BR_DEAD_REPLY; - binder_inner_proc_unlock(target_proc); - goto err_dead_proc_or_thread; - } - BUG_ON(t->buffer->async_transaction != 0); - binder_pop_transaction_ilocked(target_thread, in_reply_to); - binder_enqueue_thread_work_ilocked(target_thread, &t->work); - target_proc->outstanding_txns++; - binder_inner_proc_unlock(target_proc); - wake_up_interruptible_sync(&target_thread->wait); - binder_free_transaction(in_reply_to); - } else if (!(t->flags & TF_ONE_WAY)) { - BUG_ON(t->buffer->async_transaction != 0); - binder_inner_proc_lock(proc); - /* - * Defer the TRANSACTION_COMPLETE, so we don't return to - * userspace immediately; this allows the target process to - * immediately start processing this transaction, reducing - * latency. We will then return the TRANSACTION_COMPLETE when - * the target replies (or there is an error). - */ - binder_enqueue_deferred_thread_work_ilocked(thread, tcomplete); - t->need_reply = 1; - t->from_parent = thread->transaction_stack; - thread->transaction_stack = t; - binder_inner_proc_unlock(proc); - return_error = binder_proc_transaction(t, - target_proc, target_thread); - if (return_error) { - binder_inner_proc_lock(proc); - binder_pop_transaction_ilocked(thread, t); - binder_inner_proc_unlock(proc); - goto err_dead_proc_or_thread; - } - } else { - BUG_ON(target_node == NULL); - BUG_ON(t->buffer->async_transaction != 1); - return_error = binder_proc_transaction(t, target_proc, NULL); - /* - * Let the caller know when async transaction reaches a frozen - * process and is put in a pending queue, waiting for the target - * process to be unfrozen. - */ - if (return_error == BR_TRANSACTION_PENDING_FROZEN) - tcomplete->type = BINDER_WORK_TRANSACTION_PENDING; - binder_enqueue_thread_work(thread, tcomplete); - if (return_error && - return_error != BR_TRANSACTION_PENDING_FROZEN) - goto err_dead_proc_or_thread; - } - if (target_thread) - binder_thread_dec_tmpref(target_thread); - binder_proc_dec_tmpref(target_proc); - if (target_node) - binder_dec_node_tmpref(target_node); - /* - * write barrier to synchronize with initialization - * of log entry - */ - smp_wmb(); - WRITE_ONCE(e->debug_id_done, t_debug_id); - return; - -err_dead_proc_or_thread: - binder_txn_error("%d:%d dead process or thread\n", - thread->pid, proc->pid); - return_error_line = __LINE__; - binder_dequeue_work(proc, tcomplete); -err_translate_failed: -err_bad_object_type: -err_bad_offset: -err_bad_parent: -err_copy_data_failed: - binder_cleanup_deferred_txn_lists(&sgc_head, &pf_head); - binder_free_txn_fixups(t); - trace_binder_transaction_failed_buffer_release(t->buffer); - binder_transaction_buffer_release(target_proc, NULL, t->buffer, - buffer_offset, true); - if (target_node) - binder_dec_node_tmpref(target_node); - target_node = NULL; - t->buffer->transaction = NULL; - binder_alloc_free_buf(&target_proc->alloc, t->buffer); -err_binder_alloc_buf_failed: -err_bad_extra_size: - if (secctx) - security_release_secctx(secctx, secctx_sz); -err_get_secctx_failed: - kfree(tcomplete); - binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE); -err_alloc_tcomplete_failed: - if (trace_binder_txn_latency_free_enabled()) - binder_txn_latency_free(t); - kfree(t); - binder_stats_deleted(BINDER_STAT_TRANSACTION); -err_alloc_t_failed: -err_bad_todo_list: -err_bad_call_stack: -err_empty_call_stack: -err_dead_binder: -err_invalid_target_handle: - if (target_node) { - binder_dec_node(target_node, 1, 0); - binder_dec_node_tmpref(target_node); - } - - binder_debug(BINDER_DEBUG_FAILED_TRANSACTION, - "%d:%d transaction %s to %d:%d failed %d/%d/%d, size %lld-%lld line %d\n", - proc->pid, thread->pid, reply ? "reply" : - (tr->flags & TF_ONE_WAY ? "async" : "call"), - target_proc ? target_proc->pid : 0, - target_thread ? target_thread->pid : 0, - t_debug_id, return_error, return_error_param, - (u64)tr->data_size, (u64)tr->offsets_size, - return_error_line); - - if (target_thread) - binder_thread_dec_tmpref(target_thread); - if (target_proc) - binder_proc_dec_tmpref(target_proc); - - { - struct binder_transaction_log_entry *fe; - - e->return_error = return_error; - e->return_error_param = return_error_param; - e->return_error_line = return_error_line; - fe = binder_transaction_log_add(&binder_transaction_log_failed); - *fe = *e; - /* - * write barrier to synchronize with initialization - * of log entry - */ - smp_wmb(); - WRITE_ONCE(e->debug_id_done, t_debug_id); - WRITE_ONCE(fe->debug_id_done, t_debug_id); - } - - BUG_ON(thread->return_error.cmd != BR_OK); - if (in_reply_to) { - binder_set_txn_from_error(in_reply_to, t_debug_id, - return_error, return_error_param); - thread->return_error.cmd = BR_TRANSACTION_COMPLETE; - binder_enqueue_thread_work(thread, &thread->return_error.work); - binder_send_failed_reply(in_reply_to, return_error); - } else { - binder_inner_proc_lock(proc); - binder_set_extended_error(&thread->ee, t_debug_id, - return_error, return_error_param); - binder_inner_proc_unlock(proc); - thread->return_error.cmd = return_error; - binder_enqueue_thread_work(thread, &thread->return_error.work); - } -} - -/** - * binder_free_buf() - free the specified buffer - * @proc: binder proc that owns buffer - * @buffer: buffer to be freed - * @is_failure: failed to send transaction - * - * If buffer for an async transaction, enqueue the next async - * transaction from the node. - * - * Cleanup buffer and free it. - */ -static void -binder_free_buf(struct binder_proc *proc, - struct binder_thread *thread, - struct binder_buffer *buffer, bool is_failure) -{ - binder_inner_proc_lock(proc); - if (buffer->transaction) { - buffer->transaction->buffer = NULL; - buffer->transaction = NULL; - } - binder_inner_proc_unlock(proc); - if (buffer->async_transaction && buffer->target_node) { - struct binder_node *buf_node; - struct binder_work *w; - - buf_node = buffer->target_node; - binder_node_inner_lock(buf_node); - BUG_ON(!buf_node->has_async_transaction); - BUG_ON(buf_node->proc != proc); - w = binder_dequeue_work_head_ilocked( - &buf_node->async_todo); - if (!w) { - buf_node->has_async_transaction = false; - } else { - binder_enqueue_work_ilocked( - w, &proc->todo); - binder_wakeup_proc_ilocked(proc); - } - binder_node_inner_unlock(buf_node); - } - trace_binder_transaction_buffer_release(buffer); - binder_release_entire_buffer(proc, thread, buffer, is_failure); - binder_alloc_free_buf(&proc->alloc, buffer); -} - -static int binder_thread_write(struct binder_proc *proc, - struct binder_thread *thread, - binder_uintptr_t binder_buffer, size_t size, - binder_size_t *consumed) -{ - uint32_t cmd; - struct binder_context *context = proc->context; - void __user *buffer = (void __user *)(uintptr_t)binder_buffer; - void __user *ptr = buffer + *consumed; - void __user *end = buffer + size; - - while (ptr < end && thread->return_error.cmd == BR_OK) { - int ret; - - if (get_user(cmd, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - trace_binder_command(cmd); - if (_IOC_NR(cmd) < ARRAY_SIZE(binder_stats.bc)) { - atomic_inc(&binder_stats.bc[_IOC_NR(cmd)]); - atomic_inc(&proc->stats.bc[_IOC_NR(cmd)]); - atomic_inc(&thread->stats.bc[_IOC_NR(cmd)]); - } - switch (cmd) { - case BC_INCREFS: - case BC_ACQUIRE: - case BC_RELEASE: - case BC_DECREFS: { - uint32_t target; - const char *debug_string; - bool strong = cmd == BC_ACQUIRE || cmd == BC_RELEASE; - bool increment = cmd == BC_INCREFS || cmd == BC_ACQUIRE; - struct binder_ref_data rdata; - - if (get_user(target, (uint32_t __user *)ptr)) - return -EFAULT; - - ptr += sizeof(uint32_t); - ret = -1; - if (increment && !target) { - struct binder_node *ctx_mgr_node; - - mutex_lock(&context->context_mgr_node_lock); - ctx_mgr_node = context->binder_context_mgr_node; - if (ctx_mgr_node) { - if (ctx_mgr_node->proc == proc) { - binder_user_error("%d:%d context manager tried to acquire desc 0\n", - proc->pid, thread->pid); - mutex_unlock(&context->context_mgr_node_lock); - return -EINVAL; - } - ret = binder_inc_ref_for_node( - proc, ctx_mgr_node, - strong, NULL, &rdata); - } - mutex_unlock(&context->context_mgr_node_lock); - } - if (ret) - ret = binder_update_ref_for_handle( - proc, target, increment, strong, - &rdata); - if (!ret && rdata.desc != target) { - binder_user_error("%d:%d tried to acquire reference to desc %d, got %d instead\n", - proc->pid, thread->pid, - target, rdata.desc); - } - switch (cmd) { - case BC_INCREFS: - debug_string = "IncRefs"; - break; - case BC_ACQUIRE: - debug_string = "Acquire"; - break; - case BC_RELEASE: - debug_string = "Release"; - break; - case BC_DECREFS: - default: - debug_string = "DecRefs"; - break; - } - if (ret) { - binder_user_error("%d:%d %s %d refcount change on invalid ref %d ret %d\n", - proc->pid, thread->pid, debug_string, - strong, target, ret); - break; - } - binder_debug(BINDER_DEBUG_USER_REFS, - "%d:%d %s ref %d desc %d s %d w %d\n", - proc->pid, thread->pid, debug_string, - rdata.debug_id, rdata.desc, rdata.strong, - rdata.weak); - break; - } - case BC_INCREFS_DONE: - case BC_ACQUIRE_DONE: { - binder_uintptr_t node_ptr; - binder_uintptr_t cookie; - struct binder_node *node; - bool free_node; - - if (get_user(node_ptr, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - if (get_user(cookie, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - node = binder_get_node(proc, node_ptr); - if (node == NULL) { - binder_user_error("%d:%d %s u%016llx no match\n", - proc->pid, thread->pid, - cmd == BC_INCREFS_DONE ? - "BC_INCREFS_DONE" : - "BC_ACQUIRE_DONE", - (u64)node_ptr); - break; - } - if (cookie != node->cookie) { - binder_user_error("%d:%d %s u%016llx node %d cookie mismatch %016llx != %016llx\n", - proc->pid, thread->pid, - cmd == BC_INCREFS_DONE ? - "BC_INCREFS_DONE" : "BC_ACQUIRE_DONE", - (u64)node_ptr, node->debug_id, - (u64)cookie, (u64)node->cookie); - binder_put_node(node); - break; - } - binder_node_inner_lock(node); - if (cmd == BC_ACQUIRE_DONE) { - if (node->pending_strong_ref == 0) { - binder_user_error("%d:%d BC_ACQUIRE_DONE node %d has no pending acquire request\n", - proc->pid, thread->pid, - node->debug_id); - binder_node_inner_unlock(node); - binder_put_node(node); - break; - } - node->pending_strong_ref = 0; - } else { - if (node->pending_weak_ref == 0) { - binder_user_error("%d:%d BC_INCREFS_DONE node %d has no pending increfs request\n", - proc->pid, thread->pid, - node->debug_id); - binder_node_inner_unlock(node); - binder_put_node(node); - break; - } - node->pending_weak_ref = 0; - } - free_node = binder_dec_node_nilocked(node, - cmd == BC_ACQUIRE_DONE, 0); - WARN_ON(free_node); - binder_debug(BINDER_DEBUG_USER_REFS, - "%d:%d %s node %d ls %d lw %d tr %d\n", - proc->pid, thread->pid, - cmd == BC_INCREFS_DONE ? "BC_INCREFS_DONE" : "BC_ACQUIRE_DONE", - node->debug_id, node->local_strong_refs, - node->local_weak_refs, node->tmp_refs); - binder_node_inner_unlock(node); - binder_put_node(node); - break; - } - case BC_ATTEMPT_ACQUIRE: - pr_err("BC_ATTEMPT_ACQUIRE not supported\n"); - return -EINVAL; - case BC_ACQUIRE_RESULT: - pr_err("BC_ACQUIRE_RESULT not supported\n"); - return -EINVAL; - - case BC_FREE_BUFFER: { - binder_uintptr_t data_ptr; - struct binder_buffer *buffer; - - if (get_user(data_ptr, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - - buffer = binder_alloc_prepare_to_free(&proc->alloc, - data_ptr); - if (IS_ERR_OR_NULL(buffer)) { - if (PTR_ERR(buffer) == -EPERM) { - binder_user_error( - "%d:%d BC_FREE_BUFFER u%016llx matched unreturned or currently freeing buffer\n", - proc->pid, thread->pid, - (u64)data_ptr); - } else { - binder_user_error( - "%d:%d BC_FREE_BUFFER u%016llx no match\n", - proc->pid, thread->pid, - (u64)data_ptr); - } - break; - } - binder_debug(BINDER_DEBUG_FREE_BUFFER, - "%d:%d BC_FREE_BUFFER u%016llx found buffer %d for %s transaction\n", - proc->pid, thread->pid, (u64)data_ptr, - buffer->debug_id, - buffer->transaction ? "active" : "finished"); - binder_free_buf(proc, thread, buffer, false); - break; - } - - case BC_TRANSACTION_SG: - case BC_REPLY_SG: { - struct binder_transaction_data_sg tr; - - if (copy_from_user(&tr, ptr, sizeof(tr))) - return -EFAULT; - ptr += sizeof(tr); - binder_transaction(proc, thread, &tr.transaction_data, - cmd == BC_REPLY_SG, tr.buffers_size); - break; - } - case BC_TRANSACTION: - case BC_REPLY: { - struct binder_transaction_data tr; - - if (copy_from_user(&tr, ptr, sizeof(tr))) - return -EFAULT; - ptr += sizeof(tr); - binder_transaction(proc, thread, &tr, - cmd == BC_REPLY, 0); - break; - } - - case BC_REGISTER_LOOPER: - binder_debug(BINDER_DEBUG_THREADS, - "%d:%d BC_REGISTER_LOOPER\n", - proc->pid, thread->pid); - binder_inner_proc_lock(proc); - if (thread->looper & BINDER_LOOPER_STATE_ENTERED) { - thread->looper |= BINDER_LOOPER_STATE_INVALID; - binder_user_error("%d:%d ERROR: BC_REGISTER_LOOPER called after BC_ENTER_LOOPER\n", - proc->pid, thread->pid); - } else if (proc->requested_threads == 0) { - thread->looper |= BINDER_LOOPER_STATE_INVALID; - binder_user_error("%d:%d ERROR: BC_REGISTER_LOOPER called without request\n", - proc->pid, thread->pid); - } else { - proc->requested_threads--; - proc->requested_threads_started++; - } - thread->looper |= BINDER_LOOPER_STATE_REGISTERED; - binder_inner_proc_unlock(proc); - break; - case BC_ENTER_LOOPER: - binder_debug(BINDER_DEBUG_THREADS, - "%d:%d BC_ENTER_LOOPER\n", - proc->pid, thread->pid); - if (thread->looper & BINDER_LOOPER_STATE_REGISTERED) { - thread->looper |= BINDER_LOOPER_STATE_INVALID; - binder_user_error("%d:%d ERROR: BC_ENTER_LOOPER called after BC_REGISTER_LOOPER\n", - proc->pid, thread->pid); - } - thread->looper |= BINDER_LOOPER_STATE_ENTERED; - break; - case BC_EXIT_LOOPER: - binder_debug(BINDER_DEBUG_THREADS, - "%d:%d BC_EXIT_LOOPER\n", - proc->pid, thread->pid); - thread->looper |= BINDER_LOOPER_STATE_EXITED; - break; - - case BC_REQUEST_DEATH_NOTIFICATION: - case BC_CLEAR_DEATH_NOTIFICATION: { - uint32_t target; - binder_uintptr_t cookie; - struct binder_ref *ref; - struct binder_ref_death *death = NULL; - - if (get_user(target, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - if (get_user(cookie, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - if (cmd == BC_REQUEST_DEATH_NOTIFICATION) { - /* - * Allocate memory for death notification - * before taking lock - */ - death = kzalloc(sizeof(*death), GFP_KERNEL); - if (death == NULL) { - WARN_ON(thread->return_error.cmd != - BR_OK); - thread->return_error.cmd = BR_ERROR; - binder_enqueue_thread_work( - thread, - &thread->return_error.work); - binder_debug( - BINDER_DEBUG_FAILED_TRANSACTION, - "%d:%d BC_REQUEST_DEATH_NOTIFICATION failed\n", - proc->pid, thread->pid); - break; - } - } - binder_proc_lock(proc); - ref = binder_get_ref_olocked(proc, target, false); - if (ref == NULL) { - binder_user_error("%d:%d %s invalid ref %d\n", - proc->pid, thread->pid, - cmd == BC_REQUEST_DEATH_NOTIFICATION ? - "BC_REQUEST_DEATH_NOTIFICATION" : - "BC_CLEAR_DEATH_NOTIFICATION", - target); - binder_proc_unlock(proc); - kfree(death); - break; - } - - binder_debug(BINDER_DEBUG_DEATH_NOTIFICATION, - "%d:%d %s %016llx ref %d desc %d s %d w %d for node %d\n", - proc->pid, thread->pid, - cmd == BC_REQUEST_DEATH_NOTIFICATION ? - "BC_REQUEST_DEATH_NOTIFICATION" : - "BC_CLEAR_DEATH_NOTIFICATION", - (u64)cookie, ref->data.debug_id, - ref->data.desc, ref->data.strong, - ref->data.weak, ref->node->debug_id); - - binder_node_lock(ref->node); - if (cmd == BC_REQUEST_DEATH_NOTIFICATION) { - if (ref->death) { - binder_user_error("%d:%d BC_REQUEST_DEATH_NOTIFICATION death notification already set\n", - proc->pid, thread->pid); - binder_node_unlock(ref->node); - binder_proc_unlock(proc); - kfree(death); - break; - } - binder_stats_created(BINDER_STAT_DEATH); - INIT_LIST_HEAD(&death->work.entry); - death->cookie = cookie; - ref->death = death; - if (ref->node->proc == NULL) { - ref->death->work.type = BINDER_WORK_DEAD_BINDER; - - binder_inner_proc_lock(proc); - binder_enqueue_work_ilocked( - &ref->death->work, &proc->todo); - binder_wakeup_proc_ilocked(proc); - binder_inner_proc_unlock(proc); - } - } else { - if (ref->death == NULL) { - binder_user_error("%d:%d BC_CLEAR_DEATH_NOTIFICATION death notification not active\n", - proc->pid, thread->pid); - binder_node_unlock(ref->node); - binder_proc_unlock(proc); - break; - } - death = ref->death; - if (death->cookie != cookie) { - binder_user_error("%d:%d BC_CLEAR_DEATH_NOTIFICATION death notification cookie mismatch %016llx != %016llx\n", - proc->pid, thread->pid, - (u64)death->cookie, - (u64)cookie); - binder_node_unlock(ref->node); - binder_proc_unlock(proc); - break; - } - ref->death = NULL; - binder_inner_proc_lock(proc); - if (list_empty(&death->work.entry)) { - death->work.type = BINDER_WORK_CLEAR_DEATH_NOTIFICATION; - if (thread->looper & - (BINDER_LOOPER_STATE_REGISTERED | - BINDER_LOOPER_STATE_ENTERED)) - binder_enqueue_thread_work_ilocked( - thread, - &death->work); - else { - binder_enqueue_work_ilocked( - &death->work, - &proc->todo); - binder_wakeup_proc_ilocked( - proc); - } - } else { - BUG_ON(death->work.type != BINDER_WORK_DEAD_BINDER); - death->work.type = BINDER_WORK_DEAD_BINDER_AND_CLEAR; - } - binder_inner_proc_unlock(proc); - } - binder_node_unlock(ref->node); - binder_proc_unlock(proc); - } break; - case BC_DEAD_BINDER_DONE: { - struct binder_work *w; - binder_uintptr_t cookie; - struct binder_ref_death *death = NULL; - - if (get_user(cookie, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - - ptr += sizeof(cookie); - binder_inner_proc_lock(proc); - list_for_each_entry(w, &proc->delivered_death, - entry) { - struct binder_ref_death *tmp_death = - container_of(w, - struct binder_ref_death, - work); - - if (tmp_death->cookie == cookie) { - death = tmp_death; - break; - } - } - binder_debug(BINDER_DEBUG_DEAD_BINDER, - "%d:%d BC_DEAD_BINDER_DONE %016llx found %pK\n", - proc->pid, thread->pid, (u64)cookie, - death); - if (death == NULL) { - binder_user_error("%d:%d BC_DEAD_BINDER_DONE %016llx not found\n", - proc->pid, thread->pid, (u64)cookie); - binder_inner_proc_unlock(proc); - break; - } - binder_dequeue_work_ilocked(&death->work); - if (death->work.type == BINDER_WORK_DEAD_BINDER_AND_CLEAR) { - death->work.type = BINDER_WORK_CLEAR_DEATH_NOTIFICATION; - if (thread->looper & - (BINDER_LOOPER_STATE_REGISTERED | - BINDER_LOOPER_STATE_ENTERED)) - binder_enqueue_thread_work_ilocked( - thread, &death->work); - else { - binder_enqueue_work_ilocked( - &death->work, - &proc->todo); - binder_wakeup_proc_ilocked(proc); - } - } - binder_inner_proc_unlock(proc); - } break; - - default: - pr_err("%d:%d unknown command %u\n", - proc->pid, thread->pid, cmd); - return -EINVAL; - } - *consumed = ptr - buffer; - } - return 0; -} - -static void binder_stat_br(struct binder_proc *proc, - struct binder_thread *thread, uint32_t cmd) -{ - trace_binder_return(cmd); - if (_IOC_NR(cmd) < ARRAY_SIZE(binder_stats.br)) { - atomic_inc(&binder_stats.br[_IOC_NR(cmd)]); - atomic_inc(&proc->stats.br[_IOC_NR(cmd)]); - atomic_inc(&thread->stats.br[_IOC_NR(cmd)]); - } -} - -static int binder_put_node_cmd(struct binder_proc *proc, - struct binder_thread *thread, - void __user **ptrp, - binder_uintptr_t node_ptr, - binder_uintptr_t node_cookie, - int node_debug_id, - uint32_t cmd, const char *cmd_name) -{ - void __user *ptr = *ptrp; - - if (put_user(cmd, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - - if (put_user(node_ptr, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - - if (put_user(node_cookie, (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - - binder_stat_br(proc, thread, cmd); - binder_debug(BINDER_DEBUG_USER_REFS, "%d:%d %s %d u%016llx c%016llx\n", - proc->pid, thread->pid, cmd_name, node_debug_id, - (u64)node_ptr, (u64)node_cookie); - - *ptrp = ptr; - return 0; -} - -static int binder_wait_for_work(struct binder_thread *thread, - bool do_proc_work) -{ - DEFINE_WAIT(wait); - struct binder_proc *proc = thread->proc; - int ret = 0; - - binder_inner_proc_lock(proc); - for (;;) { - prepare_to_wait(&thread->wait, &wait, TASK_INTERRUPTIBLE|TASK_FREEZABLE); - if (binder_has_work_ilocked(thread, do_proc_work)) - break; - if (do_proc_work) - list_add(&thread->waiting_thread_node, - &proc->waiting_threads); - binder_inner_proc_unlock(proc); - schedule(); - binder_inner_proc_lock(proc); - list_del_init(&thread->waiting_thread_node); - if (signal_pending(current)) { - ret = -EINTR; - break; - } - } - finish_wait(&thread->wait, &wait); - binder_inner_proc_unlock(proc); - - return ret; -} - -/** - * binder_apply_fd_fixups() - finish fd translation - * @proc: binder_proc associated @t->buffer - * @t: binder transaction with list of fd fixups - * - * Now that we are in the context of the transaction target - * process, we can allocate and install fds. Process the - * list of fds to translate and fixup the buffer with the - * new fds first and only then install the files. - * - * If we fail to allocate an fd, skip the install and release - * any fds that have already been allocated. - */ -static int binder_apply_fd_fixups(struct binder_proc *proc, - struct binder_transaction *t) -{ - struct binder_txn_fd_fixup *fixup, *tmp; - int ret = 0; - - list_for_each_entry(fixup, &t->fd_fixups, fixup_entry) { - int fd = get_unused_fd_flags(O_CLOEXEC); - - if (fd < 0) { - binder_debug(BINDER_DEBUG_TRANSACTION, - "failed fd fixup txn %d fd %d\n", - t->debug_id, fd); - ret = -ENOMEM; - goto err; - } - binder_debug(BINDER_DEBUG_TRANSACTION, - "fd fixup txn %d fd %d\n", - t->debug_id, fd); - trace_binder_transaction_fd_recv(t, fd, fixup->offset); - fixup->target_fd = fd; - if (binder_alloc_copy_to_buffer(&proc->alloc, t->buffer, - fixup->offset, &fd, - sizeof(u32))) { - ret = -EINVAL; - goto err; - } - } - list_for_each_entry_safe(fixup, tmp, &t->fd_fixups, fixup_entry) { - fd_install(fixup->target_fd, fixup->file); - list_del(&fixup->fixup_entry); - kfree(fixup); - } - - return ret; - -err: - binder_free_txn_fixups(t); - return ret; -} - -static int binder_thread_read(struct binder_proc *proc, - struct binder_thread *thread, - binder_uintptr_t binder_buffer, size_t size, - binder_size_t *consumed, int non_block) -{ - void __user *buffer = (void __user *)(uintptr_t)binder_buffer; - void __user *ptr = buffer + *consumed; - void __user *end = buffer + size; - - int ret = 0; - int wait_for_proc_work; - - if (*consumed == 0) { - if (put_user(BR_NOOP, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - } - -retry: - binder_inner_proc_lock(proc); - wait_for_proc_work = binder_available_for_proc_work_ilocked(thread); - binder_inner_proc_unlock(proc); - - thread->looper |= BINDER_LOOPER_STATE_WAITING; - - trace_binder_wait_for_work(wait_for_proc_work, - !!thread->transaction_stack, - !binder_worklist_empty(proc, &thread->todo)); - if (wait_for_proc_work) { - if (!(thread->looper & (BINDER_LOOPER_STATE_REGISTERED | - BINDER_LOOPER_STATE_ENTERED))) { - binder_user_error("%d:%d ERROR: Thread waiting for process work before calling BC_REGISTER_LOOPER or BC_ENTER_LOOPER (state %x)\n", - proc->pid, thread->pid, thread->looper); - wait_event_interruptible(binder_user_error_wait, - binder_stop_on_user_error < 2); - } - binder_set_nice(proc->default_priority); - } - - if (non_block) { - if (!binder_has_work(thread, wait_for_proc_work)) - ret = -EAGAIN; - } else { - ret = binder_wait_for_work(thread, wait_for_proc_work); - } - - thread->looper &= ~BINDER_LOOPER_STATE_WAITING; - - if (ret) - return ret; - - while (1) { - uint32_t cmd; - struct binder_transaction_data_secctx tr; - struct binder_transaction_data *trd = &tr.transaction_data; - struct binder_work *w = NULL; - struct list_head *list = NULL; - struct binder_transaction *t = NULL; - struct binder_thread *t_from; - size_t trsize = sizeof(*trd); - - binder_inner_proc_lock(proc); - if (!binder_worklist_empty_ilocked(&thread->todo)) - list = &thread->todo; - else if (!binder_worklist_empty_ilocked(&proc->todo) && - wait_for_proc_work) - list = &proc->todo; - else { - binder_inner_proc_unlock(proc); - - /* no data added */ - if (ptr - buffer == 4 && !thread->looper_need_return) - goto retry; - break; - } - - if (end - ptr < sizeof(tr) + 4) { - binder_inner_proc_unlock(proc); - break; - } - w = binder_dequeue_work_head_ilocked(list); - if (binder_worklist_empty_ilocked(&thread->todo)) - thread->process_todo = false; - - switch (w->type) { - case BINDER_WORK_TRANSACTION: { - binder_inner_proc_unlock(proc); - t = container_of(w, struct binder_transaction, work); - } break; - case BINDER_WORK_RETURN_ERROR: { - struct binder_error *e = container_of( - w, struct binder_error, work); - - WARN_ON(e->cmd == BR_OK); - binder_inner_proc_unlock(proc); - if (put_user(e->cmd, (uint32_t __user *)ptr)) - return -EFAULT; - cmd = e->cmd; - e->cmd = BR_OK; - ptr += sizeof(uint32_t); - - binder_stat_br(proc, thread, cmd); - } break; - case BINDER_WORK_TRANSACTION_COMPLETE: - case BINDER_WORK_TRANSACTION_PENDING: - case BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT: { - if (proc->oneway_spam_detection_enabled && - w->type == BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT) - cmd = BR_ONEWAY_SPAM_SUSPECT; - else if (w->type == BINDER_WORK_TRANSACTION_PENDING) - cmd = BR_TRANSACTION_PENDING_FROZEN; - else - cmd = BR_TRANSACTION_COMPLETE; - binder_inner_proc_unlock(proc); - kfree(w); - binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE); - if (put_user(cmd, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - - binder_stat_br(proc, thread, cmd); - binder_debug(BINDER_DEBUG_TRANSACTION_COMPLETE, - "%d:%d BR_TRANSACTION_COMPLETE\n", - proc->pid, thread->pid); - } break; - case BINDER_WORK_NODE: { - struct binder_node *node = container_of(w, struct binder_node, work); - int strong, weak; - binder_uintptr_t node_ptr = node->ptr; - binder_uintptr_t node_cookie = node->cookie; - int node_debug_id = node->debug_id; - int has_weak_ref; - int has_strong_ref; - void __user *orig_ptr = ptr; - - BUG_ON(proc != node->proc); - strong = node->internal_strong_refs || - node->local_strong_refs; - weak = !hlist_empty(&node->refs) || - node->local_weak_refs || - node->tmp_refs || strong; - has_strong_ref = node->has_strong_ref; - has_weak_ref = node->has_weak_ref; - - if (weak && !has_weak_ref) { - node->has_weak_ref = 1; - node->pending_weak_ref = 1; - node->local_weak_refs++; - } - if (strong && !has_strong_ref) { - node->has_strong_ref = 1; - node->pending_strong_ref = 1; - node->local_strong_refs++; - } - if (!strong && has_strong_ref) - node->has_strong_ref = 0; - if (!weak && has_weak_ref) - node->has_weak_ref = 0; - if (!weak && !strong) { - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "%d:%d node %d u%016llx c%016llx deleted\n", - proc->pid, thread->pid, - node_debug_id, - (u64)node_ptr, - (u64)node_cookie); - rb_erase(&node->rb_node, &proc->nodes); - binder_inner_proc_unlock(proc); - binder_node_lock(node); - /* - * Acquire the node lock before freeing the - * node to serialize with other threads that - * may have been holding the node lock while - * decrementing this node (avoids race where - * this thread frees while the other thread - * is unlocking the node after the final - * decrement) - */ - binder_node_unlock(node); - binder_free_node(node); - } else - binder_inner_proc_unlock(proc); - - if (weak && !has_weak_ref) - ret = binder_put_node_cmd( - proc, thread, &ptr, node_ptr, - node_cookie, node_debug_id, - BR_INCREFS, "BR_INCREFS"); - if (!ret && strong && !has_strong_ref) - ret = binder_put_node_cmd( - proc, thread, &ptr, node_ptr, - node_cookie, node_debug_id, - BR_ACQUIRE, "BR_ACQUIRE"); - if (!ret && !strong && has_strong_ref) - ret = binder_put_node_cmd( - proc, thread, &ptr, node_ptr, - node_cookie, node_debug_id, - BR_RELEASE, "BR_RELEASE"); - if (!ret && !weak && has_weak_ref) - ret = binder_put_node_cmd( - proc, thread, &ptr, node_ptr, - node_cookie, node_debug_id, - BR_DECREFS, "BR_DECREFS"); - if (orig_ptr == ptr) - binder_debug(BINDER_DEBUG_INTERNAL_REFS, - "%d:%d node %d u%016llx c%016llx state unchanged\n", - proc->pid, thread->pid, - node_debug_id, - (u64)node_ptr, - (u64)node_cookie); - if (ret) - return ret; - } break; - case BINDER_WORK_DEAD_BINDER: - case BINDER_WORK_DEAD_BINDER_AND_CLEAR: - case BINDER_WORK_CLEAR_DEATH_NOTIFICATION: { - struct binder_ref_death *death; - uint32_t cmd; - binder_uintptr_t cookie; - - death = container_of(w, struct binder_ref_death, work); - if (w->type == BINDER_WORK_CLEAR_DEATH_NOTIFICATION) - cmd = BR_CLEAR_DEATH_NOTIFICATION_DONE; - else - cmd = BR_DEAD_BINDER; - cookie = death->cookie; - - binder_debug(BINDER_DEBUG_DEATH_NOTIFICATION, - "%d:%d %s %016llx\n", - proc->pid, thread->pid, - cmd == BR_DEAD_BINDER ? - "BR_DEAD_BINDER" : - "BR_CLEAR_DEATH_NOTIFICATION_DONE", - (u64)cookie); - if (w->type == BINDER_WORK_CLEAR_DEATH_NOTIFICATION) { - binder_inner_proc_unlock(proc); - kfree(death); - binder_stats_deleted(BINDER_STAT_DEATH); - } else { - binder_enqueue_work_ilocked( - w, &proc->delivered_death); - binder_inner_proc_unlock(proc); - } - if (put_user(cmd, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - if (put_user(cookie, - (binder_uintptr_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(binder_uintptr_t); - binder_stat_br(proc, thread, cmd); - if (cmd == BR_DEAD_BINDER) - goto done; /* DEAD_BINDER notifications can cause transactions */ - } break; - default: - binder_inner_proc_unlock(proc); - pr_err("%d:%d: bad work type %d\n", - proc->pid, thread->pid, w->type); - break; - } - - if (!t) - continue; - - BUG_ON(t->buffer == NULL); - if (t->buffer->target_node) { - struct binder_node *target_node = t->buffer->target_node; - - trd->target.ptr = target_node->ptr; - trd->cookie = target_node->cookie; - t->saved_priority = task_nice(current); - if (t->priority < target_node->min_priority && - !(t->flags & TF_ONE_WAY)) - binder_set_nice(t->priority); - else if (!(t->flags & TF_ONE_WAY) || - t->saved_priority > target_node->min_priority) - binder_set_nice(target_node->min_priority); - cmd = BR_TRANSACTION; - } else { - trd->target.ptr = 0; - trd->cookie = 0; - cmd = BR_REPLY; - } - trd->code = t->code; - trd->flags = t->flags; - trd->sender_euid = from_kuid(current_user_ns(), t->sender_euid); - - t_from = binder_get_txn_from(t); - if (t_from) { - struct task_struct *sender = t_from->proc->tsk; - - trd->sender_pid = - task_tgid_nr_ns(sender, - task_active_pid_ns(current)); - } else { - trd->sender_pid = 0; - } - - ret = binder_apply_fd_fixups(proc, t); - if (ret) { - struct binder_buffer *buffer = t->buffer; - bool oneway = !!(t->flags & TF_ONE_WAY); - int tid = t->debug_id; - - if (t_from) - binder_thread_dec_tmpref(t_from); - buffer->transaction = NULL; - binder_cleanup_transaction(t, "fd fixups failed", - BR_FAILED_REPLY); - binder_free_buf(proc, thread, buffer, true); - binder_debug(BINDER_DEBUG_FAILED_TRANSACTION, - "%d:%d %stransaction %d fd fixups failed %d/%d, line %d\n", - proc->pid, thread->pid, - oneway ? "async " : - (cmd == BR_REPLY ? "reply " : ""), - tid, BR_FAILED_REPLY, ret, __LINE__); - if (cmd == BR_REPLY) { - cmd = BR_FAILED_REPLY; - if (put_user(cmd, (uint32_t __user *)ptr)) - return -EFAULT; - ptr += sizeof(uint32_t); - binder_stat_br(proc, thread, cmd); - break; - } - continue; - } - trd->data_size = t->buffer->data_size; - trd->offsets_size = t->buffer->offsets_size; - trd->data.ptr.buffer = (uintptr_t)t->buffer->user_data; - trd->data.ptr.offsets = trd->data.ptr.buffer + - ALIGN(t->buffer->data_size, - sizeof(void *)); - - tr.secctx = t->security_ctx; - if (t->security_ctx) { - cmd = BR_TRANSACTION_SEC_CTX; - trsize = sizeof(tr); - } - if (put_user(cmd, (uint32_t __user *)ptr)) { - if (t_from) - binder_thread_dec_tmpref(t_from); - - binder_cleanup_transaction(t, "put_user failed", - BR_FAILED_REPLY); - - return -EFAULT; - } - ptr += sizeof(uint32_t); - if (copy_to_user(ptr, &tr, trsize)) { - if (t_from) - binder_thread_dec_tmpref(t_from); - - binder_cleanup_transaction(t, "copy_to_user failed", - BR_FAILED_REPLY); - - return -EFAULT; - } - ptr += trsize; - - trace_binder_transaction_received(t); - binder_stat_br(proc, thread, cmd); - binder_debug(BINDER_DEBUG_TRANSACTION, - "%d:%d %s %d %d:%d, cmd %u size %zd-%zd ptr %016llx-%016llx\n", - proc->pid, thread->pid, - (cmd == BR_TRANSACTION) ? "BR_TRANSACTION" : - (cmd == BR_TRANSACTION_SEC_CTX) ? - "BR_TRANSACTION_SEC_CTX" : "BR_REPLY", - t->debug_id, t_from ? t_from->proc->pid : 0, - t_from ? t_from->pid : 0, cmd, - t->buffer->data_size, t->buffer->offsets_size, - (u64)trd->data.ptr.buffer, - (u64)trd->data.ptr.offsets); - - if (t_from) - binder_thread_dec_tmpref(t_from); - t->buffer->allow_user_free = 1; - if (cmd != BR_REPLY && !(t->flags & TF_ONE_WAY)) { - binder_inner_proc_lock(thread->proc); - t->to_parent = thread->transaction_stack; - t->to_thread = thread; - thread->transaction_stack = t; - binder_inner_proc_unlock(thread->proc); - } else { - binder_free_transaction(t); - } - break; - } - -done: - - *consumed = ptr - buffer; - binder_inner_proc_lock(proc); - if (proc->requested_threads == 0 && - list_empty(&thread->proc->waiting_threads) && - proc->requested_threads_started < proc->max_threads && - (thread->looper & (BINDER_LOOPER_STATE_REGISTERED | - BINDER_LOOPER_STATE_ENTERED)) /* the user-space code fails to */ - /*spawn a new thread if we leave this out */) { - proc->requested_threads++; - binder_inner_proc_unlock(proc); - binder_debug(BINDER_DEBUG_THREADS, - "%d:%d BR_SPAWN_LOOPER\n", - proc->pid, thread->pid); - if (put_user(BR_SPAWN_LOOPER, (uint32_t __user *)buffer)) - return -EFAULT; - binder_stat_br(proc, thread, BR_SPAWN_LOOPER); - } else - binder_inner_proc_unlock(proc); - return 0; -} - -static void binder_release_work(struct binder_proc *proc, - struct list_head *list) -{ - struct binder_work *w; - enum binder_work_type wtype; - - while (1) { - binder_inner_proc_lock(proc); - w = binder_dequeue_work_head_ilocked(list); - wtype = w ? w->type : 0; - binder_inner_proc_unlock(proc); - if (!w) - return; - - switch (wtype) { - case BINDER_WORK_TRANSACTION: { - struct binder_transaction *t; - - t = container_of(w, struct binder_transaction, work); - - binder_cleanup_transaction(t, "process died.", - BR_DEAD_REPLY); - } break; - case BINDER_WORK_RETURN_ERROR: { - struct binder_error *e = container_of( - w, struct binder_error, work); - - binder_debug(BINDER_DEBUG_DEAD_TRANSACTION, - "undelivered TRANSACTION_ERROR: %u\n", - e->cmd); - } break; - case BINDER_WORK_TRANSACTION_PENDING: - case BINDER_WORK_TRANSACTION_ONEWAY_SPAM_SUSPECT: - case BINDER_WORK_TRANSACTION_COMPLETE: { - binder_debug(BINDER_DEBUG_DEAD_TRANSACTION, - "undelivered TRANSACTION_COMPLETE\n"); - kfree(w); - binder_stats_deleted(BINDER_STAT_TRANSACTION_COMPLETE); - } break; - case BINDER_WORK_DEAD_BINDER_AND_CLEAR: - case BINDER_WORK_CLEAR_DEATH_NOTIFICATION: { - struct binder_ref_death *death; - - death = container_of(w, struct binder_ref_death, work); - binder_debug(BINDER_DEBUG_DEAD_TRANSACTION, - "undelivered death notification, %016llx\n", - (u64)death->cookie); - kfree(death); - binder_stats_deleted(BINDER_STAT_DEATH); - } break; - case BINDER_WORK_NODE: - break; - default: - pr_err("unexpected work type, %d, not freed\n", - wtype); - break; - } - } - -} - -static struct binder_thread *binder_get_thread_ilocked( - struct binder_proc *proc, struct binder_thread *new_thread) -{ - struct binder_thread *thread = NULL; - struct rb_node *parent = NULL; - struct rb_node **p = &proc->threads.rb_node; - - while (*p) { - parent = *p; - thread = rb_entry(parent, struct binder_thread, rb_node); - - if (current->pid < thread->pid) - p = &(*p)->rb_left; - else if (current->pid > thread->pid) - p = &(*p)->rb_right; - else - return thread; - } - if (!new_thread) - return NULL; - thread = new_thread; - binder_stats_created(BINDER_STAT_THREAD); - thread->proc = proc; - thread->pid = current->pid; - atomic_set(&thread->tmp_ref, 0); - init_waitqueue_head(&thread->wait); - INIT_LIST_HEAD(&thread->todo); - rb_link_node(&thread->rb_node, parent, p); - rb_insert_color(&thread->rb_node, &proc->threads); - thread->looper_need_return = true; - thread->return_error.work.type = BINDER_WORK_RETURN_ERROR; - thread->return_error.cmd = BR_OK; - thread->reply_error.work.type = BINDER_WORK_RETURN_ERROR; - thread->reply_error.cmd = BR_OK; - thread->ee.command = BR_OK; - INIT_LIST_HEAD(&new_thread->waiting_thread_node); - return thread; -} - -static struct binder_thread *binder_get_thread(struct binder_proc *proc) -{ - struct binder_thread *thread; - struct binder_thread *new_thread; - - binder_inner_proc_lock(proc); - thread = binder_get_thread_ilocked(proc, NULL); - binder_inner_proc_unlock(proc); - if (!thread) { - new_thread = kzalloc(sizeof(*thread), GFP_KERNEL); - if (new_thread == NULL) - return NULL; - binder_inner_proc_lock(proc); - thread = binder_get_thread_ilocked(proc, new_thread); - binder_inner_proc_unlock(proc); - if (thread != new_thread) - kfree(new_thread); - } - return thread; -} - -static void binder_free_proc(struct binder_proc *proc) -{ - struct binder_device *device; - - BUG_ON(!list_empty(&proc->todo)); - BUG_ON(!list_empty(&proc->delivered_death)); - if (proc->outstanding_txns) - pr_warn("%s: Unexpected outstanding_txns %d\n", - __func__, proc->outstanding_txns); - device = container_of(proc->context, struct binder_device, context); - if (refcount_dec_and_test(&device->ref)) { - kfree(proc->context->name); - kfree(device); - } - binder_alloc_deferred_release(&proc->alloc); - put_task_struct(proc->tsk); - put_cred(proc->cred); - binder_stats_deleted(BINDER_STAT_PROC); - kfree(proc); -} - -static void binder_free_thread(struct binder_thread *thread) -{ - BUG_ON(!list_empty(&thread->todo)); - binder_stats_deleted(BINDER_STAT_THREAD); - binder_proc_dec_tmpref(thread->proc); - kfree(thread); -} - -static int binder_thread_release(struct binder_proc *proc, - struct binder_thread *thread) -{ - struct binder_transaction *t; - struct binder_transaction *send_reply = NULL; - int active_transactions = 0; - struct binder_transaction *last_t = NULL; - - binder_inner_proc_lock(thread->proc); - /* - * take a ref on the proc so it survives - * after we remove this thread from proc->threads. - * The corresponding dec is when we actually - * free the thread in binder_free_thread() - */ - proc->tmp_ref++; - /* - * take a ref on this thread to ensure it - * survives while we are releasing it - */ - atomic_inc(&thread->tmp_ref); - rb_erase(&thread->rb_node, &proc->threads); - t = thread->transaction_stack; - if (t) { - spin_lock(&t->lock); - if (t->to_thread == thread) - send_reply = t; - } else { - __acquire(&t->lock); - } - thread->is_dead = true; - - while (t) { - last_t = t; - active_transactions++; - binder_debug(BINDER_DEBUG_DEAD_TRANSACTION, - "release %d:%d transaction %d %s, still active\n", - proc->pid, thread->pid, - t->debug_id, - (t->to_thread == thread) ? "in" : "out"); - - if (t->to_thread == thread) { - thread->proc->outstanding_txns--; - t->to_proc = NULL; - t->to_thread = NULL; - if (t->buffer) { - t->buffer->transaction = NULL; - t->buffer = NULL; - } - t = t->to_parent; - } else if (t->from == thread) { - t->from = NULL; - t = t->from_parent; - } else - BUG(); - spin_unlock(&last_t->lock); - if (t) - spin_lock(&t->lock); - else - __acquire(&t->lock); - } - /* annotation for sparse, lock not acquired in last iteration above */ - __release(&t->lock); - - /* - * If this thread used poll, make sure we remove the waitqueue from any - * poll data structures holding it. - */ - if (thread->looper & BINDER_LOOPER_STATE_POLL) - wake_up_pollfree(&thread->wait); - - binder_inner_proc_unlock(thread->proc); - - /* - * This is needed to avoid races between wake_up_pollfree() above and - * someone else removing the last entry from the queue for other reasons - * (e.g. ep_remove_wait_queue() being called due to an epoll file - * descriptor being closed). Such other users hold an RCU read lock, so - * we can be sure they're done after we call synchronize_rcu(). - */ - if (thread->looper & BINDER_LOOPER_STATE_POLL) - synchronize_rcu(); - - if (send_reply) - binder_send_failed_reply(send_reply, BR_DEAD_REPLY); - binder_release_work(proc, &thread->todo); - binder_thread_dec_tmpref(thread); - return active_transactions; -} - -static __poll_t binder_poll(struct file *filp, - struct poll_table_struct *wait) -{ - struct binder_proc *proc = filp->private_data; - struct binder_thread *thread = NULL; - bool wait_for_proc_work; - - thread = binder_get_thread(proc); - if (!thread) - return POLLERR; - - binder_inner_proc_lock(thread->proc); - thread->looper |= BINDER_LOOPER_STATE_POLL; - wait_for_proc_work = binder_available_for_proc_work_ilocked(thread); - - binder_inner_proc_unlock(thread->proc); - - poll_wait(filp, &thread->wait, wait); - - if (binder_has_work(thread, wait_for_proc_work)) - return EPOLLIN; - - return 0; -} - -static int binder_ioctl_write_read(struct file *filp, unsigned long arg, - struct binder_thread *thread) -{ - int ret = 0; - struct binder_proc *proc = filp->private_data; - void __user *ubuf = (void __user *)arg; - struct binder_write_read bwr; - - if (copy_from_user(&bwr, ubuf, sizeof(bwr))) { - ret = -EFAULT; - goto out; - } - binder_debug(BINDER_DEBUG_READ_WRITE, - "%d:%d write %lld at %016llx, read %lld at %016llx\n", - proc->pid, thread->pid, - (u64)bwr.write_size, (u64)bwr.write_buffer, - (u64)bwr.read_size, (u64)bwr.read_buffer); - - if (bwr.write_size > 0) { - ret = binder_thread_write(proc, thread, - bwr.write_buffer, - bwr.write_size, - &bwr.write_consumed); - trace_binder_write_done(ret); - if (ret < 0) { - bwr.read_consumed = 0; - if (copy_to_user(ubuf, &bwr, sizeof(bwr))) - ret = -EFAULT; - goto out; - } - } - if (bwr.read_size > 0) { - ret = binder_thread_read(proc, thread, bwr.read_buffer, - bwr.read_size, - &bwr.read_consumed, - filp->f_flags & O_NONBLOCK); - trace_binder_read_done(ret); - binder_inner_proc_lock(proc); - if (!binder_worklist_empty_ilocked(&proc->todo)) - binder_wakeup_proc_ilocked(proc); - binder_inner_proc_unlock(proc); - if (ret < 0) { - if (copy_to_user(ubuf, &bwr, sizeof(bwr))) - ret = -EFAULT; - goto out; - } - } - binder_debug(BINDER_DEBUG_READ_WRITE, - "%d:%d wrote %lld of %lld, read return %lld of %lld\n", - proc->pid, thread->pid, - (u64)bwr.write_consumed, (u64)bwr.write_size, - (u64)bwr.read_consumed, (u64)bwr.read_size); - if (copy_to_user(ubuf, &bwr, sizeof(bwr))) { - ret = -EFAULT; - goto out; - } -out: - return ret; -} - -static int binder_ioctl_set_ctx_mgr(struct file *filp, - struct flat_binder_object *fbo) -{ - int ret = 0; - struct binder_proc *proc = filp->private_data; - struct binder_context *context = proc->context; - struct binder_node *new_node; - kuid_t curr_euid = current_euid(); - - mutex_lock(&context->context_mgr_node_lock); - if (context->binder_context_mgr_node) { - pr_err("BINDER_SET_CONTEXT_MGR already set\n"); - ret = -EBUSY; - goto out; - } - ret = security_binder_set_context_mgr(proc->cred); - if (ret < 0) - goto out; - if (uid_valid(context->binder_context_mgr_uid)) { - if (!uid_eq(context->binder_context_mgr_uid, curr_euid)) { - pr_err("BINDER_SET_CONTEXT_MGR bad uid %d != %d\n", - from_kuid(&init_user_ns, curr_euid), - from_kuid(&init_user_ns, - context->binder_context_mgr_uid)); - ret = -EPERM; - goto out; - } - } else { - context->binder_context_mgr_uid = curr_euid; - } - new_node = binder_new_node(proc, fbo); - if (!new_node) { - ret = -ENOMEM; - goto out; - } - binder_node_lock(new_node); - new_node->local_weak_refs++; - new_node->local_strong_refs++; - new_node->has_strong_ref = 1; - new_node->has_weak_ref = 1; - context->binder_context_mgr_node = new_node; - binder_node_unlock(new_node); - binder_put_node(new_node); -out: - mutex_unlock(&context->context_mgr_node_lock); - return ret; -} - -static int binder_ioctl_get_node_info_for_ref(struct binder_proc *proc, - struct binder_node_info_for_ref *info) -{ - struct binder_node *node; - struct binder_context *context = proc->context; - __u32 handle = info->handle; - - if (info->strong_count || info->weak_count || info->reserved1 || - info->reserved2 || info->reserved3) { - binder_user_error("%d BINDER_GET_NODE_INFO_FOR_REF: only handle may be non-zero.", - proc->pid); - return -EINVAL; - } - - /* This ioctl may only be used by the context manager */ - mutex_lock(&context->context_mgr_node_lock); - if (!context->binder_context_mgr_node || - context->binder_context_mgr_node->proc != proc) { - mutex_unlock(&context->context_mgr_node_lock); - return -EPERM; - } - mutex_unlock(&context->context_mgr_node_lock); - - node = binder_get_node_from_ref(proc, handle, true, NULL); - if (!node) - return -EINVAL; - - info->strong_count = node->local_strong_refs + - node->internal_strong_refs; - info->weak_count = node->local_weak_refs; - - binder_put_node(node); - - return 0; -} - -static int binder_ioctl_get_node_debug_info(struct binder_proc *proc, - struct binder_node_debug_info *info) -{ - struct rb_node *n; - binder_uintptr_t ptr = info->ptr; - - memset(info, 0, sizeof(*info)); - - binder_inner_proc_lock(proc); - for (n = rb_first(&proc->nodes); n != NULL; n = rb_next(n)) { - struct binder_node *node = rb_entry(n, struct binder_node, - rb_node); - if (node->ptr > ptr) { - info->ptr = node->ptr; - info->cookie = node->cookie; - info->has_strong_ref = node->has_strong_ref; - info->has_weak_ref = node->has_weak_ref; - break; - } - } - binder_inner_proc_unlock(proc); - - return 0; -} - -static bool binder_txns_pending_ilocked(struct binder_proc *proc) -{ - struct rb_node *n; - struct binder_thread *thread; - - if (proc->outstanding_txns > 0) - return true; - - for (n = rb_first(&proc->threads); n; n = rb_next(n)) { - thread = rb_entry(n, struct binder_thread, rb_node); - if (thread->transaction_stack) - return true; - } - return false; -} - -static int binder_ioctl_freeze(struct binder_freeze_info *info, - struct binder_proc *target_proc) -{ - int ret = 0; - - if (!info->enable) { - binder_inner_proc_lock(target_proc); - target_proc->sync_recv = false; - target_proc->async_recv = false; - target_proc->is_frozen = false; - binder_inner_proc_unlock(target_proc); - return 0; - } - - /* - * Freezing the target. Prevent new transactions by - * setting frozen state. If timeout specified, wait - * for transactions to drain. - */ - binder_inner_proc_lock(target_proc); - target_proc->sync_recv = false; - target_proc->async_recv = false; - target_proc->is_frozen = true; - binder_inner_proc_unlock(target_proc); - - if (info->timeout_ms > 0) - ret = wait_event_interruptible_timeout( - target_proc->freeze_wait, - (!target_proc->outstanding_txns), - msecs_to_jiffies(info->timeout_ms)); - - /* Check pending transactions that wait for reply */ - if (ret >= 0) { - binder_inner_proc_lock(target_proc); - if (binder_txns_pending_ilocked(target_proc)) - ret = -EAGAIN; - binder_inner_proc_unlock(target_proc); - } - - if (ret < 0) { - binder_inner_proc_lock(target_proc); - target_proc->is_frozen = false; - binder_inner_proc_unlock(target_proc); - } - - return ret; -} - -static int binder_ioctl_get_freezer_info( - struct binder_frozen_status_info *info) -{ - struct binder_proc *target_proc; - bool found = false; - __u32 txns_pending; - - info->sync_recv = 0; - info->async_recv = 0; - - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(target_proc, &binder_procs, proc_node) { - if (target_proc->pid == info->pid) { - found = true; - binder_inner_proc_lock(target_proc); - txns_pending = binder_txns_pending_ilocked(target_proc); - info->sync_recv |= target_proc->sync_recv | - (txns_pending << 1); - info->async_recv |= target_proc->async_recv; - binder_inner_proc_unlock(target_proc); - } - } - mutex_unlock(&binder_procs_lock); - - if (!found) - return -EINVAL; - - return 0; -} - -static int binder_ioctl_get_extended_error(struct binder_thread *thread, - void __user *ubuf) -{ - struct binder_extended_error ee; - - binder_inner_proc_lock(thread->proc); - ee = thread->ee; - binder_set_extended_error(&thread->ee, 0, BR_OK, 0); - binder_inner_proc_unlock(thread->proc); - - if (copy_to_user(ubuf, &ee, sizeof(ee))) - return -EFAULT; - - return 0; -} - -static long binder_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) -{ - int ret; - struct binder_proc *proc = filp->private_data; - struct binder_thread *thread; - void __user *ubuf = (void __user *)arg; - - /*pr_info("binder_ioctl: %d:%d %x %lx\n", - proc->pid, current->pid, cmd, arg);*/ - - binder_selftest_alloc(&proc->alloc); - - trace_binder_ioctl(cmd, arg); - - ret = wait_event_interruptible(binder_user_error_wait, binder_stop_on_user_error < 2); - if (ret) - goto err_unlocked; - - thread = binder_get_thread(proc); - if (thread == NULL) { - ret = -ENOMEM; - goto err; - } - - switch (cmd) { - case BINDER_WRITE_READ: - ret = binder_ioctl_write_read(filp, arg, thread); - if (ret) - goto err; - break; - case BINDER_SET_MAX_THREADS: { - int max_threads; - - if (copy_from_user(&max_threads, ubuf, - sizeof(max_threads))) { - ret = -EINVAL; - goto err; - } - binder_inner_proc_lock(proc); - proc->max_threads = max_threads; - binder_inner_proc_unlock(proc); - break; - } - case BINDER_SET_CONTEXT_MGR_EXT: { - struct flat_binder_object fbo; - - if (copy_from_user(&fbo, ubuf, sizeof(fbo))) { - ret = -EINVAL; - goto err; - } - ret = binder_ioctl_set_ctx_mgr(filp, &fbo); - if (ret) - goto err; - break; - } - case BINDER_SET_CONTEXT_MGR: - ret = binder_ioctl_set_ctx_mgr(filp, NULL); - if (ret) - goto err; - break; - case BINDER_THREAD_EXIT: - binder_debug(BINDER_DEBUG_THREADS, "%d:%d exit\n", - proc->pid, thread->pid); - binder_thread_release(proc, thread); - thread = NULL; - break; - case BINDER_VERSION: { - struct binder_version __user *ver = ubuf; - - if (put_user(BINDER_CURRENT_PROTOCOL_VERSION, - &ver->protocol_version)) { - ret = -EINVAL; - goto err; - } - break; - } - case BINDER_GET_NODE_INFO_FOR_REF: { - struct binder_node_info_for_ref info; - - if (copy_from_user(&info, ubuf, sizeof(info))) { - ret = -EFAULT; - goto err; - } - - ret = binder_ioctl_get_node_info_for_ref(proc, &info); - if (ret < 0) - goto err; - - if (copy_to_user(ubuf, &info, sizeof(info))) { - ret = -EFAULT; - goto err; - } - - break; - } - case BINDER_GET_NODE_DEBUG_INFO: { - struct binder_node_debug_info info; - - if (copy_from_user(&info, ubuf, sizeof(info))) { - ret = -EFAULT; - goto err; - } - - ret = binder_ioctl_get_node_debug_info(proc, &info); - if (ret < 0) - goto err; - - if (copy_to_user(ubuf, &info, sizeof(info))) { - ret = -EFAULT; - goto err; - } - break; - } - case BINDER_FREEZE: { - struct binder_freeze_info info; - struct binder_proc **target_procs = NULL, *target_proc; - int target_procs_count = 0, i = 0; - - ret = 0; - - if (copy_from_user(&info, ubuf, sizeof(info))) { - ret = -EFAULT; - goto err; - } - - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(target_proc, &binder_procs, proc_node) { - if (target_proc->pid == info.pid) - target_procs_count++; - } - - if (target_procs_count == 0) { - mutex_unlock(&binder_procs_lock); - ret = -EINVAL; - goto err; - } - - target_procs = kcalloc(target_procs_count, - sizeof(struct binder_proc *), - GFP_KERNEL); - - if (!target_procs) { - mutex_unlock(&binder_procs_lock); - ret = -ENOMEM; - goto err; - } - - hlist_for_each_entry(target_proc, &binder_procs, proc_node) { - if (target_proc->pid != info.pid) - continue; - - binder_inner_proc_lock(target_proc); - target_proc->tmp_ref++; - binder_inner_proc_unlock(target_proc); - - target_procs[i++] = target_proc; - } - mutex_unlock(&binder_procs_lock); - - for (i = 0; i < target_procs_count; i++) { - if (ret >= 0) - ret = binder_ioctl_freeze(&info, - target_procs[i]); - - binder_proc_dec_tmpref(target_procs[i]); - } - - kfree(target_procs); - - if (ret < 0) - goto err; - break; - } - case BINDER_GET_FROZEN_INFO: { - struct binder_frozen_status_info info; - - if (copy_from_user(&info, ubuf, sizeof(info))) { - ret = -EFAULT; - goto err; - } - - ret = binder_ioctl_get_freezer_info(&info); - if (ret < 0) - goto err; - - if (copy_to_user(ubuf, &info, sizeof(info))) { - ret = -EFAULT; - goto err; - } - break; - } - case BINDER_ENABLE_ONEWAY_SPAM_DETECTION: { - uint32_t enable; - - if (copy_from_user(&enable, ubuf, sizeof(enable))) { - ret = -EFAULT; - goto err; - } - binder_inner_proc_lock(proc); - proc->oneway_spam_detection_enabled = (bool)enable; - binder_inner_proc_unlock(proc); - break; - } - case BINDER_GET_EXTENDED_ERROR: - ret = binder_ioctl_get_extended_error(thread, ubuf); - if (ret < 0) - goto err; - break; - default: - ret = -EINVAL; - goto err; - } - ret = 0; -err: - if (thread) - thread->looper_need_return = false; - wait_event_interruptible(binder_user_error_wait, binder_stop_on_user_error < 2); - if (ret && ret != -EINTR) - pr_info("%d:%d ioctl %x %lx returned %d\n", proc->pid, current->pid, cmd, arg, ret); -err_unlocked: - trace_binder_ioctl_done(ret); - return ret; -} - -static void binder_vma_open(struct vm_area_struct *vma) -{ - struct binder_proc *proc = vma->vm_private_data; - - binder_debug(BINDER_DEBUG_OPEN_CLOSE, - "%d open vm area %lx-%lx (%ld K) vma %lx pagep %lx\n", - proc->pid, vma->vm_start, vma->vm_end, - (vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags, - (unsigned long)pgprot_val(vma->vm_page_prot)); -} - -static void binder_vma_close(struct vm_area_struct *vma) -{ - struct binder_proc *proc = vma->vm_private_data; - - binder_debug(BINDER_DEBUG_OPEN_CLOSE, - "%d close vm area %lx-%lx (%ld K) vma %lx pagep %lx\n", - proc->pid, vma->vm_start, vma->vm_end, - (vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags, - (unsigned long)pgprot_val(vma->vm_page_prot)); - binder_alloc_vma_close(&proc->alloc); -} - -static vm_fault_t binder_vm_fault(struct vm_fault *vmf) -{ - return VM_FAULT_SIGBUS; -} - -static const struct vm_operations_struct binder_vm_ops = { - .open = binder_vma_open, - .close = binder_vma_close, - .fault = binder_vm_fault, -}; - -static int binder_mmap(struct file *filp, struct vm_area_struct *vma) -{ - struct binder_proc *proc = filp->private_data; - - if (proc->tsk != current->group_leader) - return -EINVAL; - - binder_debug(BINDER_DEBUG_OPEN_CLOSE, - "%s: %d %lx-%lx (%ld K) vma %lx pagep %lx\n", - __func__, proc->pid, vma->vm_start, vma->vm_end, - (vma->vm_end - vma->vm_start) / SZ_1K, vma->vm_flags, - (unsigned long)pgprot_val(vma->vm_page_prot)); - - if (vma->vm_flags & FORBIDDEN_MMAP_FLAGS) { - pr_err("%s: %d %lx-%lx %s failed %d\n", __func__, - proc->pid, vma->vm_start, vma->vm_end, "bad vm_flags", -EPERM); - return -EPERM; - } - vm_flags_mod(vma, VM_DONTCOPY | VM_MIXEDMAP, VM_MAYWRITE); - - vma->vm_ops = &binder_vm_ops; - vma->vm_private_data = proc; - - return binder_alloc_mmap_handler(&proc->alloc, vma); -} - -static int binder_open(struct inode *nodp, struct file *filp) -{ - struct binder_proc *proc, *itr; - struct binder_device *binder_dev; - struct binderfs_info *info; - struct dentry *binder_binderfs_dir_entry_proc = NULL; - bool existing_pid = false; - - binder_debug(BINDER_DEBUG_OPEN_CLOSE, "%s: %d:%d\n", __func__, - current->group_leader->pid, current->pid); - - proc = kzalloc(sizeof(*proc), GFP_KERNEL); - if (proc == NULL) - return -ENOMEM; - spin_lock_init(&proc->inner_lock); - spin_lock_init(&proc->outer_lock); - get_task_struct(current->group_leader); - proc->tsk = current->group_leader; - proc->cred = get_cred(filp->f_cred); - INIT_LIST_HEAD(&proc->todo); - init_waitqueue_head(&proc->freeze_wait); - proc->default_priority = task_nice(current); - /* binderfs stashes devices in i_private */ - if (is_binderfs_device(nodp)) { - binder_dev = nodp->i_private; - info = nodp->i_sb->s_fs_info; - binder_binderfs_dir_entry_proc = info->proc_log_dir; - } else { - binder_dev = container_of(filp->private_data, - struct binder_device, miscdev); - } - refcount_inc(&binder_dev->ref); - proc->context = &binder_dev->context; - binder_alloc_init(&proc->alloc); - - binder_stats_created(BINDER_STAT_PROC); - proc->pid = current->group_leader->pid; - INIT_LIST_HEAD(&proc->delivered_death); - INIT_LIST_HEAD(&proc->waiting_threads); - filp->private_data = proc; - - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(itr, &binder_procs, proc_node) { - if (itr->pid == proc->pid) { - existing_pid = true; - break; - } - } - hlist_add_head(&proc->proc_node, &binder_procs); - mutex_unlock(&binder_procs_lock); - - if (binder_debugfs_dir_entry_proc && !existing_pid) { - char strbuf[11]; - - snprintf(strbuf, sizeof(strbuf), "%u", proc->pid); - /* - * proc debug entries are shared between contexts. - * Only create for the first PID to avoid debugfs log spamming - * The printing code will anyway print all contexts for a given - * PID so this is not a problem. - */ - proc->debugfs_entry = debugfs_create_file(strbuf, 0444, - binder_debugfs_dir_entry_proc, - (void *)(unsigned long)proc->pid, - &proc_fops); - } - - if (binder_binderfs_dir_entry_proc && !existing_pid) { - char strbuf[11]; - struct dentry *binderfs_entry; - - snprintf(strbuf, sizeof(strbuf), "%u", proc->pid); - /* - * Similar to debugfs, the process specific log file is shared - * between contexts. Only create for the first PID. - * This is ok since same as debugfs, the log file will contain - * information on all contexts of a given PID. - */ - binderfs_entry = binderfs_create_file(binder_binderfs_dir_entry_proc, - strbuf, &proc_fops, (void *)(unsigned long)proc->pid); - if (!IS_ERR(binderfs_entry)) { - proc->binderfs_entry = binderfs_entry; - } else { - int error; - - error = PTR_ERR(binderfs_entry); - pr_warn("Unable to create file %s in binderfs (error %d)\n", - strbuf, error); - } - } - - return 0; -} - -static int binder_flush(struct file *filp, fl_owner_t id) -{ - struct binder_proc *proc = filp->private_data; - - binder_defer_work(proc, BINDER_DEFERRED_FLUSH); - - return 0; -} - -static void binder_deferred_flush(struct binder_proc *proc) -{ - struct rb_node *n; - int wake_count = 0; - - binder_inner_proc_lock(proc); - for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n)) { - struct binder_thread *thread = rb_entry(n, struct binder_thread, rb_node); - - thread->looper_need_return = true; - if (thread->looper & BINDER_LOOPER_STATE_WAITING) { - wake_up_interruptible(&thread->wait); - wake_count++; - } - } - binder_inner_proc_unlock(proc); - - binder_debug(BINDER_DEBUG_OPEN_CLOSE, - "binder_flush: %d woke %d threads\n", proc->pid, - wake_count); -} - -static int binder_release(struct inode *nodp, struct file *filp) -{ - struct binder_proc *proc = filp->private_data; - - debugfs_remove(proc->debugfs_entry); - - if (proc->binderfs_entry) { - binderfs_remove_file(proc->binderfs_entry); - proc->binderfs_entry = NULL; - } - - binder_defer_work(proc, BINDER_DEFERRED_RELEASE); - - return 0; -} - -static int binder_node_release(struct binder_node *node, int refs) -{ - struct binder_ref *ref; - int death = 0; - struct binder_proc *proc = node->proc; - - binder_release_work(proc, &node->async_todo); - - binder_node_lock(node); - binder_inner_proc_lock(proc); - binder_dequeue_work_ilocked(&node->work); - /* - * The caller must have taken a temporary ref on the node, - */ - BUG_ON(!node->tmp_refs); - if (hlist_empty(&node->refs) && node->tmp_refs == 1) { - binder_inner_proc_unlock(proc); - binder_node_unlock(node); - binder_free_node(node); - - return refs; - } - - node->proc = NULL; - node->local_strong_refs = 0; - node->local_weak_refs = 0; - binder_inner_proc_unlock(proc); - - spin_lock(&binder_dead_nodes_lock); - hlist_add_head(&node->dead_node, &binder_dead_nodes); - spin_unlock(&binder_dead_nodes_lock); - - hlist_for_each_entry(ref, &node->refs, node_entry) { - refs++; - /* - * Need the node lock to synchronize - * with new notification requests and the - * inner lock to synchronize with queued - * death notifications. - */ - binder_inner_proc_lock(ref->proc); - if (!ref->death) { - binder_inner_proc_unlock(ref->proc); - continue; - } - - death++; - - BUG_ON(!list_empty(&ref->death->work.entry)); - ref->death->work.type = BINDER_WORK_DEAD_BINDER; - binder_enqueue_work_ilocked(&ref->death->work, - &ref->proc->todo); - binder_wakeup_proc_ilocked(ref->proc); - binder_inner_proc_unlock(ref->proc); - } - - binder_debug(BINDER_DEBUG_DEAD_BINDER, - "node %d now dead, refs %d, death %d\n", - node->debug_id, refs, death); - binder_node_unlock(node); - binder_put_node(node); - - return refs; -} - -static void binder_deferred_release(struct binder_proc *proc) -{ - struct binder_context *context = proc->context; - struct rb_node *n; - int threads, nodes, incoming_refs, outgoing_refs, active_transactions; - - mutex_lock(&binder_procs_lock); - hlist_del(&proc->proc_node); - mutex_unlock(&binder_procs_lock); - - mutex_lock(&context->context_mgr_node_lock); - if (context->binder_context_mgr_node && - context->binder_context_mgr_node->proc == proc) { - binder_debug(BINDER_DEBUG_DEAD_BINDER, - "%s: %d context_mgr_node gone\n", - __func__, proc->pid); - context->binder_context_mgr_node = NULL; - } - mutex_unlock(&context->context_mgr_node_lock); - binder_inner_proc_lock(proc); - /* - * Make sure proc stays alive after we - * remove all the threads - */ - proc->tmp_ref++; - - proc->is_dead = true; - proc->is_frozen = false; - proc->sync_recv = false; - proc->async_recv = false; - threads = 0; - active_transactions = 0; - while ((n = rb_first(&proc->threads))) { - struct binder_thread *thread; - - thread = rb_entry(n, struct binder_thread, rb_node); - binder_inner_proc_unlock(proc); - threads++; - active_transactions += binder_thread_release(proc, thread); - binder_inner_proc_lock(proc); - } - - nodes = 0; - incoming_refs = 0; - while ((n = rb_first(&proc->nodes))) { - struct binder_node *node; - - node = rb_entry(n, struct binder_node, rb_node); - nodes++; - /* - * take a temporary ref on the node before - * calling binder_node_release() which will either - * kfree() the node or call binder_put_node() - */ - binder_inc_node_tmpref_ilocked(node); - rb_erase(&node->rb_node, &proc->nodes); - binder_inner_proc_unlock(proc); - incoming_refs = binder_node_release(node, incoming_refs); - binder_inner_proc_lock(proc); - } - binder_inner_proc_unlock(proc); - - outgoing_refs = 0; - binder_proc_lock(proc); - while ((n = rb_first(&proc->refs_by_desc))) { - struct binder_ref *ref; - - ref = rb_entry(n, struct binder_ref, rb_node_desc); - outgoing_refs++; - binder_cleanup_ref_olocked(ref); - binder_proc_unlock(proc); - binder_free_ref(ref); - binder_proc_lock(proc); - } - binder_proc_unlock(proc); - - binder_release_work(proc, &proc->todo); - binder_release_work(proc, &proc->delivered_death); - - binder_debug(BINDER_DEBUG_OPEN_CLOSE, - "%s: %d threads %d, nodes %d (ref %d), refs %d, active transactions %d\n", - __func__, proc->pid, threads, nodes, incoming_refs, - outgoing_refs, active_transactions); - - binder_proc_dec_tmpref(proc); -} - -static void binder_deferred_func(struct work_struct *work) -{ - struct binder_proc *proc; - - int defer; - - do { - mutex_lock(&binder_deferred_lock); - if (!hlist_empty(&binder_deferred_list)) { - proc = hlist_entry(binder_deferred_list.first, - struct binder_proc, deferred_work_node); - hlist_del_init(&proc->deferred_work_node); - defer = proc->deferred_work; - proc->deferred_work = 0; - } else { - proc = NULL; - defer = 0; - } - mutex_unlock(&binder_deferred_lock); - - if (defer & BINDER_DEFERRED_FLUSH) - binder_deferred_flush(proc); - - if (defer & BINDER_DEFERRED_RELEASE) - binder_deferred_release(proc); /* frees proc */ - } while (proc); -} -static DECLARE_WORK(binder_deferred_work, binder_deferred_func); - -static void -binder_defer_work(struct binder_proc *proc, enum binder_deferred_state defer) -{ - mutex_lock(&binder_deferred_lock); - proc->deferred_work |= defer; - if (hlist_unhashed(&proc->deferred_work_node)) { - hlist_add_head(&proc->deferred_work_node, - &binder_deferred_list); - schedule_work(&binder_deferred_work); - } - mutex_unlock(&binder_deferred_lock); -} - -static void print_binder_transaction_ilocked(struct seq_file *m, - struct binder_proc *proc, - const char *prefix, - struct binder_transaction *t) -{ - struct binder_proc *to_proc; - struct binder_buffer *buffer = t->buffer; - ktime_t current_time = ktime_get(); - - spin_lock(&t->lock); - to_proc = t->to_proc; - seq_printf(m, - "%s %d: %pK from %d:%d to %d:%d code %x flags %x pri %ld r%d elapsed %lldms", - prefix, t->debug_id, t, - t->from_pid, - t->from_tid, - to_proc ? to_proc->pid : 0, - t->to_thread ? t->to_thread->pid : 0, - t->code, t->flags, t->priority, t->need_reply, - ktime_ms_delta(current_time, t->start_time)); - spin_unlock(&t->lock); - - if (proc != to_proc) { - /* - * Can only safely deref buffer if we are holding the - * correct proc inner lock for this node - */ - seq_puts(m, "\n"); - return; - } - - if (buffer == NULL) { - seq_puts(m, " buffer free\n"); - return; - } - if (buffer->target_node) - seq_printf(m, " node %d", buffer->target_node->debug_id); - seq_printf(m, " size %zd:%zd data %pK\n", - buffer->data_size, buffer->offsets_size, - buffer->user_data); -} - -static void print_binder_work_ilocked(struct seq_file *m, - struct binder_proc *proc, - const char *prefix, - const char *transaction_prefix, - struct binder_work *w) -{ - struct binder_node *node; - struct binder_transaction *t; - - switch (w->type) { - case BINDER_WORK_TRANSACTION: - t = container_of(w, struct binder_transaction, work); - print_binder_transaction_ilocked( - m, proc, transaction_prefix, t); - break; - case BINDER_WORK_RETURN_ERROR: { - struct binder_error *e = container_of( - w, struct binder_error, work); - - seq_printf(m, "%stransaction error: %u\n", - prefix, e->cmd); - } break; - case BINDER_WORK_TRANSACTION_COMPLETE: - seq_printf(m, "%stransaction complete\n", prefix); - break; - case BINDER_WORK_NODE: - node = container_of(w, struct binder_node, work); - seq_printf(m, "%snode work %d: u%016llx c%016llx\n", - prefix, node->debug_id, - (u64)node->ptr, (u64)node->cookie); - break; - case BINDER_WORK_DEAD_BINDER: - seq_printf(m, "%shas dead binder\n", prefix); - break; - case BINDER_WORK_DEAD_BINDER_AND_CLEAR: - seq_printf(m, "%shas cleared dead binder\n", prefix); - break; - case BINDER_WORK_CLEAR_DEATH_NOTIFICATION: - seq_printf(m, "%shas cleared death notification\n", prefix); - break; - default: - seq_printf(m, "%sunknown work: type %d\n", prefix, w->type); - break; - } -} - -static void print_binder_thread_ilocked(struct seq_file *m, - struct binder_thread *thread, - int print_always) -{ - struct binder_transaction *t; - struct binder_work *w; - size_t start_pos = m->count; - size_t header_pos; - - seq_printf(m, " thread %d: l %02x need_return %d tr %d\n", - thread->pid, thread->looper, - thread->looper_need_return, - atomic_read(&thread->tmp_ref)); - header_pos = m->count; - t = thread->transaction_stack; - while (t) { - if (t->from == thread) { - print_binder_transaction_ilocked(m, thread->proc, - " outgoing transaction", t); - t = t->from_parent; - } else if (t->to_thread == thread) { - print_binder_transaction_ilocked(m, thread->proc, - " incoming transaction", t); - t = t->to_parent; - } else { - print_binder_transaction_ilocked(m, thread->proc, - " bad transaction", t); - t = NULL; - } - } - list_for_each_entry(w, &thread->todo, entry) { - print_binder_work_ilocked(m, thread->proc, " ", - " pending transaction", w); - } - if (!print_always && m->count == header_pos) - m->count = start_pos; -} - -static void print_binder_node_nilocked(struct seq_file *m, - struct binder_node *node) -{ - struct binder_ref *ref; - struct binder_work *w; - int count; - - count = 0; - hlist_for_each_entry(ref, &node->refs, node_entry) - count++; - - seq_printf(m, " node %d: u%016llx c%016llx hs %d hw %d ls %d lw %d is %d iw %d tr %d", - node->debug_id, (u64)node->ptr, (u64)node->cookie, - node->has_strong_ref, node->has_weak_ref, - node->local_strong_refs, node->local_weak_refs, - node->internal_strong_refs, count, node->tmp_refs); - if (count) { - seq_puts(m, " proc"); - hlist_for_each_entry(ref, &node->refs, node_entry) - seq_printf(m, " %d", ref->proc->pid); - } - seq_puts(m, "\n"); - if (node->proc) { - list_for_each_entry(w, &node->async_todo, entry) - print_binder_work_ilocked(m, node->proc, " ", - " pending async transaction", w); - } -} - -static void print_binder_ref_olocked(struct seq_file *m, - struct binder_ref *ref) -{ - binder_node_lock(ref->node); - seq_printf(m, " ref %d: desc %d %snode %d s %d w %d d %pK\n", - ref->data.debug_id, ref->data.desc, - ref->node->proc ? "" : "dead ", - ref->node->debug_id, ref->data.strong, - ref->data.weak, ref->death); - binder_node_unlock(ref->node); -} - -static void print_binder_proc(struct seq_file *m, - struct binder_proc *proc, int print_all) -{ - struct binder_work *w; - struct rb_node *n; - size_t start_pos = m->count; - size_t header_pos; - struct binder_node *last_node = NULL; - - seq_printf(m, "proc %d\n", proc->pid); - seq_printf(m, "context %s\n", proc->context->name); - header_pos = m->count; - - binder_inner_proc_lock(proc); - for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n)) - print_binder_thread_ilocked(m, rb_entry(n, struct binder_thread, - rb_node), print_all); - - for (n = rb_first(&proc->nodes); n != NULL; n = rb_next(n)) { - struct binder_node *node = rb_entry(n, struct binder_node, - rb_node); - if (!print_all && !node->has_async_transaction) - continue; - - /* - * take a temporary reference on the node so it - * survives and isn't removed from the tree - * while we print it. - */ - binder_inc_node_tmpref_ilocked(node); - /* Need to drop inner lock to take node lock */ - binder_inner_proc_unlock(proc); - if (last_node) - binder_put_node(last_node); - binder_node_inner_lock(node); - print_binder_node_nilocked(m, node); - binder_node_inner_unlock(node); - last_node = node; - binder_inner_proc_lock(proc); - } - binder_inner_proc_unlock(proc); - if (last_node) - binder_put_node(last_node); - - if (print_all) { - binder_proc_lock(proc); - for (n = rb_first(&proc->refs_by_desc); - n != NULL; - n = rb_next(n)) - print_binder_ref_olocked(m, rb_entry(n, - struct binder_ref, - rb_node_desc)); - binder_proc_unlock(proc); - } - binder_alloc_print_allocated(m, &proc->alloc); - binder_inner_proc_lock(proc); - list_for_each_entry(w, &proc->todo, entry) - print_binder_work_ilocked(m, proc, " ", - " pending transaction", w); - list_for_each_entry(w, &proc->delivered_death, entry) { - seq_puts(m, " has delivered dead binder\n"); - break; - } - binder_inner_proc_unlock(proc); - if (!print_all && m->count == header_pos) - m->count = start_pos; -} - -static const char * const binder_return_strings[] = { - "BR_ERROR", - "BR_OK", - "BR_TRANSACTION", - "BR_REPLY", - "BR_ACQUIRE_RESULT", - "BR_DEAD_REPLY", - "BR_TRANSACTION_COMPLETE", - "BR_INCREFS", - "BR_ACQUIRE", - "BR_RELEASE", - "BR_DECREFS", - "BR_ATTEMPT_ACQUIRE", - "BR_NOOP", - "BR_SPAWN_LOOPER", - "BR_FINISHED", - "BR_DEAD_BINDER", - "BR_CLEAR_DEATH_NOTIFICATION_DONE", - "BR_FAILED_REPLY", - "BR_FROZEN_REPLY", - "BR_ONEWAY_SPAM_SUSPECT", - "BR_TRANSACTION_PENDING_FROZEN" -}; - -static const char * const binder_command_strings[] = { - "BC_TRANSACTION", - "BC_REPLY", - "BC_ACQUIRE_RESULT", - "BC_FREE_BUFFER", - "BC_INCREFS", - "BC_ACQUIRE", - "BC_RELEASE", - "BC_DECREFS", - "BC_INCREFS_DONE", - "BC_ACQUIRE_DONE", - "BC_ATTEMPT_ACQUIRE", - "BC_REGISTER_LOOPER", - "BC_ENTER_LOOPER", - "BC_EXIT_LOOPER", - "BC_REQUEST_DEATH_NOTIFICATION", - "BC_CLEAR_DEATH_NOTIFICATION", - "BC_DEAD_BINDER_DONE", - "BC_TRANSACTION_SG", - "BC_REPLY_SG", -}; - -static const char * const binder_objstat_strings[] = { - "proc", - "thread", - "node", - "ref", - "death", - "transaction", - "transaction_complete" -}; - -static void print_binder_stats(struct seq_file *m, const char *prefix, - struct binder_stats *stats) -{ - int i; - - BUILD_BUG_ON(ARRAY_SIZE(stats->bc) != - ARRAY_SIZE(binder_command_strings)); - for (i = 0; i < ARRAY_SIZE(stats->bc); i++) { - int temp = atomic_read(&stats->bc[i]); - - if (temp) - seq_printf(m, "%s%s: %d\n", prefix, - binder_command_strings[i], temp); - } - - BUILD_BUG_ON(ARRAY_SIZE(stats->br) != - ARRAY_SIZE(binder_return_strings)); - for (i = 0; i < ARRAY_SIZE(stats->br); i++) { - int temp = atomic_read(&stats->br[i]); - - if (temp) - seq_printf(m, "%s%s: %d\n", prefix, - binder_return_strings[i], temp); - } - - BUILD_BUG_ON(ARRAY_SIZE(stats->obj_created) != - ARRAY_SIZE(binder_objstat_strings)); - BUILD_BUG_ON(ARRAY_SIZE(stats->obj_created) != - ARRAY_SIZE(stats->obj_deleted)); - for (i = 0; i < ARRAY_SIZE(stats->obj_created); i++) { - int created = atomic_read(&stats->obj_created[i]); - int deleted = atomic_read(&stats->obj_deleted[i]); - - if (created || deleted) - seq_printf(m, "%s%s: active %d total %d\n", - prefix, - binder_objstat_strings[i], - created - deleted, - created); - } -} - -static void print_binder_proc_stats(struct seq_file *m, - struct binder_proc *proc) -{ - struct binder_work *w; - struct binder_thread *thread; - struct rb_node *n; - int count, strong, weak, ready_threads; - size_t free_async_space = - binder_alloc_get_free_async_space(&proc->alloc); - - seq_printf(m, "proc %d\n", proc->pid); - seq_printf(m, "context %s\n", proc->context->name); - count = 0; - ready_threads = 0; - binder_inner_proc_lock(proc); - for (n = rb_first(&proc->threads); n != NULL; n = rb_next(n)) - count++; - - list_for_each_entry(thread, &proc->waiting_threads, waiting_thread_node) - ready_threads++; - - seq_printf(m, " threads: %d\n", count); - seq_printf(m, " requested threads: %d+%d/%d\n" - " ready threads %d\n" - " free async space %zd\n", proc->requested_threads, - proc->requested_threads_started, proc->max_threads, - ready_threads, - free_async_space); - count = 0; - for (n = rb_first(&proc->nodes); n != NULL; n = rb_next(n)) - count++; - binder_inner_proc_unlock(proc); - seq_printf(m, " nodes: %d\n", count); - count = 0; - strong = 0; - weak = 0; - binder_proc_lock(proc); - for (n = rb_first(&proc->refs_by_desc); n != NULL; n = rb_next(n)) { - struct binder_ref *ref = rb_entry(n, struct binder_ref, - rb_node_desc); - count++; - strong += ref->data.strong; - weak += ref->data.weak; - } - binder_proc_unlock(proc); - seq_printf(m, " refs: %d s %d w %d\n", count, strong, weak); - - count = binder_alloc_get_allocated_count(&proc->alloc); - seq_printf(m, " buffers: %d\n", count); - - binder_alloc_print_pages(m, &proc->alloc); - - count = 0; - binder_inner_proc_lock(proc); - list_for_each_entry(w, &proc->todo, entry) { - if (w->type == BINDER_WORK_TRANSACTION) - count++; - } - binder_inner_proc_unlock(proc); - seq_printf(m, " pending transactions: %d\n", count); - - print_binder_stats(m, " ", &proc->stats); -} - -static int state_show(struct seq_file *m, void *unused) -{ - struct binder_proc *proc; - struct binder_node *node; - struct binder_node *last_node = NULL; - - seq_puts(m, "binder state:\n"); - - spin_lock(&binder_dead_nodes_lock); - if (!hlist_empty(&binder_dead_nodes)) - seq_puts(m, "dead nodes:\n"); - hlist_for_each_entry(node, &binder_dead_nodes, dead_node) { - /* - * take a temporary reference on the node so it - * survives and isn't removed from the list - * while we print it. - */ - node->tmp_refs++; - spin_unlock(&binder_dead_nodes_lock); - if (last_node) - binder_put_node(last_node); - binder_node_lock(node); - print_binder_node_nilocked(m, node); - binder_node_unlock(node); - last_node = node; - spin_lock(&binder_dead_nodes_lock); - } - spin_unlock(&binder_dead_nodes_lock); - if (last_node) - binder_put_node(last_node); - - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(proc, &binder_procs, proc_node) - print_binder_proc(m, proc, 1); - mutex_unlock(&binder_procs_lock); - - return 0; -} - -static int stats_show(struct seq_file *m, void *unused) -{ - struct binder_proc *proc; - - seq_puts(m, "binder stats:\n"); - - print_binder_stats(m, "", &binder_stats); - - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(proc, &binder_procs, proc_node) - print_binder_proc_stats(m, proc); - mutex_unlock(&binder_procs_lock); - - return 0; -} - -static int transactions_show(struct seq_file *m, void *unused) -{ - struct binder_proc *proc; - - seq_puts(m, "binder transactions:\n"); - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(proc, &binder_procs, proc_node) - print_binder_proc(m, proc, 0); - mutex_unlock(&binder_procs_lock); - - return 0; -} - -static int proc_show(struct seq_file *m, void *unused) -{ - struct binder_proc *itr; - int pid = (unsigned long)m->private; - - mutex_lock(&binder_procs_lock); - hlist_for_each_entry(itr, &binder_procs, proc_node) { - if (itr->pid == pid) { - seq_puts(m, "binder proc state:\n"); - print_binder_proc(m, itr, 1); - } - } - mutex_unlock(&binder_procs_lock); - - return 0; -} - -static void print_binder_transaction_log_entry(struct seq_file *m, - struct binder_transaction_log_entry *e) -{ - int debug_id = READ_ONCE(e->debug_id_done); - /* - * read barrier to guarantee debug_id_done read before - * we print the log values - */ - smp_rmb(); - seq_printf(m, - "%d: %s from %d:%d to %d:%d context %s node %d handle %d size %d:%d ret %d/%d l=%d", - e->debug_id, (e->call_type == 2) ? "reply" : - ((e->call_type == 1) ? "async" : "call "), e->from_proc, - e->from_thread, e->to_proc, e->to_thread, e->context_name, - e->to_node, e->target_handle, e->data_size, e->offsets_size, - e->return_error, e->return_error_param, - e->return_error_line); - /* - * read-barrier to guarantee read of debug_id_done after - * done printing the fields of the entry - */ - smp_rmb(); - seq_printf(m, debug_id && debug_id == READ_ONCE(e->debug_id_done) ? - "\n" : " (incomplete)\n"); -} - -static int transaction_log_show(struct seq_file *m, void *unused) -{ - struct binder_transaction_log *log = m->private; - unsigned int log_cur = atomic_read(&log->cur); - unsigned int count; - unsigned int cur; - int i; - - count = log_cur + 1; - cur = count < ARRAY_SIZE(log->entry) && !log->full ? - 0 : count % ARRAY_SIZE(log->entry); - if (count > ARRAY_SIZE(log->entry) || log->full) - count = ARRAY_SIZE(log->entry); - for (i = 0; i < count; i++) { - unsigned int index = cur++ % ARRAY_SIZE(log->entry); - - print_binder_transaction_log_entry(m, &log->entry[index]); - } - return 0; -} - -const struct file_operations binder_fops = { - .owner = THIS_MODULE, - .poll = binder_poll, - .unlocked_ioctl = binder_ioctl, - .compat_ioctl = compat_ptr_ioctl, - .mmap = binder_mmap, - .open = binder_open, - .flush = binder_flush, - .release = binder_release, -}; - -DEFINE_SHOW_ATTRIBUTE(state); -DEFINE_SHOW_ATTRIBUTE(stats); -DEFINE_SHOW_ATTRIBUTE(transactions); -DEFINE_SHOW_ATTRIBUTE(transaction_log); - -const struct binder_debugfs_entry binder_debugfs_entries[] = { - { - .name = "state", - .mode = 0444, - .fops = &state_fops, - .data = NULL, - }, - { - .name = "stats", - .mode = 0444, - .fops = &stats_fops, - .data = NULL, - }, - { - .name = "transactions", - .mode = 0444, - .fops = &transactions_fops, - .data = NULL, - }, - { - .name = "transaction_log", - .mode = 0444, - .fops = &transaction_log_fops, - .data = &binder_transaction_log, - }, - { - .name = "failed_transaction_log", - .mode = 0444, - .fops = &transaction_log_fops, - .data = &binder_transaction_log_failed, - }, - {} /* terminator */ -}; - -static int __init init_binder_device(const char *name) -{ - int ret; - struct binder_device *binder_device; - - binder_device = kzalloc(sizeof(*binder_device), GFP_KERNEL); - if (!binder_device) - return -ENOMEM; - - binder_device->miscdev.fops = &binder_fops; - binder_device->miscdev.minor = MISC_DYNAMIC_MINOR; - binder_device->miscdev.name = name; - - refcount_set(&binder_device->ref, 1); - binder_device->context.binder_context_mgr_uid = INVALID_UID; - binder_device->context.name = name; - mutex_init(&binder_device->context.context_mgr_node_lock); - - ret = misc_register(&binder_device->miscdev); - if (ret < 0) { - kfree(binder_device); - return ret; - } - - hlist_add_head(&binder_device->hlist, &binder_devices); - - return ret; -} - -static int __init binder_init(void) -{ - int ret; - char *device_name, *device_tmp; - struct binder_device *device; - struct hlist_node *tmp; - char *device_names = NULL; - const struct binder_debugfs_entry *db_entry; - - ret = binder_alloc_shrinker_init(); - if (ret) - return ret; - - atomic_set(&binder_transaction_log.cur, ~0U); - atomic_set(&binder_transaction_log_failed.cur, ~0U); - - binder_debugfs_dir_entry_root = debugfs_create_dir("binder", NULL); - - binder_for_each_debugfs_entry(db_entry) - debugfs_create_file(db_entry->name, - db_entry->mode, - binder_debugfs_dir_entry_root, - db_entry->data, - db_entry->fops); - - binder_debugfs_dir_entry_proc = debugfs_create_dir("proc", - binder_debugfs_dir_entry_root); - - if (!IS_ENABLED(CONFIG_ANDROID_BINDERFS) && - strcmp(binder_devices_param, "") != 0) { - /* - * Copy the module_parameter string, because we don't want to - * tokenize it in-place. - */ - device_names = kstrdup(binder_devices_param, GFP_KERNEL); - if (!device_names) { - ret = -ENOMEM; - goto err_alloc_device_names_failed; - } - - device_tmp = device_names; - while ((device_name = strsep(&device_tmp, ","))) { - ret = init_binder_device(device_name); - if (ret) - goto err_init_binder_device_failed; - } - } - - ret = init_binderfs(); - if (ret) - goto err_init_binder_device_failed; - - return ret; - -err_init_binder_device_failed: - hlist_for_each_entry_safe(device, tmp, &binder_devices, hlist) { - misc_deregister(&device->miscdev); - hlist_del(&device->hlist); - kfree(device); - } - - kfree(device_names); - -err_alloc_device_names_failed: - debugfs_remove_recursive(binder_debugfs_dir_entry_root); - binder_alloc_shrinker_exit(); - - return ret; -} - -device_initcall(binder_init); - -#define CREATE_TRACE_POINTS -#include "binder_trace.h" - -MODULE_LICENSE("GPL v2"); diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c deleted file mode 100644 index e3db8297095a..000000000000 --- a/drivers/android/binder_alloc.c +++ /dev/null @@ -1,1284 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-only -/* binder_alloc.c - * - * Android IPC Subsystem - * - * Copyright (C) 2007-2017 Google, Inc. - */ - -#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include "binder_alloc.h" -#include "binder_trace.h" - -struct list_lru binder_alloc_lru; - -static DEFINE_MUTEX(binder_alloc_mmap_lock); - -enum { - BINDER_DEBUG_USER_ERROR = 1U << 0, - BINDER_DEBUG_OPEN_CLOSE = 1U << 1, - BINDER_DEBUG_BUFFER_ALLOC = 1U << 2, - BINDER_DEBUG_BUFFER_ALLOC_ASYNC = 1U << 3, -}; -static uint32_t binder_alloc_debug_mask = BINDER_DEBUG_USER_ERROR; - -module_param_named(debug_mask, binder_alloc_debug_mask, - uint, 0644); - -#define binder_alloc_debug(mask, x...) \ - do { \ - if (binder_alloc_debug_mask & mask) \ - pr_info_ratelimited(x); \ - } while (0) - -static struct binder_buffer *binder_buffer_next(struct binder_buffer *buffer) -{ - return list_entry(buffer->entry.next, struct binder_buffer, entry); -} - -static struct binder_buffer *binder_buffer_prev(struct binder_buffer *buffer) -{ - return list_entry(buffer->entry.prev, struct binder_buffer, entry); -} - -static size_t binder_alloc_buffer_size(struct binder_alloc *alloc, - struct binder_buffer *buffer) -{ - if (list_is_last(&buffer->entry, &alloc->buffers)) - return alloc->buffer + alloc->buffer_size - buffer->user_data; - return binder_buffer_next(buffer)->user_data - buffer->user_data; -} - -static void binder_insert_free_buffer(struct binder_alloc *alloc, - struct binder_buffer *new_buffer) -{ - struct rb_node **p = &alloc->free_buffers.rb_node; - struct rb_node *parent = NULL; - struct binder_buffer *buffer; - size_t buffer_size; - size_t new_buffer_size; - - BUG_ON(!new_buffer->free); - - new_buffer_size = binder_alloc_buffer_size(alloc, new_buffer); - - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: add free buffer, size %zd, at %pK\n", - alloc->pid, new_buffer_size, new_buffer); - - while (*p) { - parent = *p; - buffer = rb_entry(parent, struct binder_buffer, rb_node); - BUG_ON(!buffer->free); - - buffer_size = binder_alloc_buffer_size(alloc, buffer); - - if (new_buffer_size < buffer_size) - p = &parent->rb_left; - else - p = &parent->rb_right; - } - rb_link_node(&new_buffer->rb_node, parent, p); - rb_insert_color(&new_buffer->rb_node, &alloc->free_buffers); -} - -static void binder_insert_allocated_buffer_locked( - struct binder_alloc *alloc, struct binder_buffer *new_buffer) -{ - struct rb_node **p = &alloc->allocated_buffers.rb_node; - struct rb_node *parent = NULL; - struct binder_buffer *buffer; - - BUG_ON(new_buffer->free); - - while (*p) { - parent = *p; - buffer = rb_entry(parent, struct binder_buffer, rb_node); - BUG_ON(buffer->free); - - if (new_buffer->user_data < buffer->user_data) - p = &parent->rb_left; - else if (new_buffer->user_data > buffer->user_data) - p = &parent->rb_right; - else - BUG(); - } - rb_link_node(&new_buffer->rb_node, parent, p); - rb_insert_color(&new_buffer->rb_node, &alloc->allocated_buffers); -} - -static struct binder_buffer *binder_alloc_prepare_to_free_locked( - struct binder_alloc *alloc, - uintptr_t user_ptr) -{ - struct rb_node *n = alloc->allocated_buffers.rb_node; - struct binder_buffer *buffer; - void __user *uptr; - - uptr = (void __user *)user_ptr; - - while (n) { - buffer = rb_entry(n, struct binder_buffer, rb_node); - BUG_ON(buffer->free); - - if (uptr < buffer->user_data) - n = n->rb_left; - else if (uptr > buffer->user_data) - n = n->rb_right; - else { - /* - * Guard against user threads attempting to - * free the buffer when in use by kernel or - * after it's already been freed. - */ - if (!buffer->allow_user_free) - return ERR_PTR(-EPERM); - buffer->allow_user_free = 0; - return buffer; - } - } - return NULL; -} - -/** - * binder_alloc_prepare_to_free() - get buffer given user ptr - * @alloc: binder_alloc for this proc - * @user_ptr: User pointer to buffer data - * - * Validate userspace pointer to buffer data and return buffer corresponding to - * that user pointer. Search the rb tree for buffer that matches user data - * pointer. - * - * Return: Pointer to buffer or NULL - */ -struct binder_buffer *binder_alloc_prepare_to_free(struct binder_alloc *alloc, - uintptr_t user_ptr) -{ - struct binder_buffer *buffer; - - mutex_lock(&alloc->mutex); - buffer = binder_alloc_prepare_to_free_locked(alloc, user_ptr); - mutex_unlock(&alloc->mutex); - return buffer; -} - -static int binder_update_page_range(struct binder_alloc *alloc, int allocate, - void __user *start, void __user *end) -{ - void __user *page_addr; - unsigned long user_page_addr; - struct binder_lru_page *page; - struct vm_area_struct *vma = NULL; - struct mm_struct *mm = NULL; - bool need_mm = false; - - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: %s pages %pK-%pK\n", alloc->pid, - allocate ? "allocate" : "free", start, end); - - if (end <= start) - return 0; - - trace_binder_update_page_range(alloc, allocate, start, end); - - if (allocate == 0) - goto free_range; - - for (page_addr = start; page_addr < end; page_addr += PAGE_SIZE) { - page = &alloc->pages[(page_addr - alloc->buffer) / PAGE_SIZE]; - if (!page->page_ptr) { - need_mm = true; - break; - } - } - - if (need_mm && mmget_not_zero(alloc->mm)) - mm = alloc->mm; - - if (mm) { - mmap_write_lock(mm); - vma = alloc->vma; - } - - if (!vma && need_mm) { - binder_alloc_debug(BINDER_DEBUG_USER_ERROR, - "%d: binder_alloc_buf failed to map pages in userspace, no vma\n", - alloc->pid); - goto err_no_vma; - } - - for (page_addr = start; page_addr < end; page_addr += PAGE_SIZE) { - int ret; - bool on_lru; - size_t index; - - index = (page_addr - alloc->buffer) / PAGE_SIZE; - page = &alloc->pages[index]; - - if (page->page_ptr) { - trace_binder_alloc_lru_start(alloc, index); - - on_lru = list_lru_del(&binder_alloc_lru, &page->lru); - WARN_ON(!on_lru); - - trace_binder_alloc_lru_end(alloc, index); - continue; - } - - if (WARN_ON(!vma)) - goto err_page_ptr_cleared; - - trace_binder_alloc_page_start(alloc, index); - page->page_ptr = alloc_page(GFP_KERNEL | - __GFP_HIGHMEM | - __GFP_ZERO); - if (!page->page_ptr) { - pr_err("%d: binder_alloc_buf failed for page at %pK\n", - alloc->pid, page_addr); - goto err_alloc_page_failed; - } - page->alloc = alloc; - INIT_LIST_HEAD(&page->lru); - - user_page_addr = (uintptr_t)page_addr; - ret = vm_insert_page(vma, user_page_addr, page[0].page_ptr); - if (ret) { - pr_err("%d: binder_alloc_buf failed to map page at %lx in userspace\n", - alloc->pid, user_page_addr); - goto err_vm_insert_page_failed; - } - - if (index + 1 > alloc->pages_high) - alloc->pages_high = index + 1; - - trace_binder_alloc_page_end(alloc, index); - } - if (mm) { - mmap_write_unlock(mm); - mmput(mm); - } - return 0; - -free_range: - for (page_addr = end - PAGE_SIZE; 1; page_addr -= PAGE_SIZE) { - bool ret; - size_t index; - - index = (page_addr - alloc->buffer) / PAGE_SIZE; - page = &alloc->pages[index]; - - trace_binder_free_lru_start(alloc, index); - - ret = list_lru_add(&binder_alloc_lru, &page->lru); - WARN_ON(!ret); - - trace_binder_free_lru_end(alloc, index); - if (page_addr == start) - break; - continue; - -err_vm_insert_page_failed: - __free_page(page->page_ptr); - page->page_ptr = NULL; -err_alloc_page_failed: -err_page_ptr_cleared: - if (page_addr == start) - break; - } -err_no_vma: - if (mm) { - mmap_write_unlock(mm); - mmput(mm); - } - return vma ? -ENOMEM : -ESRCH; -} - -static inline void binder_alloc_set_vma(struct binder_alloc *alloc, - struct vm_area_struct *vma) -{ - /* pairs with smp_load_acquire in binder_alloc_get_vma() */ - smp_store_release(&alloc->vma, vma); -} - -static inline struct vm_area_struct *binder_alloc_get_vma( - struct binder_alloc *alloc) -{ - /* pairs with smp_store_release in binder_alloc_set_vma() */ - return smp_load_acquire(&alloc->vma); -} - -static bool debug_low_async_space_locked(struct binder_alloc *alloc, int pid) -{ - /* - * Find the amount and size of buffers allocated by the current caller; - * The idea is that once we cross the threshold, whoever is responsible - * for the low async space is likely to try to send another async txn, - * and at some point we'll catch them in the act. This is more efficient - * than keeping a map per pid. - */ - struct rb_node *n; - struct binder_buffer *buffer; - size_t total_alloc_size = 0; - size_t num_buffers = 0; - - for (n = rb_first(&alloc->allocated_buffers); n != NULL; - n = rb_next(n)) { - buffer = rb_entry(n, struct binder_buffer, rb_node); - if (buffer->pid != pid) - continue; - if (!buffer->async_transaction) - continue; - total_alloc_size += binder_alloc_buffer_size(alloc, buffer) - + sizeof(struct binder_buffer); - num_buffers++; - } - - /* - * Warn if this pid has more than 50 transactions, or more than 50% of - * async space (which is 25% of total buffer size). Oneway spam is only - * detected when the threshold is exceeded. - */ - if (num_buffers > 50 || total_alloc_size > alloc->buffer_size / 4) { - binder_alloc_debug(BINDER_DEBUG_USER_ERROR, - "%d: pid %d spamming oneway? %zd buffers allocated for a total size of %zd\n", - alloc->pid, pid, num_buffers, total_alloc_size); - if (!alloc->oneway_spam_detected) { - alloc->oneway_spam_detected = true; - return true; - } - } - return false; -} - -static struct binder_buffer *binder_alloc_new_buf_locked( - struct binder_alloc *alloc, - size_t data_size, - size_t offsets_size, - size_t extra_buffers_size, - int is_async, - int pid) -{ - struct rb_node *n = alloc->free_buffers.rb_node; - struct binder_buffer *buffer; - size_t buffer_size; - struct rb_node *best_fit = NULL; - void __user *has_page_addr; - void __user *end_page_addr; - size_t size, data_offsets_size; - int ret; - - /* Check binder_alloc is fully initialized */ - if (!binder_alloc_get_vma(alloc)) { - binder_alloc_debug(BINDER_DEBUG_USER_ERROR, - "%d: binder_alloc_buf, no vma\n", - alloc->pid); - return ERR_PTR(-ESRCH); - } - - data_offsets_size = ALIGN(data_size, sizeof(void *)) + - ALIGN(offsets_size, sizeof(void *)); - - if (data_offsets_size < data_size || data_offsets_size < offsets_size) { - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: got transaction with invalid size %zd-%zd\n", - alloc->pid, data_size, offsets_size); - return ERR_PTR(-EINVAL); - } - size = data_offsets_size + ALIGN(extra_buffers_size, sizeof(void *)); - if (size < data_offsets_size || size < extra_buffers_size) { - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: got transaction with invalid extra_buffers_size %zd\n", - alloc->pid, extra_buffers_size); - return ERR_PTR(-EINVAL); - } - if (is_async && - alloc->free_async_space < size + sizeof(struct binder_buffer)) { - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_alloc_buf size %zd failed, no async space left\n", - alloc->pid, size); - return ERR_PTR(-ENOSPC); - } - - /* Pad 0-size buffers so they get assigned unique addresses */ - size = max(size, sizeof(void *)); - - while (n) { - buffer = rb_entry(n, struct binder_buffer, rb_node); - BUG_ON(!buffer->free); - buffer_size = binder_alloc_buffer_size(alloc, buffer); - - if (size < buffer_size) { - best_fit = n; - n = n->rb_left; - } else if (size > buffer_size) - n = n->rb_right; - else { - best_fit = n; - break; - } - } - if (best_fit == NULL) { - size_t allocated_buffers = 0; - size_t largest_alloc_size = 0; - size_t total_alloc_size = 0; - size_t free_buffers = 0; - size_t largest_free_size = 0; - size_t total_free_size = 0; - - for (n = rb_first(&alloc->allocated_buffers); n != NULL; - n = rb_next(n)) { - buffer = rb_entry(n, struct binder_buffer, rb_node); - buffer_size = binder_alloc_buffer_size(alloc, buffer); - allocated_buffers++; - total_alloc_size += buffer_size; - if (buffer_size > largest_alloc_size) - largest_alloc_size = buffer_size; - } - for (n = rb_first(&alloc->free_buffers); n != NULL; - n = rb_next(n)) { - buffer = rb_entry(n, struct binder_buffer, rb_node); - buffer_size = binder_alloc_buffer_size(alloc, buffer); - free_buffers++; - total_free_size += buffer_size; - if (buffer_size > largest_free_size) - largest_free_size = buffer_size; - } - binder_alloc_debug(BINDER_DEBUG_USER_ERROR, - "%d: binder_alloc_buf size %zd failed, no address space\n", - alloc->pid, size); - binder_alloc_debug(BINDER_DEBUG_USER_ERROR, - "allocated: %zd (num: %zd largest: %zd), free: %zd (num: %zd largest: %zd)\n", - total_alloc_size, allocated_buffers, - largest_alloc_size, total_free_size, - free_buffers, largest_free_size); - return ERR_PTR(-ENOSPC); - } - if (n == NULL) { - buffer = rb_entry(best_fit, struct binder_buffer, rb_node); - buffer_size = binder_alloc_buffer_size(alloc, buffer); - } - - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_alloc_buf size %zd got buffer %pK size %zd\n", - alloc->pid, size, buffer, buffer_size); - - has_page_addr = (void __user *) - (((uintptr_t)buffer->user_data + buffer_size) & PAGE_MASK); - WARN_ON(n && buffer_size != size); - end_page_addr = - (void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data + size); - if (end_page_addr > has_page_addr) - end_page_addr = has_page_addr; - ret = binder_update_page_range(alloc, 1, (void __user *) - PAGE_ALIGN((uintptr_t)buffer->user_data), end_page_addr); - if (ret) - return ERR_PTR(ret); - - if (buffer_size != size) { - struct binder_buffer *new_buffer; - - new_buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); - if (!new_buffer) { - pr_err("%s: %d failed to alloc new buffer struct\n", - __func__, alloc->pid); - goto err_alloc_buf_struct_failed; - } - new_buffer->user_data = (u8 __user *)buffer->user_data + size; - list_add(&new_buffer->entry, &buffer->entry); - new_buffer->free = 1; - binder_insert_free_buffer(alloc, new_buffer); - } - - rb_erase(best_fit, &alloc->free_buffers); - buffer->free = 0; - buffer->allow_user_free = 0; - binder_insert_allocated_buffer_locked(alloc, buffer); - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_alloc_buf size %zd got %pK\n", - alloc->pid, size, buffer); - buffer->data_size = data_size; - buffer->offsets_size = offsets_size; - buffer->async_transaction = is_async; - buffer->extra_buffers_size = extra_buffers_size; - buffer->pid = pid; - buffer->oneway_spam_suspect = false; - if (is_async) { - alloc->free_async_space -= size + sizeof(struct binder_buffer); - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC_ASYNC, - "%d: binder_alloc_buf size %zd async free %zd\n", - alloc->pid, size, alloc->free_async_space); - if (alloc->free_async_space < alloc->buffer_size / 10) { - /* - * Start detecting spammers once we have less than 20% - * of async space left (which is less than 10% of total - * buffer size). - */ - buffer->oneway_spam_suspect = debug_low_async_space_locked(alloc, pid); - } else { - alloc->oneway_spam_detected = false; - } - } - return buffer; - -err_alloc_buf_struct_failed: - binder_update_page_range(alloc, 0, (void __user *) - PAGE_ALIGN((uintptr_t)buffer->user_data), - end_page_addr); - return ERR_PTR(-ENOMEM); -} - -/** - * binder_alloc_new_buf() - Allocate a new binder buffer - * @alloc: binder_alloc for this proc - * @data_size: size of user data buffer - * @offsets_size: user specified buffer offset - * @extra_buffers_size: size of extra space for meta-data (eg, security context) - * @is_async: buffer for async transaction - * @pid: pid to attribute allocation to (used for debugging) - * - * Allocate a new buffer given the requested sizes. Returns - * the kernel version of the buffer pointer. The size allocated - * is the sum of the three given sizes (each rounded up to - * pointer-sized boundary) - * - * Return: The allocated buffer or %NULL if error - */ -struct binder_buffer *binder_alloc_new_buf(struct binder_alloc *alloc, - size_t data_size, - size_t offsets_size, - size_t extra_buffers_size, - int is_async, - int pid) -{ - struct binder_buffer *buffer; - - mutex_lock(&alloc->mutex); - buffer = binder_alloc_new_buf_locked(alloc, data_size, offsets_size, - extra_buffers_size, is_async, pid); - mutex_unlock(&alloc->mutex); - return buffer; -} - -static void __user *buffer_start_page(struct binder_buffer *buffer) -{ - return (void __user *)((uintptr_t)buffer->user_data & PAGE_MASK); -} - -static void __user *prev_buffer_end_page(struct binder_buffer *buffer) -{ - return (void __user *) - (((uintptr_t)(buffer->user_data) - 1) & PAGE_MASK); -} - -static void binder_delete_free_buffer(struct binder_alloc *alloc, - struct binder_buffer *buffer) -{ - struct binder_buffer *prev, *next = NULL; - bool to_free = true; - - BUG_ON(alloc->buffers.next == &buffer->entry); - prev = binder_buffer_prev(buffer); - BUG_ON(!prev->free); - if (prev_buffer_end_page(prev) == buffer_start_page(buffer)) { - to_free = false; - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer %pK share page with %pK\n", - alloc->pid, buffer->user_data, - prev->user_data); - } - - if (!list_is_last(&buffer->entry, &alloc->buffers)) { - next = binder_buffer_next(buffer); - if (buffer_start_page(next) == buffer_start_page(buffer)) { - to_free = false; - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer %pK share page with %pK\n", - alloc->pid, - buffer->user_data, - next->user_data); - } - } - - if (PAGE_ALIGNED(buffer->user_data)) { - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer start %pK is page aligned\n", - alloc->pid, buffer->user_data); - to_free = false; - } - - if (to_free) { - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: merge free, buffer %pK do not share page with %pK or %pK\n", - alloc->pid, buffer->user_data, - prev->user_data, - next ? next->user_data : NULL); - binder_update_page_range(alloc, 0, buffer_start_page(buffer), - buffer_start_page(buffer) + PAGE_SIZE); - } - list_del(&buffer->entry); - kfree(buffer); -} - -static void binder_free_buf_locked(struct binder_alloc *alloc, - struct binder_buffer *buffer) -{ - size_t size, buffer_size; - - buffer_size = binder_alloc_buffer_size(alloc, buffer); - - size = ALIGN(buffer->data_size, sizeof(void *)) + - ALIGN(buffer->offsets_size, sizeof(void *)) + - ALIGN(buffer->extra_buffers_size, sizeof(void *)); - - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%d: binder_free_buf %pK size %zd buffer_size %zd\n", - alloc->pid, buffer, size, buffer_size); - - BUG_ON(buffer->free); - BUG_ON(size > buffer_size); - BUG_ON(buffer->transaction != NULL); - BUG_ON(buffer->user_data < alloc->buffer); - BUG_ON(buffer->user_data > alloc->buffer + alloc->buffer_size); - - if (buffer->async_transaction) { - alloc->free_async_space += buffer_size + sizeof(struct binder_buffer); - - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC_ASYNC, - "%d: binder_free_buf size %zd async free %zd\n", - alloc->pid, size, alloc->free_async_space); - } - - binder_update_page_range(alloc, 0, - (void __user *)PAGE_ALIGN((uintptr_t)buffer->user_data), - (void __user *)(((uintptr_t) - buffer->user_data + buffer_size) & PAGE_MASK)); - - rb_erase(&buffer->rb_node, &alloc->allocated_buffers); - buffer->free = 1; - if (!list_is_last(&buffer->entry, &alloc->buffers)) { - struct binder_buffer *next = binder_buffer_next(buffer); - - if (next->free) { - rb_erase(&next->rb_node, &alloc->free_buffers); - binder_delete_free_buffer(alloc, next); - } - } - if (alloc->buffers.next != &buffer->entry) { - struct binder_buffer *prev = binder_buffer_prev(buffer); - - if (prev->free) { - binder_delete_free_buffer(alloc, buffer); - rb_erase(&prev->rb_node, &alloc->free_buffers); - buffer = prev; - } - } - binder_insert_free_buffer(alloc, buffer); -} - -static void binder_alloc_clear_buf(struct binder_alloc *alloc, - struct binder_buffer *buffer); -/** - * binder_alloc_free_buf() - free a binder buffer - * @alloc: binder_alloc for this proc - * @buffer: kernel pointer to buffer - * - * Free the buffer allocated via binder_alloc_new_buf() - */ -void binder_alloc_free_buf(struct binder_alloc *alloc, - struct binder_buffer *buffer) -{ - /* - * We could eliminate the call to binder_alloc_clear_buf() - * from binder_alloc_deferred_release() by moving this to - * binder_alloc_free_buf_locked(). However, that could - * increase contention for the alloc mutex if clear_on_free - * is used frequently for large buffers. The mutex is not - * needed for correctness here. - */ - if (buffer->clear_on_free) { - binder_alloc_clear_buf(alloc, buffer); - buffer->clear_on_free = false; - } - mutex_lock(&alloc->mutex); - binder_free_buf_locked(alloc, buffer); - mutex_unlock(&alloc->mutex); -} - -/** - * binder_alloc_mmap_handler() - map virtual address space for proc - * @alloc: alloc structure for this proc - * @vma: vma passed to mmap() - * - * Called by binder_mmap() to initialize the space specified in - * vma for allocating binder buffers - * - * Return: - * 0 = success - * -EBUSY = address space already mapped - * -ENOMEM = failed to map memory to given address space - */ -int binder_alloc_mmap_handler(struct binder_alloc *alloc, - struct vm_area_struct *vma) -{ - int ret; - const char *failure_string; - struct binder_buffer *buffer; - - if (unlikely(vma->vm_mm != alloc->mm)) { - ret = -EINVAL; - failure_string = "invalid vma->vm_mm"; - goto err_invalid_mm; - } - - mutex_lock(&binder_alloc_mmap_lock); - if (alloc->buffer_size) { - ret = -EBUSY; - failure_string = "already mapped"; - goto err_already_mapped; - } - alloc->buffer_size = min_t(unsigned long, vma->vm_end - vma->vm_start, - SZ_4M); - mutex_unlock(&binder_alloc_mmap_lock); - - alloc->buffer = (void __user *)vma->vm_start; - - alloc->pages = kcalloc(alloc->buffer_size / PAGE_SIZE, - sizeof(alloc->pages[0]), - GFP_KERNEL); - if (alloc->pages == NULL) { - ret = -ENOMEM; - failure_string = "alloc page array"; - goto err_alloc_pages_failed; - } - - buffer = kzalloc(sizeof(*buffer), GFP_KERNEL); - if (!buffer) { - ret = -ENOMEM; - failure_string = "alloc buffer struct"; - goto err_alloc_buf_struct_failed; - } - - buffer->user_data = alloc->buffer; - list_add(&buffer->entry, &alloc->buffers); - buffer->free = 1; - binder_insert_free_buffer(alloc, buffer); - alloc->free_async_space = alloc->buffer_size / 2; - - /* Signal binder_alloc is fully initialized */ - binder_alloc_set_vma(alloc, vma); - - return 0; - -err_alloc_buf_struct_failed: - kfree(alloc->pages); - alloc->pages = NULL; -err_alloc_pages_failed: - alloc->buffer = NULL; - mutex_lock(&binder_alloc_mmap_lock); - alloc->buffer_size = 0; -err_already_mapped: - mutex_unlock(&binder_alloc_mmap_lock); -err_invalid_mm: - binder_alloc_debug(BINDER_DEBUG_USER_ERROR, - "%s: %d %lx-%lx %s failed %d\n", __func__, - alloc->pid, vma->vm_start, vma->vm_end, - failure_string, ret); - return ret; -} - - -void binder_alloc_deferred_release(struct binder_alloc *alloc) -{ - struct rb_node *n; - int buffers, page_count; - struct binder_buffer *buffer; - - buffers = 0; - mutex_lock(&alloc->mutex); - BUG_ON(alloc->vma); - - while ((n = rb_first(&alloc->allocated_buffers))) { - buffer = rb_entry(n, struct binder_buffer, rb_node); - - /* Transaction should already have been freed */ - BUG_ON(buffer->transaction); - - if (buffer->clear_on_free) { - binder_alloc_clear_buf(alloc, buffer); - buffer->clear_on_free = false; - } - binder_free_buf_locked(alloc, buffer); - buffers++; - } - - while (!list_empty(&alloc->buffers)) { - buffer = list_first_entry(&alloc->buffers, - struct binder_buffer, entry); - WARN_ON(!buffer->free); - - list_del(&buffer->entry); - WARN_ON_ONCE(!list_empty(&alloc->buffers)); - kfree(buffer); - } - - page_count = 0; - if (alloc->pages) { - int i; - - for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) { - void __user *page_addr; - bool on_lru; - - if (!alloc->pages[i].page_ptr) - continue; - - on_lru = list_lru_del(&binder_alloc_lru, - &alloc->pages[i].lru); - page_addr = alloc->buffer + i * PAGE_SIZE; - binder_alloc_debug(BINDER_DEBUG_BUFFER_ALLOC, - "%s: %d: page %d at %pK %s\n", - __func__, alloc->pid, i, page_addr, - on_lru ? "on lru" : "active"); - __free_page(alloc->pages[i].page_ptr); - page_count++; - } - kfree(alloc->pages); - } - mutex_unlock(&alloc->mutex); - if (alloc->mm) - mmdrop(alloc->mm); - - binder_alloc_debug(BINDER_DEBUG_OPEN_CLOSE, - "%s: %d buffers %d, pages %d\n", - __func__, alloc->pid, buffers, page_count); -} - -static void print_binder_buffer(struct seq_file *m, const char *prefix, - struct binder_buffer *buffer) -{ - seq_printf(m, "%s %d: %pK size %zd:%zd:%zd %s\n", - prefix, buffer->debug_id, buffer->user_data, - buffer->data_size, buffer->offsets_size, - buffer->extra_buffers_size, - buffer->transaction ? "active" : "delivered"); -} - -/** - * binder_alloc_print_allocated() - print buffer info - * @m: seq_file for output via seq_printf() - * @alloc: binder_alloc for this proc - * - * Prints information about every buffer associated with - * the binder_alloc state to the given seq_file - */ -void binder_alloc_print_allocated(struct seq_file *m, - struct binder_alloc *alloc) -{ - struct rb_node *n; - - mutex_lock(&alloc->mutex); - for (n = rb_first(&alloc->allocated_buffers); n != NULL; n = rb_next(n)) - print_binder_buffer(m, " buffer", - rb_entry(n, struct binder_buffer, rb_node)); - mutex_unlock(&alloc->mutex); -} - -/** - * binder_alloc_print_pages() - print page usage - * @m: seq_file for output via seq_printf() - * @alloc: binder_alloc for this proc - */ -void binder_alloc_print_pages(struct seq_file *m, - struct binder_alloc *alloc) -{ - struct binder_lru_page *page; - int i; - int active = 0; - int lru = 0; - int free = 0; - - mutex_lock(&alloc->mutex); - /* - * Make sure the binder_alloc is fully initialized, otherwise we might - * read inconsistent state. - */ - if (binder_alloc_get_vma(alloc) != NULL) { - for (i = 0; i < alloc->buffer_size / PAGE_SIZE; i++) { - page = &alloc->pages[i]; - if (!page->page_ptr) - free++; - else if (list_empty(&page->lru)) - active++; - else - lru++; - } - } - mutex_unlock(&alloc->mutex); - seq_printf(m, " pages: %d:%d:%d\n", active, lru, free); - seq_printf(m, " pages high watermark: %zu\n", alloc->pages_high); -} - -/** - * binder_alloc_get_allocated_count() - return count of buffers - * @alloc: binder_alloc for this proc - * - * Return: count of allocated buffers - */ -int binder_alloc_get_allocated_count(struct binder_alloc *alloc) -{ - struct rb_node *n; - int count = 0; - - mutex_lock(&alloc->mutex); - for (n = rb_first(&alloc->allocated_buffers); n != NULL; n = rb_next(n)) - count++; - mutex_unlock(&alloc->mutex); - return count; -} - - -/** - * binder_alloc_vma_close() - invalidate address space - * @alloc: binder_alloc for this proc - * - * Called from binder_vma_close() when releasing address space. - * Clears alloc->vma to prevent new incoming transactions from - * allocating more buffers. - */ -void binder_alloc_vma_close(struct binder_alloc *alloc) -{ - binder_alloc_set_vma(alloc, NULL); -} - -/** - * binder_alloc_free_page() - shrinker callback to free pages - * @item: item to free - * @lock: lock protecting the item - * @cb_arg: callback argument - * - * Called from list_lru_walk() in binder_shrink_scan() to free - * up pages when the system is under memory pressure. - */ -enum lru_status binder_alloc_free_page(struct list_head *item, - struct list_lru_one *lru, - spinlock_t *lock, - void *cb_arg) - __must_hold(lock) -{ - struct mm_struct *mm = NULL; - struct binder_lru_page *page = container_of(item, - struct binder_lru_page, - lru); - struct binder_alloc *alloc; - uintptr_t page_addr; - size_t index; - struct vm_area_struct *vma; - - alloc = page->alloc; - if (!mutex_trylock(&alloc->mutex)) - goto err_get_alloc_mutex_failed; - - if (!page->page_ptr) - goto err_page_already_freed; - - index = page - alloc->pages; - page_addr = (uintptr_t)alloc->buffer + index * PAGE_SIZE; - - mm = alloc->mm; - if (!mmget_not_zero(mm)) - goto err_mmget; - if (!mmap_read_trylock(mm)) - goto err_mmap_read_lock_failed; - vma = binder_alloc_get_vma(alloc); - - list_lru_isolate(lru, item); - spin_unlock(lock); - - if (vma) { - trace_binder_unmap_user_start(alloc, index); - - zap_page_range_single(vma, page_addr, PAGE_SIZE, NULL); - - trace_binder_unmap_user_end(alloc, index); - } - mmap_read_unlock(mm); - mmput_async(mm); - - trace_binder_unmap_kernel_start(alloc, index); - - __free_page(page->page_ptr); - page->page_ptr = NULL; - - trace_binder_unmap_kernel_end(alloc, index); - - spin_lock(lock); - mutex_unlock(&alloc->mutex); - return LRU_REMOVED_RETRY; - -err_mmap_read_lock_failed: - mmput_async(mm); -err_mmget: -err_page_already_freed: - mutex_unlock(&alloc->mutex); -err_get_alloc_mutex_failed: - return LRU_SKIP; -} - -static unsigned long -binder_shrink_count(struct shrinker *shrink, struct shrink_control *sc) -{ - return list_lru_count(&binder_alloc_lru); -} - -static unsigned long -binder_shrink_scan(struct shrinker *shrink, struct shrink_control *sc) -{ - return list_lru_walk(&binder_alloc_lru, binder_alloc_free_page, - NULL, sc->nr_to_scan); -} - -static struct shrinker binder_shrinker = { - .count_objects = binder_shrink_count, - .scan_objects = binder_shrink_scan, - .seeks = DEFAULT_SEEKS, -}; - -/** - * binder_alloc_init() - called by binder_open() for per-proc initialization - * @alloc: binder_alloc for this proc - * - * Called from binder_open() to initialize binder_alloc fields for - * new binder proc - */ -void binder_alloc_init(struct binder_alloc *alloc) -{ - alloc->pid = current->group_leader->pid; - alloc->mm = current->mm; - mmgrab(alloc->mm); - mutex_init(&alloc->mutex); - INIT_LIST_HEAD(&alloc->buffers); -} - -int binder_alloc_shrinker_init(void) -{ - int ret = list_lru_init(&binder_alloc_lru); - - if (ret == 0) { - ret = register_shrinker(&binder_shrinker, "android-binder"); - if (ret) - list_lru_destroy(&binder_alloc_lru); - } - return ret; -} - -void binder_alloc_shrinker_exit(void) -{ - unregister_shrinker(&binder_shrinker); - list_lru_destroy(&binder_alloc_lru); -} - -/** - * check_buffer() - verify that buffer/offset is safe to access - * @alloc: binder_alloc for this proc - * @buffer: binder buffer to be accessed - * @offset: offset into @buffer data - * @bytes: bytes to access from offset - * - * Check that the @offset/@bytes are within the size of the given - * @buffer and that the buffer is currently active and not freeable. - * Offsets must also be multiples of sizeof(u32). The kernel is - * allowed to touch the buffer in two cases: - * - * 1) when the buffer is being created: - * (buffer->free == 0 && buffer->allow_user_free == 0) - * 2) when the buffer is being torn down: - * (buffer->free == 0 && buffer->transaction == NULL). - * - * Return: true if the buffer is safe to access - */ -static inline bool check_buffer(struct binder_alloc *alloc, - struct binder_buffer *buffer, - binder_size_t offset, size_t bytes) -{ - size_t buffer_size = binder_alloc_buffer_size(alloc, buffer); - - return buffer_size >= bytes && - offset <= buffer_size - bytes && - IS_ALIGNED(offset, sizeof(u32)) && - !buffer->free && - (!buffer->allow_user_free || !buffer->transaction); -} - -/** - * binder_alloc_get_page() - get kernel pointer for given buffer offset - * @alloc: binder_alloc for this proc - * @buffer: binder buffer to be accessed - * @buffer_offset: offset into @buffer data - * @pgoffp: address to copy final page offset to - * - * Lookup the struct page corresponding to the address - * at @buffer_offset into @buffer->user_data. If @pgoffp is not - * NULL, the byte-offset into the page is written there. - * - * The caller is responsible to ensure that the offset points - * to a valid address within the @buffer and that @buffer is - * not freeable by the user. Since it can't be freed, we are - * guaranteed that the corresponding elements of @alloc->pages[] - * cannot change. - * - * Return: struct page - */ -static struct page *binder_alloc_get_page(struct binder_alloc *alloc, - struct binder_buffer *buffer, - binder_size_t buffer_offset, - pgoff_t *pgoffp) -{ - binder_size_t buffer_space_offset = buffer_offset + - (buffer->user_data - alloc->buffer); - pgoff_t pgoff = buffer_space_offset & ~PAGE_MASK; - size_t index = buffer_space_offset >> PAGE_SHIFT; - struct binder_lru_page *lru_page; - - lru_page = &alloc->pages[index]; - *pgoffp = pgoff; - return lru_page->page_ptr; -} - -/** - * binder_alloc_clear_buf() - zero out buffer - * @alloc: binder_alloc for this proc - * @buffer: binder buffer to be cleared - * - * memset the given buffer to 0 - */ -static void binder_alloc_clear_buf(struct binder_alloc *alloc, - struct binder_buffer *buffer) -{ - size_t bytes = binder_alloc_buffer_size(alloc, buffer); - binder_size_t buffer_offset = 0; - - while (bytes) { - unsigned long size; - struct page *page; - pgoff_t pgoff; - - page = binder_alloc_get_page(alloc, buffer, - buffer_offset, &pgoff); - size = min_t(size_t, bytes, PAGE_SIZE - pgoff); - memset_page(page, pgoff, 0, size); - bytes -= size; - buffer_offset += size; - } -} - -/** - * binder_alloc_copy_user_to_buffer() - copy src user to tgt user - * @alloc: binder_alloc for this proc - * @buffer: binder buffer to be accessed - * @buffer_offset: offset into @buffer data - * @from: userspace pointer to source buffer - * @bytes: bytes to copy - * - * Copy bytes from source userspace to target buffer. - * - * Return: bytes remaining to be copied - */ -unsigned long -binder_alloc_copy_user_to_buffer(struct binder_alloc *alloc, - struct binder_buffer *buffer, - binder_size_t buffer_offset, - const void __user *from, - size_t bytes) -{ - if (!check_buffer(alloc, buffer, buffer_offset, bytes)) - return bytes; - - while (bytes) { - unsigned long size; - unsigned long ret; - struct page *page; - pgoff_t pgoff; - void *kptr; - - page = binder_alloc_get_page(alloc, buffer, - buffer_offset, &pgoff); - size = min_t(size_t, bytes, PAGE_SIZE - pgoff); - kptr = kmap_local_page(page) + pgoff; - ret = copy_from_user(kptr, from, size); - kunmap_local(kptr); - if (ret) - return bytes - size + ret; - bytes -= size; - from += size; - buffer_offset += size; - } - return 0; -} - -static int binder_alloc_do_buffer_copy(struct binder_alloc *alloc, - bool to_buffer, - struct binder_buffer *buffer, - binder_size_t buffer_offset, - void *ptr, - size_t bytes) -{ - /* All copies must be 32-bit aligned and 32-bit size */ - if (!check_buffer(alloc, buffer, buffer_offset, bytes)) - return -EINVAL; - - while (bytes) { - unsigned long size; - struct page *page; - pgoff_t pgoff; - - page = binder_alloc_get_page(alloc, buffer, - buffer_offset, &pgoff); - size = min_t(size_t, bytes, PAGE_SIZE - pgoff); - if (to_buffer) - memcpy_to_page(page, pgoff, ptr, size); - else - memcpy_from_page(ptr, page, pgoff, size); - bytes -= size; - pgoff = 0; - ptr = ptr + size; - buffer_offset += size; - } - return 0; -} - -int binder_alloc_copy_to_buffer(struct binder_alloc *alloc, - struct binder_buffer *buffer, - binder_size_t buffer_offset, - void *src, - size_t bytes) -{ - return binder_alloc_do_buffer_copy(alloc, true, buffer, buffer_offset, - src, bytes); -} - -int binder_alloc_copy_from_buffer(struct binder_alloc *alloc, - void *dest, - struct binder_buffer *buffer, - binder_size_t buffer_offset, - size_t bytes) -{ - return binder_alloc_do_buffer_copy(alloc, false, buffer, buffer_offset, - dest, bytes); -} - diff --git a/drivers/android/binderfs.c b/drivers/android/binderfs.c deleted file mode 100644 index 420dc9cbf774..000000000000 --- a/drivers/android/binderfs.c +++ /dev/null @@ -1,827 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 - -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include -#include - -#include "binder_internal.h" - -#define FIRST_INODE 1 -#define SECOND_INODE 2 -#define INODE_OFFSET 3 -#define BINDERFS_MAX_MINOR (1U << MINORBITS) -/* Ensure that the initial ipc namespace always has devices available. */ -#define BINDERFS_MAX_MINOR_CAPPED (BINDERFS_MAX_MINOR - 4) - -static dev_t binderfs_dev; -static DEFINE_MUTEX(binderfs_minors_mutex); -static DEFINE_IDA(binderfs_minors); - -enum binderfs_param { - Opt_max, - Opt_stats_mode, -}; - -enum binderfs_stats_mode { - binderfs_stats_mode_unset, - binderfs_stats_mode_global, -}; - -struct binder_features { - bool oneway_spam_detection; - bool extended_error; -}; - -static const struct constant_table binderfs_param_stats[] = { - { "global", binderfs_stats_mode_global }, - {} -}; - -static const struct fs_parameter_spec binderfs_fs_parameters[] = { - fsparam_u32("max", Opt_max), - fsparam_enum("stats", Opt_stats_mode, binderfs_param_stats), - {} -}; - -static struct binder_features binder_features = { - .oneway_spam_detection = true, - .extended_error = true, -}; - -static inline struct binderfs_info *BINDERFS_SB(const struct super_block *sb) -{ - return sb->s_fs_info; -} - -bool is_binderfs_device(const struct inode *inode) -{ - if (inode->i_sb->s_magic == BINDERFS_SUPER_MAGIC) - return true; - - return false; -} - -/** - * binderfs_binder_device_create - allocate inode from super block of a - * binderfs mount - * @ref_inode: inode from wich the super block will be taken - * @userp: buffer to copy information about new device for userspace to - * @req: struct binderfs_device as copied from userspace - * - * This function allocates a new binder_device and reserves a new minor - * number for it. - * Minor numbers are limited and tracked globally in binderfs_minors. The - * function will stash a struct binder_device for the specific binder - * device in i_private of the inode. - * It will go on to allocate a new inode from the super block of the - * filesystem mount, stash a struct binder_device in its i_private field - * and attach a dentry to that inode. - * - * Return: 0 on success, negative errno on failure - */ -static int binderfs_binder_device_create(struct inode *ref_inode, - struct binderfs_device __user *userp, - struct binderfs_device *req) -{ - int minor, ret; - struct dentry *dentry, *root; - struct binder_device *device; - char *name = NULL; - size_t name_len; - struct inode *inode = NULL; - struct super_block *sb = ref_inode->i_sb; - struct binderfs_info *info = sb->s_fs_info; -#if defined(CONFIG_IPC_NS) - bool use_reserve = (info->ipc_ns == &init_ipc_ns); -#else - bool use_reserve = true; -#endif - - /* Reserve new minor number for the new device. */ - mutex_lock(&binderfs_minors_mutex); - if (++info->device_count <= info->mount_opts.max) - minor = ida_alloc_max(&binderfs_minors, - use_reserve ? BINDERFS_MAX_MINOR : - BINDERFS_MAX_MINOR_CAPPED, - GFP_KERNEL); - else - minor = -ENOSPC; - if (minor < 0) { - --info->device_count; - mutex_unlock(&binderfs_minors_mutex); - return minor; - } - mutex_unlock(&binderfs_minors_mutex); - - ret = -ENOMEM; - device = kzalloc(sizeof(*device), GFP_KERNEL); - if (!device) - goto err; - - inode = new_inode(sb); - if (!inode) - goto err; - - inode->i_ino = minor + INODE_OFFSET; - simple_inode_init_ts(inode); - init_special_inode(inode, S_IFCHR | 0600, - MKDEV(MAJOR(binderfs_dev), minor)); - inode->i_fop = &binder_fops; - inode->i_uid = info->root_uid; - inode->i_gid = info->root_gid; - - req->name[BINDERFS_MAX_NAME] = '\0'; /* NUL-terminate */ - name_len = strlen(req->name); - /* Make sure to include terminating NUL byte */ - name = kmemdup(req->name, name_len + 1, GFP_KERNEL); - if (!name) - goto err; - - refcount_set(&device->ref, 1); - device->binderfs_inode = inode; - device->context.binder_context_mgr_uid = INVALID_UID; - device->context.name = name; - device->miscdev.name = name; - device->miscdev.minor = minor; - mutex_init(&device->context.context_mgr_node_lock); - - req->major = MAJOR(binderfs_dev); - req->minor = minor; - - if (userp && copy_to_user(userp, req, sizeof(*req))) { - ret = -EFAULT; - goto err; - } - - root = sb->s_root; - inode_lock(d_inode(root)); - - /* look it up */ - dentry = lookup_one_len(name, root, name_len); - if (IS_ERR(dentry)) { - inode_unlock(d_inode(root)); - ret = PTR_ERR(dentry); - goto err; - } - - if (d_really_is_positive(dentry)) { - /* already exists */ - dput(dentry); - inode_unlock(d_inode(root)); - ret = -EEXIST; - goto err; - } - - inode->i_private = device; - d_instantiate(dentry, inode); - fsnotify_create(root->d_inode, dentry); - inode_unlock(d_inode(root)); - - return 0; - -err: - kfree(name); - kfree(device); - mutex_lock(&binderfs_minors_mutex); - --info->device_count; - ida_free(&binderfs_minors, minor); - mutex_unlock(&binderfs_minors_mutex); - iput(inode); - - return ret; -} - -/** - * binder_ctl_ioctl - handle binder device node allocation requests - * - * The request handler for the binder-control device. All requests operate on - * the binderfs mount the binder-control device resides in: - * - BINDER_CTL_ADD - * Allocate a new binder device. - * - * Return: %0 on success, negative errno on failure. - */ -static long binder_ctl_ioctl(struct file *file, unsigned int cmd, - unsigned long arg) -{ - int ret = -EINVAL; - struct inode *inode = file_inode(file); - struct binderfs_device __user *device = (struct binderfs_device __user *)arg; - struct binderfs_device device_req; - - switch (cmd) { - case BINDER_CTL_ADD: - ret = copy_from_user(&device_req, device, sizeof(device_req)); - if (ret) { - ret = -EFAULT; - break; - } - - ret = binderfs_binder_device_create(inode, device, &device_req); - break; - default: - break; - } - - return ret; -} - -static void binderfs_evict_inode(struct inode *inode) -{ - struct binder_device *device = inode->i_private; - struct binderfs_info *info = BINDERFS_SB(inode->i_sb); - - clear_inode(inode); - - if (!S_ISCHR(inode->i_mode) || !device) - return; - - mutex_lock(&binderfs_minors_mutex); - --info->device_count; - ida_free(&binderfs_minors, device->miscdev.minor); - mutex_unlock(&binderfs_minors_mutex); - - if (refcount_dec_and_test(&device->ref)) { - kfree(device->context.name); - kfree(device); - } -} - -static int binderfs_fs_context_parse_param(struct fs_context *fc, - struct fs_parameter *param) -{ - int opt; - struct binderfs_mount_opts *ctx = fc->fs_private; - struct fs_parse_result result; - - opt = fs_parse(fc, binderfs_fs_parameters, param, &result); - if (opt < 0) - return opt; - - switch (opt) { - case Opt_max: - if (result.uint_32 > BINDERFS_MAX_MINOR) - return invalfc(fc, "Bad value for '%s'", param->key); - - ctx->max = result.uint_32; - break; - case Opt_stats_mode: - if (!capable(CAP_SYS_ADMIN)) - return -EPERM; - - ctx->stats_mode = result.uint_32; - break; - default: - return invalfc(fc, "Unsupported parameter '%s'", param->key); - } - - return 0; -} - -static int binderfs_fs_context_reconfigure(struct fs_context *fc) -{ - struct binderfs_mount_opts *ctx = fc->fs_private; - struct binderfs_info *info = BINDERFS_SB(fc->root->d_sb); - - if (info->mount_opts.stats_mode != ctx->stats_mode) - return invalfc(fc, "Binderfs stats mode cannot be changed during a remount"); - - info->mount_opts.stats_mode = ctx->stats_mode; - info->mount_opts.max = ctx->max; - return 0; -} - -static int binderfs_show_options(struct seq_file *seq, struct dentry *root) -{ - struct binderfs_info *info = BINDERFS_SB(root->d_sb); - - if (info->mount_opts.max <= BINDERFS_MAX_MINOR) - seq_printf(seq, ",max=%d", info->mount_opts.max); - - switch (info->mount_opts.stats_mode) { - case binderfs_stats_mode_unset: - break; - case binderfs_stats_mode_global: - seq_printf(seq, ",stats=global"); - break; - } - - return 0; -} - -static const struct super_operations binderfs_super_ops = { - .evict_inode = binderfs_evict_inode, - .show_options = binderfs_show_options, - .statfs = simple_statfs, -}; - -static inline bool is_binderfs_control_device(const struct dentry *dentry) -{ - struct binderfs_info *info = dentry->d_sb->s_fs_info; - - return info->control_dentry == dentry; -} - -static int binderfs_rename(struct mnt_idmap *idmap, - struct inode *old_dir, struct dentry *old_dentry, - struct inode *new_dir, struct dentry *new_dentry, - unsigned int flags) -{ - if (is_binderfs_control_device(old_dentry) || - is_binderfs_control_device(new_dentry)) - return -EPERM; - - return simple_rename(idmap, old_dir, old_dentry, new_dir, - new_dentry, flags); -} - -static int binderfs_unlink(struct inode *dir, struct dentry *dentry) -{ - if (is_binderfs_control_device(dentry)) - return -EPERM; - - return simple_unlink(dir, dentry); -} - -static const struct file_operations binder_ctl_fops = { - .owner = THIS_MODULE, - .open = nonseekable_open, - .unlocked_ioctl = binder_ctl_ioctl, - .compat_ioctl = binder_ctl_ioctl, - .llseek = noop_llseek, -}; - -/** - * binderfs_binder_ctl_create - create a new binder-control device - * @sb: super block of the binderfs mount - * - * This function creates a new binder-control device node in the binderfs mount - * referred to by @sb. - * - * Return: 0 on success, negative errno on failure - */ -static int binderfs_binder_ctl_create(struct super_block *sb) -{ - int minor, ret; - struct dentry *dentry; - struct binder_device *device; - struct inode *inode = NULL; - struct dentry *root = sb->s_root; - struct binderfs_info *info = sb->s_fs_info; -#if defined(CONFIG_IPC_NS) - bool use_reserve = (info->ipc_ns == &init_ipc_ns); -#else - bool use_reserve = true; -#endif - - device = kzalloc(sizeof(*device), GFP_KERNEL); - if (!device) - return -ENOMEM; - - /* If we have already created a binder-control node, return. */ - if (info->control_dentry) { - ret = 0; - goto out; - } - - ret = -ENOMEM; - inode = new_inode(sb); - if (!inode) - goto out; - - /* Reserve a new minor number for the new device. */ - mutex_lock(&binderfs_minors_mutex); - minor = ida_alloc_max(&binderfs_minors, - use_reserve ? BINDERFS_MAX_MINOR : - BINDERFS_MAX_MINOR_CAPPED, - GFP_KERNEL); - mutex_unlock(&binderfs_minors_mutex); - if (minor < 0) { - ret = minor; - goto out; - } - - inode->i_ino = SECOND_INODE; - simple_inode_init_ts(inode); - init_special_inode(inode, S_IFCHR | 0600, - MKDEV(MAJOR(binderfs_dev), minor)); - inode->i_fop = &binder_ctl_fops; - inode->i_uid = info->root_uid; - inode->i_gid = info->root_gid; - - refcount_set(&device->ref, 1); - device->binderfs_inode = inode; - device->miscdev.minor = minor; - - dentry = d_alloc_name(root, "binder-control"); - if (!dentry) - goto out; - - inode->i_private = device; - info->control_dentry = dentry; - d_add(dentry, inode); - - return 0; - -out: - kfree(device); - iput(inode); - - return ret; -} - -static const struct inode_operations binderfs_dir_inode_operations = { - .lookup = simple_lookup, - .rename = binderfs_rename, - .unlink = binderfs_unlink, -}; - -static struct inode *binderfs_make_inode(struct super_block *sb, int mode) -{ - struct inode *ret; - - ret = new_inode(sb); - if (ret) { - ret->i_ino = iunique(sb, BINDERFS_MAX_MINOR + INODE_OFFSET); - ret->i_mode = mode; - simple_inode_init_ts(ret); - } - return ret; -} - -static struct dentry *binderfs_create_dentry(struct dentry *parent, - const char *name) -{ - struct dentry *dentry; - - dentry = lookup_one_len(name, parent, strlen(name)); - if (IS_ERR(dentry)) - return dentry; - - /* Return error if the file/dir already exists. */ - if (d_really_is_positive(dentry)) { - dput(dentry); - return ERR_PTR(-EEXIST); - } - - return dentry; -} - -void binderfs_remove_file(struct dentry *dentry) -{ - struct inode *parent_inode; - - parent_inode = d_inode(dentry->d_parent); - inode_lock(parent_inode); - if (simple_positive(dentry)) { - dget(dentry); - simple_unlink(parent_inode, dentry); - d_delete(dentry); - dput(dentry); - } - inode_unlock(parent_inode); -} - -struct dentry *binderfs_create_file(struct dentry *parent, const char *name, - const struct file_operations *fops, - void *data) -{ - struct dentry *dentry; - struct inode *new_inode, *parent_inode; - struct super_block *sb; - - parent_inode = d_inode(parent); - inode_lock(parent_inode); - - dentry = binderfs_create_dentry(parent, name); - if (IS_ERR(dentry)) - goto out; - - sb = parent_inode->i_sb; - new_inode = binderfs_make_inode(sb, S_IFREG | 0444); - if (!new_inode) { - dput(dentry); - dentry = ERR_PTR(-ENOMEM); - goto out; - } - - new_inode->i_fop = fops; - new_inode->i_private = data; - d_instantiate(dentry, new_inode); - fsnotify_create(parent_inode, dentry); - -out: - inode_unlock(parent_inode); - return dentry; -} - -static struct dentry *binderfs_create_dir(struct dentry *parent, - const char *name) -{ - struct dentry *dentry; - struct inode *new_inode, *parent_inode; - struct super_block *sb; - - parent_inode = d_inode(parent); - inode_lock(parent_inode); - - dentry = binderfs_create_dentry(parent, name); - if (IS_ERR(dentry)) - goto out; - - sb = parent_inode->i_sb; - new_inode = binderfs_make_inode(sb, S_IFDIR | 0755); - if (!new_inode) { - dput(dentry); - dentry = ERR_PTR(-ENOMEM); - goto out; - } - - new_inode->i_fop = &simple_dir_operations; - new_inode->i_op = &simple_dir_inode_operations; - - set_nlink(new_inode, 2); - d_instantiate(dentry, new_inode); - inc_nlink(parent_inode); - fsnotify_mkdir(parent_inode, dentry); - -out: - inode_unlock(parent_inode); - return dentry; -} - -static int binder_features_show(struct seq_file *m, void *unused) -{ - bool *feature = m->private; - - seq_printf(m, "%d\n", *feature); - - return 0; -} -DEFINE_SHOW_ATTRIBUTE(binder_features); - -static int init_binder_features(struct super_block *sb) -{ - struct dentry *dentry, *dir; - - dir = binderfs_create_dir(sb->s_root, "features"); - if (IS_ERR(dir)) - return PTR_ERR(dir); - - dentry = binderfs_create_file(dir, "oneway_spam_detection", - &binder_features_fops, - &binder_features.oneway_spam_detection); - if (IS_ERR(dentry)) - return PTR_ERR(dentry); - - dentry = binderfs_create_file(dir, "extended_error", - &binder_features_fops, - &binder_features.extended_error); - if (IS_ERR(dentry)) - return PTR_ERR(dentry); - - return 0; -} - -static int init_binder_logs(struct super_block *sb) -{ - struct dentry *binder_logs_root_dir, *dentry, *proc_log_dir; - const struct binder_debugfs_entry *db_entry; - struct binderfs_info *info; - int ret = 0; - - binder_logs_root_dir = binderfs_create_dir(sb->s_root, - "binder_logs"); - if (IS_ERR(binder_logs_root_dir)) { - ret = PTR_ERR(binder_logs_root_dir); - goto out; - } - - binder_for_each_debugfs_entry(db_entry) { - dentry = binderfs_create_file(binder_logs_root_dir, - db_entry->name, - db_entry->fops, - db_entry->data); - if (IS_ERR(dentry)) { - ret = PTR_ERR(dentry); - goto out; - } - } - - proc_log_dir = binderfs_create_dir(binder_logs_root_dir, "proc"); - if (IS_ERR(proc_log_dir)) { - ret = PTR_ERR(proc_log_dir); - goto out; - } - info = sb->s_fs_info; - info->proc_log_dir = proc_log_dir; - -out: - return ret; -} - -static int binderfs_fill_super(struct super_block *sb, struct fs_context *fc) -{ - int ret; - struct binderfs_info *info; - struct binderfs_mount_opts *ctx = fc->fs_private; - struct inode *inode = NULL; - struct binderfs_device device_info = {}; - const char *name; - size_t len; - - sb->s_blocksize = PAGE_SIZE; - sb->s_blocksize_bits = PAGE_SHIFT; - - /* - * The binderfs filesystem can be mounted by userns root in a - * non-initial userns. By default such mounts have the SB_I_NODEV flag - * set in s_iflags to prevent security issues where userns root can - * just create random device nodes via mknod() since it owns the - * filesystem mount. But binderfs does not allow to create any files - * including devices nodes. The only way to create binder devices nodes - * is through the binder-control device which userns root is explicitly - * allowed to do. So removing the SB_I_NODEV flag from s_iflags is both - * necessary and safe. - */ - sb->s_iflags &= ~SB_I_NODEV; - sb->s_iflags |= SB_I_NOEXEC; - sb->s_magic = BINDERFS_SUPER_MAGIC; - sb->s_op = &binderfs_super_ops; - sb->s_time_gran = 1; - - sb->s_fs_info = kzalloc(sizeof(struct binderfs_info), GFP_KERNEL); - if (!sb->s_fs_info) - return -ENOMEM; - info = sb->s_fs_info; - - info->ipc_ns = get_ipc_ns(current->nsproxy->ipc_ns); - - info->root_gid = make_kgid(sb->s_user_ns, 0); - if (!gid_valid(info->root_gid)) - info->root_gid = GLOBAL_ROOT_GID; - info->root_uid = make_kuid(sb->s_user_ns, 0); - if (!uid_valid(info->root_uid)) - info->root_uid = GLOBAL_ROOT_UID; - info->mount_opts.max = ctx->max; - info->mount_opts.stats_mode = ctx->stats_mode; - - inode = new_inode(sb); - if (!inode) - return -ENOMEM; - - inode->i_ino = FIRST_INODE; - inode->i_fop = &simple_dir_operations; - inode->i_mode = S_IFDIR | 0755; - simple_inode_init_ts(inode); - inode->i_op = &binderfs_dir_inode_operations; - set_nlink(inode, 2); - - sb->s_root = d_make_root(inode); - if (!sb->s_root) - return -ENOMEM; - - ret = binderfs_binder_ctl_create(sb); - if (ret) - return ret; - - name = binder_devices_param; - for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) { - strscpy(device_info.name, name, len + 1); - ret = binderfs_binder_device_create(inode, NULL, &device_info); - if (ret) - return ret; - name += len; - if (*name == ',') - name++; - } - - ret = init_binder_features(sb); - if (ret) - return ret; - - if (info->mount_opts.stats_mode == binderfs_stats_mode_global) - return init_binder_logs(sb); - - return 0; -} - -static int binderfs_fs_context_get_tree(struct fs_context *fc) -{ - return get_tree_nodev(fc, binderfs_fill_super); -} - -static void binderfs_fs_context_free(struct fs_context *fc) -{ - struct binderfs_mount_opts *ctx = fc->fs_private; - - kfree(ctx); -} - -static const struct fs_context_operations binderfs_fs_context_ops = { - .free = binderfs_fs_context_free, - .get_tree = binderfs_fs_context_get_tree, - .parse_param = binderfs_fs_context_parse_param, - .reconfigure = binderfs_fs_context_reconfigure, -}; - -static int binderfs_init_fs_context(struct fs_context *fc) -{ - struct binderfs_mount_opts *ctx; - - ctx = kzalloc(sizeof(struct binderfs_mount_opts), GFP_KERNEL); - if (!ctx) - return -ENOMEM; - - ctx->max = BINDERFS_MAX_MINOR; - ctx->stats_mode = binderfs_stats_mode_unset; - - fc->fs_private = ctx; - fc->ops = &binderfs_fs_context_ops; - - return 0; -} - -static void binderfs_kill_super(struct super_block *sb) -{ - struct binderfs_info *info = sb->s_fs_info; - - /* - * During inode eviction struct binderfs_info is needed. - * So first wipe the super_block then free struct binderfs_info. - */ - kill_litter_super(sb); - - if (info && info->ipc_ns) - put_ipc_ns(info->ipc_ns); - - kfree(info); -} - -static struct file_system_type binder_fs_type = { - .name = "binder", - .init_fs_context = binderfs_init_fs_context, - .parameters = binderfs_fs_parameters, - .kill_sb = binderfs_kill_super, - .fs_flags = FS_USERNS_MOUNT, -}; - -int __init init_binderfs(void) -{ - int ret; - const char *name; - size_t len; - - /* Verify that the default binderfs device names are valid. */ - name = binder_devices_param; - for (len = strcspn(name, ","); len > 0; len = strcspn(name, ",")) { - if (len > BINDERFS_MAX_NAME) - return -E2BIG; - name += len; - if (*name == ',') - name++; - } - - /* Allocate new major number for binderfs. */ - ret = alloc_chrdev_region(&binderfs_dev, 0, BINDERFS_MAX_MINOR, - "binder"); - if (ret) - return ret; - - ret = register_filesystem(&binder_fs_type); - if (ret) { - unregister_chrdev_region(binderfs_dev, BINDERFS_MAX_MINOR); - return ret; - } - - return ret; -}