From patchwork Fri Sep 15 10:59:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140446 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1070120vqi; Fri, 15 Sep 2023 07:04:47 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHis9XxIgJjVJuLb9amanCvLKmGLcvaE0euN5urZJGqZHRaZWKBtcrQQY4ohFssQ6aKXPHW X-Received: by 2002:a05:6300:8081:b0:14c:def1:e728 with SMTP id ap1-20020a056300808100b0014cdef1e728mr1839473pzc.60.1694786686799; Fri, 15 Sep 2023 07:04:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694786686; cv=none; d=google.com; s=arc-20160816; b=k73LZOEat6Ic3NmTdv0iDC1qjdpwsTvuQ3fLO3tmefquwGyNi2P7rLZgiQ/2LnHBhV wdfy4/vbxZ/GJ0H1Vz5gVP+HQr+SP0Zn1Ez2/fLimUj1YmSniAP3IMDKQUW545nO0tsd M7182ky4DjKhmxDQVgFOSr5k7hG3o/zbzMAEQ09Msq0cBHlwNoGIXkuW2B4j+DdJfrX8 29RJFBynNmpCI9XOtA5O0fk+yDpt4A/RoSQwbljy1NQt9CfhAHkrMyHgVak/nsmMZ9oj W7zi2+koTZ/ZIGeP6hMXlHu3Tb2S8aIjylzQZPRTV5UyzcZYgomnkMS/6sqhpWqgLimU eSaw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Y3DxUvEg+3g6OhCwS+K/sONLa3J3FFOGvmkB4cPhPPQ=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=JY9RM7UtKHuNsU4A85XVCzDREsDbYXY5XSluI5iRux8HkVGB7V9gxOUqDhNR+Vtqv7 PZfJjXsYKRBu918/WSGj7qyMI9vPCK0AqP7LOym1lzJw2KRJkxXR6veY71dbbg5e8bxc 0OisoMKDy3KjMMUoavvtx1ocmfj+Ys+qgl7KYCbhoaEfvPi5S3HZ7P24VPGviZq2/rPH vo8MqHLNxJCY3pwWl/nZGJslVzKlRwl1DJgNIFLfQa+I6QssCrtJr0NUfVBWtp6EE58h 9QyQT1KzQ7U2Ud0ELnT2hovJ1H0hJGUdZiB7fq9igSDr2IaoTwo0XV/P23DFjOw3AbEv Qadg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=jM9yoZmg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id ct25-20020a056a000f9900b0065b4e2b52c5si3255542pfb.196.2023.09.15.07.04.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 07:04:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=jM9yoZmg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 1DBEA834F75A; Fri, 15 Sep 2023 03:59:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234119AbjIOK7s (ORCPT + 32 others); Fri, 15 Sep 2023 06:59:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54176 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233884AbjIOK7q (ORCPT ); Fri, 15 Sep 2023 06:59:46 -0400 Received: from mail-ej1-x649.google.com (mail-ej1-x649.google.com [IPv6:2a00:1450:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71461D3 for ; Fri, 15 Sep 2023 03:59:41 -0700 (PDT) Received: by mail-ej1-x649.google.com with SMTP id a640c23a62f3a-993d7ca4607so152492866b.1 for ; Fri, 15 Sep 2023 03:59:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775580; x=1695380380; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Y3DxUvEg+3g6OhCwS+K/sONLa3J3FFOGvmkB4cPhPPQ=; b=jM9yoZmgXs7EqL1Jxcr3PrzkI1HmyIq50nPIQf1AQTR91kFWFnSUVKrUGXF8bBgwRR QIiURFlFD7QW0CC4frsPrl/cyYAWHx0CavTt9lVVRlHu8W00lI8Q2WhfmHckiL/1J6Hg jUzkDkBXS5H2AMd/p9DUpbGnVDvPwYexmUhygIICOwLPuuJWxE6MOGyENv60pcvZucst ZuINFfczGmlxBLbuKE2+GsvqgkhxQQEdllr0T5j+j86+rQGPtOqxUAUeQidd6OI31sW9 eFzjjVNQA8lic2JJHFFxQPpTYB1TiMPdym0mE/VDtK4ClVamHDnE8cXBCofN/I3MdPSA D/ug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775580; x=1695380380; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Y3DxUvEg+3g6OhCwS+K/sONLa3J3FFOGvmkB4cPhPPQ=; b=GD5dF8yJE2EwpJQNw0meEv4FL+OZGel/+M5rIf1xj2tzlHBeufkDSsxNJMJxzMnepu PsJ8oIJarxXPI6UXHDYwf79G/aEAVVse3+0VW+6mbSQtYw3UambqwlyVE84YSDGIOj8O CnbPZTDe7/nrjfCSh8aByIpmrFJXQZlHkO46hkxvsa6PSy9RALuAx0jICdJs632/JjwU oFuvbIBlKwUPCw64uvnhQ4wcSH4scJqkBBO4K42mhMMjXQRMmd+eq2FDXiij+7fTMR/t i+aHhwwuiNOWzKMI2INsOYl19Ln4PLGUrT6oyifEYdLTxj7cnQmKh9AMeBdId5hJnMnO oASQ== X-Gm-Message-State: AOJu0YwpTKOWPdX4a03OHmzkF1YCM6sBLH7U4LuyW3kUK1Tr2qbg5Sce MVdi2pla1FkQIlK89QAn/dk3+3Ci+yAOojZrwg== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a17:907:cb13:b0:9a1:cac8:6448 with SMTP id um19-20020a170907cb1300b009a1cac86448mr7245ejc.2.1694775579655; Fri, 15 Sep 2023 03:59:39 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:20 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-2-matteorizzo@google.com> Subject: [RFC PATCH 01/14] mm/slub: don't try to dereference invalid freepointers From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 15 Sep 2023 03:59:55 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777112644769593804 X-GMAIL-MSGID: 1777112644769593804 slab_free_freelist_hook tries to read a freelist pointer from the current object even when freeing a single object. This is invalid because single objects don't actually contain a freelist pointer when they're freed and the memory contains other data. This causes problems for checking the integrity of freelist in get_freepointer. Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook Reviewed-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> --- mm/slub.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/mm/slub.c b/mm/slub.c index f7940048138c..a7dae207c2d2 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1820,7 +1820,9 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, do { object = next; - next = get_freepointer(s, object); + /* Single objects don't actually contain a freepointer */ + if (object != old_tail) + next = get_freepointer(s, object); /* If object's reuse doesn't have to be delayed */ if (!slab_free_hook(s, object, slab_want_init_on_free(s))) { From patchwork Fri Sep 15 10:59:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140556 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1195438vqi; Fri, 15 Sep 2023 10:07:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHdIoCb0I3FkcuQFYhvWvj8b7nXTH6IhMtXp4moG2EbMF2cYC0s37v78o6W9anatioYObRy X-Received: by 2002:a17:902:cecb:b0:1c2:82e:32de with SMTP id d11-20020a170902cecb00b001c2082e32demr3260839plg.0.1694797631515; Fri, 15 Sep 2023 10:07:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694797631; cv=none; d=google.com; s=arc-20160816; b=x3uD4qPpUGebanJ+a1fT5eEWvPGtnxY3r1v7AbFKL2/eGGSEpV53EmYj+aj1o6/CVW +KMNWCR+VMsg5TY0MW3r5SVzn3k2WKE+uAkwj7qRo3WYseeVPOTbDvV383DNXBHLpgLG gWlDIpv+qgT3/HygeS3dwkm8me6BVXewBr4aHjeVfcUaRr6esEbv9Fo0CJhU4KuLneAg cD5alNfy9q1dOZAQH7q0IbIEMxnPRfBBlnPjihSQaPGoD1UaMKn8bpQDBl4hMJtkz6Lp MODM/ltj2kdZdmF3AxbM5WovUXgyyzeyuJ1QCF1BKtXY4ZgEr1WBRL/kHHD5w0oDU71e qNqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=ZucAKzByYrFaSau/ORvEIe6Pa38oq3gYlqLtnctNAqY=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=C1eAcZNb8iV9Iiygxgr5I6/ljWh4RT0x3K1BaXw0baHkFdtqDJhP8EiHQvpT9/zKnD 0xg/x4qtUXNwIBOdaZx+X+LuzFJHRWmabeqS+hwXQVj4iVqA92nuFIO6GKDa7T+1aZ1/ kkKroQIBdd/CeX4hG9ZN/an6dUS48PvPkZ7qXM2O+b3MOBrkl2I+tPR5HS+HJSfV4kEF fkHkshlLHEqwo5GsYUfMwuXIcwfo+2j6jQ5fyewIP/0ECKrGBMyvfjvtLHT+cNtdoq/a adpldQ7IDcup23FefmxeZ+S+tu1OGhjYN1yAHnoY+GwpKy/Zkl33LWYG9fgUWi3TNfw0 Spvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=0UUipZjE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id d15-20020a170902654f00b001bc67506b38si3552102pln.367.2023.09.15.10.07.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 10:07:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=0UUipZjE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id C4FE68381ECD; Fri, 15 Sep 2023 04:00:14 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234109AbjIOK7y (ORCPT + 32 others); Fri, 15 Sep 2023 06:59:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54238 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233910AbjIOK7v (ORCPT ); Fri, 15 Sep 2023 06:59:51 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6F4C193 for ; Fri, 15 Sep 2023 03:59:43 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5925fb6087bso25063237b3.2 for ; Fri, 15 Sep 2023 03:59:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775583; x=1695380383; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ZucAKzByYrFaSau/ORvEIe6Pa38oq3gYlqLtnctNAqY=; b=0UUipZjE05nfLxs//N+5K5UD1LOg8zvkCIA/PPAeacB3UA7ZgNPiM9SlycPer+gLF6 BVy9xvJYA7dkKfNTjNj3cw9Eaw0qi2E4DgJr/hnzdNf5+t9Ls6xgwqREMZBwKqe22HpY j5CL2LouU3WbUF4RCC226OpcwNNd+ELOMxYkpaj3+WIgX8QsFyztklW/ZsJVn0V/dXmI k0QnktpwyQRYNLgnZ7rqtRY82q8CH/jlzfkde0KOmW0NZUTkusQmZsKSS/pH9RdFBUK6 UHcp56PWP8oOuIbj/2X0zj7lbrzm1aYWQ/HMSNoovmjrkCzoIWbv8HDQq9at2315+BBv Ne/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775583; x=1695380383; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZucAKzByYrFaSau/ORvEIe6Pa38oq3gYlqLtnctNAqY=; b=hPq0AHz8bTTqzgIMyncIwI/tOPe+IH1tIAq2OthT9z//lkNWFvq29Q6fdQ+tNHmH/Y qUHPiuDmtT62B7EqrOLAxyecmQDu7/t8RvHWws3VIQXd0WCaf9DnA8o63ePtBBqrU8jt xHRl70f2IjO1u5S9aq9lWPZ4Q/pZfjmdDgO3aM0pCRxcbyvFqS6sAnIOmCQ8f/lX4dE/ CXJqucJz+p31+UlQK/Rp6NTQwJIa+iDHUnVwL+9ydbHLxgGXw0lMgh2Tg41VPunzxgfl wRiciS8RRvyTZ/+fmpxgBDteY+M5xZ79CT9s8lQgYWuGAXRVxJzi0y56Vxao79J4dGy/ M7yQ== X-Gm-Message-State: AOJu0YzM2fCiHJYyldjdccTVyGj2eAhi5t8wRDwXwGOU9icrEjZlTgOT oJAiKXsHh3+rh8ItpocDFbcrqCJZIdAIgX86Hw== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a81:b612:0:b0:565:9bee:22e0 with SMTP id u18-20020a81b612000000b005659bee22e0mr34824ywh.0.1694775582994; Fri, 15 Sep 2023 03:59:42 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:21 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-3-matteorizzo@google.com> Subject: [RFC PATCH 02/14] mm/slub: add is_slab_addr/is_slab_page helpers From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:00:14 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777124121476419512 X-GMAIL-MSGID: 1777124121476419512 From: Jann Horn This is refactoring in preparation for adding two different implementations (for SLAB_VIRTUAL enabled and disabled). virt_to_folio(x) expands to _compound_head(virt_to_page(x)) and virt_to_head_page(x) also expands to _compound_head(virt_to_page(x)) so PageSlab(virt_to_head_page(res)) should be equivalent to is_slab_addr(res). Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- include/linux/slab.h | 1 + kernel/resource.c | 2 +- mm/slab.h | 9 +++++++++ mm/slab_common.c | 5 ++--- mm/slub.c | 6 +++--- 5 files changed, 16 insertions(+), 7 deletions(-) diff --git a/include/linux/slab.h b/include/linux/slab.h index 8228d1276a2f..a2d82010d269 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -793,4 +793,5 @@ int slab_dead_cpu(unsigned int cpu); #define slab_dead_cpu NULL #endif +#define is_slab_addr(addr) folio_test_slab(virt_to_folio(addr)) #endif /* _LINUX_SLAB_H */ diff --git a/kernel/resource.c b/kernel/resource.c index b1763b2fd7ef..c829e5f97292 100644 --- a/kernel/resource.c +++ b/kernel/resource.c @@ -158,7 +158,7 @@ static void free_resource(struct resource *res) * buddy and trying to be smart and reusing them eventually in * alloc_resource() overcomplicates resource handling. */ - if (res && PageSlab(virt_to_head_page(res))) + if (res && is_slab_addr(res)) kfree(res); } diff --git a/mm/slab.h b/mm/slab.h index 799a315695c6..25e41dd6087e 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -169,6 +169,15 @@ static_assert(IS_ALIGNED(offsetof(struct slab, freelist), sizeof(freelist_aba_t) */ #define slab_page(s) folio_page(slab_folio(s), 0) +/** + * is_slab_page - Checks if a page is really a slab page + * @s: The slab + * + * Checks if s points to a slab page. + * + * Return: true if s points to a slab and false otherwise. + */ +#define is_slab_page(s) folio_test_slab(slab_folio(s)) /* * If network-based swap is enabled, sl*b must keep track of whether pages * were allocated from pfmemalloc reserves. diff --git a/mm/slab_common.c b/mm/slab_common.c index e99e821065c3..79102d24f099 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1063,7 +1063,7 @@ void kfree(const void *object) return; folio = virt_to_folio(object); - if (unlikely(!folio_test_slab(folio))) { + if (unlikely(!is_slab_addr(object))) { free_large_kmalloc(folio, (void *)object); return; } @@ -1094,8 +1094,7 @@ size_t __ksize(const void *object) return 0; folio = virt_to_folio(object); - - if (unlikely(!folio_test_slab(folio))) { + if (unlikely(!is_slab_addr(object))) { if (WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE)) return 0; if (WARN_ON(object != folio_address(folio))) diff --git a/mm/slub.c b/mm/slub.c index a7dae207c2d2..b69916ab7aa8 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1259,7 +1259,7 @@ static int check_slab(struct kmem_cache *s, struct slab *slab) { int maxobj; - if (!folio_test_slab(slab_folio(slab))) { + if (!is_slab_page(slab)) { slab_err(s, slab, "Not a valid slab page"); return 0; } @@ -1454,7 +1454,7 @@ static noinline bool alloc_debug_processing(struct kmem_cache *s, return true; bad: - if (folio_test_slab(slab_folio(slab))) { + if (is_slab_page(slab)) { /* * If this is a slab page then lets do the best we can * to avoid issues in the future. Marking all objects @@ -1484,7 +1484,7 @@ static inline int free_consistency_checks(struct kmem_cache *s, return 0; if (unlikely(s != slab->slab_cache)) { - if (!folio_test_slab(slab_folio(slab))) { + if (!is_slab_page(slab)) { slab_err(s, slab, "Attempt to free object(0x%p) outside of slab", object); } else if (!slab->slab_cache) { From patchwork Fri Sep 15 10:59:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140365 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp960573vqi; Fri, 15 Sep 2023 04:08:53 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFfSmv54vXuH8FG4o+aGIcaVo72/Ixrzf+m6bVyK2F9Z+y78qN96sh1uu9VVU2JN0PrAlgx X-Received: by 2002:a05:6a21:7782:b0:141:d640:794a with SMTP id bd2-20020a056a21778200b00141d640794amr1381242pzc.39.1694776133236; Fri, 15 Sep 2023 04:08:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694776133; cv=none; d=google.com; s=arc-20160816; b=BuVXeABqtuYyYtB1BPe7DVJQ5cW55/HWvXSW5fePeTQb36a4Ot5lmINlMP8Cyiq1jp GvN8N9dZo+P2uAmWu1kKkIjCpISaiPYxkizrFByycegUtJAiq3HZcXW/7dFmuctmMt6o 8c5JCY2SLdG9eGP51lNNFdKOLD55YR1rxUlZgpqy1YyHBPZg+eYe/SOFStGEXfsdJiPK THzxehBkh1mbPAU+C+iwZtrFa4N4Oqh1ivXFHZJ1JXNlk4bf3bq62bjHEc5DfWpxjNLf 8x7kX11mQBoMlBM2ILQhug/cNpWfXaR45YmBBPAryp3wsEgPF2qxiL195sPrWUs2EWUn pNOg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=dKUQWZsAVWnYh2N3/tMI/up9IoZgDH1PEAwTzCTiQv4=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=maNOJoATNgSpNBAtHrFltCBtrVnaGi9E4y1NjRh3gGvrRMnPk/sKYiRkMBrc5KjFeJ c1UtJ6tee6zHX/oB36hncYJNwSHqd2RR4D6rXztpbUF3kKMTzxHjnRAQmbEIpb+MV9gI ygk95lKYekZn/CrdA3MIIAj2U9K60/IKJbPZCPlAV7sTNMTbFm6JCuSNl6nQcDy+nURR iUT3aGjk5MRgDdxAu5S5vKZGKRIDQh2gCqUtt+RIINUO4D5Zcy+JWI+3qsbD+7JYjVXs 9xCHPRUSy19SQxg9Ymufd0FXze2RIdC9ylJTLhnBsdJ6QYY6+GrcM7dAKAvWi9NaXigL af7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=VEPfBEVX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id s34-20020a056a0017a200b0069014d63f21si3242561pfg.148.2023.09.15.04.08.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 04:08:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=VEPfBEVX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 9663183FF751; Fri, 15 Sep 2023 04:00:34 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234314AbjIOK77 (ORCPT + 32 others); Fri, 15 Sep 2023 06:59:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37774 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234243AbjIOK7w (ORCPT ); Fri, 15 Sep 2023 06:59:52 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E3C0CC4 for ; Fri, 15 Sep 2023 03:59:46 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d8191a1d5acso1961969276.1 for ; Fri, 15 Sep 2023 03:59:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775585; x=1695380385; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dKUQWZsAVWnYh2N3/tMI/up9IoZgDH1PEAwTzCTiQv4=; b=VEPfBEVXTmYLoHER1CewY8TLgJJJdmC+XcNqAnmS2A+IIP2QTrpVnBGgnTjVsSFNKO I+9/qcdpOGpj6tG1v4xqvLtgWOgT1VG7aYNS9Mnfyc2G8saHlYRws/+BoL4+s9bjrbTo fgPbErMim2HYgkcTDbtAw8GiOTxtQkPF8wGj7yWskWt3M7XXYbvDjdqdBVAzXm90+7yR gweACBwsKptAT/hkF7Mb3asWpix2DuE9DxsI5GI0xnGX4aAAuQETWwmlptjAMiwwVuny EaYa5Ld9q+oCWdEiJ/Gotd7V4ml45kkXp6YjgvxTUYY+11euFyFxpY5dbhER+6FSw8df zUPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775585; x=1695380385; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dKUQWZsAVWnYh2N3/tMI/up9IoZgDH1PEAwTzCTiQv4=; b=GNJf9wvy0qkkweoS+6jGhuCqy+B4ttDdngtaP/raY0ekuK0RB0fYaYtfIXOhYyai0/ UUy6NJ4eV+3W1XgjkxJFfcbV9FocjWZmmDnzEuyN3AeSN+RuQKNkyd7gfwldNIkay+Ap yFriap8BfwjdUFmPNcuzF2MU1G/h5xxdMZWr88AIm55al6YXRWL5cjIawc7i4ZL0wSgo YUVPsT3t9k9EBEUuL5FNk/5FEdThMJwEN2tUAq6FamCSLGvFubDTxqekUGrRezSUKADc WJ0oSTX+pEEuwnSro8T6qtAz1eEOhcrAdFyES6SBIjnr9cSeQ8FWcVjufM7TdC3RBpSy L/Dg== X-Gm-Message-State: AOJu0YzS8eKVitQ8TxjgWRaTStzJrKaaVK3WD+NTyM+uN4ovzjJjNMdH ru0NQ6C7+RXHOYRRbGbltFZWWsp2Zm1sMkdkzQ== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a05:6902:138e:b0:d78:245a:aac4 with SMTP id x14-20020a056902138e00b00d78245aaac4mr27236ybu.1.1694775585627; Fri, 15 Sep 2023 03:59:45 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:22 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-4-matteorizzo@google.com> Subject: [RFC PATCH 03/14] mm/slub: move kmem_cache_order_objects to slab.h From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:00:35 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777101578922952515 X-GMAIL-MSGID: 1777101578922952515 From: Jann Horn This is refactoring for SLAB_VIRTUAL. The implementation needs to know the order of the virtual memory region allocated to each slab to know how much physical memory to allocate when the slab is reused. We reuse kmem_cache_order_objects for this, so we have to move it before struct slab. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- include/linux/slub_def.h | 9 --------- mm/slab.h | 22 ++++++++++++++++++++++ mm/slub.c | 12 ------------ 3 files changed, 22 insertions(+), 21 deletions(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index deb90cf4bffb..0adf5ba8241b 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -83,15 +83,6 @@ struct kmem_cache_cpu { #define slub_percpu_partial_read_once(c) NULL #endif // CONFIG_SLUB_CPU_PARTIAL -/* - * Word size structure that can be atomically updated or read and that - * contains both the order and the number of objects that a slab of the - * given order would contain. - */ -struct kmem_cache_order_objects { - unsigned int x; -}; - /* * Slab cache management. */ diff --git a/mm/slab.h b/mm/slab.h index 25e41dd6087e..3fe0d1e26e26 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -38,6 +38,15 @@ typedef union { freelist_full_t full; } freelist_aba_t; +/* + * Word size structure that can be atomically updated or read and that + * contains both the order and the number of objects that a slab of the + * given order would contain. + */ +struct kmem_cache_order_objects { + unsigned int x; +}; + /* Reuses the bits in struct page */ struct slab { unsigned long __page_flags; @@ -227,6 +236,19 @@ static inline struct slab *virt_to_slab(const void *addr) return folio_slab(folio); } +#define OO_SHIFT 16 +#define OO_MASK ((1 << OO_SHIFT) - 1) + +static inline unsigned int oo_order(struct kmem_cache_order_objects x) +{ + return x.x >> OO_SHIFT; +} + +static inline unsigned int oo_objects(struct kmem_cache_order_objects x) +{ + return x.x & OO_MASK; +} + static inline int slab_order(const struct slab *slab) { return folio_order((struct folio *)slab_folio(slab)); diff --git a/mm/slub.c b/mm/slub.c index b69916ab7aa8..df2529c03bd3 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -284,8 +284,6 @@ static inline bool kmem_cache_has_cpu_partial(struct kmem_cache *s) */ #define DEBUG_METADATA_FLAGS (SLAB_RED_ZONE | SLAB_POISON | SLAB_STORE_USER) -#define OO_SHIFT 16 -#define OO_MASK ((1 << OO_SHIFT) - 1) #define MAX_OBJS_PER_PAGE 32767 /* since slab.objects is u15 */ /* Internal SLUB flags */ @@ -473,16 +471,6 @@ static inline struct kmem_cache_order_objects oo_make(unsigned int order, return x; } -static inline unsigned int oo_order(struct kmem_cache_order_objects x) -{ - return x.x >> OO_SHIFT; -} - -static inline unsigned int oo_objects(struct kmem_cache_order_objects x) -{ - return x.x & OO_MASK; -} - #ifdef CONFIG_SLUB_CPU_PARTIAL static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) { From patchwork Fri Sep 15 10:59:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140364 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp960560vqi; Fri, 15 Sep 2023 04:08:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGHJf7KnE1iGG5YsgCZzuQcrE6aguexbd0tQ78SZ2gjVcMAYu2B10oIPxsJBo/M0IEjLD9T X-Received: by 2002:a17:90a:8401:b0:273:f848:e3c4 with SMTP id j1-20020a17090a840100b00273f848e3c4mr1119237pjn.17.1694776131711; Fri, 15 Sep 2023 04:08:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694776131; cv=none; d=google.com; s=arc-20160816; b=AqKoyDx7/e4KGHZk/A0xcuHyKnE0RJkV9NMxKyn73p5zY8bMQb17r8YkORvxXGgzgS Labx7PLvd3aXqcO/iuVCNL2jJnW72oIyMx+dYmhGIw1oNtrULXIB2Tb0k8q5u67dn/UY HDcldlohX/6aTnZsUpFKW9iYa/NAHtQv5Wg1D/8yeYnK0ADDbSinKKL7Mkb0fuiHU7RJ tJSI8RSGaXpfagkdejulYMl4HsVsaD7f6lgg16rUk+3JxASeXmLonCvxuQkB+YaN9SIK 0t+Hbun8UggvxcbBlTxzDxoICdCP30EjkJZ/wcEnM+j+Wk4SF168U+cOih4sw2FxBhMC 32rw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=xbAGox+1BQp1J2SCvFRdM4+gVemYBYzERzmxGZTZFIk=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=1GzXGICGPR6f85oj4IgvwcISFTwg4fm/kT85RxB69q+KYaZt4LVm6oOVQTs3+4PDpo IPDMYkivd3hjbcTN8UDQEfc8vG84ISXqJ2CHH8Z8Mp+nSl3KD22vNLsK1RsSyvLwCn3/ iC9mJLqzQfE9jdb2HF639OjM0574THIqQtGeL/d5RrBZTLpjVL8JLNgYiRMwFA6VCGEG Ycn+qpYq7JxKKdntnlBY1rNUELfZcNDlnkEHtQEHOo3YwHE7rU1tcCMNxYvmcXlMkomZ S4pERAtG98abiIiKq6F8+IKXF7Cfb+fCT3lJfzaUD6uwwlaAZ/69089zmntnC2bvFEQx 6JVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=S9XslvuI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id a71-20020a63904a000000b00577f59c8a16si3100987pge.151.2023.09.15.04.08.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 04:08:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=S9XslvuI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 939A783FF74E; Fri, 15 Sep 2023 04:00:34 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234065AbjIOLAG (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234172AbjIOK7z (ORCPT ); Fri, 15 Sep 2023 06:59:55 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2DD9418D for ; Fri, 15 Sep 2023 03:59:49 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-59bdae1ef38so44827907b3.1 for ; Fri, 15 Sep 2023 03:59:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775588; x=1695380388; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=xbAGox+1BQp1J2SCvFRdM4+gVemYBYzERzmxGZTZFIk=; b=S9XslvuI5gBbHvYu8f7Ky91OO+Y1hZncxlL21uUgx5MO/6rdEDpG76N2oSViFvwES8 pfJbPgJgmsNYUMj+iKxbvzeqx29PMB1MVrDAmj6Tgtz2NgLTjhhO1PtVSIPHwqz3meOm QBRMSCuftb9Jei3z8+MmglO2IAPtTm1gdJEyp2c8FRobWY8VwYMQc9IbKSLvNzxFzKFk R0WJByLuE59mANduQXQA7w4DguI7zC2ac+6ZZ7xmgMkeAFlABS2LXCaseJC9jU38vZ3H 8fI/jfB0pgdQaVGTdAveXUqn2MZ9Zqj/Gqh9pvIHrh0Exc8SVaRwFXT6zMTVtIAFK/c3 /QIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775588; x=1695380388; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=xbAGox+1BQp1J2SCvFRdM4+gVemYBYzERzmxGZTZFIk=; b=j5rDuq5QDhY5F76Owwyh8aLIEk34NEmulmxS+N8ILJridEEGEaHnXadSF+fDbDfwjI mASSK7G066Q/8NThkrnx8SSoVIqM9SQMrv5Jv1WjfB91ihBk+Nz57OdXFaRKz4sMVIU+ 4v5de/PagnjB7lhRl3UBWXQ7dQHRx3R8jWzpoFcN8kd8CXL5AzfC87T0ff4QWHruiFfx EMqRY8tDvC7E51bkb/iuIEzGKQBF86um4SP7cIxBmxNQ/Ym2ViwIgQcRqrTA9tHFkD2T 77qd+2QZ+oDz8ejVBdE16I5y5zAa7FRzCmGXQb1i8PGvJyLHuNz89Zyt+p7biEYbrvbN tdNQ== X-Gm-Message-State: AOJu0Yw8JlBChpdcH/Hs9HEyXMcIMejeF4hLjYgumUENi+RRTL2X8RY4 5LsGMMJXQ6P7dNi3ErlftzgIYk6Mz1yA2yyGFA== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a05:6902:11cd:b0:d81:5c03:df99 with SMTP id n13-20020a05690211cd00b00d815c03df99mr40387ybu.3.1694775588301; Fri, 15 Sep 2023 03:59:48 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:23 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-5-matteorizzo@google.com> Subject: [RFC PATCH 04/14] mm: use virt_to_slab instead of folio_slab From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:00:35 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777101577327483519 X-GMAIL-MSGID: 1777101577327483519 From: Jann Horn This is refactoring in preparation for the introduction of SLAB_VIRTUAL which does not implement folio_slab. With SLAB_VIRTUAL there is no longer a 1:1 correspondence between slabs and pages of physical memory used by the slab allocator. There is no way to look up the slab which corresponds to a specific page of physical memory without iterating over all slabs or over the page tables. Instead of doing that, we can look up the slab starting from its virtual address which can still be performed cheaply with both SLAB_VIRTUAL enabled and disabled. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- mm/memcontrol.c | 2 +- mm/slab_common.c | 12 +++++++----- mm/slub.c | 14 ++++++-------- 3 files changed, 14 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e8ca4bdcb03c..0ab9f5323db7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2936,7 +2936,7 @@ struct mem_cgroup *mem_cgroup_from_obj_folio(struct folio *folio, void *p) struct slab *slab; unsigned int off; - slab = folio_slab(folio); + slab = virt_to_slab(p); objcgs = slab_objcgs(slab); if (!objcgs) return NULL; diff --git a/mm/slab_common.c b/mm/slab_common.c index 79102d24f099..42ceaf7e9f47 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1062,13 +1062,13 @@ void kfree(const void *object) if (unlikely(ZERO_OR_NULL_PTR(object))) return; - folio = virt_to_folio(object); if (unlikely(!is_slab_addr(object))) { + folio = virt_to_folio(object); free_large_kmalloc(folio, (void *)object); return; } - slab = folio_slab(folio); + slab = virt_to_slab(object); s = slab->slab_cache; __kmem_cache_free(s, (void *)object, _RET_IP_); } @@ -1089,12 +1089,13 @@ EXPORT_SYMBOL(kfree); size_t __ksize(const void *object) { struct folio *folio; + struct kmem_cache *s; if (unlikely(object == ZERO_SIZE_PTR)) return 0; - folio = virt_to_folio(object); if (unlikely(!is_slab_addr(object))) { + folio = virt_to_folio(object); if (WARN_ON(folio_size(folio) <= KMALLOC_MAX_CACHE_SIZE)) return 0; if (WARN_ON(object != folio_address(folio))) @@ -1102,11 +1103,12 @@ size_t __ksize(const void *object) return folio_size(folio); } + s = virt_to_slab(object)->slab_cache; #ifdef CONFIG_SLUB_DEBUG - skip_orig_size_check(folio_slab(folio)->slab_cache, object); + skip_orig_size_check(s, object); #endif - return slab_ksize(folio_slab(folio)->slab_cache); + return slab_ksize(s); } void *kmalloc_trace(struct kmem_cache *s, gfp_t gfpflags, size_t size) diff --git a/mm/slub.c b/mm/slub.c index df2529c03bd3..ad33d9e1601d 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3848,25 +3848,23 @@ int build_detached_freelist(struct kmem_cache *s, size_t size, { int lookahead = 3; void *object; - struct folio *folio; + struct slab *slab; size_t same; object = p[--size]; - folio = virt_to_folio(object); + slab = virt_to_slab(object); if (!s) { /* Handle kalloc'ed objects */ - if (unlikely(!folio_test_slab(folio))) { - free_large_kmalloc(folio, object); + if (unlikely(slab == NULL)) { + free_large_kmalloc(virt_to_folio(object), object); df->slab = NULL; return size; } - /* Derive kmem_cache from object */ - df->slab = folio_slab(folio); - df->s = df->slab->slab_cache; + df->s = slab->slab_cache; } else { - df->slab = folio_slab(folio); df->s = cache_from_obj(s, object); /* Support for memcg */ } + df->slab = slab; /* Start new detached freelist */ df->tail = object; From patchwork Fri Sep 15 10:59:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140435 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1044851vqi; Fri, 15 Sep 2023 06:26:09 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFihRanbPHCWu7SGh+gX9ZFrBnkYXe6HxSqB9QHZvrHus+P0ADfqYs2TBpWgnNxhyq4BN24 X-Received: by 2002:a05:6a00:124b:b0:68f:e810:e894 with SMTP id u11-20020a056a00124b00b0068fe810e894mr1690257pfi.33.1694784369176; Fri, 15 Sep 2023 06:26:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694784369; cv=none; d=google.com; s=arc-20160816; b=FGQdGNfkFH5ZPb66gv9hRo9XEiclWDxEaEAnsgxDK92aC8zqN5+gKMhWDcIsMpGm8s hQmTR2Fsm1YDcy50gURNgb9/QVWGa2N5RUZoy4Xm+oRvfgovXGPGTk/U7/566uuhWdZw WnUcAdTshGjaFyQeORqXBvu5JxdZL9kt4BtsQOjSrWkiq9onAXSduPOBXyRRh3c+x5fJ FMOH7w2ACoh1rYfHOOcahLHFnR7sMznZXNaxi/6LC8YWZba5AolnPvyqBKfhxg/rbbbW tOxGk50GT2X+/wT8T/s/kuiNXR6pPPH0tlGdYwoSJb5t0K1OOwwssIpzLwwFE5IVWLoT EasA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=jWKqeflunudoGymWmoNHHzgvD8T7OkY46quv3dOXpy0=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=USZd7fxROef/HMmwMOC3p3FL2WaOWt8Z/ae15jURYxfhbBqHSyybyE4CaDofrzcKWo iLucUTNfrTyJAwT6BS2C3mN2xEgu2EJCV1W4sVuEeCtc+tDnPxM/ERxonQco1zJKIrj3 /gKTshOyh1CXJwPvpgTokwNyf1TwMA8uD5gEPsLG3b/j60jBvRR6SmjNOuL2y8QQyZhC vOEHy3UBfTDA3AWPoSd9VLOernYkC+e5WAWVHvtA3+RZedsKMV0iZ6EVDFTfRSvn4n4R DM/pqVjC7TMCBJkJnOaHtlNZsl+20YZhWnewQuWpz/iPjs7WVvTkP8pySs0uRTNcaeoz PzHA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=y7rwNgx5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id j6-20020a63fc06000000b00565df122f43si3237758pgi.202.2023.09.15.06.26.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 06:26:09 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=y7rwNgx5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id C71D8823626C; Fri, 15 Sep 2023 04:00:27 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234330AbjIOLAL (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234335AbjIOLAF (ORCPT ); Fri, 15 Sep 2023 07:00:05 -0400 Received: from mail-ej1-x64a.google.com (mail-ej1-x64a.google.com [IPv6:2a00:1450:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 44335189 for ; Fri, 15 Sep 2023 03:59:52 -0700 (PDT) Received: by mail-ej1-x64a.google.com with SMTP id a640c23a62f3a-94a35b0d4ceso145489966b.3 for ; Fri, 15 Sep 2023 03:59:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775590; x=1695380390; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jWKqeflunudoGymWmoNHHzgvD8T7OkY46quv3dOXpy0=; b=y7rwNgx5k+DXr11V2BHWhy4llRFlcX1PV5xnO8aUypv9+5X3Q1RLH6DBPt5Ei6aH/q MxB3tr91+IBbtWdSO/Iz60Ku74D9gg4vdywXgL7Kdwzd0JxS7ke/fhSDQEP3g57rh4dy lwKdhFHtZoGTAVhH9eQRNbWgKxcSO2VGa4oI9/yXHSCmM1u6xHDDSg7lZyjwRmqcmDZt mXjdu88rV0iq2Ilc4zAVdsNx9jp/dk3Ytqc+GcVimQFc1xJnw/GSGP6MxLx3iyQFyO47 yd+xZQuBp0PT/s0HZufJpypXiSURxaYzoSiDjTvy0d6q++ttBaNr9Dcm5fvlCM6Zhdqq bgYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775590; x=1695380390; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jWKqeflunudoGymWmoNHHzgvD8T7OkY46quv3dOXpy0=; b=lUMKXh+d1waWUecK8zb2EWDX78PGi/gbIjPRMiZR7zdJlxq5RujmSlevWIAmjQ27V2 sRybpVyS8OyFaC5FgdcLuizczwDtgdyaxWfaOH9wgAQQT1eVzO041SAzS0dZdCW3lLCu I7krtZNS3l8W1eCUDqe7nPvsbVW2rDAFJuLRu6BblvVl486kErbYJZrQfr6t9UEp5IUZ 2lXQTGX7ca1i4xZD31H4o8Ik9QLgdjFA1ivn1GUSbyMH0mXrdurKXqEs81KuKXRLB/mT sZtBOZitsZqKagH9/hHBBaA+tLgS+BSREivigqyPNA0naYdRowlGip0yS8OrEAbRYZVZ FVCA== X-Gm-Message-State: AOJu0YyDz6JiF+WS6alWMsrZDU88DUF4SxNaGB7A10J0u+tUtqNojTMA CeUuDeQOfoaAfDAxltjQn3L1kKf7Cwmk4nfSGg== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a17:907:890:b0:9ad:a751:2ea3 with SMTP id zt16-20020a170907089000b009ada7512ea3mr6741ejb.6.1694775590539; Fri, 15 Sep 2023 03:59:50 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:24 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-6-matteorizzo@google.com> Subject: [RFC PATCH 05/14] mm/slub: create folio_set/clear_slab helpers From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:00:27 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777110214272653265 X-GMAIL-MSGID: 1777110214272653265 From: Jann Horn This is refactoring in preparation for SLAB_VIRTUAL. Extract this code to separate functions so that it's not duplicated in the code that allocates and frees page with SLAB_VIRTUAL enabled. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- mm/slub.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index ad33d9e1601d..9b87afade125 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1849,6 +1849,26 @@ static void *setup_object(struct kmem_cache *s, void *object) /* * Slab allocation and freeing */ + +static void folio_set_slab(struct folio *folio, struct slab *slab) +{ + __folio_set_slab(folio); + /* Make the flag visible before any changes to folio->mapping */ + smp_wmb(); + + if (folio_is_pfmemalloc(folio)) + slab_set_pfmemalloc(slab); +} + +static void folio_clear_slab(struct folio *folio, struct slab *slab) +{ + __slab_clear_pfmemalloc(slab); + folio->mapping = NULL; + /* Make the mapping reset visible before clearing the flag */ + smp_wmb(); + __folio_clear_slab(folio); +} + static inline struct slab *alloc_slab_page(gfp_t flags, int node, struct kmem_cache_order_objects oo) { @@ -1865,11 +1885,7 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node, return NULL; slab = folio_slab(folio); - __folio_set_slab(folio); - /* Make the flag visible before any changes to folio->mapping */ - smp_wmb(); - if (folio_is_pfmemalloc(folio)) - slab_set_pfmemalloc(slab); + folio_set_slab(folio, slab); return slab; } @@ -2067,11 +2083,7 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab) int order = folio_order(folio); int pages = 1 << order; - __slab_clear_pfmemalloc(slab); - folio->mapping = NULL; - /* Make the mapping reset visible before clearing the flag */ - smp_wmb(); - __folio_clear_slab(folio); + folio_clear_slab(folio, slab); mm_account_reclaimed_pages(pages); unaccount_slab(slab, order, s); __free_pages(&folio->page, order); From patchwork Fri Sep 15 10:59:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140557 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1196665vqi; Fri, 15 Sep 2023 10:08:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEiAh7nzriZYjkJ29HbOW4AapNcv8HlVH3EUNJobGEt26O9bSBE4Xxkn/lVGGJrjxYkWUL2 X-Received: by 2002:a17:902:ce84:b0:1be:e873:38b0 with SMTP id f4-20020a170902ce8400b001bee87338b0mr2549613plg.59.1694797735221; Fri, 15 Sep 2023 10:08:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694797735; cv=none; d=google.com; s=arc-20160816; b=xEXA49C4FKlc4y8WsfbQHSv8NG6kRYqSasiyQXnkXWiYFDL0yVZEO0qqOUrffn51hR g2ewBaXNh35n+DicfllhqnKE1UDP6ByIxv0+xm39WkpHaru5/VMm0gYIu3ld2ixv4y+j pA/kaE2ySqn+BboBAWpXB4LT90Ya0rRSru1xGimbmR6ky1F6CeEU2iKrs/xxwxShEsC1 15HzoV1FnDYvedrTg/Ss3k9+/76nTUH5G6VgHept779uiPOQ7CsLP8bTB15Q0veVW7K8 kiYZKxmKU5O7Rg9XGRYndgqZpK6em+BHwnYkj+yp7qh17OVzUWMoIxVzaI+W4OdpaF06 NVcg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=h/q3ouMwQ7IDAIohC0sCavv+JpOlHz5JpjrFwf8Yaow=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=WWBKbrf9EPOcJ1yt0IWPAqY3Dpzn1zEp2ddOH2MgqfTHCRJmdLJQLHW+VK2vd8VsAa QiVbM0UG1qrWavUtR4kggIapavhyHUPGrEXnzX5ZtIn4r2m0tUbltLCBB1eEtIeO5zo6 68pmAyzUM83bccUuZcSbcidnhUL6sVgbPcGxo+Xi6sLGVIXn992fOHjxErY9QVQXU8A4 synaCPFFvbGyR0+cX+o0EyOzx50CAOhZE/F2khW7/gnuKIRoiWMn+UXdSIUsC+C/x0Jv JchbSx+COMMV0UNhW3aM/ft2bElwkJzBZ1xTh4C6dltQvSD1eEcQjiyYQTnouQiNowpR 9kZA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=kf7ywg1c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id n9-20020a170903110900b001bdc664cd69si3819720plh.153.2023.09.15.10.08.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 10:08:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=kf7ywg1c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 1EDDE808EF70; Fri, 15 Sep 2023 04:00:29 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234325AbjIOLAV (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234308AbjIOLAJ (ORCPT ); Fri, 15 Sep 2023 07:00:09 -0400 Received: from mail-ej1-x64a.google.com (mail-ej1-x64a.google.com [IPv6:2a00:1450:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D68C310FA for ; Fri, 15 Sep 2023 03:59:54 -0700 (PDT) Received: by mail-ej1-x64a.google.com with SMTP id a640c23a62f3a-9a647551b7dso316247266b.1 for ; Fri, 15 Sep 2023 03:59:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775593; x=1695380393; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=h/q3ouMwQ7IDAIohC0sCavv+JpOlHz5JpjrFwf8Yaow=; b=kf7ywg1cK7qdtNseJfNJII+vGPd7GqIfWF5yIxge0QWHLNSZcmfhkFdLuP1xmndcc7 nSBFVW6bbXgqYdZF44TryJrX6lIXD0llXx/q4W7/HIK/KZL9UHx9DRlyvIlyYDw4ikIO 9J86p55d8HaGcnaJT81C9lcK4VDFZoSc9Y0DKt1vNPhkgeABErHaTpvAmsOy54Qs6P3v TXIZFA7nvgHo6K/GkBdhDBToPm/1JwRQB8rbxbt7C/wuYEYeDJ7WAEJYXuhuTMnr1Irw cN37ZbwXueaajQEwjv3mx5ecKdvPGuUxp7fRsw6UgXKUQKlzyp5JhwbOB3tozbjRyo+N Du3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775593; x=1695380393; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=h/q3ouMwQ7IDAIohC0sCavv+JpOlHz5JpjrFwf8Yaow=; b=BGCYePHkIP7ULxTUXW04oTnqBXs/HaPG9nIjA/AqhP5osCOprySro+fNu0asAVJFCE n3SHCrrPG/fuFZQG/TrUW11CCjDWlfPytWhR2WjXO4SIlmYdRhL3E7F9fPtsxKDm14fg eLHyJ3QtRseBFDuZN0S8yXTvjYp/Frvwr69yhZP4UJ9sS4Iwm2xBeseCYtkebtoVPpdD dvscyP6pCelcJxxbGxWEwO6hARXPcW3Q4fggslu/zocBSUppiPrZABprv+de/kjWxMKx 7C6Gc6j/trEe+mCFqnCYkm/jB+wYDB4Uj9wWWZAeaApCS8AorUIadqmr8qoYiPrktMWU zpfg== X-Gm-Message-State: AOJu0YxBTfmFHg1s+0LCf3yr1scz+TzLIfH6RcyDbNJWFzNTVwUYc9CQ eQ7tFkzHcZ2G6m4djCMOUTu7Rnb/RHHluClUaA== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a17:907:c9aa:b0:9ad:c79d:2a20 with SMTP id uj42-20020a170907c9aa00b009adc79d2a20mr11116ejc.1.1694775593399; Fri, 15 Sep 2023 03:59:53 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:25 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-7-matteorizzo@google.com> Subject: [RFC PATCH 06/14] mm/slub: pass additional args to alloc_slab_page From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:00:29 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777124230068031338 X-GMAIL-MSGID: 1777124230068031338 From: Jann Horn This is refactoring in preparation for SLAB_VIRTUAL. The implementation of SLAB_VIRTUAL needs access to struct kmem_cache in alloc_slab_page in order to take unused slabs from the slab freelist, which is per-cache. In addition to that it passes two different sets of GFP flags. meta_gfp_flags is used for the memory backing the metadata region and page tables, and gfp_flags for the data memory. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- mm/slub.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index 9b87afade125..eaa1256aff89 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1869,7 +1869,8 @@ static void folio_clear_slab(struct folio *folio, struct slab *slab) __folio_clear_slab(folio); } -static inline struct slab *alloc_slab_page(gfp_t flags, int node, +static inline struct slab *alloc_slab_page(struct kmem_cache *s, + gfp_t meta_flags, gfp_t flags, int node, struct kmem_cache_order_objects oo) { struct folio *folio; @@ -2020,7 +2021,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) if ((alloc_gfp & __GFP_DIRECT_RECLAIM) && oo_order(oo) > oo_order(s->min)) alloc_gfp = (alloc_gfp | __GFP_NOMEMALLOC) & ~__GFP_RECLAIM; - slab = alloc_slab_page(alloc_gfp, node, oo); + slab = alloc_slab_page(s, flags, alloc_gfp, node, oo); if (unlikely(!slab)) { oo = s->min; alloc_gfp = flags; @@ -2028,7 +2029,7 @@ static struct slab *allocate_slab(struct kmem_cache *s, gfp_t flags, int node) * Allocation may have failed due to fragmentation. * Try a lower order alloc if possible */ - slab = alloc_slab_page(alloc_gfp, node, oo); + slab = alloc_slab_page(s, flags, alloc_gfp, node, oo); if (unlikely(!slab)) return NULL; stat(s, ORDER_FALLBACK); From patchwork Fri Sep 15 10:59:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140642 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1277985vqi; Fri, 15 Sep 2023 12:39:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHppUZmgB7gn5yQfZPGSD9JGG5Av3wvk+fQeUSDhPuoOqQzsOxeqbhYcl2lpohXX22A2whu X-Received: by 2002:a05:6870:230f:b0:1d5:a3f5:e93 with SMTP id w15-20020a056870230f00b001d5a3f50e93mr2797635oao.17.1694806768348; Fri, 15 Sep 2023 12:39:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694806768; cv=none; d=google.com; s=arc-20160816; b=FcyEIwH6Lrkh6DzCWqIXpL7EQGGVezyFQ4k7wd3J6TZIOew87OAG2c0i3FOhxgYj8j NuzLI3V28Q+cfDIlH/DuYk7OG7oVZT6u5uF+fFVPSqxu7AnvLi50dDj7OfF04b79utvQ P8/ifMnwS3ichFz916HY+FamPFwp5W/kP8Zt/5TyQfJRvd1syV6HtSWjaD14dfpPYRxi S6cIB0nzhTdxZEPahIbo7LKk1h/o3XVNp8CUSnw5SG0lM6fRw5qdiHDmnr4KQ0qrrF1r Z1HnzEmeE8W+uLKwDvcB4yc3onCTQo2SVOswTyzKuXbnzKwIq8qyRGfbdlLHVegO1ltO XzAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=rOse4MLSxLAQnwPck5JwfSnTM0/2w8+aiHpu8xQSJg0=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=E3efyWM0UJO0Tk8qffWGeZXfIXO2USgJ/4+mjKSetKk3It6datTp5Em8UZkPTy7di7 4kU9jg+gs4baoAv/tH0nGJG7o8F0gzZ5xn6iW9kNFIwweMXDJpd0TjWdYBGAcMY4Gqm+ AII/VQNCGRwroQ55RA+vOjvTi3r1AxlCo9ccwpEPWFnmQohQx6540Hl/bO1KP4lqpD6d X50M8QA7MCjOkz1WfTzjwoRe48GeasiPhmvguCP/lf1R4QLJyVvOWlKsUnPlpY4fKZzA jtBi2dmv9THi4ixB++TYVN2wUrE0Ebl8CCJKpMz2RgxqHAAYMD07iqxdxMGFbwfdKjdd 5OnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=wEBLi8Pq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id j190-20020a638bc7000000b00570505c5267si3685955pge.262.2023.09.15.12.39.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 12:39:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=wEBLi8Pq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id CEF838328AF1; Fri, 15 Sep 2023 04:00:37 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234360AbjIOLAZ (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234384AbjIOLAS (ORCPT ); Fri, 15 Sep 2023 07:00:18 -0400 Received: from mail-ed1-x549.google.com (mail-ed1-x549.google.com [IPv6:2a00:1450:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D0541BC5 for ; Fri, 15 Sep 2023 03:59:57 -0700 (PDT) Received: by mail-ed1-x549.google.com with SMTP id 4fb4d7f45d1cf-5219ceead33so1437008a12.2 for ; Fri, 15 Sep 2023 03:59:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775596; x=1695380396; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rOse4MLSxLAQnwPck5JwfSnTM0/2w8+aiHpu8xQSJg0=; b=wEBLi8Pqw0u7yFLZgv46Dx9+Atqe+oud6meqog8I47zKzg8Eskqs+OurJC/F0QHH45 wC/4SdPnnxcNVgcrFJF6UWoxu8QVGVxFeOe3eYmSBvIXfDmmiogJiJzVcw0LPdp3ily8 SF1bTzPLRBiAxRe2HzPkoIDRIHlffQcWukdD44lbwhoc9ygdlZldIbS7PMwJbCnQk91G czbVaEeUnaqpGOxPa9MYmRiuSYSwPdPTmBJ6KBSetdpXmAYYlYElT+1FJmRpVQ5lVtjE YLL6tuk0AAPj/zRSdDBxJ1mZcxaxhpl7YDu1dOOJLSQrohRSE+b51uH3V1mKbZYrmbLo Qi5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775596; x=1695380396; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rOse4MLSxLAQnwPck5JwfSnTM0/2w8+aiHpu8xQSJg0=; b=kBTRm5JQBLQxkeO19OlRl070vy//z1JOgbZloDm8jyqTlrft+7QWDwxp3vm3j5aIDW frrA0HuSE9Hg4Fqov/mAH6caA/TOuebmq+DrAAfdmR2NnpV7uE41HEs7IHO69Nkc09VU qlW2KEnzuwrS5VrAjLRgDplQh8bh9E9Wq5BWT53mVNoZrawqpeizgnAb6gcARqQ5Tq8R ijIhLN3jtmE6BjZiJpDMrzdg9vK2WA6l8qNTRb11MrUXmTzDf7ilCUHQk9QCXednwQdB 5eoB56aE1bksyfumm1XZfViyBP141g1J5PjzrDddqyKZg5eq2+liNF+SneZOhYQoFzde jK5Q== X-Gm-Message-State: AOJu0Yx8zz3l8Lv2iArcNC+poEpPSYwTRZWPyvwGAkEyczkon5/RpPYt +LJuPi2niHt1sMkgZhQrOLWQyO+Qgkmqv/oNNQ== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a50:baa4:0:b0:525:442c:2e5d with SMTP id x33-20020a50baa4000000b00525442c2e5dmr9427ede.6.1694775595758; Fri, 15 Sep 2023 03:59:55 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:26 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-8-matteorizzo@google.com> Subject: [RFC PATCH 07/14] mm/slub: pass slab pointer to the freeptr decode helper From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:00:38 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777133701772092658 X-GMAIL-MSGID: 1777133701772092658 From: Jann Horn This is refactoring in preparation for checking freeptrs for corruption inside freelist_ptr_decode(). Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo --- mm/slub.c | 43 +++++++++++++++++++++++-------------------- 1 file changed, 23 insertions(+), 20 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index eaa1256aff89..42e7cc0b4452 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -383,7 +383,8 @@ static inline freeptr_t freelist_ptr_encode(const struct kmem_cache *s, } static inline void *freelist_ptr_decode(const struct kmem_cache *s, - freeptr_t ptr, unsigned long ptr_addr) + freeptr_t ptr, unsigned long ptr_addr, + struct slab *slab) { void *decoded; @@ -395,7 +396,8 @@ static inline void *freelist_ptr_decode(const struct kmem_cache *s, return decoded; } -static inline void *get_freepointer(struct kmem_cache *s, void *object) +static inline void *get_freepointer(struct kmem_cache *s, void *object, + struct slab *slab) { unsigned long ptr_addr; freeptr_t p; @@ -403,7 +405,7 @@ static inline void *get_freepointer(struct kmem_cache *s, void *object) object = kasan_reset_tag(object); ptr_addr = (unsigned long)object + s->offset; p = *(freeptr_t *)(ptr_addr); - return freelist_ptr_decode(s, p, ptr_addr); + return freelist_ptr_decode(s, p, ptr_addr, slab); } #ifndef CONFIG_SLUB_TINY @@ -424,18 +426,19 @@ static void prefetch_freepointer(const struct kmem_cache *s, void *object) * get_freepointer_safe() returns initialized memory. */ __no_kmsan_checks -static inline void *get_freepointer_safe(struct kmem_cache *s, void *object) +static inline void *get_freepointer_safe(struct kmem_cache *s, void *object, + struct slab *slab) { unsigned long freepointer_addr; freeptr_t p; if (!debug_pagealloc_enabled_static()) - return get_freepointer(s, object); + return get_freepointer(s, object, slab); object = kasan_reset_tag(object); freepointer_addr = (unsigned long)object + s->offset; copy_from_kernel_nofault(&p, (freeptr_t *)freepointer_addr, sizeof(p)); - return freelist_ptr_decode(s, p, freepointer_addr); + return freelist_ptr_decode(s, p, freepointer_addr, slab); } static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp) @@ -627,7 +630,7 @@ static void __fill_map(unsigned long *obj_map, struct kmem_cache *s, bitmap_zero(obj_map, slab->objects); - for (p = slab->freelist; p; p = get_freepointer(s, p)) + for (p = slab->freelist; p; p = get_freepointer(s, p, slab)) set_bit(__obj_to_index(s, addr, p), obj_map); } @@ -937,7 +940,7 @@ static void print_trailer(struct kmem_cache *s, struct slab *slab, u8 *p) print_slab_info(slab); pr_err("Object 0x%p @offset=%tu fp=0x%p\n\n", - p, p - addr, get_freepointer(s, p)); + p, p - addr, get_freepointer(s, p, slab)); if (s->flags & SLAB_RED_ZONE) print_section(KERN_ERR, "Redzone ", p - s->red_left_pad, @@ -1230,7 +1233,7 @@ static int check_object(struct kmem_cache *s, struct slab *slab, return 1; /* Check free pointer validity */ - if (!check_valid_pointer(s, slab, get_freepointer(s, p))) { + if (!check_valid_pointer(s, slab, get_freepointer(s, p, slab))) { object_err(s, slab, p, "Freepointer corrupt"); /* * No choice but to zap it and thus lose the remainder @@ -1298,7 +1301,7 @@ static int on_freelist(struct kmem_cache *s, struct slab *slab, void *search) break; } object = fp; - fp = get_freepointer(s, object); + fp = get_freepointer(s, object, slab); nr++; } @@ -1810,7 +1813,7 @@ static inline bool slab_free_freelist_hook(struct kmem_cache *s, object = next; /* Single objects don't actually contain a freepointer */ if (object != old_tail) - next = get_freepointer(s, object); + next = get_freepointer(s, object, virt_to_slab(object)); /* If object's reuse doesn't have to be delayed */ if (!slab_free_hook(s, object, slab_want_init_on_free(s))) { @@ -2161,7 +2164,7 @@ static void *alloc_single_from_partial(struct kmem_cache *s, lockdep_assert_held(&n->list_lock); object = slab->freelist; - slab->freelist = get_freepointer(s, object); + slab->freelist = get_freepointer(s, object, slab); slab->inuse++; if (!alloc_debug_processing(s, slab, object, orig_size)) { @@ -2192,7 +2195,7 @@ static void *alloc_single_from_new_slab(struct kmem_cache *s, object = slab->freelist; - slab->freelist = get_freepointer(s, object); + slab->freelist = get_freepointer(s, object, slab); slab->inuse = 1; if (!alloc_debug_processing(s, slab, object, orig_size)) @@ -2517,7 +2520,7 @@ static void deactivate_slab(struct kmem_cache *s, struct slab *slab, freelist_tail = NULL; freelist_iter = freelist; while (freelist_iter) { - nextfree = get_freepointer(s, freelist_iter); + nextfree = get_freepointer(s, freelist_iter, slab); /* * If 'nextfree' is invalid, it is possible that the object at @@ -2944,7 +2947,7 @@ static inline bool free_debug_processing(struct kmem_cache *s, /* Reached end of constructed freelist yet? */ if (object != tail) { - object = get_freepointer(s, object); + object = get_freepointer(s, object, slab); goto next_object; } checks_ok = true; @@ -3173,7 +3176,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, * That slab must be frozen for per cpu allocations to work. */ VM_BUG_ON(!c->slab->frozen); - c->freelist = get_freepointer(s, freelist); + c->freelist = get_freepointer(s, freelist, c->slab); c->tid = next_tid(c->tid); local_unlock_irqrestore(&s->cpu_slab->lock, flags); return freelist; @@ -3275,7 +3278,7 @@ static void *___slab_alloc(struct kmem_cache *s, gfp_t gfpflags, int node, * For !pfmemalloc_match() case we don't load freelist so that * we don't make further mismatched allocations easier. */ - deactivate_slab(s, slab, get_freepointer(s, freelist)); + deactivate_slab(s, slab, get_freepointer(s, freelist, slab)); return freelist; } @@ -3377,7 +3380,7 @@ static __always_inline void *__slab_alloc_node(struct kmem_cache *s, unlikely(!object || !slab || !node_match(slab, node))) { object = __slab_alloc(s, gfpflags, node, addr, c, orig_size); } else { - void *next_object = get_freepointer_safe(s, object); + void *next_object = get_freepointer_safe(s, object, slab); /* * The cmpxchg will only match if there was no additional @@ -3984,7 +3987,7 @@ static inline int __kmem_cache_alloc_bulk(struct kmem_cache *s, gfp_t flags, continue; /* goto for-loop */ } - c->freelist = get_freepointer(s, object); + c->freelist = get_freepointer(s, object, c->slab); p[i] = object; maybe_wipe_obj_freeptr(s, p[i]); } @@ -4275,7 +4278,7 @@ static void early_kmem_cache_node_alloc(int node) init_tracking(kmem_cache_node, n); #endif n = kasan_slab_alloc(kmem_cache_node, n, GFP_KERNEL, false); - slab->freelist = get_freepointer(kmem_cache_node, n); + slab->freelist = get_freepointer(kmem_cache_node, n, slab); slab->inuse = 1; kmem_cache_node->node[node] = n; init_kmem_cache_node(n); From patchwork Fri Sep 15 10:59:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140404 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1011581vqi; Fri, 15 Sep 2023 05:38:32 -0700 (PDT) X-Google-Smtp-Source: AGHT+IELokFbnylsW8nN/w+CXSP/I9lw/hkW6b0Vzct7ZwSoCFLif4TUDuxJhffuedBJ/zQnK0Bj X-Received: by 2002:a05:6a20:8e0f:b0:134:d4d3:f0a8 with SMTP id y15-20020a056a208e0f00b00134d4d3f0a8mr1949561pzj.3.1694781512534; Fri, 15 Sep 2023 05:38:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694781512; cv=none; d=google.com; s=arc-20160816; b=JG688ksOakHu4vw6y59jsvwwtcjun5auPqvJouLY2K+IRSN4Z4m3OvkY1mEGOeWNfp SE+THLPgiPjUBiqk/rLKnsZJV51L+yqagEENTdKonhwTYfFqbQdSHrwdNJJmb3NjgYii FV8l9xYIcf0miNf5nxYWWl2xPDC0ktwocXjNUcuqSCyfAqG7MEzkrZvr8svPkeeCh8OO h6sFNXDYQYX+ay2YdUQ+7teN/qoOANBKghjGe+cmv5eXYJb74k6Eo+L1FeYqr5LnCivZ beaZDAoP1c56ju6Eya+8F07QY1Sgt62JVeXXYO7y1Y4dtUEiAnppIUcqXt6LDGcd+wHc uPTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=PIsfdB6xTDxZC/lwkP7uk8Z8Jcdw9pIYLJmg+xNA39E=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=wA2r55d7SWUVyKmO5Vv6+810wJeu0vFnrkwAx03xOmRBDmeYYNdppzTsMsDqyON3H0 X6eKwjhZ4PTOcYGvSPtZLLA30XlgiYmEYj9ejrRndLpi1dNXGpmLarSluIBevZHTVctP z5lNUmSoSZidYF+8Rp14e7apy8t2nX/VGAIT1G5J+QIg+1mhlq3vperMD9o2ofo8MGpe clIc3a+YdAXhlXzWsCyy8djiU2F0B8/PZBauOhTfk0y5wSLIl6hbkC9Im8hOHMHJrCWb l53bjYiEefjpGy+lBpIjwjxpdbBshFWI2OJ///FL3ICfVjkDWC1PJvBVchsqoemnMEqV 2XAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=2vUDZjgv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from fry.vger.email (fry.vger.email. [23.128.96.38]) by mx.google.com with ESMTPS id cp16-20020a056a00349000b0068ffd96e1d4si3261845pfb.165.2023.09.15.05.38.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 05:38:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) client-ip=23.128.96.38; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=2vUDZjgv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.38 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 2FD82801DD89; Fri, 15 Sep 2023 04:01:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234354AbjIOLA3 (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36734 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234311AbjIOLAY (ORCPT ); Fri, 15 Sep 2023 07:00:24 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C8BD1BF2 for ; Fri, 15 Sep 2023 03:59:59 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d817775453dso2243041276.2 for ; Fri, 15 Sep 2023 03:59:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775598; x=1695380398; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PIsfdB6xTDxZC/lwkP7uk8Z8Jcdw9pIYLJmg+xNA39E=; b=2vUDZjgvuJgT6Dxl96lpPH2wdVDJTwodMyV+fgNSXrXWT39dB78Ag5ylWhOB6Ng213 VcaopasHaEJZl7Q1UrT0FoVUJCkrcCSlnxLz7I8g0tcGM6BG0T7LAszHfQj3JZ6ZpKbB YQR+GK0WpNc27Qu6Wi0ouGX/KiDXdd6qZIHy75+0tPXuhBa2bXos3QzUmgA4BAbqb78o YaGcHh2ENF3PkXvYE8zSppYsW/skio12alDCH5V3jj4Sxp/sGgS+O87Dy3Hu9twJf+co uO2aPSyDPkJqlrjaJ4Uedwr484h3CKPnSQ2RQH4Wn0O+uqhfmz8oQ7pjmOk955rdW4KC XhjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775598; x=1695380398; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PIsfdB6xTDxZC/lwkP7uk8Z8Jcdw9pIYLJmg+xNA39E=; b=HvFKJ989LZ31hK/FmbNHdJVBMcSE7xRx3yZ5fBWDx4DYKiZDXmY3KXgF0KtUSs4wFv g9Tv+p0FoW5ZATgcnFEcz7WE8Z9UgOpecjVBIAKanQN5AyCWTJvsJRnWL98rXFe/Nvzq 12fRIioRTPFuZO3GRyfAbRL64LGzy3a4bVWfR/HdXOPtMhCTVgsKTxEsR1aeuf5c2qB5 WtgRO4w2QiHXerVIgZ7ldmupCWCyoysdmjGhJR5Pv2/ITpbLWVfGMDBZ3k8PilA4RFiq yntW7mSmb9qsyVJXS/HMIMKuQl1wrA8Ct2aJPT92aRcMhSRBttrLbgEvataBdAqPWHxL fa1Q== X-Gm-Message-State: AOJu0YzNklRA4AcQEuiYTpPGWOwRx2N0pV432HGquAIJITXjB3nMdQrV dw9kgBZm6GebK3InocMp7wbYp+9syqI1b4CDbw== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a5b:d10:0:b0:d81:7f38:6d65 with SMTP id y16-20020a5b0d10000000b00d817f386d65mr22869ybp.2.1694775598311; Fri, 15 Sep 2023 03:59:58 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:27 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-9-matteorizzo@google.com> Subject: [RFC PATCH 08/14] security: introduce CONFIG_SLAB_VIRTUAL From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:12 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777107219309405263 X-GMAIL-MSGID: 1777107219309405263 From: Jann Horn SLAB_VIRTUAL is a mitigation for the SLUB allocator which prevents reuse of virtual addresses across different slab caches and therefore makes some types of use-after-free bugs unexploitable. SLAB_VIRTUAL is incompatible with KASAN and we believe it's not worth adding support for it. This is because SLAB_VIRTUAL and KASAN are aimed at two different use cases: KASAN is meant for catching bugs as early as possible in debug/fuzz/testing builds, and it's not meant to be used in production. SLAB_VIRTUAL on the other hand is an exploit mitigation that doesn't attempt to highlight bugs but instead tries to make them unexploitable. It doesn't make sense to enable it in debugging builds or during fuzzing, and instead we expect that it will be enabled in production kernels. SLAB_VIRTUAL is not currently compatible with KFENCE, removing this limitation is future work. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- security/Kconfig.hardening | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening index 0f295961e773..9f4e6e38aa76 100644 --- a/security/Kconfig.hardening +++ b/security/Kconfig.hardening @@ -355,4 +355,18 @@ config GCC_PLUGIN_RANDSTRUCT * https://grsecurity.net/ * https://pax.grsecurity.net/ +config SLAB_VIRTUAL + bool "Allocate slab objects from virtual memory" + depends on SLUB && !SLUB_TINY + # If KFENCE support is desired, it could be implemented on top of our + # virtual memory allocation facilities + depends on !KFENCE + # ASAN support will require that shadow memory is allocated + # appropriately. + depends on !KASAN + help + Allocate slab objects from kernel-virtual memory, and ensure that + virtual memory used as a slab cache is never reused to store + objects from other slab caches or non-slab data. + endmenu From patchwork Fri Sep 15 10:59:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140427 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1035026vqi; Fri, 15 Sep 2023 06:12:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH2g3yvHsSV8JvOV2kmuJQdA+D9+77QS8IIa6Y4wLZblZOgDnChwdntb9IIhJ297U4nr2Mx X-Received: by 2002:a17:90b:194e:b0:269:4fe8:687 with SMTP id nk14-20020a17090b194e00b002694fe80687mr2183327pjb.19.1694783576055; Fri, 15 Sep 2023 06:12:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694783576; cv=none; d=google.com; s=arc-20160816; b=RHO65tDCthfbkiNXiQGvL/jL/BV+NakZfUkTRA5XbXiJd6+C8lGLtnx1dpAqrCXVfL gaYtYuFb/xR+ietHzO7PcutGFe/qpjAzuylzSakuhLSIa8Pwma0byzmJsO1+nhRWDFZr SFDPN5tz8gmf086S2cjBcsiR/SQ5uuy+wgMBEL3tGT7yc+M3QePkfO66c+yiaePnD1NI KGwpIy+HngVRRbwb0lR3sI35NTsHL/tEx+WQBFKuWO2iaGWSK4COnylp+hgNKWWqCL3n r7dFc7qXqVg5HJ+uYhHnU4JflYVsLmgMF7leClr3Qdi6JYofAOek+a91FYvnHBurCQjf A8QA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Rgcw5kSlwTUZsnv8liQ/ZCTx1m6umum1AvnN+yk4ci0=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=De1J3VtAeDLQlTnO6kPTuSMr8f4ttBNV4UGH1mn3y8WyecG+3ItFiel52GurGOguWx 9gtJWtALsm63JCe7yVfysLT2AHVN6hEmz95qgfj5HffGWV36Z7gGSRO+LPWSRrrxnWDx A3/6VI7aTtYmatbHEAw+JCt/4SOwLpof8zzlftOf1APGjnZ4rcbWAZP6UAhqk7Zs7QbY 8FK9/CwSERxoq/33wb3oK+6lhRlIbVbb/BVogcCd7ckALHW+XDxR2jgcFdLF0YG8tXYA /wFtQ6cLKrHWi13Ffk8ROY96am5EGjef6F3s+gVXY5evgUWLM8ryTyU9DzUwX7QQ29Yi 0vYg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=YNj4Reoe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id a19-20020a17090acb9300b00252d84b7af0si3362132pju.181.2023.09.15.06.12.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 06:12:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=YNj4Reoe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id A43BC84049EE; Fri, 15 Sep 2023 04:01:48 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234359AbjIOLAo (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48184 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234311AbjIOLAd (ORCPT ); Fri, 15 Sep 2023 07:00:33 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 90EC61BC for ; Fri, 15 Sep 2023 04:00:02 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-59bdb9fe821so26710147b3.0 for ; Fri, 15 Sep 2023 04:00:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775601; x=1695380401; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Rgcw5kSlwTUZsnv8liQ/ZCTx1m6umum1AvnN+yk4ci0=; b=YNj4ReoeIxRcOkpVq1wXRHDJ4kBOiomlcsoOyIJsPeKpmSe6sUS5axgSozT2cUiDNF UqI+EpaBrjvu03Qa7dvD7SvLO6kwCl712RRDvHxermpsNFzBOfEx0m2UTPF3z4jl/yLX jAzEnQWYLo1Gr8mGDGsrEewV5ZCk+Qp1X4OdwvOo2McV2JMSEODYU6cn8qcsuVFHSs9D pe7+CgbeAs/3c5IuHTQe+Yisg3r0qXFFz2uPZuY9j0Zk/bEVTlEwga3EDXK+r/YFMjye M2PHAbgQnmBkUROVb6AQ29yPfOW61rTSNVZXaPYhPAELyisAzA3LWZgqHhCCRktRPcAY HeEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775601; x=1695380401; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Rgcw5kSlwTUZsnv8liQ/ZCTx1m6umum1AvnN+yk4ci0=; b=tGgXusZqk5IQ8QBTh8B6jr8PVEE6dXbjeOrTfMzYfFGyNcqoSt8UaSZudU6kR1f44P TlA8kU30/uBvam2zfv2V8XHd1f/f64a1V1YnroFGqKvvFoUWUZEHLKs8B6MWoIprLkrm vx1TcapFqolhY1+cV+ctXunlqep9kN828dDYSSH4jLH/HqdVzs7ArlEJpeWgGHolojPV 29JdN9xSjC8V99H3SxW3Q0uhLu+snTApEwdW34qEDkJcedb/27+frUrih3RfjxgJOJAV Y1kN5FP7i4Gt0Vu5PyJ1xUmP8qPTJ9I1dVVfYvnPjlN9PtZYfiMrsFh2XL0OhTYDciMI K6Hg== X-Gm-Message-State: AOJu0YzMt8OhCKL557OFF5gaTTnddrJBFCXJnhy3TY7GzNIjiDqVO/J+ H6xr1Mk+VyDDZoXQRKdibKgnCwgyeRvcRD5v6A== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a81:ae43:0:b0:59b:f493:813d with SMTP id g3-20020a81ae43000000b0059bf493813dmr28901ywk.1.1694775601139; Fri, 15 Sep 2023 04:00:01 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:28 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-10-matteorizzo@google.com> Subject: [RFC PATCH 09/14] mm/slub: add the slab freelists to kmem_cache From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:48 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777109382997031748 X-GMAIL-MSGID: 1777109382997031748 From: Jann Horn With SLAB_VIRTUAL enabled, unused slabs which still have virtual memory allocated to them but no physical memory are kept in a per-cache list so that they can be reused later if the cache needs to grow again. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- include/linux/slub_def.h | 16 ++++++++++++++++ mm/slub.c | 23 +++++++++++++++++++++++ 2 files changed, 39 insertions(+) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 0adf5ba8241b..693e9bb34edc 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -86,6 +86,20 @@ struct kmem_cache_cpu { /* * Slab cache management. */ +struct kmem_cache_virtual { +#ifdef CONFIG_SLAB_VIRTUAL + /* Protects freed_slabs and freed_slabs_min */ + spinlock_t freed_slabs_lock; + /* + * Slabs on this list have virtual memory of size oo allocated to them + * but no physical memory + */ + struct list_head freed_slabs; + /* Same as freed_slabs but with memory of size min */ + struct list_head freed_slabs_min; +#endif +}; + struct kmem_cache { #ifndef CONFIG_SLUB_TINY struct kmem_cache_cpu __percpu *cpu_slab; @@ -107,6 +121,8 @@ struct kmem_cache { /* Allocation and freeing of slabs */ struct kmem_cache_order_objects min; + struct kmem_cache_virtual virtual; + gfp_t allocflags; /* gfp flags to use on each alloc */ int refcount; /* Refcount for slab cache destroy */ void (*ctor)(void *); diff --git a/mm/slub.c b/mm/slub.c index 42e7cc0b4452..4f77e5d4fe6c 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -4510,8 +4510,20 @@ static int calculate_sizes(struct kmem_cache *s) return !!oo_objects(s->oo); } +static inline void slab_virtual_open(struct kmem_cache *s) +{ +#ifdef CONFIG_SLAB_VIRTUAL + /* WARNING: this stuff will be relocated in bootstrap()! */ + spin_lock_init(&s->virtual.freed_slabs_lock); + INIT_LIST_HEAD(&s->virtual.freed_slabs); + INIT_LIST_HEAD(&s->virtual.freed_slabs_min); +#endif +} + static int kmem_cache_open(struct kmem_cache *s, slab_flags_t flags) { + slab_virtual_open(s); + s->flags = kmem_cache_flags(s->size, flags, s->name); #ifdef CONFIG_SLAB_FREELIST_HARDENED s->random = get_random_long(); @@ -4994,6 +5006,16 @@ static int slab_memory_callback(struct notifier_block *self, * that may be pointing to the wrong kmem_cache structure. */ +static inline void slab_virtual_bootstrap(struct kmem_cache *s, struct kmem_cache *static_cache) +{ + slab_virtual_open(s); + +#ifdef CONFIG_SLAB_VIRTUAL + list_splice(&static_cache->virtual.freed_slabs, &s->virtual.freed_slabs); + list_splice(&static_cache->virtual.freed_slabs_min, &s->virtual.freed_slabs_min); +#endif +} + static struct kmem_cache * __init bootstrap(struct kmem_cache *static_cache) { int node; @@ -5001,6 +5023,7 @@ static struct kmem_cache * __init bootstrap(struct kmem_cache *static_cache) struct kmem_cache_node *n; memcpy(s, static_cache, kmem_cache->object_size); + slab_virtual_bootstrap(s, static_cache); /* * This runs very early, and only the boot processor is supposed to be From patchwork Fri Sep 15 10:59:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140600 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1233540vqi; Fri, 15 Sep 2023 11:11:59 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFCBp2VsHaLCUXJedApCfPjSBjJFMNblOWt2xn8aZG8FdWdiCjCb7DWJRfNVkSWVT0PCcrV X-Received: by 2002:a05:6358:52d4:b0:13c:dd43:f741 with SMTP id z20-20020a05635852d400b0013cdd43f741mr3099470rwz.24.1694801519077; Fri, 15 Sep 2023 11:11:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694801519; cv=none; d=google.com; s=arc-20160816; b=qJR/uoMyes4Rsardfum7Su7Rs1sbTNaYRnnnLzw+HOBAw5eZZONi4b/gG3zGPj8MOj TBNJvP/vnv0s3CZVqNCEgd8cVjf4ybYeALtu2TwfXSCxn5s4rvyGiuDqJ+jDBe7l4buE Fz2us6MaMdDVp+A8BUsOnDSZe4HBkISmRWMlt8gEaKEab90VlrKT4JlgzaRR95DX+Knr tH7ZpWsHT/iG9ZXX2jyXZn1WMveMa9rbGlKpB85PoKl2y4cd9B4ZO3b1SEoHenlgjuSa EBfcv6I9pM6qZCGhbmdeGePIX8ECD/Bte1MQyIOal2n5YGneQvypAxwMevYuDbWqOMvm t+HA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=QO31JQTgjdCfbGBtgkfgkDIiB/3zYgVmpQ8kZJD1h+Y=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=tmnyZTEqy0he1RQTRaVSGseST4QF1VsBWnSE46ecSQZhGtdjKlebKQ/rAIthj85Tuo jp4PqvVgMrFfF48S8zNPsVdkw+y8ifYJ4PnUNfqb0KrRNYPN7PK82h0MZEuWXwonZrfW dwzMUJeFoH70p/f7I9JVFQh0ZHG36vqmqg4HK9QwbCn5OjNt6lPHmPmYimo0LAYFif8f jvCH5cCwE1B79Cq92IFvmbmzwgpRytdQsq/XNj4M/PCKC2WFdYh1ka/1XJaleR4Jydpn pPpcjTeRrrENtsznE3ozwcrmjR0hhusq62YTkvf2mh4QAObJq7rPTKKI5CXbnLDjaZw1 RvBw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=rnLhAKjL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id l21-20020a635715000000b00565d4ffe843si3543267pgb.418.2023.09.15.11.11.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 11:11:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=rnLhAKjL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 0744882A0BC9; Fri, 15 Sep 2023 04:01:03 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234290AbjIOLAs (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55970 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234295AbjIOLAo (ORCPT ); Fri, 15 Sep 2023 07:00:44 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C86F3268F for ; Fri, 15 Sep 2023 04:00:05 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-d81873bf443so2092988276.1 for ; Fri, 15 Sep 2023 04:00:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775604; x=1695380404; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QO31JQTgjdCfbGBtgkfgkDIiB/3zYgVmpQ8kZJD1h+Y=; b=rnLhAKjLuhA4vg/nVySYKRO6L9b0ld7QHjnHHV6pyD4+JlR7b9/V9L5EaAg2xBThI6 GtgyvL5XlRdgnRTEe4GW5npsY8nQ+dyAil8lYbxZu3ngjzvyEFRTtyKQ7dPdl+Db5gc0 PuAqWNkruacXAJB8Dycg8oYIecGc9jnfoX7Psrgoq1gO/WNgMRhSEppj668vER1lN/xS s6o4W+ftGzwYabSAbrg5S6u1VRNaUrTmYilNwsPrV0Vti8NZto88GvUZjKcKWoyj9TjX 8CZLhDZC+iqED6I5UaYd96YHiSYPyxSolQ+UBvM1fV//kc0NX13RNlzk40Zt8xn3q1U4 9ROQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775604; x=1695380404; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QO31JQTgjdCfbGBtgkfgkDIiB/3zYgVmpQ8kZJD1h+Y=; b=Kk3ovv2qElBo9VgoPt6y72TZfPp7outfMcE0Qw4GzqwtTNBMc1Wd9yXw3+o1JhMIUj 1Qcjdb2aNjMRjxjGcw7tgxbJ4N4lm4enMDNQTAkVyKXAxKPqCZ8hOmij29G1Rz23sbqH R86y3P/PfP1Ni+FsGGpWKwuViinf5w7AMFuohuNHFbVjEtT00G4F46Y6iYkmym6m2EJ0 LwXtS+oJUmWuU9FboRKAzk1lcETYY+zXcJBtUmZtjto3ExalTwbRhpQIOBOmjrqK5LJq InDw3J2YCyFBNAwXskxriCk3g2d/0tBGnfXhJ9DCgDvrIyX3/Tsb1GHwHP3wWSh+xN3S XMRA== X-Gm-Message-State: AOJu0YysE6iCvawjjAlL58s4tYSEDyIP/mh7S94s5FZ3cAcbRVwFJl9w rcGIPexuNqk36Vamoa8/l7bkVa76oi69eUwfIA== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a05:6902:144d:b0:d81:503e:2824 with SMTP id a13-20020a056902144d00b00d81503e2824mr26306ybv.10.1694775603871; Fri, 15 Sep 2023 04:00:03 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:29 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-11-matteorizzo@google.com> Subject: [RFC PATCH 10/14] x86: Create virtual memory region for SLUB From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:03 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777128197284943466 X-GMAIL-MSGID: 1777128197284943466 From: Jann Horn SLAB_VIRTUAL reserves 512 GiB of virtual memory and uses them for both struct slab and the actual slab memory. The pointers returned by kmem_cache_alloc will point to this range of memory. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- Documentation/arch/x86/x86_64/mm.rst | 4 ++-- arch/x86/include/asm/pgtable_64_types.h | 16 ++++++++++++++++ arch/x86/mm/init_64.c | 19 +++++++++++++++---- arch/x86/mm/kaslr.c | 9 +++++++++ arch/x86/mm/mm_internal.h | 4 ++++ mm/slub.c | 4 ++++ security/Kconfig.hardening | 2 ++ 7 files changed, 52 insertions(+), 6 deletions(-) diff --git a/Documentation/arch/x86/x86_64/mm.rst b/Documentation/arch/x86/x86_64/mm.rst index 35e5e18c83d0..121179537175 100644 --- a/Documentation/arch/x86/x86_64/mm.rst +++ b/Documentation/arch/x86/x86_64/mm.rst @@ -57,7 +57,7 @@ Complete virtual memory map with 4-level page tables fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole | | | | vaddr_end for KASLR fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping - fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole + fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | SLUB virtual memory ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space @@ -116,7 +116,7 @@ Complete virtual memory map with 5-level page tables fffffc0000000000 | -4 TB | fffffdffffffffff | 2 TB | ... unused hole | | | | vaddr_end for KASLR fffffe0000000000 | -2 TB | fffffe7fffffffff | 0.5 TB | cpu_entry_area mapping - fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | ... unused hole + fffffe8000000000 | -1.5 TB | fffffeffffffffff | 0.5 TB | SLUB virtual memory ffffff0000000000 | -1 TB | ffffff7fffffffff | 0.5 TB | %esp fixup stacks ffffff8000000000 | -512 GB | ffffffeeffffffff | 444 GB | ... unused hole ffffffef00000000 | -68 GB | fffffffeffffffff | 64 GB | EFI region mapping space diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 38b54b992f32..e1a91eb084c4 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -6,6 +6,7 @@ #ifndef __ASSEMBLY__ #include +#include #include /* @@ -199,6 +200,21 @@ extern unsigned int ptrs_per_p4d; #define ESPFIX_PGD_ENTRY _AC(-2, UL) #define ESPFIX_BASE_ADDR (ESPFIX_PGD_ENTRY << P4D_SHIFT) +#ifdef CONFIG_SLAB_VIRTUAL +#define SLAB_PGD_ENTRY _AC(-3, UL) +#define SLAB_BASE_ADDR (SLAB_PGD_ENTRY << P4D_SHIFT) +#define SLAB_END_ADDR (SLAB_BASE_ADDR + P4D_SIZE) + +/* + * We need to define this here because we need it to compute SLAB_META_SIZE + * and including slab.h causes a dependency cycle. + */ +#define STRUCT_SLAB_SIZE (32 * sizeof(void *)) +#define SLAB_VPAGES ((SLAB_END_ADDR - SLAB_BASE_ADDR) / PAGE_SIZE) +#define SLAB_META_SIZE ALIGN(SLAB_VPAGES * STRUCT_SLAB_SIZE, PAGE_SIZE) +#define SLAB_DATA_BASE_ADDR (SLAB_BASE_ADDR + SLAB_META_SIZE) +#endif /* CONFIG_SLAB_VIRTUAL */ + #define CPU_ENTRY_AREA_PGD _AC(-4, UL) #define CPU_ENTRY_AREA_BASE (CPU_ENTRY_AREA_PGD << P4D_SHIFT) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a190aae8ceaf..d716ddfd9880 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1279,16 +1279,19 @@ static void __init register_page_bootmem_info(void) } /* - * Pre-allocates page-table pages for the vmalloc area in the kernel page-table. + * Pre-allocates page-table pages for the vmalloc and SLUB areas in the kernel + * page-table. * Only the level which needs to be synchronized between all page-tables is * allocated because the synchronization can be expensive. */ -static void __init preallocate_vmalloc_pages(void) +static void __init preallocate_top_level_entries_range(unsigned long start, + unsigned long end) { unsigned long addr; const char *lvl; - for (addr = VMALLOC_START; addr <= VMEMORY_END; addr = ALIGN(addr + 1, PGDIR_SIZE)) { + + for (addr = start; addr <= end; addr = ALIGN(addr + 1, PGDIR_SIZE)) { pgd_t *pgd = pgd_offset_k(addr); p4d_t *p4d; pud_t *pud; @@ -1328,6 +1331,14 @@ static void __init preallocate_vmalloc_pages(void) panic("Failed to pre-allocate %s pages for vmalloc area\n", lvl); } +static void __init preallocate_top_level_entries(void) +{ + preallocate_top_level_entries_range(VMALLOC_START, VMEMORY_END); +#ifdef CONFIG_SLAB_VIRTUAL + preallocate_top_level_entries_range(SLAB_BASE_ADDR, SLAB_END_ADDR - 1); +#endif +} + void __init mem_init(void) { pci_iommu_alloc(); @@ -1351,7 +1362,7 @@ void __init mem_init(void) if (get_gate_vma(&init_mm)) kclist_add(&kcore_vsyscall, (void *)VSYSCALL_ADDR, PAGE_SIZE, KCORE_USER); - preallocate_vmalloc_pages(); + preallocate_top_level_entries(); } #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c index 37db264866b6..7b297d372a8c 100644 --- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -136,6 +136,15 @@ void __init kernel_randomize_memory(void) vaddr = round_up(vaddr + 1, PUD_SIZE); remain_entropy -= entropy; } + +#ifdef CONFIG_SLAB_VIRTUAL + /* + * slub_addr_base is initialized separately from the + * kaslr_memory_regions because it comes after CPU_ENTRY_AREA_BASE. + */ + prandom_bytes_state(&rand_state, &rand, sizeof(rand)); + slub_addr_base += (rand & ((1UL << 36) - PAGE_SIZE)); +#endif } void __meminit init_trampoline_kaslr(void) diff --git a/arch/x86/mm/mm_internal.h b/arch/x86/mm/mm_internal.h index 3f37b5c80bb3..fafb79b7e019 100644 --- a/arch/x86/mm/mm_internal.h +++ b/arch/x86/mm/mm_internal.h @@ -25,4 +25,8 @@ void update_cache_mode_entry(unsigned entry, enum page_cache_mode cache); extern unsigned long tlb_single_page_flush_ceiling; +#ifdef CONFIG_SLAB_VIRTUAL +extern unsigned long slub_addr_base; +#endif + #endif /* __X86_MM_INTERNAL_H */ diff --git a/mm/slub.c b/mm/slub.c index 4f77e5d4fe6c..a731fdc79bff 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -166,6 +166,10 @@ * the fast path and disables lockless freelists. */ +#ifdef CONFIG_SLAB_VIRTUAL +unsigned long slub_addr_base = SLAB_DATA_BASE_ADDR; +#endif /* CONFIG_SLAB_VIRTUAL */ + /* * We could simply use migrate_disable()/enable() but as long as it's a * function call even on !PREEMPT_RT, use inline preempt_disable() there. diff --git a/security/Kconfig.hardening b/security/Kconfig.hardening index 9f4e6e38aa76..f4a0af424149 100644 --- a/security/Kconfig.hardening +++ b/security/Kconfig.hardening @@ -357,6 +357,8 @@ config GCC_PLUGIN_RANDSTRUCT config SLAB_VIRTUAL bool "Allocate slab objects from virtual memory" + # For virtual memory region allocation + depends on X86_64 depends on SLUB && !SLUB_TINY # If KFENCE support is desired, it could be implemented on top of our # virtual memory allocation facilities From patchwork Fri Sep 15 10:59:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140579 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1216124vqi; Fri, 15 Sep 2023 10:42:06 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEyNs6y6lyz0fJj89nX1rIo7XHJ8xH9jD9k9afMFdGUeEpWOWXys2duS1EUiZtNmURPiBYj X-Received: by 2002:a05:6358:9499:b0:142:fbfc:c481 with SMTP id i25-20020a056358949900b00142fbfcc481mr2977771rwb.9.1694799726544; Fri, 15 Sep 2023 10:42:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694799726; cv=none; d=google.com; s=arc-20160816; b=ZQ0JTkR38OsaBTDCFnbNMMhrbKcjpIOFemhOKbnajz+nFvP7Xo1rmgwtRY3SG6BuL0 Inbuh3WB3mJJCSBO4IL1Nlxl6Dwylg3FYNyUKXxw3Gk/1jYdQflnXy6XtAzxB33p1Niw 0K1EUCuiFhZQT84ErfwYvc8x90WTvk/VMIFbY99OoC54bs/a5B1UaosdFBWo4Pc10KxX FrZ1Rkj3B6Q3wCApDZAb8U/dgupb3/+jlCOyl5U99QhkH3GUFoQ0M6QEk4rq3jhuu9aj zTo8RgjbXu6VLPH4KcYdyvoiSbtEW4FvtzAg9O6Zp7xD2EhhI7iRFsisbrjHpJEdKyiL 6Ufg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=fwMqMOgZczlkXIp265D11obkAY0m5SEDJnzDVowAfdI=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=nAzw/2d77zm9z9GgTqWtiZNFtMX1RricJ/RYIMmH7OsqEtPk/obxXvBi9143dR3YT3 hMdzYKh6vubwcmy9pmm6ml1mVp/fXx1zMOvFBHFfMWK3YkPLmJpbu3rxQZZd0I6tKsP3 fsqVl+DR9NEDOWSN4d+jOe5tygLhSWDChFd5Z0FbaXfAjm8UZ03F1sSAdXKO5c620Vaj 1kHhkrFTbKoGv0FJi2VGxn3XY1zzkVOAMHwOo66m985egh6nK/AFXsmzUB1Z/hXAuBtc Di9HdYRCZWOmxfdtAgF/Am62UcpUvOcAEFiFa/pHeVdvRhGC7uXPSI3qe5wnhVjp1UOp ejgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="vih8/izi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id l10-20020a654c4a000000b0056aea4cc1c9si3572299pgr.653.2023.09.15.10.42.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 10:42:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b="vih8/izi"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 33A228209684; Fri, 15 Sep 2023 04:01:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234289AbjIOLA7 (ORCPT + 32 others); Fri, 15 Sep 2023 07:00:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59498 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234385AbjIOLAz (ORCPT ); Fri, 15 Sep 2023 07:00:55 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 761F7CD4 for ; Fri, 15 Sep 2023 04:00:07 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d81a76a11eeso1468807276.3 for ; Fri, 15 Sep 2023 04:00:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775606; x=1695380406; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=fwMqMOgZczlkXIp265D11obkAY0m5SEDJnzDVowAfdI=; b=vih8/iziYYW1X/tOkqfJNNWkbxGwS582WfHF/U0hzSVobzgMFe6aYnD76g1QnbuGnk g1bc2qZqtx/e16WL8CopzT/rFha1+XGgPECljvJO96C/q+wpuXIqpITO5FuqNCLRWCzs WmcDjmFhdAmwnmZWPrW16Tn3t+dog07QXRL7wIP7X8JmqGYeCQO+N1ipUH2xbFWO2dGr 4PuRQ5dW4XowqGIKv1pDUpNEd+IWs4zzEvOl9/LdbRbXOfKmAROiCyGA5Z6SXuT4Neux HqVqRqwc4beAZDmvmCeYkK5VrSUTSr3WOF0B+LoW1JtQcCLN0qwY8LVeS8JvVrTL2bvT NLUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775606; x=1695380406; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=fwMqMOgZczlkXIp265D11obkAY0m5SEDJnzDVowAfdI=; b=rYZw+oOf2S8A0+s1nzMeu8rvxfdHiZWIEzZUuwwrw1OvHV2K7nZ3QKTSKOP4/OgrbY PoDjPq3T7x+zhQDMCo5GQQ+n53J7yeUDmRKxQicjC5QF4FOAR6vcd2KKyVBq/QXjoiRu 8oRunFAj5FIbRY+6XopkPeLY9yEHDYG7Ai3Ar/eSzW0q27QLBoMFCsCSvHVq9XiY13GR 7AcWROH9IcL8OgDlrkHVrT8rzIekMBq+DtFyvusY4i3vb7nkLaASK2xfHLpB0dXYigBg s+a06UmG9q8+ZjR+/CDlZBidoZvnisu2ptU/p9s6KkJzDcmMLasrCLoDvSBJaLhnGPb3 u43A== X-Gm-Message-State: AOJu0YwE7bgH5REafr+zp1kBrbIZy8n9zx747/TkhjtUH+a16n2l9svW 70dsITiu7Bt3Zzcy0MQQ36Tflz8Zszv9VICtig== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a05:6902:1341:b0:d80:cf4:7e80 with SMTP id g1-20020a056902134100b00d800cf47e80mr23456ybu.7.1694775606539; Fri, 15 Sep 2023 04:00:06 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:30 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-12-matteorizzo@google.com> Subject: [RFC PATCH 11/14] mm/slub: allocate slabs from virtual memory From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:12 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777126317903751092 X-GMAIL-MSGID: 1777126317903751092 From: Jann Horn This is the main implementation of SLAB_VIRTUAL. With SLAB_VIRTUAL enabled, slab memory is not allocated from the linear map but from a dedicated region of virtual memory. The code ensures that once a range of virtual addresses is assigned to a slab cache, that virtual memory is never reused again except for other slabs in that same cache. This lets us mitigate some exploits for use-after-free vulnerabilities where the attacker makes SLUB release a slab page to the page allocator and then makes it reuse that same page for a different slab cache ("cross-cache attacks"). With SLAB_VIRTUAL enabled struct slab no longer overlaps struct page but instead it is allocated from a dedicated region of virtual memory. This makes it possible to have references to slabs whose physical memory has been freed. SLAB_VIRTUAL has a small performance overhead, about 1-2% on kernel compilation time. We are using 4 KiB pages to map slab pages and slab metadata area, instead of the 2 MiB pages that the kernel uses to map the physmap. We experimented with a version of the patch that uses 2 MiB pages and we did see some performance improvement but the code also became much more complicated and ugly because we would need to allocate and free multiple slabs at once. In addition to the TLB contention, SLAB_VIRTUAL also adds new locks to the slow path of the allocator. Lock contention also contributes to the performance penalty to some extent, and this is more visible on machines with many CPUs. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo --- arch/x86/include/asm/page_64.h | 10 + arch/x86/include/asm/pgtable_64_types.h | 5 + arch/x86/mm/physaddr.c | 10 + include/linux/slab.h | 7 + init/main.c | 1 + mm/slab.h | 106 ++++++ mm/slab_common.c | 4 + mm/slub.c | 439 +++++++++++++++++++++++- mm/usercopy.c | 12 +- 9 files changed, 587 insertions(+), 7 deletions(-) diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index cc6b8e087192..25fb734a2fe6 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -3,6 +3,7 @@ #define _ASM_X86_PAGE_64_H #include +#include #ifndef __ASSEMBLY__ #include @@ -18,10 +19,19 @@ extern unsigned long page_offset_base; extern unsigned long vmalloc_base; extern unsigned long vmemmap_base; +#ifdef CONFIG_SLAB_VIRTUAL +unsigned long slab_virt_to_phys(unsigned long x); +#endif + static __always_inline unsigned long __phys_addr_nodebug(unsigned long x) { unsigned long y = x - __START_KERNEL_map; +#ifdef CONFIG_SLAB_VIRTUAL + if (is_slab_addr(x)) + return slab_virt_to_phys(x); +#endif + /* use the carry flag to determine if x was < __START_KERNEL_map */ x = y + ((x > y) ? phys_base : (__START_KERNEL_map - PAGE_OFFSET)); diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index e1a91eb084c4..4aae822a6a96 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -213,6 +213,11 @@ extern unsigned int ptrs_per_p4d; #define SLAB_VPAGES ((SLAB_END_ADDR - SLAB_BASE_ADDR) / PAGE_SIZE) #define SLAB_META_SIZE ALIGN(SLAB_VPAGES * STRUCT_SLAB_SIZE, PAGE_SIZE) #define SLAB_DATA_BASE_ADDR (SLAB_BASE_ADDR + SLAB_META_SIZE) + +#define is_slab_addr(ptr) ((unsigned long)(ptr) >= SLAB_DATA_BASE_ADDR && \ + (unsigned long)(ptr) < SLAB_END_ADDR) +#define is_slab_meta(ptr) ((unsigned long)(ptr) >= SLAB_BASE_ADDR && \ + (unsigned long)(ptr) < SLAB_DATA_BASE_ADDR) #endif /* CONFIG_SLAB_VIRTUAL */ #define CPU_ENTRY_AREA_PGD _AC(-4, UL) diff --git a/arch/x86/mm/physaddr.c b/arch/x86/mm/physaddr.c index fc3f3d3e2ef2..7f1b81c75e4d 100644 --- a/arch/x86/mm/physaddr.c +++ b/arch/x86/mm/physaddr.c @@ -16,6 +16,11 @@ unsigned long __phys_addr(unsigned long x) { unsigned long y = x - __START_KERNEL_map; +#ifdef CONFIG_SLAB_VIRTUAL + if (is_slab_addr(x)) + return slab_virt_to_phys(x); +#endif + /* use the carry flag to determine if x was < __START_KERNEL_map */ if (unlikely(x > y)) { x = y + phys_base; @@ -48,6 +53,11 @@ bool __virt_addr_valid(unsigned long x) { unsigned long y = x - __START_KERNEL_map; +#ifdef CONFIG_SLAB_VIRTUAL + if (is_slab_addr(x)) + return true; +#endif + /* use the carry flag to determine if x was < __START_KERNEL_map */ if (unlikely(x > y)) { x = y + phys_base; diff --git a/include/linux/slab.h b/include/linux/slab.h index a2d82010d269..2180d5170995 100644 --- a/include/linux/slab.h +++ b/include/linux/slab.h @@ -793,5 +793,12 @@ int slab_dead_cpu(unsigned int cpu); #define slab_dead_cpu NULL #endif +#ifdef CONFIG_SLAB_VIRTUAL +void __init init_slub_page_reclaim(void); +#else #define is_slab_addr(addr) folio_test_slab(virt_to_folio(addr)) +static inline void init_slub_page_reclaim(void) +{ +} +#endif /* CONFIG_SLAB_VIRTUAL */ #endif /* _LINUX_SLAB_H */ diff --git a/init/main.c b/init/main.c index ad920fac325c..72456964417e 100644 --- a/init/main.c +++ b/init/main.c @@ -1532,6 +1532,7 @@ static noinline void __init kernel_init_freeable(void) workqueue_init(); init_mm_internals(); + init_slub_page_reclaim(); rcu_init_tasks_generic(); do_pre_smp_initcalls(); diff --git a/mm/slab.h b/mm/slab.h index 3fe0d1e26e26..460c802924bd 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -1,6 +1,11 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef MM_SLAB_H #define MM_SLAB_H + +#include +#include +#include + /* * Internal slab definitions */ @@ -49,7 +54,35 @@ struct kmem_cache_order_objects { /* Reuses the bits in struct page */ struct slab { + /* + * With CONFIG_SLAB_VIRTUAL enabled instances of struct slab are not + * overlapped with struct page but instead they are allocated from + * a dedicated virtual memory area. + */ +#ifndef CONFIG_SLAB_VIRTUAL unsigned long __page_flags; +#else + /* + * Used by virt_to_slab to find the actual struct slab for a slab that + * spans multiple pages. + */ + struct slab *compound_slab_head; + + /* + * Pointer to the folio that the objects are allocated from, or NULL if + * the slab is currently unused and no physical memory is allocated to + * it. Protected by slub_kworker_lock. + */ + struct folio *backing_folio; + + struct kmem_cache_order_objects oo; + + struct list_head flush_list_elem; + + /* Replaces the page lock */ + spinlock_t slab_lock; + +#endif #if defined(CONFIG_SLAB) @@ -104,12 +137,17 @@ struct slab { #error "Unexpected slab allocator configured" #endif + /* See comment for __page_flags above. */ +#ifndef CONFIG_SLAB_VIRTUAL atomic_t __page_refcount; +#endif #ifdef CONFIG_MEMCG unsigned long memcg_data; #endif }; +/* See comment for __page_flags above. */ +#ifndef CONFIG_SLAB_VIRTUAL #define SLAB_MATCH(pg, sl) \ static_assert(offsetof(struct page, pg) == offsetof(struct slab, sl)) SLAB_MATCH(flags, __page_flags); @@ -120,10 +158,15 @@ SLAB_MATCH(memcg_data, memcg_data); #endif #undef SLAB_MATCH static_assert(sizeof(struct slab) <= sizeof(struct page)); +#else +static_assert(sizeof(struct slab) <= STRUCT_SLAB_SIZE); +#endif + #if defined(system_has_freelist_aba) && defined(CONFIG_SLUB) static_assert(IS_ALIGNED(offsetof(struct slab, freelist), sizeof(freelist_aba_t))); #endif +#ifndef CONFIG_SLAB_VIRTUAL /** * folio_slab - Converts from folio to slab. * @folio: The folio. @@ -187,6 +230,14 @@ static_assert(IS_ALIGNED(offsetof(struct slab, freelist), sizeof(freelist_aba_t) * Return: true if s points to a slab and false otherwise. */ #define is_slab_page(s) folio_test_slab(slab_folio(s)) +#else +#define slab_folio(s) (s->backing_folio) +#define is_slab_page(s) is_slab_meta(s) +/* Needed for check_heap_object but never actually used */ +#define folio_slab(folio) NULL +static void *slab_to_virt(const struct slab *s); +#endif /* CONFIG_SLAB_VIRTUAL */ + /* * If network-based swap is enabled, sl*b must keep track of whether pages * were allocated from pfmemalloc reserves. @@ -213,7 +264,11 @@ static inline void __slab_clear_pfmemalloc(struct slab *slab) static inline void *slab_address(const struct slab *slab) { +#ifdef CONFIG_SLAB_VIRTUAL + return slab_to_virt(slab); +#else return folio_address(slab_folio(slab)); +#endif } static inline int slab_nid(const struct slab *slab) @@ -226,6 +281,52 @@ static inline pg_data_t *slab_pgdat(const struct slab *slab) return folio_pgdat(slab_folio(slab)); } +#ifdef CONFIG_SLAB_VIRTUAL +/* + * Internal helper. Returns the address of the struct slab corresponding to + * the virtual memory page containing kaddr. This does a simple arithmetic + * mapping and does *not* return the struct slab of the head page! + */ +static unsigned long virt_to_slab_raw(unsigned long addr) +{ + VM_WARN_ON(!is_slab_addr(addr)); + return SLAB_BASE_ADDR + + ((addr - SLAB_BASE_ADDR) / PAGE_SIZE * sizeof(struct slab)); +} + +static struct slab *virt_to_slab(const void *addr) +{ + struct slab *slab, *slab_head; + + if (!is_slab_addr(addr)) + return NULL; + + slab = (struct slab *)virt_to_slab_raw((unsigned long)addr); + slab_head = slab->compound_slab_head; + + if (CHECK_DATA_CORRUPTION(!is_slab_meta(slab_head), + "compound slab head out of meta range: %p", slab_head)) + return NULL; + + return slab_head; +} + +static void *slab_to_virt(const struct slab *s) +{ + unsigned long slab_idx; + bool unaligned_slab = + ((unsigned long)s - SLAB_BASE_ADDR) % sizeof(*s) != 0; + + if (CHECK_DATA_CORRUPTION(!is_slab_meta(s), "slab not in meta range") || + CHECK_DATA_CORRUPTION(unaligned_slab, "unaligned slab pointer") || + CHECK_DATA_CORRUPTION(s->compound_slab_head != s, + "%s called on non-head slab", __func__)) + return NULL; + + slab_idx = ((unsigned long)s - SLAB_BASE_ADDR) / sizeof(*s); + return (void *)(SLAB_BASE_ADDR + PAGE_SIZE * slab_idx); +} +#else static inline struct slab *virt_to_slab(const void *addr) { struct folio *folio = virt_to_folio(addr); @@ -235,6 +336,7 @@ static inline struct slab *virt_to_slab(const void *addr) return folio_slab(folio); } +#endif /* CONFIG_SLAB_VIRTUAL */ #define OO_SHIFT 16 #define OO_MASK ((1 << OO_SHIFT) - 1) @@ -251,7 +353,11 @@ static inline unsigned int oo_objects(struct kmem_cache_order_objects x) static inline int slab_order(const struct slab *slab) { +#ifndef CONFIG_SLAB_VIRTUAL return folio_order((struct folio *)slab_folio(slab)); +#else + return oo_order(slab->oo); +#endif } static inline size_t slab_size(const struct slab *slab) diff --git a/mm/slab_common.c b/mm/slab_common.c index 42ceaf7e9f47..7754fdba07a0 100644 --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -1064,6 +1064,10 @@ void kfree(const void *object) if (unlikely(!is_slab_addr(object))) { folio = virt_to_folio(object); + if (IS_ENABLED(CONFIG_SLAB_VIRTUAL) && + CHECK_DATA_CORRUPTION(folio_test_slab(folio), + "unexpected slab page mapped outside slab range")) + return; free_large_kmalloc(folio, (void *)object); return; } diff --git a/mm/slub.c b/mm/slub.c index a731fdc79bff..66ae60cdadaf 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -38,6 +38,10 @@ #include #include #include +#include +#include +#include +#include #include #include #include @@ -168,6 +172,8 @@ #ifdef CONFIG_SLAB_VIRTUAL unsigned long slub_addr_base = SLAB_DATA_BASE_ADDR; +/* Protects slub_addr_base */ +static DEFINE_SPINLOCK(slub_valloc_lock); #endif /* CONFIG_SLAB_VIRTUAL */ /* @@ -430,19 +436,18 @@ static void prefetch_freepointer(const struct kmem_cache *s, void *object) * get_freepointer_safe() returns initialized memory. */ __no_kmsan_checks -static inline void *get_freepointer_safe(struct kmem_cache *s, void *object, +static inline freeptr_t get_freepointer_safe(struct kmem_cache *s, void *object, struct slab *slab) { - unsigned long freepointer_addr; + unsigned long freepointer_addr = (unsigned long)object + s->offset; freeptr_t p; if (!debug_pagealloc_enabled_static()) - return get_freepointer(s, object, slab); + return *(freeptr_t *)freepointer_addr; object = kasan_reset_tag(object); - freepointer_addr = (unsigned long)object + s->offset; copy_from_kernel_nofault(&p, (freeptr_t *)freepointer_addr, sizeof(p)); - return freelist_ptr_decode(s, p, freepointer_addr, slab); + return p; } static inline void set_freepointer(struct kmem_cache *s, void *object, void *fp) @@ -478,6 +483,17 @@ static inline struct kmem_cache_order_objects oo_make(unsigned int order, return x; } +#ifdef CONFIG_SLAB_VIRTUAL +unsigned long slab_virt_to_phys(unsigned long x) +{ + struct slab *slab = virt_to_slab((void *)x); + struct folio *folio = slab_folio(slab); + + return page_to_phys(folio_page(folio, 0)) + offset_in_folio(folio, x); +} +EXPORT_SYMBOL(slab_virt_to_phys); +#endif + #ifdef CONFIG_SLUB_CPU_PARTIAL static void slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) { @@ -506,18 +522,26 @@ slub_set_cpu_partial(struct kmem_cache *s, unsigned int nr_objects) */ static __always_inline void slab_lock(struct slab *slab) { +#ifdef CONFIG_SLAB_VIRTUAL + spin_lock(&slab->slab_lock); +#else struct page *page = slab_page(slab); VM_BUG_ON_PAGE(PageTail(page), page); bit_spin_lock(PG_locked, &page->flags); +#endif } static __always_inline void slab_unlock(struct slab *slab) { +#ifdef CONFIG_SLAB_VIRTUAL + spin_unlock(&slab->slab_lock); +#else struct page *page = slab_page(slab); VM_BUG_ON_PAGE(PageTail(page), page); __bit_spin_unlock(PG_locked, &page->flags); +#endif } static inline bool @@ -1863,6 +1887,10 @@ static void folio_set_slab(struct folio *folio, struct slab *slab) /* Make the flag visible before any changes to folio->mapping */ smp_wmb(); +#ifdef CONFIG_SLAB_VIRTUAL + slab->backing_folio = folio; +#endif + if (folio_is_pfmemalloc(folio)) slab_set_pfmemalloc(slab); } @@ -1874,8 +1902,285 @@ static void folio_clear_slab(struct folio *folio, struct slab *slab) /* Make the mapping reset visible before clearing the flag */ smp_wmb(); __folio_clear_slab(folio); +#ifdef CONFIG_SLAB_VIRTUAL + slab->backing_folio = NULL; +#endif +} + +#ifdef CONFIG_SLAB_VIRTUAL +/* + * Make sure we have the necessary page tables for the given address. + * Returns a pointer to the PTE, or NULL on allocation failure. + * + * We're using ugly low-level code here instead of the standard + * helpers because the normal code insists on using GFP_KERNEL. + * + * If may_alloc is false, throw an error if the PTE is not already mapped. + */ +static pte_t *slub_get_ptep(unsigned long address, gfp_t gfp_flags, + bool may_alloc) +{ + pgd_t *pgd = pgd_offset_k(address); + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + unsigned long flags; + struct page *spare_page = NULL; + +retry: + spin_lock_irqsave(&slub_valloc_lock, flags); + /* + * The top-level entry should already be present - see + * preallocate_top_level_entries(). + */ + BUG_ON(pgd_none(READ_ONCE(*pgd))); + p4d = p4d_offset(pgd, address); + if (p4d_none(READ_ONCE(*p4d))) { + if (!spare_page) + goto need_page; + p4d_populate(&init_mm, p4d, (pud_t *)page_to_virt(spare_page)); + goto need_page; + + } + pud = pud_offset(p4d, address); + if (pud_none(READ_ONCE(*pud))) { + if (!spare_page) + goto need_page; + pud_populate(&init_mm, pud, (pmd_t *)page_to_virt(spare_page)); + goto need_page; + } + pmd = pmd_offset(pud, address); + if (pmd_none(READ_ONCE(*pmd))) { + if (!spare_page) + goto need_page; + pmd_populate_kernel(&init_mm, pmd, + (pte_t *)page_to_virt(spare_page)); + spare_page = NULL; + } + spin_unlock_irqrestore(&slub_valloc_lock, flags); + if (spare_page) + __free_page(spare_page); + return pte_offset_kernel(pmd, address); + +need_page: + spin_unlock_irqrestore(&slub_valloc_lock, flags); + VM_WARN_ON(!may_alloc); + spare_page = alloc_page(gfp_flags); + if (unlikely(!spare_page)) + return NULL; + /* ensure ordering between page zeroing and PTE write */ + smp_wmb(); + goto retry; +} + +/* + * Reserve a range of virtual address space, ensure that we have page tables for + * it, and allocate a corresponding struct slab. + * This is cold code, we don't really have to worry about performance here. + */ +static struct slab *alloc_slab_meta(unsigned int order, gfp_t gfp_flags) +{ + unsigned long alloc_size = PAGE_SIZE << order; + unsigned long flags; + unsigned long old_base; + unsigned long data_range_start, data_range_end; + unsigned long meta_range_start, meta_range_end; + unsigned long addr; + struct slab *slab, *sp; + bool valid_start, valid_end; + + gfp_flags &= (__GFP_HIGH | __GFP_RECLAIM | __GFP_IO | + __GFP_FS | __GFP_NOWARN | __GFP_RETRY_MAYFAIL | + __GFP_NOFAIL | __GFP_NORETRY | __GFP_MEMALLOC | + __GFP_NOMEMALLOC); + /* New page tables and metadata pages should be zeroed */ + gfp_flags |= __GFP_ZERO; + + spin_lock_irqsave(&slub_valloc_lock, flags); +retry_locked: + old_base = slub_addr_base; + + /* + * We drop the lock. The following code might sleep during + * page table allocation. Any mutations we make before rechecking + * slub_addr_base are idempotent, so that's fine. + */ + spin_unlock_irqrestore(&slub_valloc_lock, flags); + + /* + * [data_range_start, data_range_end) is the virtual address range where + * this slab's objects will be mapped. + * We want alignment appropriate for the order. Note that this could be + * relaxed based on the alignment requirements of the objects being + * allocated, but for now, we behave like the page allocator would. + */ + data_range_start = ALIGN(old_base, alloc_size); + data_range_end = data_range_start + alloc_size; + + valid_start = data_range_start >= SLAB_BASE_ADDR && + IS_ALIGNED(data_range_start, PAGE_SIZE); + valid_end = data_range_end >= SLAB_BASE_ADDR && + IS_ALIGNED(data_range_end, PAGE_SIZE); + if (CHECK_DATA_CORRUPTION(!valid_start, + "invalid slab data range start") || + CHECK_DATA_CORRUPTION(!valid_end, + "invalid slab data range end")) + return NULL; + + /* We ran out of virtual memory for slabs */ + if (WARN_ON_ONCE(data_range_start >= SLAB_END_ADDR || + data_range_end >= SLAB_END_ADDR)) + return NULL; + + /* + * [meta_range_start, meta_range_end) is the range where the struct + * slabs for the current data range are mapped. The first struct slab, + * located at meta_range_start is the head slab that contains the actual + * data, all other struct slabs in the range point to the head slab. + */ + meta_range_start = virt_to_slab_raw(data_range_start); + meta_range_end = virt_to_slab_raw(data_range_end); + + /* Ensure the meta range is mapped. */ + for (addr = ALIGN_DOWN(meta_range_start, PAGE_SIZE); + addr < meta_range_end; addr += PAGE_SIZE) { + pte_t *ptep = slub_get_ptep(addr, gfp_flags, true); + + if (ptep == NULL) + return NULL; + + spin_lock_irqsave(&slub_valloc_lock, flags); + if (pte_none(READ_ONCE(*ptep))) { + struct page *meta_page; + + spin_unlock_irqrestore(&slub_valloc_lock, flags); + meta_page = alloc_page(gfp_flags); + if (meta_page == NULL) + return NULL; + spin_lock_irqsave(&slub_valloc_lock, flags); + + /* Make sure that no one else has already mapped that page */ + if (pte_none(READ_ONCE(*ptep))) + set_pte_safe(ptep, + mk_pte(meta_page, PAGE_KERNEL)); + else + __free_page(meta_page); + } + spin_unlock_irqrestore(&slub_valloc_lock, flags); + } + + /* Ensure we have page tables for the data range. */ + for (addr = data_range_start; addr < data_range_end; + addr += PAGE_SIZE) { + pte_t *ptep = slub_get_ptep(addr, gfp_flags, true); + + if (ptep == NULL) + return NULL; + } + + /* Did we race with someone else who made forward progress? */ + spin_lock_irqsave(&slub_valloc_lock, flags); + if (old_base != slub_addr_base) + goto retry_locked; + + /* Success! Grab the range for ourselves. */ + slub_addr_base = data_range_end; + spin_unlock_irqrestore(&slub_valloc_lock, flags); + + slab = (struct slab *)meta_range_start; + spin_lock_init(&slab->slab_lock); + + /* Initialize basic slub metadata for virt_to_slab() */ + for (sp = slab; (unsigned long)sp < meta_range_end; sp++) + sp->compound_slab_head = slab; + + return slab; +} + +/* Get an unused slab, or allocate a new one */ +static struct slab *get_free_slab(struct kmem_cache *s, + struct kmem_cache_order_objects oo, gfp_t meta_gfp_flags, + struct list_head *freed_slabs) +{ + unsigned long flags; + struct slab *slab; + + spin_lock_irqsave(&s->virtual.freed_slabs_lock, flags); + slab = list_first_entry_or_null(freed_slabs, struct slab, slab_list); + + if (likely(slab)) { + list_del(&slab->slab_list); + + spin_unlock_irqrestore(&s->virtual.freed_slabs_lock, flags); + return slab; + } + + spin_unlock_irqrestore(&s->virtual.freed_slabs_lock, flags); + slab = alloc_slab_meta(oo_order(oo), meta_gfp_flags); + if (slab == NULL) + return NULL; + + return slab; } +static struct slab *alloc_slab_page(struct kmem_cache *s, + gfp_t meta_gfp_flags, gfp_t gfp_flags, int node, + struct kmem_cache_order_objects oo) +{ + struct folio *folio; + struct slab *slab; + unsigned int order = oo_order(oo); + unsigned long flags; + void *virt_mapping; + pte_t *ptep; + struct list_head *freed_slabs; + + if (order == oo_order(s->min)) + freed_slabs = &s->virtual.freed_slabs_min; + else + freed_slabs = &s->virtual.freed_slabs; + + slab = get_free_slab(s, oo, meta_gfp_flags, freed_slabs); + + /* + * Avoid making UAF reads easily exploitable by repopulating + * with pages containing attacker-controller data - always zero + * pages. + */ + gfp_flags |= __GFP_ZERO; + if (node == NUMA_NO_NODE) + folio = (struct folio *)alloc_pages(gfp_flags, order); + else + folio = (struct folio *)__alloc_pages_node(node, gfp_flags, + order); + + if (!folio) { + /* Rollback: put the struct slab back. */ + spin_lock_irqsave(&s->virtual.freed_slabs_lock, flags); + list_add(&slab->slab_list, freed_slabs); + spin_unlock_irqrestore(&s->virtual.freed_slabs_lock, flags); + + return NULL; + } + folio_set_slab(folio, slab); + + slab->oo = oo; + + virt_mapping = slab_to_virt(slab); + + /* Wire up physical folio */ + for (unsigned long i = 0; i < (1UL << oo_order(oo)); i++) { + ptep = slub_get_ptep( + (unsigned long)virt_mapping + i * PAGE_SIZE, 0, false); + if (CHECK_DATA_CORRUPTION(pte_present(*ptep), + "slab PTE already present")) + return NULL; + set_pte_safe(ptep, mk_pte(folio_page(folio, i), PAGE_KERNEL)); + } + + return slab; +} +#else static inline struct slab *alloc_slab_page(struct kmem_cache *s, gfp_t meta_flags, gfp_t flags, int node, struct kmem_cache_order_objects oo) @@ -1897,6 +2202,7 @@ static inline struct slab *alloc_slab_page(struct kmem_cache *s, return slab; } +#endif /* CONFIG_SLAB_VIRTUAL */ #ifdef CONFIG_SLAB_FREELIST_RANDOM /* Pre-initialize the random sequence cache */ @@ -2085,6 +2391,94 @@ static struct slab *new_slab(struct kmem_cache *s, gfp_t flags, int node) flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node); } +#ifdef CONFIG_SLAB_VIRTUAL +static DEFINE_SPINLOCK(slub_kworker_lock); +static struct kthread_worker *slub_kworker; +static LIST_HEAD(slub_tlbflush_queue); + +static void slub_tlbflush_worker(struct kthread_work *work) +{ + unsigned long irq_flags; + LIST_HEAD(local_queue); + struct slab *slab, *tmp; + unsigned long addr_start = ULONG_MAX; + unsigned long addr_end = 0; + + spin_lock_irqsave(&slub_kworker_lock, irq_flags); + list_splice_init(&slub_tlbflush_queue, &local_queue); + list_for_each_entry(slab, &local_queue, flush_list_elem) { + unsigned long start = (unsigned long)slab_to_virt(slab); + unsigned long end = start + PAGE_SIZE * + (1UL << oo_order(slab->oo)); + + if (start < addr_start) + addr_start = start; + if (end > addr_end) + addr_end = end; + } + spin_unlock_irqrestore(&slub_kworker_lock, irq_flags); + + if (addr_start < addr_end) + flush_tlb_kernel_range(addr_start, addr_end); + + spin_lock_irqsave(&slub_kworker_lock, irq_flags); + list_for_each_entry_safe(slab, tmp, &local_queue, flush_list_elem) { + struct folio *folio = slab_folio(slab); + struct kmem_cache *s = slab->slab_cache; + + list_del(&slab->flush_list_elem); + folio_clear_slab(folio, slab); + __free_pages(folio_page(folio, 0), oo_order(slab->oo)); + + /* IRQs are already off */ + spin_lock(&s->virtual.freed_slabs_lock); + if (oo_order(slab->oo) == oo_order(s->oo)) { + list_add(&slab->slab_list, &s->virtual.freed_slabs); + } else { + WARN_ON(oo_order(slab->oo) != oo_order(s->min)); + list_add(&slab->slab_list, &s->virtual.freed_slabs_min); + } + spin_unlock(&s->virtual.freed_slabs_lock); + } + spin_unlock_irqrestore(&slub_kworker_lock, irq_flags); +} +static DEFINE_KTHREAD_WORK(slub_tlbflush_work, slub_tlbflush_worker); + +static void __free_slab(struct kmem_cache *s, struct slab *slab) +{ + int order = oo_order(slab->oo); + unsigned long pages = 1UL << order; + unsigned long slab_base = (unsigned long)slab_address(slab); + unsigned long irq_flags; + + /* Clear the PTEs for the slab we're freeing */ + for (unsigned long i = 0; i < pages; i++) { + unsigned long addr = slab_base + i * PAGE_SIZE; + pte_t *ptep = slub_get_ptep(addr, 0, false); + + if (CHECK_DATA_CORRUPTION(!pte_present(*ptep), + "slab PTE already clear")) + return; + + ptep_clear(&init_mm, addr, ptep); + } + + mm_account_reclaimed_pages(pages); + unaccount_slab(slab, order, s); + + /* + * We might not be able to a TLB flush here (e.g. hardware interrupt + * handlers) so instead we give the slab to the TLB flusher thread + * which will flush the TLB for us and only then free the physical + * memory. + */ + spin_lock_irqsave(&slub_kworker_lock, irq_flags); + list_add(&slab->flush_list_elem, &slub_tlbflush_queue); + spin_unlock_irqrestore(&slub_kworker_lock, irq_flags); + if (READ_ONCE(slub_kworker) != NULL) + kthread_queue_work(slub_kworker, &slub_tlbflush_work); +} +#else static void __free_slab(struct kmem_cache *s, struct slab *slab) { struct folio *folio = slab_folio(slab); @@ -2096,6 +2490,7 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab) unaccount_slab(slab, order, s); __free_pages(&folio->page, order); } +#endif /* CONFIG_SLAB_VIRTUAL */ static void rcu_free_slab(struct rcu_head *h) { @@ -3384,7 +3779,15 @@ static __always_inline void *__slab_alloc_node(struct kmem_cache *s, unlikely(!object || !slab || !node_match(slab, node))) { object = __slab_alloc(s, gfpflags, node, addr, c, orig_size); } else { - void *next_object = get_freepointer_safe(s, object, slab); + void *next_object; + freeptr_t next_encoded = get_freepointer_safe(s, object, slab); + + if (unlikely(READ_ONCE(c->tid) != tid)) + goto redo; + + next_object = freelist_ptr_decode(s, next_encoded, + (unsigned long)kasan_reset_tag(object) + s->offset, + slab); /* * The cmpxchg will only match if there was no additional @@ -5050,6 +5453,30 @@ static struct kmem_cache * __init bootstrap(struct kmem_cache *static_cache) return s; } +#ifdef CONFIG_SLAB_VIRTUAL +/* + * Late initialization of reclaim kthread. + * This has to happen way later than kmem_cache_init() because it depends on + * having all the kthread infrastructure ready. + */ +void __init init_slub_page_reclaim(void) +{ + struct kthread_worker *w; + + w = kthread_create_worker(0, "slub-physmem-reclaim"); + if (IS_ERR(w)) + panic("unable to create slub-physmem-reclaim worker"); + + /* + * Make sure that the kworker is properly initialized before making + * the store visible to other CPUs. The free path will check that + * slub_kworker is not NULL before attempting to give the TLB flusher + * pages to free. + */ + smp_store_release(&slub_kworker, w); +} +#endif /* CONFIG_SLAB_VIRTUAL */ + void __init kmem_cache_init(void) { static __initdata struct kmem_cache boot_kmem_cache, diff --git a/mm/usercopy.c b/mm/usercopy.c index 83c164aba6e0..8b30906ca7f9 100644 --- a/mm/usercopy.c +++ b/mm/usercopy.c @@ -189,9 +189,19 @@ static inline void check_heap_object(const void *ptr, unsigned long n, if (!virt_addr_valid(ptr)) return; + /* + * We need to check this first because when CONFIG_SLAB_VIRTUAL is + * enabled a slab address might not be backed by a folio. + */ + if (IS_ENABLED(CONFIG_SLAB_VIRTUAL) && is_slab_addr(ptr)) { + /* Check slab allocator for flags and size. */ + __check_heap_object(ptr, n, virt_to_slab(ptr), to_user); + return; + } + folio = virt_to_folio(ptr); - if (folio_test_slab(folio)) { + if (!IS_ENABLED(CONFIG_SLAB_VIRTUAL) && folio_test_slab(folio)) { /* Check slab allocator for flags and size. */ __check_heap_object(ptr, n, folio_slab(folio), to_user); } else if (folio_test_large(folio)) { From patchwork Fri Sep 15 10:59:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140489 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:a55:b0:13f:353d:d1ed with SMTP id 21csp370324rwb; Fri, 15 Sep 2023 07:54:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFXc1TW1+gN0EJxxz3qgZys7BFSct3Mq4KmUvkR7UcHtR2mnD5IaDsVz1v7Wv2maIv8fwtF X-Received: by 2002:a17:90a:1fc8:b0:26b:2538:d717 with SMTP id z8-20020a17090a1fc800b0026b2538d717mr1713669pjz.25.1694789642894; Fri, 15 Sep 2023 07:54:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694789642; cv=none; d=google.com; s=arc-20160816; b=Vg1Onsp1T5ge0c+UAQc1TodkChZIJM0qcgbWu2TeiAC8karbLB2itKzMFEwwu2uh1Q KIR9DlLvTnu9ln/mFFc4Pmm05ZdsHYPa/rcR5Z6ed7+W9IR77Xnr3nA6fgUjHNfPCc+S +J/eHUf8B69mEQzjTR9Qy5zaOgExaEUU6hzMNNElvzOo8TESW9dWJWCAUJl8nZ0hZxHQ N66m4WtXrgMQx/v/+1QKL5ojh55DsDeZu2LxyjIxwwKNTGs6rt64JaSY1n8X3v7PY5Q8 hsjoRFbAJ3auJXjrLkaletVlKo9bRnaL+Dkv+igitE523blOOfLWWdhHRLQz/34lExev gWSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Xv8bl1gghcslEiMH0/FhgvPpwPOagI6elAD3nLRWX2k=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=ggwHEh5p7C5t89FJazNDbyPpC8EvPZ9sZaylxwOn9fdrWhbYMCcHAapPcbUOMxKiQv v+GwsxDeJoCM3pxz+ngENC4nHIghUY5g9A5K+ntSRjwqa8lncjsAS6azjpaTie03n9Kb hFFlOCIYwP8odIuxY3lTc0qcskYj8jDFWu2br66FzHeNWOwRd/qQ9oJSzpCJZLPwadwg NrOfLUmQ8AVxDkEIqcRcJ6sNXn9eNfbYafxz64MXgeDUCwu6Mqjh2GAagbzaT7YtASjO bwKLjvUVgaSITHdbQgjTpve6r2TMUczL5qgOZkXKGmqA6TbmTjmraIkgZtDvi17/NFav xpuQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=S2p93F7A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id az1-20020a056a02004100b00573f9a427d6si853360pgb.450.2023.09.15.07.54.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 07:54:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=S2p93F7A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 0B6F680E6C97; Fri, 15 Sep 2023 04:01:46 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231841AbjIOLBo (ORCPT + 32 others); Fri, 15 Sep 2023 07:01:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233884AbjIOLBf (ORCPT ); Fri, 15 Sep 2023 07:01:35 -0400 Received: from mail-ej1-x649.google.com (mail-ej1-x649.google.com [IPv6:2a00:1450:4864:20::649]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 26E0F272E for ; Fri, 15 Sep 2023 04:00:11 -0700 (PDT) Received: by mail-ej1-x649.google.com with SMTP id a640c23a62f3a-9a5d86705e4so143356866b.1 for ; Fri, 15 Sep 2023 04:00:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775609; x=1695380409; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Xv8bl1gghcslEiMH0/FhgvPpwPOagI6elAD3nLRWX2k=; b=S2p93F7A0wqiuJDV2GRyCDi4nQ/GWf5rYiMWy293ffMZgTknAV9VtkFjanCFblpjeZ Vz8riOrDUxMru8/Xd+jpL+9x03lPexD3tetyF6nW+ZLbmerLobIp+mZKD17m9CNfUpsE MOqVfSqBBVEwpWiGN4oAFbfSgW2oDtnN6GHD4WjvkPcaimIqVdD4ytihCT2228jmRTAH 4jruLV64m/j0Dw4HwPpSfLEXH1kKv5/XByh5i9fjp1NAjcewXlYamwpJh3hR/KRt2IZJ WL363SQmgT7XUMl8Xni4TtoUSQoQYLnnhqDNxZLxAbv/ueK4lNbA92kVd7zi0iRu4Gm1 YX5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775609; x=1695380409; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Xv8bl1gghcslEiMH0/FhgvPpwPOagI6elAD3nLRWX2k=; b=g2nqZt46D62gJ28PDU44Ewco64f38IXFwp4Cy6tpusUORM4BbPIRf/nt6Oy/jSndvG nbXmc18Pyyk4ADw43gGJwGA0Piyvag/9ntzt+GMXp14OcLt+57GN3hGsYndvnIxBDon+ ecNVLGFBhIMorGl4LUtaOP7hCYlaM4yFZJKPNS8izFZC1IUlG/+9pa58XUNcBn0oJ1+h xlMa9mC1YEv3d/gPFIiKpEW2aUSVAvoIetEwrotD/bxT0TnDzGGqyvygDYydxzqcL/Ty mr90a+PZlSPdECU4YzaEeASnxFeGX4UQ+5wDY90CzLQnfchjXYucyttalYBonI5Yhrea Bxqw== X-Gm-Message-State: AOJu0YypZwY/twRykJ3dixZKFY4+xCFyxlR3GeuneEn48g2T2UC3TRxP 4xog3X59U5j9aS5mKEw2DNxZH4PXqAPN5DU47Q== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a17:907:71d7:b0:9ad:c478:586b with SMTP id zw23-20020a17090771d700b009adc478586bmr6329ejb.13.1694775609336; Fri, 15 Sep 2023 04:00:09 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:31 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-13-matteorizzo@google.com> Subject: [RFC PATCH 12/14] mm/slub: introduce the deallocated_pages sysfs attribute From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:46 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777115744821646103 X-GMAIL-MSGID: 1777115744821646103 From: Jann Horn When SLAB_VIRTUAL is enabled this new sysfs attribute tracks the number of slab pages whose physical memory has been reclaimed but whose virtual memory is still allocated to a kmem_cache. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- include/linux/slub_def.h | 4 +++- mm/slub.c | 18 ++++++++++++++++++ 2 files changed, 21 insertions(+), 1 deletion(-) diff --git a/include/linux/slub_def.h b/include/linux/slub_def.h index 693e9bb34edc..eea402d849da 100644 --- a/include/linux/slub_def.h +++ b/include/linux/slub_def.h @@ -88,7 +88,7 @@ struct kmem_cache_cpu { */ struct kmem_cache_virtual { #ifdef CONFIG_SLAB_VIRTUAL - /* Protects freed_slabs and freed_slabs_min */ + /* Protects freed_slabs, freed_slabs_min, and nr_free_pages */ spinlock_t freed_slabs_lock; /* * Slabs on this list have virtual memory of size oo allocated to them @@ -97,6 +97,8 @@ struct kmem_cache_virtual { struct list_head freed_slabs; /* Same as freed_slabs but with memory of size min */ struct list_head freed_slabs_min; + /* Number of slab pages which got freed */ + unsigned long nr_freed_pages; #endif }; diff --git a/mm/slub.c b/mm/slub.c index 66ae60cdadaf..0f7f5bf0b174 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -2110,6 +2110,8 @@ static struct slab *get_free_slab(struct kmem_cache *s, if (likely(slab)) { list_del(&slab->slab_list); + WRITE_ONCE(s->virtual.nr_freed_pages, + s->virtual.nr_freed_pages - (1UL << slab_order(slab))); spin_unlock_irqrestore(&s->virtual.freed_slabs_lock, flags); return slab; @@ -2158,6 +2160,8 @@ static struct slab *alloc_slab_page(struct kmem_cache *s, /* Rollback: put the struct slab back. */ spin_lock_irqsave(&s->virtual.freed_slabs_lock, flags); list_add(&slab->slab_list, freed_slabs); + WRITE_ONCE(s->virtual.nr_freed_pages, + s->virtual.nr_freed_pages + (1UL << slab_order(slab))); spin_unlock_irqrestore(&s->virtual.freed_slabs_lock, flags); return NULL; @@ -2438,6 +2442,8 @@ static void slub_tlbflush_worker(struct kthread_work *work) WARN_ON(oo_order(slab->oo) != oo_order(s->min)); list_add(&slab->slab_list, &s->virtual.freed_slabs_min); } + WRITE_ONCE(s->virtual.nr_freed_pages, s->virtual.nr_freed_pages + + (1UL << slab_order(slab))); spin_unlock(&s->virtual.freed_slabs_lock); } spin_unlock_irqrestore(&slub_kworker_lock, irq_flags); @@ -4924,6 +4930,7 @@ static inline void slab_virtual_open(struct kmem_cache *s) spin_lock_init(&s->virtual.freed_slabs_lock); INIT_LIST_HEAD(&s->virtual.freed_slabs); INIT_LIST_HEAD(&s->virtual.freed_slabs_min); + s->virtual.nr_freed_pages = 0; #endif } @@ -6098,6 +6105,14 @@ static ssize_t objects_partial_show(struct kmem_cache *s, char *buf) } SLAB_ATTR_RO(objects_partial); +#ifdef CONFIG_SLAB_VIRTUAL +static ssize_t deallocated_pages_show(struct kmem_cache *s, char *buf) +{ + return sysfs_emit(buf, "%lu\n", READ_ONCE(s->virtual.nr_freed_pages)); +} +SLAB_ATTR_RO(deallocated_pages); +#endif /* CONFIG_SLAB_VIRTUAL */ + static ssize_t slabs_cpu_partial_show(struct kmem_cache *s, char *buf) { int objects = 0; @@ -6424,6 +6439,9 @@ static struct attribute *slab_attrs[] = { &min_partial_attr.attr, &cpu_partial_attr.attr, &objects_partial_attr.attr, +#ifdef CONFIG_SLAB_VIRTUAL + &deallocated_pages_attr.attr, +#endif &partial_attr.attr, &cpu_slabs_attr.attr, &ctor_attr.attr, From patchwork Fri Sep 15 10:59:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140448 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp1076142vqi; Fri, 15 Sep 2023 07:11:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG7O19N/aSWH8Dv+DW/nsoX1JC/RDN4x5yT0UMNOiiOoLzVEkH7O2QDz1zKU1dcQBCvmhb2 X-Received: by 2002:a05:6a20:244e:b0:14d:e075:fc5d with SMTP id t14-20020a056a20244e00b0014de075fc5dmr2051717pzc.40.1694787099053; Fri, 15 Sep 2023 07:11:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694787099; cv=none; d=google.com; s=arc-20160816; b=P9Iqr9G6wQcxyDVS34zPkXvdkSOrLJlsGDwkAU5C74R8IZk+pFTO4aOdVSwKiJvQ6c DZt7p1yItQxqkl3+c1lfA9kcqTQDNsT9gO6FYd8cnQn3qZpVt6xcBanyA+HbJQl+udPV EaiiNm5zsJ8C1Zxozchp27ejJQqymTmyfTC2HngTNH7sGuCXP497SGVqNMa+v9Zq6ctw /dtoo2zk28ZwOqrYQgO5Sm/xfHmfG+EkAkWNtEpZ5QdWLYNapeiwkg5Afrqgf735uOdS 96GBt4gjbjub2qOSl6YX2DcTqqs+U4mCXTaGXqro3RoqINsxXXQpBh47GSqB4KsNrrKO 54DA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=C4pEChVXRU6sm004Zu1IJnbwIsY1yXPwkOtMip47RAc=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=aLsciI/2MOua8+Idpl/kcJ9Yr+uIhmCDINiKHHVjQ2hv059AcDpRUYp2RIW9L7BQAp g1oHG3up7B7foPdBDDBlI92AleVpJGOAkBHk+XHB3u76Uwk+O42BKOlIL9oYnJ714PFj g5ZSy4CINpOK9DhvBIQN8CyTGl9+9YP5jN3YlqQfOMPixXAoajtz+HeHwFVQ3OphXYkx Ua8GTqYXm8kKHMZpw3PtC1u9/4wdZvk+1JW8hhKpNh67HA22Gespj7pp6uLkfH1ENfyd p8ZhP9nFauAHGsjHPNT3J0Wuu6K8NbFdsXiGMR6o5h8eMrLB74abMAKBp2jMKWiHKD1P SZgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=caER2+GT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id bt8-20020a632908000000b0057744d09d2fsi3322739pgb.18.2023.09.15.07.11.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 07:11:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=caER2+GT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 2E6568063BD7; Fri, 15 Sep 2023 04:01:23 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233924AbjIOLBC (ORCPT + 32 others); Fri, 15 Sep 2023 07:01:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59466 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234314AbjIOLA7 (ORCPT ); Fri, 15 Sep 2023 07:00:59 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6EC62D42 for ; Fri, 15 Sep 2023 04:00:12 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-cf4cb742715so2346580276.2 for ; Fri, 15 Sep 2023 04:00:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775612; x=1695380412; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=C4pEChVXRU6sm004Zu1IJnbwIsY1yXPwkOtMip47RAc=; b=caER2+GTwQV0Eb9KjW9LgnO5MO+4bH08gvZvDYLwhRzAj6vxyz9Kx4oIWpM2gnGD2r iv1hfpc35TbryyXHFB3fNa4MQ9qx/OVRnmd5ENOT+vwdBYQ173BABw0om86QdNTmPAcV WlPYBxZpLu/EJTw1MVq2Nj5buHDgztP2ChREUy1a6eoQQT48nPtCneIDyxYu3CEv3VnV GdG/BYmTZyu9toFczoU4gQUY2H7pyZNWjZa2/QeCLrhz/mqxd+hwaHhkvpCi0vZ5HkQB JgQ7s0FDXRt2U08+93YHyKMUOAQEa8dT04K2Z7mHKYrRa0YtjndYwTXJz5JcCYbiRA/f 0asw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775612; x=1695380412; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=C4pEChVXRU6sm004Zu1IJnbwIsY1yXPwkOtMip47RAc=; b=Dyn1GEv0hgExyNCQ09Q/FVPdmdKPYPK+ekp2qkNwvUrFEoWHLbAHGjCkDzocJXfdGk kL7mtg7B2k0DCXx1RfqutMopCbmn3FLKnf+9fTfhg8zKAzJq/V4y/gwaDC5Hq7B0V/yU TtJtQ+QAFRs/NMrGM2rb/fzmG/rcl77FSk5ry3AvWmPaLhEp9chBdcBvtYGCUcFTQEuU 0zOMjbLBbDdSOI1IX7pgyH62jZh7T7pa7xOfen3J0CgnHkejQQVaL/N4ceN8by1EEm3H rP8cdtGPtiaiSndqhLADmdBN3fhJv/NuXKYM46aw0fNuOXSIyLbb4brKDRAt0LN2csM0 +kVA== X-Gm-Message-State: AOJu0Yyn+UZZZBKJ0+iLAdb/5Nb+DyYlnVOHXYPg9BOtFG3ZHCnuoBA9 eIiYpjHxVzRToAo8IRQ+/gUDuHNCNeuxoWnI4w== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a05:6902:118a:b0:d80:183c:92b9 with SMTP id m10-20020a056902118a00b00d80183c92b9mr29633ybu.4.1694775611854; Fri, 15 Sep 2023 04:00:11 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:32 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-14-matteorizzo@google.com> Subject: [RFC PATCH 13/14] mm/slub: sanity-check freepointers From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:23 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777113076897825176 X-GMAIL-MSGID: 1777113076897825176 From: Jann Horn Sanity-check that: - non-NULL freepointers point into the slab - freepointers look plausibly aligned Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo Reviewed-by: Kees Cook --- lib/slub_kunit.c | 4 ++++ mm/slab.h | 8 +++++++ mm/slub.c | 57 ++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 69 insertions(+) diff --git a/lib/slub_kunit.c b/lib/slub_kunit.c index d4a3730b08fa..acf8600bd1fd 100644 --- a/lib/slub_kunit.c +++ b/lib/slub_kunit.c @@ -45,6 +45,10 @@ static void test_clobber_zone(struct kunit *test) #ifndef CONFIG_KASAN static void test_next_pointer(struct kunit *test) { + if (IS_ENABLED(CONFIG_SLAB_VIRTUAL)) + kunit_skip(test, + "incompatible with freepointer corruption detection in CONFIG_SLAB_VIRTUAL"); + struct kmem_cache *s = test_kmem_cache_create("TestSlub_next_ptr_free", 64, SLAB_POISON); u8 *p = kmem_cache_alloc(s, GFP_KERNEL); diff --git a/mm/slab.h b/mm/slab.h index 460c802924bd..8d10a011bdf0 100644 --- a/mm/slab.h +++ b/mm/slab.h @@ -79,6 +79,14 @@ struct slab { struct list_head flush_list_elem; + /* + * Not in kmem_cache because it depends on whether the allocation is + * normal order or fallback order. + * an alternative might be to over-allocate virtual memory for + * fallback-order pages. + */ + unsigned long align_mask; + /* Replaces the page lock */ spinlock_t slab_lock; diff --git a/mm/slub.c b/mm/slub.c index 0f7f5bf0b174..57474c8a6569 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -392,6 +392,44 @@ static inline freeptr_t freelist_ptr_encode(const struct kmem_cache *s, return (freeptr_t){.v = encoded}; } +/* + * Does some validation of freelist pointers. Without SLAB_VIRTUAL this is + * currently a no-op. + */ +static inline bool freelist_pointer_corrupted(struct slab *slab, freeptr_t ptr, + void *decoded) +{ +#ifdef CONFIG_SLAB_VIRTUAL + /* + * If the freepointer decodes to 0, use 0 as the slab_base so that + * the check below always passes (0 & slab->align_mask == 0). + */ + unsigned long slab_base = decoded ? (unsigned long)slab_to_virt(slab) + : 0; + + /* + * This verifies that the SLUB freepointer does not point outside the + * slab. Since at that point we can basically do it for free, it also + * checks that the pointer alignment looks vaguely sane. + * However, we probably don't want the cost of a proper division here, + * so instead we just do a cheap check whether the bottom bits that are + * clear in the size are also clear in the pointer. + * So for kmalloc-32, it does a perfect alignment check, but for + * kmalloc-192, it just checks that the pointer is a multiple of 32. + * This should probably be reconsidered - is this a good tradeoff, or + * should that part be thrown out, or do we want a proper accurate + * alignment check (and can we make it work with acceptable performance + * cost compared to the security improvement - probably not)? + */ + return CHECK_DATA_CORRUPTION( + ((unsigned long)decoded & slab->align_mask) != slab_base, + "bad freeptr (encoded %lx, ptr %p, base %lx, mask %lx", + ptr.v, decoded, slab_base, slab->align_mask); +#else + return false; +#endif +} + static inline void *freelist_ptr_decode(const struct kmem_cache *s, freeptr_t ptr, unsigned long ptr_addr, struct slab *slab) @@ -403,6 +441,10 @@ static inline void *freelist_ptr_decode(const struct kmem_cache *s, #else decoded = (void *)ptr.v; #endif + + if (unlikely(freelist_pointer_corrupted(slab, ptr, decoded))) + return NULL; + return decoded; } @@ -2122,6 +2164,21 @@ static struct slab *get_free_slab(struct kmem_cache *s, if (slab == NULL) return NULL; + /* + * Bits that must be equal to start-of-slab address for all + * objects inside the slab. + * For compatibility with pointer tagging (like in HWASAN), this would + * need to clear the pointer tag bits from the mask. + */ + slab->align_mask = ~((PAGE_SIZE << oo_order(oo)) - 1); + + /* + * Object alignment bits (must be zero, which is equal to the bits in + * the start-of-slab address) + */ + if (s->red_left_pad == 0) + slab->align_mask |= (1 << (ffs(s->size) - 1)) - 1; + return slab; } From patchwork Fri Sep 15 10:59:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matteo Rizzo X-Patchwork-Id: 140373 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp972882vqi; Fri, 15 Sep 2023 04:32:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEAVDpxOGZhPW4Z7mJq0amwPabUBv8HvhVjY+7LNPjQJQpAF7kpTniaO4Sg3XDezIPAnra2 X-Received: by 2002:a17:90a:fd98:b0:274:686d:497b with SMTP id cx24-20020a17090afd9800b00274686d497bmr1082736pjb.27.1694777521442; Fri, 15 Sep 2023 04:32:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694777521; cv=none; d=google.com; s=arc-20160816; b=PTBHo/3bwVKvJKhnJPZxMPBTLqfp7wpnKbMd4jiMRY2JFIJQcykad5Cm6DMxLGqJoL rbyC2HS/t3Z828sYLUxkb1G8LXDbxEbQWBQpVDanlMCqWYZ+O/B5uFMKhU9L5zfm7f1P cVO3bWD/vDnWgI5a3j0OXa0SNJim+IYSWNvhn3TITHX7SBX31pS4uqh5VXtSSElInlV/ YTb1ZYiI1o8rNeUrl0e9o5BsnQmnVdZ9aQunGEEblZ27yK7VGsPNC1VgN6XLlAQ1ROCz cXss4iBcLfb/4g0HYj1Rzmo6MqeXvOWIR42ssAdobs4P4s7YpSdUcoZPZlIs2HFKfKLa oYjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=/kdjtZjOl0pG1Y8OuH7SH0XgMhsH4ZDWKrdEtcTG86U=; fh=hxrcP/evkFXgdiqiA6zEdvE31LUTUVye/z/fSCqBw68=; b=n5PJO1beq3Nsk3XVmpUFRO5Tbsv8gFvMK5uHDDuhHjWmBNPnnEeMV14U9FI/edvC8O ZrcPhBRElEbyMXC6BmXs1Ez6B1Zmkz7JJbc9Jhlg+FZjhF4VHMQHoO2L+PlZaPh2CkcQ uCS6z/LjDdHO5XMTxNB0pxBhZb+uCtFeRM1PjSEO5Qre16mqa07e6PGxMqDq3SrjcB3D DsfoKbqVCHJ8uDjkuDymkPDyPD3jbi+DEGEJCoKNWiYX6QuPTJDi1JYSDg714B4nh0bo HhjzE5C+Zq6DFxpiNucOtBt8iZb3HaxAUfcYV/gOAILJt1NnmuzDU3iStBobsRwEdg2A KCWw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3ndptF9y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id v20-20020a17090a899400b0026813cd5719si3159471pjn.128.2023.09.15.04.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Sep 2023 04:32:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=3ndptF9y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 8E52380DB34D; Fri, 15 Sep 2023 04:01:55 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234398AbjIOLBG (ORCPT + 32 others); Fri, 15 Sep 2023 07:01:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59632 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234330AbjIOLBB (ORCPT ); Fri, 15 Sep 2023 07:01:01 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1C5FF2D54 for ; Fri, 15 Sep 2023 04:00:15 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-594e5e2e608so25545777b3.2 for ; Fri, 15 Sep 2023 04:00:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694775614; x=1695380414; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=/kdjtZjOl0pG1Y8OuH7SH0XgMhsH4ZDWKrdEtcTG86U=; b=3ndptF9yE0L023Xk0WgZY1aIs2w8tVLrrV0WwCIQAni/dKRZByoLnrT4Lu1rqRxp4f 3aE2B70ZTkvS/LjcxEI6kN/ibdqHpR4C0ru2VUsEkAnV8zATtSXLvuNoLPr0cfTii1DQ crghhB/Xk/og2ZfPvknCkfOpKwNz8M04i8MHtgefFYZKRTjvZdFXYHTrs6xR8KkJmPWg i95HteKNtPwzhxRjoPS0s85gf4CWtjHzloF9Irb4BKNgiq+MdlPNvUkgApzudF4r/Bo7 jtEl/yrmZCLqEUeC8CNGp9VFPm3KBg9kur28tfqVBmd6tzAQFCS0tJvvB0BVA4li4xl2 HHBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694775614; x=1695380414; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=/kdjtZjOl0pG1Y8OuH7SH0XgMhsH4ZDWKrdEtcTG86U=; b=IaSmLEvjQDCgfaSBiZQyBI3A043AaAaYKAQKDgBgLdooWAEoOZtmDIYLcP1WZmV/ko Rbutw2J06hc5asmTMWagz1Jdz6Y1ZwSd9twAEvcidxJ5q/SUvuE8ue02+ImzHgPleHov 9xIYC66YGL83ULsTKJEwSD7BRl8OF9OLjxFcJsn20Xah3AoDdMuOCPAJ1kS49X/EYGtt afws1GxvePIG3ABP5SFzv1Tnh34ljlIl9aa6JXl58UvM2qqYb9eksLTWAnfIWPT2JeWy uzki95pynXB8ruhuYzVHpKWnUyMoeQKmMJIOYCa0UfsrDIjYsL+TW+hpPPY9Wl8E0qGq YLHg== X-Gm-Message-State: AOJu0Yw/0l8tvyMYZ6UXzLmkk05+81AW2TqyDB31WT8eYZMbrR2QPiMd W3EvoKBG5tMDmrphyJjvzzT1n3e05M8GR4YeXQ== X-Received: from mr-cloudtop2.c.googlers.com ([fda3:e722:ac3:cc00:31:98fb:c0a8:2a6]) (user=matteorizzo job=sendgmr) by 2002:a81:b643:0:b0:59b:f863:6f60 with SMTP id h3-20020a81b643000000b0059bf8636f60mr37260ywk.4.1694775614195; Fri, 15 Sep 2023 04:00:14 -0700 (PDT) Date: Fri, 15 Sep 2023 10:59:33 +0000 In-Reply-To: <20230915105933.495735-1-matteorizzo@google.com> Mime-Version: 1.0 References: <20230915105933.495735-1-matteorizzo@google.com> X-Mailer: git-send-email 2.42.0.459.ge4e396fd5e-goog Message-ID: <20230915105933.495735-15-matteorizzo@google.com> Subject: [RFC PATCH 14/14] security: add documentation for SLAB_VIRTUAL From: Matteo Rizzo To: cl@linux.com, penberg@kernel.org, rientjes@google.com, iamjoonsoo.kim@lge.com, akpm@linux-foundation.org, vbabka@suse.cz, roman.gushchin@linux.dev, 42.hyeyoo@gmail.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-hardening@vger.kernel.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, corbet@lwn.net, luto@kernel.org, peterz@infradead.org Cc: jannh@google.com, matteorizzo@google.com, evn@google.com, poprdi@google.com, jordyzomer@google.com X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Fri, 15 Sep 2023 04:01:55 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777103034145978586 X-GMAIL-MSGID: 1777103034145978586 From: Jann Horn Document what SLAB_VIRTUAL is trying to do, how it's implemented, and why. Signed-off-by: Jann Horn Co-developed-by: Matteo Rizzo Signed-off-by: Matteo Rizzo --- Documentation/security/self-protection.rst | 102 +++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/Documentation/security/self-protection.rst b/Documentation/security/self-protection.rst index 910668e665cb..5a5e99e3f244 100644 --- a/Documentation/security/self-protection.rst +++ b/Documentation/security/self-protection.rst @@ -314,3 +314,105 @@ To help kill classes of bugs that result in kernel addresses being written to userspace, the destination of writes needs to be tracked. If the buffer is destined for userspace (e.g. seq_file backed ``/proc`` files), it should automatically censor sensitive values. + + +Memory Allocator Mitigations +============================ + +Protection against cross-cache attacks (SLAB_VIRTUAL) +----------------------------------------------------- + +SLAB_VIRTUAL is a mitigation that deterministically prevents cross-cache +attacks. + +Linux Kernel use-after-free vulnerabilities are commonly exploited by turning +them into an object type confusion (having two active pointers of different +types to the same memory location) using one of the following techniques: + +1. Direct object reuse: make the kernel give the victim object back to the slab + allocator, then allocate the object again from the same slab cache as a + different type. This is only possible if the victim object resides in a slab + cache which can contain objects of different types - for example one of the + kmalloc caches. +2. "Cross-cache attack": make the kernel give the victim object back to the slab + allocator, then make the slab allocator give the page containing the object + back to the page allocator, then either allocate the page directly as some + other type of page or make the slab allocator allocate it again for a + different slab cache and allocate an object from there. + +In either case, the important part is that the same virtual address is reused +for two objects of different types. + +The first case can be addressed by separating objects of different types +into different slab caches. If a slab cache only contains objects of the +same type then directly turning an use-after-free into a type confusion is +impossible as long as the slab page that contains the victim object remains +assigned to that slab cache. This type of mitigation is easily bypassable +by cross-cache attacks: if the attacker can make the slab allocator return +the page containing the victim object to the page allocator and then make +it use the same page for a different slab cache, type confusion becomes +possible again. Addressing the first case is therefore only worthwhile if +cross-cache attacks are also addressed. AUTOSLAB uses a combination of +probabilistic mitigations for this. SLAB_VIRTUAL addresses the second case +deterministically by changing the way the slab allocator allocates memory. + +Preventing slab virtual address reuse +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In theory there is an easy fix against cross-cache attacks: modify the slab +allocator so that it never gives memory back to the page allocator. In practice +this would be problematic because physical memory remains permanently assigned +to a slab cache even if it doesn't contain any active objects. A viable +cross-cache mitigation must allow the system to reclaim unused physical memory. +In the current design of the slab allocator there is no way +to keep a region of virtual memory permanently assigned to a slab cache without +also permanently reserving physical memory. That is because the virtual +addresses that the slab allocator uses come from the linear map region, where +there is a 1:1 correspondence between virtual and physical addresses. + +SLAB_VIRTUAL's solution is to create a dedicated virtual memory region that is +only used for slab memory, and to enforce that once a range of virtual addresses +is used for a slab cache, it is never reused for any other caches. Using a +dedicated region of virtual memory lets us reserve ranges of virtual addresses +to prevent cross-cache attacks and at the same time release physical memory back +to the system when it's no longer needed. This is what Chromium's PartitionAlloc +does in userspace +(https://chromium.googlesource.com/chromium/src/+/354da2514b31df2aa14291199a567e10a7671621/base/allocator/partition_allocator/PartitionAlloc.md). + +Implementation +~~~~~~~~~~~~~~ + +SLAB_VIRTUAL reserves a region of virtual memory for the slab allocator. All +pointers returned by the slab allocator point to this region. The region is +statically partitioned in two sub-regions: the metadata region and the data +region. The data region is where the actual objects are allocated from. The +metadata region is an array of struct slab objects, one for each PAGE_SIZE bytes +in the data region. +Without SLAB_VIRTUAL, struct slab is overlaid on top of the struct page/struct +folio that corresponds to the physical memory page backing the slab instead of +using a dedicated memory region. This doesn't work for SLAB_VIRTUAL, which needs +to store metadata for slabs even when no physical memory is allocated to them. +Having an array of struct slab lets us implement virt_to_slab efficiently purely +with arithmetic. In order to support high-order slabs, the struct slabs +corresponding to tail pages contain a pointer to the head slab, which +corresponds to the slab's head page. + +TLB flushing +~~~~~~~~~~~~ + +Before it can release a page of physical memory back to the page allocator, the +slab allocator must flush the TLB entries for that page on all CPUs. This is not +only necessary for the mitigation to work reliably but it's also required for +correctness. Without a TLB flush some CPUs might continue using the old mapping +if the virtual address range is reused for a new slab and cause memory +corruption even in the absence of other bugs. The slab allocator can release +pages in contexts where TLB flushes can't be performed (e.g. in hardware +interrupt handlers). Pages to free are not freed directly, and instead they are +put on a queue and freed from a workqueue context which also flushes the TLB. + +Performance +~~~~~~~~~~~ + +SLAB_VIRTUAL's performance impact depends on the workload. On kernel compilation +(kernbench) the slowdown is about 1-2% depending on the machine type and is +slightly worse on machines with more cores.