From patchwork Thu Aug 10 08:13:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chuyi Zhou X-Patchwork-Id: 13447 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp344802vqi; Thu, 10 Aug 2023 04:20:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHLvcg0aDx1kPqNfVFnB6S363aq0XGDamqaXG1guAIBObMQtZ/z2CoQMIw4/CAm4T35WHos X-Received: by 2002:a17:90a:6046:b0:269:60ed:d495 with SMTP id h6-20020a17090a604600b0026960edd495mr1412030pjm.27.1691666420761; Thu, 10 Aug 2023 04:20:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691666420; cv=none; d=google.com; s=arc-20160816; b=XrWeT3HEdNVnCpji9JEXXPiuCWV9gBzgYTbSngR21BChW+cto333hoE4+nCb6kJlfi Cn/a1KlBOIgSQmkdJgxicCJXrcal4CV2Oi6CDdFRXs3SbR0SsPf2jXSk0zWFpV3Hvdef eQxgH0LOKlGMcElU30q7TrnY533owGGHBOgExnYeLd9tDOluM9xbkI4M5IZEP0HCmiXJ Nj8Fy44VCfls/yEFD8jqbopkNnf9qfdFtVhZ0iFcSDGLhf8wVpwMdbibLxNDJ51yQw0q Vc+nXXpz4SQ5vpU7+VY48hjBRSdILFLRUf0hwa0TjgwLZiFmXk7BzdlX/U2Q1Ngg7gk/ ah9Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=PEx1EuEgk0xV+c+q+3VbaXrWwc7le6laBtvGOnu8wyQ=; fh=5K/5O+uaWy/TXWHq+XYNoCsNLzz39Nw0/H1YBlI/NpI=; b=rtgFiZ/4vIkC+9en8fyTZWg+InVY26tpD78S0j+wF++E2lRoZ7OCCyS8tkVxClf+Zy Hbg1GaZjxjZU5REOPT9LpsD9wxR2aIOgUyIMGHXB9vUj29ijCX20nRI10w/x7x0T/s+P LWJxsRA//r6c8Yw/KuNTDr35TuMH9aykXRubtjOYavCohk94fKpwriHpZbI4KbDOs5LP 0jbw/ZWKK5b8rL5ncG/WlyALKwvjQfOBKLSj7Yerh5nXbPN76boatGMj5ZUm0f0OuC+c p2cIKNcU8ipXIB8XSoZTmgF0h8u9LnxBe7V6+mPXqE6VDaHIWf07kyXgiR+qh4hIVy7v ln9g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=ijsKmilt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p2-20020a17090ab90200b0025bcf85bda2si1406395pjr.7.2023.08.10.04.20.06; Thu, 10 Aug 2023 04:20:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=ijsKmilt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234083AbjHJIN4 (ORCPT + 99 others); Thu, 10 Aug 2023 04:13:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234063AbjHJINw (ORCPT ); Thu, 10 Aug 2023 04:13:52 -0400 Received: from mail-oi1-x22c.google.com (mail-oi1-x22c.google.com [IPv6:2607:f8b0:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB80D211B for ; Thu, 10 Aug 2023 01:13:27 -0700 (PDT) Received: by mail-oi1-x22c.google.com with SMTP id 5614622812f47-3a3efebcc24so503425b6e.1 for ; Thu, 10 Aug 2023 01:13:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691655207; x=1692260007; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=PEx1EuEgk0xV+c+q+3VbaXrWwc7le6laBtvGOnu8wyQ=; b=ijsKmiltZmQUhQYUjGd7SHhsVtX8YNhhN6I8sy4n2JiRX3b1y7j5NWS2dVdpykwqCq 54/qPpSLd1/T1rsZsIbrKz5gpF20u+I21TRZ0+CLmdPrLc/355ivsbhzdaV0baDn8fQr PL2D1DNPGuouhWTAZCfNORitelOUxt7AXL2OncZDVwL3dW9ghEu6FHBKVH8RFa+nLjOW 67DKngKnv43zyUthGq5kwTChsSb3K1AokbfkHXxU+JKBxQ46ZxVTIq3MmdBbwUjFVL9m pyoSEJxRcQKKabPqKL94pHarAFDeuFhBIDQkMaNBR2i0yTA3xk/vQ0ydGLBkhWgzN8HU DFxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691655207; x=1692260007; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=PEx1EuEgk0xV+c+q+3VbaXrWwc7le6laBtvGOnu8wyQ=; b=eJ20eXZucsY56mZa25TUAHAV+Qfuz1p1CJewZ8yN42h02ZL/8i4Kn1h54ozEf9FwV/ R0b1wdaUc+oQzeG3L495u/61EDCeHS92vSDdccEZGQkY+tOx3TnWxq7eRySDV9IfiJXQ ZuGUWdWddXK5vdNxNmUwnMLvRfJajcTQWzdpQzPqrnpOVI3e6Umd4xmLSCoW2Jg+TnVm 4KuLAFf/l/MNfwGKpBLsBMjQUmSqQyAdaEMnESmYmODRPZBX4cfeD/Y+bIVreb8DukkJ yh4Xj4qzAlDqomdXSYtNV0UiW0e9Sy0GqIANzvOOf9lv5kxxSbZ5grXLG/hoG407m6Yu OWIw== X-Gm-Message-State: AOJu0Yz8ru1V+vunITz+p3rk996hLmjJXgyvDaUz63lsbBc8d20RusCy Q7C3ME71jKfmPLc4YburhjrodA== X-Received: by 2002:a05:6808:30a5:b0:3a7:3ab9:e590 with SMTP id bl37-20020a05680830a500b003a73ab9e590mr2629027oib.9.1691655207046; Thu, 10 Aug 2023 01:13:27 -0700 (PDT) Received: from n37-019-243.byted.org ([180.184.51.40]) by smtp.gmail.com with ESMTPSA id x12-20020a170902ec8c00b001b1a2c14a4asm1019036plg.38.2023.08.10.01.13.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Aug 2023 01:13:26 -0700 (PDT) From: Chuyi Zhou To: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, muchun.song@linux.dev Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, wuyun.abel@bytedance.com, robin.lu@bytedance.com, Chuyi Zhou Subject: [RFC PATCH v2 0/5] mm: Select victim using bpf_oom_evaluate_task Date: Thu, 10 Aug 2023 16:13:14 +0800 Message-Id: <20230810081319.65668-1-zhouchuyi@bytedance.com> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773840808867331775 X-GMAIL-MSGID: 1773840808867331775 Changes ------- This is v2 of the BPF OOM policy patchset. v1 : https://lore.kernel.org/lkml/20230804093804.47039-1-zhouchuyi@bytedance.com/ v1 -> v2 changes: - rename bpf_select_task to bpf_oom_evaluate_task and bypass the tsk_is_oom_victim (and MMF_OOM_SKIP) logic. (Michal) - add a new hook to set policy's name, so dump_header() can know what has been the selection policy when reporting messages. (Michal) - add a tracepoint when select_bad_process() find nothing. (Alan) - add a doc to to describe how it is all supposed to work. (Alan) ================ This patchset adds a new interface and use it to select victim when OOM is invoked. The mainly motivation is the need to customizable OOM victim selection functionality. The new interface is a bpf hook plugged in oom_evaluate_task. It takes oc and current task as parameters and return a result indicating which one is selected by the attached bpf program. There are several conserns when designing this interface suggested by Michal: 1. Hooking into oom_evaluate_task can keep the consistency of global and memcg OOM interface. Besides, it seems the least disruptive to the existing oom killer implementation. 2. Userspace can handle a lot on its own and provide the input to the BPF program to make a decision. Since the oom scope iteration will be implemented already in the kernel so all the BPF program has to do is to rank processes or memcgs. 3. The new interface should better bypass the current heuristic rules (e.g., tsk_is_oom_victim, and MMF_OOM_SKIP) to meet an arbitrary oom policy's need. Chuyi Zhou (5): mm, oom: Introduce bpf_oom_evaluate_task mm: Add policy_name to identify OOM policies mm: Add a tracepoint when OOM victim selection is failed bpf: Add a OOM policy test bpf: Add a BPF OOM policy Doc Documentation/bpf/oom.rst | 70 +++++++++ include/linux/oom.h | 7 + include/trace/events/oom.h | 18 +++ mm/oom_kill.c | 100 +++++++++++-- .../bpf/prog_tests/test_oom_policy.c | 140 ++++++++++++++++++ .../testing/selftests/bpf/progs/oom_policy.c | 104 +++++++++++++ 6 files changed, 428 insertions(+), 11 deletions(-) create mode 100644 Documentation/bpf/oom.rst create mode 100644 tools/testing/selftests/bpf/prog_tests/test_oom_policy.c create mode 100644 tools/testing/selftests/bpf/progs/oom_policy.c