From patchwork Thu Apr 6 21:06:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Namhyung Kim X-Patchwork-Id: 80534 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1309701vqo; Thu, 6 Apr 2023 14:24:48 -0700 (PDT) X-Google-Smtp-Source: AKy350YtpPxkgoyfp3MxUnM6sdIF/c0tZ7vLZI1953ejvNG+yHrwrsbKghhWflZbKt8SfN/l4vBC X-Received: by 2002:a17:90b:3882:b0:23b:4bf6:bbed with SMTP id mu2-20020a17090b388200b0023b4bf6bbedmr13384812pjb.24.1680816288493; Thu, 06 Apr 2023 14:24:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680816288; cv=none; d=google.com; s=arc-20160816; b=pyISCQyuAGB7Nq6RFJUV6nKYHg8Wtz4Qp66YaLzRCPzFMMoZGxfNjJJijksQG/eGOg ktROKIu1gB2bG8RPgb5l+AfAKVC4BT8f5O+rPxhWkGPC1ubVRwPDn4zB62A28bu8sale lMzG3dy8gWguxdM+qVm3sPIwq+TjwAOPJYQ5ybe3begamIEsdXvKME8PovNr9PrUJhpO qcl4qv2rF5lz82NSizRyMsDsI6kqXXqIwsyHkLtbC7iZc886Tc+JlxSCp2VbcsU5Wi4Z zPxKu2yirV6g9d3s9JrzKMFzTJnv9cou1azU6KvnhLvpnr3txk6khRJTqEYIycM8Ex2N ZK0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=7I+ZTLBAAWmNlN+JBk4TipMBC8vuk1wOji+jyAsKtTA=; b=NQdomfrVITUFSxgC34QwBqYCVxYC2ArHCGC+zF7zY0ZJ5kw4WFVmdKdbObwu7IUwoN 7PaEdBaE1FA1WGqflICyqEyx3cV4S6XkqD9vKfPZ71BDD5lV1Y8hmYOa/pCeuC8eogbL 5g/awF39HJyNr4iWiSMB92+Tu0wSZOYxv+S3xKGuJp9yvkX5EN0EDES/gbMHAJKf06Tg s6LoWOEvZt2cDFmQANadRP4p/1xQCyaLVGo4f6yUGHxKwVCsjeh0uEldUocI2OV/oD21 +Tb/p9z3D9MOyloxt4CgL/vWMk+rXt+m0PM59X5DpzOUCrjZ5joxdJiEv4w15NiOAMtu 6+Gw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=g4JIygxR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ml8-20020a17090b360800b0023d46505d83si4655283pjb.79.2023.04.06.14.24.25; Thu, 06 Apr 2023 14:24:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=g4JIygxR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238308AbjDFVGx (ORCPT + 99 others); Thu, 6 Apr 2023 17:06:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237801AbjDFVGd (ORCPT ); Thu, 6 Apr 2023 17:06:33 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 779E39753; Thu, 6 Apr 2023 14:06:23 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id v9so2339243pjk.0; Thu, 06 Apr 2023 14:06:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; t=1680815182; x=1683407182; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=7I+ZTLBAAWmNlN+JBk4TipMBC8vuk1wOji+jyAsKtTA=; b=g4JIygxRKGw+RQeL1AkR1lDj7TucctKNv+PWq71gm17icpuIeT3xmoTuugrHOLcNr0 ORQKJuJ2IVmfYcVCXque1deeq+sB83xaNjsybbNsXrMaGS8MT5otHkBuREnwaW3Y2BPp IEYNGklF4PxLGGX26J+uBP+kmBL5dbuxPTudwSIDknQ0AFQrgzl9UKVP8+xdx23dCTo7 2Z6etn1aXcN6VKa8s1Sw1XMy/aQZE6lp/tWtioVC3FOx5JoS+nhGVJZepj3hhskselX0 Zs4WbVQz24EsYhJXDzcjngDMH3hqMGXGNwItuHOaVg6eTgkhu6zHcUleKquXWZ7+NOqJ vHYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680815182; x=1683407182; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=7I+ZTLBAAWmNlN+JBk4TipMBC8vuk1wOji+jyAsKtTA=; b=SPSjbwSpKUSkkAvJoCHSLmgCfV72gPBXWX8o1ZQUbscxf7LIp4MpPZZ0kr7aBZPUfu iFyDthWqyWYxwvpiWBV91c8S3hKHfQaxnTQnyi+JpXk7xYv/D/JpNJv0/mop9puF/x8+ 88OwTkXwZb67Aq4clgILA+Mz4gcC/XQ92ph/h5DtbGTQ1ZP9SPt9/+LguYOfvKd9MFTX qxX2WaXMDSDOguihiQBu/j9kl80WWBYGkAnC/W7iJVvCHwHN9DX+5KCtXaFrmpprYFF1 +4Ktr5Kq5YtNJlbz8ntrjYoXQ0Xsp6pVeRNcIWvGz4Mn8bSe5Tv+nSzqy0/rSrpruZLo MW5A== X-Gm-Message-State: AAQBX9c96CYVtNcIytT9ChVWetMs7Ph77rORX/160MMUQCljM64clDJ6 DrBMjccMjvhXrYo7Mr6/Cug= X-Received: by 2002:a17:903:64c:b0:1a4:fcc9:ec61 with SMTP id kh12-20020a170903064c00b001a4fcc9ec61mr479341plb.5.1680815182514; Thu, 06 Apr 2023 14:06:22 -0700 (PDT) Received: from moohyul.svl.corp.google.com ([2620:15c:2d4:203:3301:38fe:e39e:3d51]) by smtp.gmail.com with ESMTPSA id g5-20020a170902868500b001a0667822c8sm1777837plo.94.2023.04.06.14.06.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Apr 2023 14:06:22 -0700 (PDT) Sender: Namhyung Kim From: Namhyung Kim To: Arnaldo Carvalho de Melo , Jiri Olsa Cc: Ian Rogers , Adrian Hunter , Peter Zijlstra , Ingo Molnar , LKML , linux-perf-users@vger.kernel.org, Song Liu , Hao Luo , bpf@vger.kernel.org, Juri Lelli Subject: [PATCH 7/7] perf lock contention: Do not try to update if hash map is full Date: Thu, 6 Apr 2023 14:06:11 -0700 Message-Id: <20230406210611.1622492-8-namhyung@kernel.org> X-Mailer: git-send-email 2.40.0.577.gac1e443424-goog In-Reply-To: <20230406210611.1622492-1-namhyung@kernel.org> References: <20230406210611.1622492-1-namhyung@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=0.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762463620520226530?= X-GMAIL-MSGID: =?utf-8?q?1762463620520226530?= It doesn't delete data in the task_data and lock_stat maps. The data is kept there until it's consumed by userspace at the end. But it calls bpf_map_update_elem() again and again, and the data will be discarded if the map is full. This is not good. Worse, in the bpf_map_update_elem(), it keeps trying to get a new node even if the map was full. I guess it makes sense if it deletes some node like in the tstamp map (that's why I didn't make the change there). In a pre-allocated hash map, that means it'd iterate all CPU to check the freelist. And it has a bad performance impact on large machines. I've checked it on my 64 CPU machine with this. $ perf bench sched messaging -g 1000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 1000 groups == 40000 processes run Total time: 2.825 [sec] And I used the task mode, so that it can guarantee the map is full. The default map entry size is 16K and this workload has 40K tasks. Before: $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 1000 groups == 40000 processes run Total time: 11.299 [sec] contended total wait max wait avg wait pid comm 19284 3.51 s 3.70 ms 181.91 us 1305863 sched-messaging 243 84.09 ms 466.67 us 346.04 us 1336608 sched-messaging 177 66.35 ms 12.08 ms 374.88 us 1220416 node For some reason, it didn't report the data failures. But you can see the total time in the workload is increased a lot (2.8 -> 11.3). If it fails early when the map is full, it goes back to normal. After: $ sudo ./perf lock con -abt -E3 -- perf bench sched messaging -g 1000 # Running 'sched/messaging' benchmark: # 20 sender and receiver processes per group # 1000 groups == 40000 processes run Total time: 3.044 [sec] contended total wait max wait avg wait pid comm 18743 591.92 ms 442.96 us 31.58 us 1431454 sched-messaging 51 210.64 ms 207.45 ms 4.13 ms 1468724 sched-messaging 81 68.61 ms 65.79 ms 847.07 us 1463183 sched-messaging === output for debug === bad: 1164137, total: 2253341 bad rate: 51.66 % histogram of failure reasons task: 0 stack: 0 time: 0 data: 1164137 Signed-off-by: Namhyung Kim --- .../perf/util/bpf_skel/lock_contention.bpf.c | 22 ++++++++++++++++--- 1 file changed, 19 insertions(+), 3 deletions(-) diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c index cb87c98e5340..23f6e63544ed 100644 --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c @@ -4,6 +4,7 @@ #include #include #include +#include #include "lock_data.h" @@ -126,6 +127,9 @@ int stack_fail; int time_fail; int data_fail; +int task_map_full; +int data_map_full; + static inline int can_record(u64 *ctx) { if (has_cpu) { @@ -177,11 +181,12 @@ static inline int update_task_data(struct task_struct *task) return -1; p = bpf_map_lookup_elem(&task_data, &pid); - if (p == NULL) { + if (p == NULL && !task_map_full) { struct contention_task_data data = {}; BPF_CORE_READ_STR_INTO(&data.comm, task, comm); - bpf_map_update_elem(&task_data, &pid, &data, BPF_NOEXIST); + if (bpf_map_update_elem(&task_data, &pid, &data, BPF_NOEXIST) == -E2BIG) + task_map_full = 1; } return 0; @@ -370,6 +375,12 @@ int contention_end(u64 *ctx) data = bpf_map_lookup_elem(&lock_stat, &key); if (!data) { + if (data_map_full) { + bpf_map_delete_elem(&tstamp, &pid); + __sync_fetch_and_add(&data_fail, 1); + return 0; + } + struct contention_data first = { .total_time = duration, .max_time = duration, @@ -377,12 +388,17 @@ int contention_end(u64 *ctx) .count = 1, .flags = pelem->flags, }; + int err; if (aggr_mode == LOCK_AGGR_ADDR) first.flags |= check_lock_type(pelem->lock, pelem->flags); - if (bpf_map_update_elem(&lock_stat, &key, &first, BPF_NOEXIST) < 0) + err = bpf_map_update_elem(&lock_stat, &key, &first, BPF_NOEXIST); + if (err < 0) { + if (err == -E2BIG) + data_map_full = 1; __sync_fetch_and_add(&data_fail, 1); + } bpf_map_delete_elem(&tstamp, &pid); return 0; }