Message ID | 20230928235018.2136-1-namhyung@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp3813602vqu; Thu, 28 Sep 2023 23:15:56 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH25essKG3EmY75AqDIT1pWiKzcnSyQr4+eqsMV+BPv1kpbXDnLiecp+FOS30mptqRu/oQd X-Received: by 2002:a17:902:ec91:b0:1c5:dfe9:b209 with SMTP id x17-20020a170902ec9100b001c5dfe9b209mr3993195plg.62.1695968156089; Thu, 28 Sep 2023 23:15:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695968156; cv=none; d=google.com; s=arc-20160816; b=FSGg9dikMrJ1kRaNns9q8HRaoqbf8nlpbAqch6ll6IUENNRJHJTqHwcsYR79mDINVb GlN+i1g6rYDmR3ldMlIxklMI8bZqQz4nd8FcPAnf/XUt6m833/aT4/QGdlv7xStLq1jD mKYp5WI4WeYY1+/BwcTukszzLy1cEkaPZ2nnhSvfD3Mh+gC9P3HLUlT5Qs4jXBR2zn+e jRyuW2qTY1fBM/8rPCrPxQ/f8XtcKSsvBXKfFQnu/mO7l5WIM3LmRQzZiLE35AXMOYef BRvJ5jbmBOXX2f4fumqp6vBJTJmMaiXL6uFmdRi+5Fy5N+dR4uYCmh6uZm/0f1BRMq3N 1L9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:sender:dkim-signature; bh=ZMt+zIF+jw6oNrS3fj8y+UDRx1fAa1Xaxe11egraIX0=; fh=Eu3OfQ0ohnhl6u2myNPCo9midvMFa9y2PC9Dj+pizlY=; b=Gl30/v5xOUoH5rLtsbgf56NqGplV484JP/dg3NAjYWa4U0qnXA+28rTY4r2ZUkXiGS Z83Kc8xYJG+T9HGq9M+G3yKeXhLF1zTPfyAos2OM6qYqZIsaoUV84BVUm2GJzPLILoOs nOTXDVTS5jwEY+j1p/RIOIILIMp/5ANtidCHs3Eb8rPl1ZuQmx9D6/PbL8vKqwf0yxco Ci+oD7y9V/YcFyO/1fFs2a06G/JhBJNQpktNFvGCRHkgQYv0roednWMJVsLAxxrad6l4 btMzx0SvubGROrfXcK8zJmFgttS7dvaCwYXRXL7p94+/apiLo+AA1is0Tuoik5wBPNAF lyDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="F/L+H3te"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id t12-20020a1709027fcc00b001c6139bf838si12262342plb.544.2023.09.28.23.15.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 23:15:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="F/L+H3te"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 291998301535; Thu, 28 Sep 2023 16:54:15 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230274AbjI1XuZ (ORCPT <rfc822;pwkd43@gmail.com> + 21 others); Thu, 28 Sep 2023 19:50:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229541AbjI1XuY (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 28 Sep 2023 19:50:24 -0400 Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2688E195; Thu, 28 Sep 2023 16:50:23 -0700 (PDT) Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-69101022969so12171605b3a.3; Thu, 28 Sep 2023 16:50:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695945022; x=1696549822; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:from:to:cc:subject:date:message-id:reply-to; bh=ZMt+zIF+jw6oNrS3fj8y+UDRx1fAa1Xaxe11egraIX0=; b=F/L+H3te0DOa1zgkD/6EoFeFcmW+jPQoU8wsdBX6a05Yiv3ptIk+595w4ay+fqx65c AssjbP2gtw0v7ltH96He38aEqCuvm2Oz8LJnHlgxP7BIThxOsMxclGnj4RCICXFpMVdD niB10JeLnM6577xagb/uDcxKsc8JThM3SE0ggFmdPQ3xrnDrNaghKeRR6rBxYRJdBiQ+ 1pp6WT9AfN40+X5jYVcBNaVhTqXE2HN1NHpJHuLomvI6lv5ZjmPUt9y1cmkuTtJ9bwZp CGaDPX8TpR62q7NWJRJo6O2j3LZvoDi2HUSzWlliOdJjOofG6Tj1PEmYL9NKA1p1y0zW YCVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695945022; x=1696549822; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:sender:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ZMt+zIF+jw6oNrS3fj8y+UDRx1fAa1Xaxe11egraIX0=; b=lHTD89W859J/SbUXi/XK0J8LTnpe0ePBKux5B3Bk+gUVvkWNPsDzHX1Kc2x7U2GAJT 8PtaAbJ++5IVtZ5YJ/RgMOpTwKUKVJ6FW6Z45w/qpiTtilyBV+nLxgUSsIwoahwviVTE FNRdyDp/I+rCoydg2NT7Axk71hb7F5alJO93aGIA9j4g0F7JHrBLWnsR3rSTPeYnsijK QnBw3q3dKvKcjHWdPdYLf6XyDIL/6XhA6dUs81zmxolDbmWf11xIi0097YA7U1DeJbpN Pa2k/oNoRj4MLPZDTscKalgsIDPTx3FLlS79JVBZjaQj5hoHrnThgbMg5WpIqR3sArYE KrVA== X-Gm-Message-State: AOJu0YwIKa9/5gMWSXkpEzEXEy+BRKTC7REd+HD//pFfWTHImCcro1mU IKHKASreyYuTZcb91Q9QF9g= X-Received: by 2002:a05:6a00:c92:b0:68c:3f2:5ff7 with SMTP id a18-20020a056a000c9200b0068c03f25ff7mr2821852pfv.1.1695945022550; Thu, 28 Sep 2023 16:50:22 -0700 (PDT) Received: from bangji.hsd1.ca.comcast.net ([2601:647:6780:42e0:819e:1876:98c3:2f15]) by smtp.gmail.com with ESMTPSA id q9-20020a639809000000b00573f82bb00esm13383482pgd.2.2023.09.28.16.50.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 28 Sep 2023 16:50:22 -0700 (PDT) Sender: Namhyung Kim <namhyung@gmail.com> From: Namhyung Kim <namhyung@kernel.org> To: Arnaldo Carvalho de Melo <acme@kernel.org>, Jiri Olsa <jolsa@kernel.org> Cc: Ian Rogers <irogers@google.com>, Adrian Hunter <adrian.hunter@intel.com>, Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@kernel.org>, LKML <linux-kernel@vger.kernel.org>, linux-perf-users@vger.kernel.org, Song Liu <song@kernel.org>, Hao Luo <haoluo@google.com>, bpf@vger.kernel.org Subject: [PATCH] perf lock contention: Clear lock addr after use Date: Thu, 28 Sep 2023 16:50:18 -0700 Message-ID: <20230928235018.2136-1-namhyung@kernel.org> X-Mailer: git-send-email 2.42.0.582.g8ccd20d70d-goog MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.0 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 28 Sep 2023 16:54:15 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778351504940239354 X-GMAIL-MSGID: 1778351504940239354 |
Series |
perf lock contention: Clear lock addr after use
|
|
Commit Message
Namhyung Kim
Sept. 28, 2023, 11:50 p.m. UTC
It checks the current lock to calculated the delta of contention time.
The address is saved in the tstamp map which is allocated at begining of
contention and released at end of contention.
But it's possible for bpf_map_delete_elem() to fail. In that case, the
element in the tstamp map kept for the current lock and it makes the
next contention for the same lock tracked incorrectly. Specificially
the next contention begin will see the existing element for the task and
it'd just return. Then the next contention end will see the element and
calculate the time using the timestamp for the previous begin.
This can result in a large value for two small contentions happened from
time to time. Let's clear the lock address so that it can be updated
next time even if the bpf_map_delete_elem() failed.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
---
tools/perf/util/bpf_skel/lock_contention.bpf.c | 4 ++++
1 file changed, 4 insertions(+)
Comments
Em Thu, Sep 28, 2023 at 04:50:18PM -0700, Namhyung Kim escreveu: > It checks the current lock to calculated the delta of contention time. > The address is saved in the tstamp map which is allocated at begining of > contention and released at end of contention. > > But it's possible for bpf_map_delete_elem() to fail. In that case, the How can it fail? You do: pelem = bpf_map_lookup_elem(&tstamp, &pid); if (!pelem || pelem->lock != ctx[0]) return 0; So it is there, why would the removal using the same key fail? The patch should work as-is, I'm just curious about what would make there removal of a map entry that was successfully looked up on the same contention_end prog to fail when being removed... - Arnaldo > element in the tstamp map kept for the current lock and it makes the > next contention for the same lock tracked incorrectly. Specificially > the next contention begin will see the existing element for the task and > it'd just return. Then the next contention end will see the element and > calculate the time using the timestamp for the previous begin. > > This can result in a large value for two small contentions happened from > time to time. Let's clear the lock address so that it can be updated > next time even if the bpf_map_delete_elem() failed. > > Signed-off-by: Namhyung Kim <namhyung@kernel.org> > --- > tools/perf/util/bpf_skel/lock_contention.bpf.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c > index 4900a5dfb4a4..b11179452e19 100644 > --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c > +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c > @@ -389,6 +389,7 @@ int contention_end(u64 *ctx) > > duration = bpf_ktime_get_ns() - pelem->timestamp; > if ((__s64)duration < 0) { > + pelem->lock = 0; > bpf_map_delete_elem(&tstamp, &pid); > __sync_fetch_and_add(&time_fail, 1); > return 0; > @@ -422,6 +423,7 @@ int contention_end(u64 *ctx) > data = bpf_map_lookup_elem(&lock_stat, &key); > if (!data) { > if (data_map_full) { > + pelem->lock = 0; > bpf_map_delete_elem(&tstamp, &pid); > __sync_fetch_and_add(&data_fail, 1); > return 0; > @@ -445,6 +447,7 @@ int contention_end(u64 *ctx) > data_map_full = 1; > __sync_fetch_and_add(&data_fail, 1); > } > + pelem->lock = 0; > bpf_map_delete_elem(&tstamp, &pid); > return 0; > } > @@ -458,6 +461,7 @@ int contention_end(u64 *ctx) > if (data->min_time > duration) > data->min_time = duration; > > + pelem->lock = 0; > bpf_map_delete_elem(&tstamp, &pid); > return 0; > } > -- > 2.42.0.582.g8ccd20d70d-goog >
On Fri, Oct 13, 2023 at 4:00 PM Arnaldo Carvalho de Melo <acme@kernel.org> wrote: > > Em Thu, Sep 28, 2023 at 04:50:18PM -0700, Namhyung Kim escreveu: > > It checks the current lock to calculated the delta of contention time. > > > The address is saved in the tstamp map which is allocated at begining of > > contention and released at end of contention. > > > > But it's possible for bpf_map_delete_elem() to fail. In that case, the > > How can it fail? > > You do: > > pelem = bpf_map_lookup_elem(&tstamp, &pid); > if (!pelem || pelem->lock != ctx[0]) > return 0; > > So it is there, why would the removal using the same key fail? It can fail when it doesn't get a lock for the internal bucket. See kernel/bpf/hashtab.c::htab_map_delete_elem(). But I'm not sure whether that's actually possible in this case. > > The patch should work as-is, I'm just curious about what would make > there removal of a map entry that was successfully looked up on the same > contention_end prog to fail when being removed... Now I'm seeing some rare error cases like a spinlock wait is longer than a minute. I suspect a bug in this code and try to be more defensive. Thanks, Namhyung
diff --git a/tools/perf/util/bpf_skel/lock_contention.bpf.c b/tools/perf/util/bpf_skel/lock_contention.bpf.c index 4900a5dfb4a4..b11179452e19 100644 --- a/tools/perf/util/bpf_skel/lock_contention.bpf.c +++ b/tools/perf/util/bpf_skel/lock_contention.bpf.c @@ -389,6 +389,7 @@ int contention_end(u64 *ctx) duration = bpf_ktime_get_ns() - pelem->timestamp; if ((__s64)duration < 0) { + pelem->lock = 0; bpf_map_delete_elem(&tstamp, &pid); __sync_fetch_and_add(&time_fail, 1); return 0; @@ -422,6 +423,7 @@ int contention_end(u64 *ctx) data = bpf_map_lookup_elem(&lock_stat, &key); if (!data) { if (data_map_full) { + pelem->lock = 0; bpf_map_delete_elem(&tstamp, &pid); __sync_fetch_and_add(&data_fail, 1); return 0; @@ -445,6 +447,7 @@ int contention_end(u64 *ctx) data_map_full = 1; __sync_fetch_and_add(&data_fail, 1); } + pelem->lock = 0; bpf_map_delete_elem(&tstamp, &pid); return 0; } @@ -458,6 +461,7 @@ int contention_end(u64 *ctx) if (data->min_time > duration) data->min_time = duration; + pelem->lock = 0; bpf_map_delete_elem(&tstamp, &pid); return 0; }