From patchwork Mon Dec 11 12:18:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Ilpo_J=C3=A4rvinen?= X-Patchwork-Id: 176672 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp7003805vqy; Mon, 11 Dec 2023 04:22:59 -0800 (PST) X-Google-Smtp-Source: AGHT+IGTXUTo/+USRvi3A5/Sk9adAmQPMyjcwUoWUNFTUSNzpTSSIJWhI5EM/+1J0hcFsMizcUVk X-Received: by 2002:a05:6a00:1412:b0:6ce:bb12:cc78 with SMTP id l18-20020a056a00141200b006cebb12cc78mr6181226pfu.5.1702297379492; Mon, 11 Dec 2023 04:22:59 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702297379; cv=none; d=google.com; s=arc-20160816; b=U+69EUKOKhm2PYhujHx0BrIi0HpJqzzXucPz4+cw9aYWlPMfCMElejH0PjeDKq87Mb LvC4edNVFdL7AyRqbMb8xziIf0IsD2BXbd8nX+v3kIIQGMuNK4PWnPsf3/fkm9o8t8IT RA6Nm4prhZatdOzIE9No6xaRgkGw+FJQTlhzkrRhJAjtMzy1r9z4Xge3/yzDG7DMMDv2 5eiwob4X1DZAeBXwGkhsNAMDuJfPhiMGgYskuYh2+fWqTpYx0Pv4bKzZDjiEsu6TPHjL NitqRgkHL7GS7UfB9TFQIOh/X7YgmQRLKI6ThAE7vuwFYLKNoMIRuZg8VOTMWxa3S5yQ uHAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=n14K3vpsWiDVqYnMHD9QXAkYa683LqFs6d58henc3YI=; fh=gPQ6jqLSfsDb5bE3yrtO+AlT5R4d75RXkjC5xckz7Dk=; b=0K3nnKbM0IK87GB6FPQdyeH2DKw9mBu7ueTS76HTZ5xN7ld0k6vhv32Q0fmLflkTA+ zxXBN+vfNXs/GvSVYaE9Va504C0OqPM6UVDU2DE1vKQXiU9ILR/FGDor8WO5ne/cb69S xG3GSPZRtguCJCXxrXj2y9H6QhoLoyAfY2eygRzEc4SYb2H6cdbvv6hnBlAIjNlwsI6g igpQF0IAeteZpKBLVQLCQoIhNzkToDWGRFRkW2wrQGI4zdF0egDy4xlT+LSlvDMCKTgQ n2yUISTO6BhdT++3YDPs7xSq7tdFWKnIE3JWd0BF9wLEFPCFTkBBICNztJQF9yy73BRF a8TA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JkgJac4V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id a17-20020a056a000c9100b006cdd1393a2fsi6130654pfv.96.2023.12.11.04.22.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 11 Dec 2023 04:22:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=JkgJac4V; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 63611804ADAC; Mon, 11 Dec 2023 04:22:40 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234333AbjLKMW2 (ORCPT + 99 others); Mon, 11 Dec 2023 07:22:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343593AbjLKMWJ (ORCPT ); Mon, 11 Dec 2023 07:22:09 -0500 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DE8521BB; Mon, 11 Dec 2023 04:21:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702297308; x=1733833308; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Vf+YhSbbVKKBX9ioybzBZ0uA7UfvYFjvulm8AvtaAIE=; b=JkgJac4Vnh3rWB6jmCBtKSliKKL3drwMItoLVI0B63gav0tyA9ix8n+r aLrfr0lWjU20OcF1v9wwutFRFcLhdCiUk0O19GzuTBDwQzGjsln1NmwOh 8Vgd0oK8qSihKlDRNNhDIynhyv7n7s/ylIeJHfyLrhLkj7wJkBvw3rRxw XDHoICLk8KMOflMXXKPI1+zYwRuYegB7hFHNGC3ceCMDJDb1eCKiu2UCx LXzWTvrfKKraiAaEZBlez95MKcTLp6GLrSt83Mc32KM/cBnM0drSohYXU sIIty1GmdfC4Q0wvGb+9AYgFo+o6rmsMk6DM5fHlxeriSE23T/Kb3l/O/ Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10920"; a="393509771" X-IronPort-AV: E=Sophos;i="6.04,267,1695711600"; d="scan'208";a="393509771" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Dec 2023 04:21:48 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10920"; a="863755997" X-IronPort-AV: E=Sophos;i="6.04,267,1695711600"; d="scan'208";a="863755997" Received: from ijarvine-desk1.ger.corp.intel.com (HELO localhost) ([10.246.50.188]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Dec 2023 04:21:45 -0800 From: =?utf-8?q?Ilpo_J=C3=A4rvinen?= To: linux-kselftest@vger.kernel.org, Reinette Chatre , Shuah Khan , Shaopeng Tan , =?utf-8?q?Maciej_Wiecz=C3=B3r-R?= =?utf-8?q?etman?= , Fenghua Yu Cc: linux-kernel@vger.kernel.org, =?utf-8?q?Ilpo_J=C3=A4rvinen?= Subject: [PATCH v3 21/29] selftests/resctrl: Read in less obvious order to defeat prefetch optimizations Date: Mon, 11 Dec 2023 14:18:18 +0200 Message-Id: <20231211121826.14392-22-ilpo.jarvinen@linux.intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231211121826.14392-1-ilpo.jarvinen@linux.intel.com> References: <20231211121826.14392-1-ilpo.jarvinen@linux.intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Mon, 11 Dec 2023 04:22:40 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784988177225586143 X-GMAIL-MSGID: 1784988177225586143 When reading memory in order, HW prefetching optimizations will interfere with measuring how caches and memory are being accessed. This adds noise into the results. Change the fill_buf reading loop to not use an obvious in-order access using multiply by a prime and modulo. Using a prime multiplier with modulo ensures the entire buffer is eventually read. 23 is small enough that the reads are spread out but wrapping does not occur very frequently (wrapping too often can trigger L2 hits more frequently which causes noise to the test because getting the data from LLC is not required). It was discovered that not all primes work equally well and some can cause wildly unstable results (e.g., in an earlier version of this patch, the reads were done in reversed order and 59 was used as the prime resulting in unacceptably high and unstable results in MBA and MBM test on some architectures). Link: https://lore.kernel.org/linux-kselftest/TYAPR01MB6330025B5E6537F94DA49ACB8B499@TYAPR01MB6330.jpnprd01.prod.outlook.com/ Signed-off-by: Ilpo Järvinen Reviewed-by: Reinette Chatre --- tools/testing/selftests/resctrl/fill_buf.c | 38 +++++++++++++++++----- 1 file changed, 30 insertions(+), 8 deletions(-) diff --git a/tools/testing/selftests/resctrl/fill_buf.c b/tools/testing/selftests/resctrl/fill_buf.c index 8fe9574db9d8..93a3d408339c 100644 --- a/tools/testing/selftests/resctrl/fill_buf.c +++ b/tools/testing/selftests/resctrl/fill_buf.c @@ -51,16 +51,38 @@ static void mem_flush(unsigned char *buf, size_t buf_size) sb(); } +/* + * Buffer index step advance to workaround HW prefetching interfering with + * the measurements. + * + * Must be a prime to step through all indexes of the buffer. + * + * Some primes work better than others on some architectures (from MBA/MBM + * result stability point of view). + */ +#define FILL_IDX_MULT 23 + static int fill_one_span_read(unsigned char *buf, size_t buf_size) { - unsigned char *end_ptr = buf + buf_size; - unsigned char sum, *p; - - sum = 0; - p = buf; - while (p < end_ptr) { - sum += *p; - p += (CL_SIZE / 2); + unsigned int size = buf_size / (CL_SIZE / 2); + unsigned int i, idx = 0; + unsigned char sum = 0; + + /* + * Read the buffer in an order that is unexpected by HW prefetching + * optimizations to prevent them interfering with the caching pattern. + * + * The read order is (in terms of halves of cachelines): + * i * FILL_IDX_MULT % size + * The formula is open-coded below to avoiding modulo inside the loop + * as it improves MBA/MBM result stability on some architectures. + */ + for (i = 0; i < size; i++) { + sum += buf[idx * (CL_SIZE / 2)]; + + idx += FILL_IDX_MULT; + while (idx >= size) + idx -= size; } return sum;