From patchwork Thu Feb 23 09:32:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Juergen Gross X-Patchwork-Id: 60876 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp227387wrd; Thu, 23 Feb 2023 01:58:12 -0800 (PST) X-Google-Smtp-Source: AK7set/2ATAFiSbRReEt1cKnt46UK7yfqKW7EHZN0ZDQNi0MB7A6YASJ8qO5zU0Ks44YYMO9asCX X-Received: by 2002:a17:906:51ca:b0:8eb:fb1a:6970 with SMTP id v10-20020a17090651ca00b008ebfb1a6970mr2866058ejk.1.1677146291806; Thu, 23 Feb 2023 01:58:11 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677146291; cv=none; d=google.com; s=arc-20160816; b=n64vi8N9ArXKXqOw5bDYE/Mnd2hblS7C+rxpT0M25IXveRxxTyx5u0Io0IE0etJ38m FWXtK9UhUHYlSTclWjivs7dIP7oVEPh5dzSi9GLtWI8eAOlS81jder0Him/MvBDXviOp 3B+oJf3RN3o95tszR2HlMLBza+DRhNv1XBv8aGB7E4pY1dQKpjX7GuBijQVCWO8E23vj eu/sMB8SnLofqAd5mjtH4GgsomZXl3bSLDKiZ6ohsP1WDoidqqufFbv7zO19uzDqBopw gq1qHGQwqeRJI7aOpBqKcFdNjtsDAHF6rUYcxLJ5XlFVDRWCJBrmdDpqcnN7gsqf88H1 JTig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=paye5Q3BJqm2xBj19bWFydb/ZMlz/VrLA9JaH3on1fE=; b=lHuFwPtMVe0ldUTRuIRktQsQrWNrjyVCl2NfdGqIsJCiq6A7jMK6JOI70+AsFAIZg9 x5hV6hgE+uw4Q5l/cbcXjODbLz+ALGtvH0vH1Qzw3jUoBk5nwp6Bm8TpSOr/+KF7Npvw wHSX7aMm6sovLQKjLqgOV6lE6Bwe1E4zEBroaSyCnpPEEbsIgPiI8sclKgDem+MlCaD7 cJx01tInNn7Yo41wBLX+KOCwVuL9w48Yh+NTCK8SLaiPD7N3ZvAY52XZHSxkDoKWHiOJ AP8BnXutNIUJ3j1d769XR6xS7AQrCAUHbIC0oWI2uvVy4URjxEbGawtMqHLQ7U3cvd/M mFqg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=KX1OHhZX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fv35-20020a17090750a300b008c9662741dbsi12588091ejc.390.2023.02.23.01.57.17; Thu, 23 Feb 2023 01:58:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.com header.s=susede1 header.b=KX1OHhZX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=suse.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234070AbjBWJea (ORCPT + 99 others); Thu, 23 Feb 2023 04:34:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234105AbjBWJeY (ORCPT ); Thu, 23 Feb 2023 04:34:24 -0500 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61EBF4AFCB for ; Thu, 23 Feb 2023 01:33:56 -0800 (PST) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 09F1C337FA; Thu, 23 Feb 2023 09:33:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1677144818; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=paye5Q3BJqm2xBj19bWFydb/ZMlz/VrLA9JaH3on1fE=; b=KX1OHhZXtA/n9jJhrBM2J/KveEMip7kKJ24tbgOvDX2UZatweZZM3PXUQrXC9QFRi0MwXV yk5aAGkVhDon/NGjA3Jcbx41KvrK6FnSGg8XGA+iHxQRl/B7kMoB9zIgg0o0e2gXVhA/KP ezxEEUy6uflIl0ToIhRnUMEBstTtOng= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id BDC7513928; Thu, 23 Feb 2023 09:33:37 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id W8P9LPEy92PSbAAAMHmgww (envelope-from ); Thu, 23 Feb 2023 09:33:37 +0000 From: Juergen Gross To: linux-kernel@vger.kernel.org, x86@kernel.org Cc: Juergen Gross , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" Subject: [PATCH v3 09/12] x86/mtrr: construct a memory map with cache modes Date: Thu, 23 Feb 2023 10:32:40 +0100 Message-Id: <20230223093243.1180-10-jgross@suse.com> X-Mailer: git-send-email 2.35.3 In-Reply-To: <20230223093243.1180-1-jgross@suse.com> References: <20230223093243.1180-1-jgross@suse.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758615350343113097?= X-GMAIL-MSGID: =?utf-8?q?1758615350343113097?= After MTRR initialization construct a memory map with cache modes from MTRR values. This will speed up lookups via mtrr_lookup_type() especially in case of overlapping MTRRs. This will be needed when switching the semantics of the "uniform" parameter of mtrr_lookup_type() from "only covered by one MTRR" to "memory range has a uniform cache mode", which is the data the callers really want to know. Today this information is not easily available, in case MTRRs are not well sorted regarding base address. The map will be built in __initdata. When memory management is up, the map will be moved to dynamically allocated memory, in order to avoid the need of an overly large array. The size of this array is calculated using the number of variable MTRR registers and the needed size for fixed entries. Only add the map creation and expansion for now. The lookup will be added later. When writing new MTRR entries in the running system rebuild the map inside the call from mtrr_rendezvous_handler() in order to avoid nasty race conditions with concurrent lookups. Signed-off-by: Juergen Gross --- V3: - new patch --- arch/x86/kernel/cpu/mtrr/generic.c | 254 +++++++++++++++++++++++++++++ arch/x86/kernel/cpu/mtrr/mtrr.c | 6 +- arch/x86/kernel/cpu/mtrr/mtrr.h | 3 + 3 files changed, 262 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/cpu/mtrr/generic.c b/arch/x86/kernel/cpu/mtrr/generic.c index 1ad08a96989c..ca9b8cec81a0 100644 --- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -33,6 +33,37 @@ static struct fixed_range_block fixed_range_blocks[] = { {} }; +struct cache_map { + u64 start; + u64 end; + u8 type; + bool fixed; +}; + +/* + * CACHE_MAP_MAX is the maximum number of memory ranges in cache_map, where + * no 2 adjacent ranges have the same cache mode (those would be merged). + * The number is based on the worst case: + * - no two adjacent fixed MTRRs share the same cache mode + * - one variable MTRR is spanning a huge area with mode WB + * - 255 variable MTRRs with mode UC all overlap with the WB MTRR, creating 2 + * additional ranges each (result like "ababababa...aba" with a = WB, b = UC), + * accounting for MTRR_MAX_VAR_RANGES * 2 - 1 range entries + * - a TOM2 area (even with overlapping an UC MTRR can't add 2 range entries + * to the possible maximum, as it always starts at 4GB, thus it can't be in + * the middle of that MTRR, unless that MTRR starts at 0, which would remove + * the initial "a" from the "abababa" pattern above) + * The map won't contain ranges with no matching MTRR (those fall back to the + * default cache mode). + */ +#define CACHE_MAP_MAX (MTRR_NUM_FIXED_RANGES + MTRR_MAX_VAR_RANGES * 2) + +static struct cache_map init_cache_map[CACHE_MAP_MAX] __initdata; +static struct cache_map *cache_map __refdata = init_cache_map; +static unsigned int cache_map_size = CACHE_MAP_MAX; +static unsigned int cache_map_n; +static unsigned int cache_map_fixed; + static unsigned long smp_changes_mask; static int mtrr_state_set; u64 mtrr_tom2; @@ -78,6 +109,20 @@ static u64 get_mtrr_size(u64 mask) return size; } +static u8 get_var_mtrr_state(unsigned int reg, u64 *start, u64 *size) +{ + struct mtrr_var_range *mtrr = mtrr_state.var_ranges + reg; + + if (!(mtrr->mask_lo & (1 << 11))) + return MTRR_TYPE_INVALID; + + *start = (((u64)mtrr->base_hi) << 32) + (mtrr->base_lo & PAGE_MASK); + *size = get_mtrr_size((((u64)mtrr->mask_hi) << 32) + + (mtrr->mask_lo & PAGE_MASK)); + + return mtrr->base_lo & 0xff; +} + static u8 get_effective_type(u8 type1, u8 type2) { if (type1 == MTRR_TYPE_UNCACHABLE || type2 == MTRR_TYPE_UNCACHABLE) @@ -241,6 +286,211 @@ static u8 mtrr_type_lookup_variable(u64 start, u64 end, u64 *partial_end, return mtrr_state.def_type; } +static void rm_map_entry_at(int idx) +{ + int i; + + for (i = idx; i < cache_map_n - 1; i++) + cache_map[i] = cache_map[i + 1]; + + cache_map_n--; +} + +/* + * Add an entry into cache_map at a specific index. + * Merges adjacent entries if appropriate. + * Return the number of merges for correcting the scan index. + */ +static int add_map_entry_at(u64 start, u64 end, u8 type, int idx) +{ + bool merge_prev, merge_next; + int i; + + if (start >= end) + return 0; + + merge_prev = (idx > 0 && !cache_map[idx - 1].fixed && + start == cache_map[idx - 1].end && + type == cache_map[idx - 1].type); + merge_next = (idx < cache_map_n && !cache_map[idx].fixed && + end == cache_map[idx].start && + type == cache_map[idx].type); + + if (merge_prev && merge_next) { + cache_map[idx - 1].end = cache_map[idx].end; + rm_map_entry_at(idx); + return 2; + } + if (merge_prev) { + cache_map[idx - 1].end = end; + return 1; + } + if (merge_next) { + cache_map[idx].start = start; + return 1; + } + + /* Sanity check: the array should NEVER be too small! */ + if (cache_map_n == cache_map_size) { + WARN(1, "MTRR cache mode memory map exhausted!\n"); + cache_map_n = cache_map_fixed; + return 0; + } + + for (i = cache_map_n; i > idx; i--) + cache_map[i] = cache_map[i - 1]; + + cache_map[idx].start = start; + cache_map[idx].end = end; + cache_map[idx].type = type; + cache_map[idx].fixed = false; + cache_map_n++; + + return 0; +} + +/* Clear a part of an entry. Return 1 if start of entry is still valid. */ +static int clr_map_range_at(u64 start, u64 end, int idx) +{ + int ret = start != cache_map[idx].start; + u64 tmp; + + if (start == cache_map[idx].start && end == cache_map[idx].end) { + rm_map_entry_at(idx); + } else if (start == cache_map[idx].start) { + cache_map[idx].start = end; + } else if (end == cache_map[idx].end) { + cache_map[idx].end = start; + } else { + tmp = cache_map[idx].end; + cache_map[idx].end = start; + add_map_entry_at(end, tmp, cache_map[idx].type, idx + 1); + } + + return ret; +} + +static void add_map_entry(u64 start, u64 end, u8 type) +{ + int i; + u8 new_type, old_type; + u64 tmp; + + for (i = 0; i < cache_map_n && start < end; i++) { + if (start >= cache_map[i].end) + continue; + + if (start < cache_map[i].start) { + /* Region start has no overlap. */ + tmp = min(end, cache_map[i].start); + i -= add_map_entry_at(start, tmp, type, i); + start = tmp; + continue; + } + + new_type = get_effective_type(type, cache_map[i].type); + old_type = cache_map[i].type; + + if (cache_map[i].fixed || new_type == old_type) { + /* Cut off start of new entry. */ + start = cache_map[i].end; + continue; + } + + tmp = min(end, cache_map[i].end); + i += clr_map_range_at(start, tmp, i); + i -= add_map_entry_at(start, tmp, new_type, i); + start = tmp; + } + + add_map_entry_at(start, end, type, i); +} + +/* Add variable MTRRs to cache map. */ +static void map_add_var(void) +{ + unsigned int i; + u64 start, size; + u8 type; + + /* Add AMD magic MTRR. */ + if (mtrr_tom2) { + add_map_entry(1ULL << 32, mtrr_tom2 - 1, MTRR_TYPE_WRBACK); + cache_map[cache_map_n - 1].fixed = true; + } + + for (i = 0; i < num_var_ranges; i++) { + type = get_var_mtrr_state(i, &start, &size); + if (type != MTRR_TYPE_INVALID) + add_map_entry(start, start + size, type); + } +} + +/* Rebuild map by replacing variable entries. */ +static void rebuild_map(void) +{ + cache_map_n = cache_map_fixed; + + map_add_var(); +} + +/* Build the cache_map containing the cache modes per memory range. */ +void mtrr_build_map(void) +{ + unsigned int i; + u64 start, end, size; + u8 type; + + if (!mtrr_state.enabled) + return; + + /* Add fixed MTRRs, optimize for adjacent entries with same type. */ + if (mtrr_state.enabled & MTRR_STATE_MTRR_FIXED_ENABLED) { + start = 0; + end = size = 0x10000; + type = mtrr_state.fixed_ranges[0]; + + for (i = 1; i < MTRR_NUM_FIXED_RANGES; i++) { + if (i == 8 || i == 24) + size >>= 2; + + if (mtrr_state.fixed_ranges[i] != type) { + add_map_entry(start, end, type); + start = end; + type = mtrr_state.fixed_ranges[i]; + } + end += size; + } + add_map_entry(start, end, type); + } + + /* Mark fixed and magic MTRR as fixed, they take precedence. */ + for (i = 0; i < cache_map_n; i++) + cache_map[i].fixed = true; + cache_map_fixed = cache_map_n; + + map_add_var(); +} + +/* Copy the cache_map from __initdata memory to dynamically allocated one. */ +void __init mtrr_copy_map(void) +{ + unsigned int new_size = cache_map_fixed + 2 * num_var_ranges; + + if (!mtrr_state.enabled || !new_size) { + cache_map = NULL; + return; + } + + mutex_lock(&mtrr_mutex); + + cache_map = kcalloc(new_size, sizeof(*cache_map), GFP_KERNEL); + memmove(cache_map, init_cache_map, cache_map_n * sizeof(*cache_map)); + cache_map_size = new_size; + + mutex_unlock(&mtrr_mutex); +} + /** * mtrr_overwrite_state - set static MTRR state * @@ -814,6 +1064,10 @@ static void generic_set_mtrr(unsigned int reg, unsigned long base, cache_enable(); local_irq_restore(flags); + + /* On the first cpu rebuild the cache mode memory map. */ + if (smp_processor_id() == cpumask_first(cpu_online_mask)) + rebuild_map(); } int generic_validate_add_page(unsigned long base, unsigned long size, diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.c b/arch/x86/kernel/cpu/mtrr/mtrr.c index 50cd2287b6e1..1dbb9fdfd87b 100644 --- a/arch/x86/kernel/cpu/mtrr/mtrr.c +++ b/arch/x86/kernel/cpu/mtrr/mtrr.c @@ -65,7 +65,7 @@ static bool mtrr_enabled(void) } unsigned int mtrr_usage_table[MTRR_MAX_VAR_RANGES]; -static DEFINE_MUTEX(mtrr_mutex); +DEFINE_MUTEX(mtrr_mutex); u64 size_or_mask, size_and_mask; @@ -668,6 +668,7 @@ void __init mtrr_bp_init(void) /* Software overwrite of MTRR state, only for generic case. */ mtrr_calc_physbits(true); init_table(); + mtrr_build_map(); pr_info("MTRRs set to read-only\n"); return; @@ -705,6 +706,7 @@ void __init mtrr_bp_init(void) if (get_mtrr_state()) { memory_caching_control |= CACHE_MTRR; changed_by_mtrr_cleanup = mtrr_cleanup(phys_addr); + mtrr_build_map(); } else { mtrr_if = NULL; why = "by BIOS"; @@ -733,6 +735,8 @@ void mtrr_save_state(void) static int __init mtrr_init_finialize(void) { + mtrr_copy_map(); + if (!mtrr_enabled()) return 0; diff --git a/arch/x86/kernel/cpu/mtrr/mtrr.h b/arch/x86/kernel/cpu/mtrr/mtrr.h index a3c362d3d5bf..6246a1d8650b 100644 --- a/arch/x86/kernel/cpu/mtrr/mtrr.h +++ b/arch/x86/kernel/cpu/mtrr/mtrr.h @@ -53,6 +53,7 @@ bool get_mtrr_state(void); extern u64 size_or_mask, size_and_mask; extern const struct mtrr_ops *mtrr_if; +extern struct mutex mtrr_mutex; extern unsigned int num_var_ranges; extern u64 mtrr_tom2; @@ -61,6 +62,8 @@ extern struct mtrr_state_type mtrr_state; void mtrr_state_warn(void); const char *mtrr_attrib_to_str(int x); void mtrr_wrmsr(unsigned, unsigned, unsigned); +void mtrr_build_map(void); +void mtrr_copy_map(void); /* CPU specific mtrr_ops vectors. */ extern const struct mtrr_ops amd_mtrr_ops;