Message ID | 20240202231831.354848-1-seanjc@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-50706-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:9bc1:b0:106:209c:c626 with SMTP id op1csp751464dyc; Fri, 2 Feb 2024 15:19:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IFgChi/OgyD+zERX/TxYwyrdOcRlqoPK0R1oxHsR8JJ220cPEsUzXGn0Tec1ytBcRBCHtCb X-Received: by 2002:a05:6402:128b:b0:55f:a84b:c3cb with SMTP id w11-20020a056402128b00b0055fa84bc3cbmr754875edv.30.1706915945305; Fri, 02 Feb 2024 15:19:05 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706915945; cv=pass; d=google.com; s=arc-20160816; b=OU+7ZRDmliVFr5cSlPHCxRQD/GvVwuKTuW/AMOwy0OaIvjzBiN8OIGHBGoetPhwNuZ M3OYCEy2AaxaCNneGgBXR7IEvW0LOBTOnxlhG+QCMPlNdRm8Pgs/tXeTUHcECw+EkCiq +VFgcQxAS84FMgDf1b22wIEw5/2uK4oOAEO76ZdyXc84MRsOh5GSeEo+Z3smTFPfSMLF Ilmgw27ywlj+6kwQp1gYniG41NA2kAKBXdLqjW+VVhufea6B0GXbTk+hoGqsv+neMeQL 9DloQuPsDSUqmfFqx3HZiu5ZcU49e24JV490J7cr39AL4F8jNjHYEtJMybl28ldMSL8y jPBA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:reply-to:dkim-signature; bh=l9DTIvJdIifkzfJTgX6IxXh36jSV6kLqcP4pMjb+H5c=; fh=rGCMtk3OvQ9T/fxPGouifu+QyX4B1at6s8qBANHmagQ=; b=cr/QVw3ADkargFroTLittZZbqNV/zmibNgzfNiXl60dHOs4x1GgzGtTLbEyWbq5D1Q brPCU4/l6VdYvDKXKDZ/IW6o7jZE9GJq63foTccbOtCaBpmTODu8WcvuZJCYw1qWImGY A0B9A2QezCJVYV4NW95YmSPS3ouE8iTky1MsC7tb/swwOFL6CChnhKv+Pic7PSUZZsIR L+8h5kuQv14Tr+j1hy5xOzNQhm0K7+d0hshIH7ytFPTJMSWf8ueFreisyOOAGKVMIuAZ JikFhr4JHzvk8O1z48o90Yr8QEo3Fud06bG75AlT80a0mO2wW30kymQi08AA2y7CbLQr dUkQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=yXziIpaV; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-50706-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-50706-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com X-Forwarded-Encrypted: i=1; AJvYcCVTuAgTLFaO+4THcfNBgmljRCzC4kvFgt3dAGpbe12s1jaPYZaItFaen7gPK9ugo4jxCFLnzrPxR3c8pYW1WDAsVnCHxw== Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id j10-20020a50d00a000000b0055d33a14abasi1224339edf.367.2024.02.02.15.19.05 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Feb 2024 15:19:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-50706-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=yXziIpaV; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-50706-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-50706-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id BE41E1F283FC for <ouuuleilei@gmail.com>; Fri, 2 Feb 2024 23:19:04 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0853312C81A; Fri, 2 Feb 2024 23:18:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="yXziIpaV" Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B7FC612C7EF for <linux-kernel@vger.kernel.org>; Fri, 2 Feb 2024 23:18:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706915916; cv=none; b=tF/MUTVV8zwocwg6suQ1p/Awa2WA28jQdw2oV+2OL7nOZebJMZsidjCjo5PuFw9Zvk5/K84T23ypbfwzcJ6IKOjhYReauvzLbx0cfFhmOZhMDikZbRuZVyDom4DBium+8oNTmZH+eXk+ImI57tcZ3265dM1+xsXFWOX5koDlKQI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706915916; c=relaxed/simple; bh=2lzyhY4laC/6FqGkX8jZiMfFuGk71PZJdrnkd/RsoVU=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=Gk9S2eIjZ6Qop2MGjQ6EaA6tbi84P2LooyUB0lKoCKhaBJC0o0f3DJSwOCsZYsnr5Hykx6JDuXv6bULuTdkEpSHoysM4sl9e8XdlZT0BxxBsR063QLVtPzhDHaXcBQHmCSUiEuLL10NqMmYuvDSDJ/Jx4RLlAFw94tvmrEbh8OE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=yXziIpaV; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dc647f65573so4366389276.2 for <linux-kernel@vger.kernel.org>; Fri, 02 Feb 2024 15:18:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706915913; x=1707520713; darn=vger.kernel.org; h=cc:to:from:subject:message-id:mime-version:date:reply-to:from:to:cc :subject:date:message-id:reply-to; bh=l9DTIvJdIifkzfJTgX6IxXh36jSV6kLqcP4pMjb+H5c=; b=yXziIpaViZPmsmECe9djPVUJp/XKy0CxC+hg0JMAU6P6Z4H3o0ZZzR3ir/MOry002q ELzcIYCAjEyTZI11+wDz9GN3Aniv4wQWv5F4Gc0vG6ou6avemwCac4bXQBZqnQiZtrEX 8DIz7Pe7FyI3PoK6EV5l0feiXUN63LZQx1JhqNxJ1TeWfkwoiOhg270mXzSQeewrpBIu OfsAdKPh/bV3cWKzRpUzbbVLAyE11KBUL7SWNieNgYmdLFFj1AGjFOTw8WI2ZntRIcVF wgMJsbusvq0dmZjVlZuGpuUqGdAiy9t5LvpjBLOLm+8SONKbBKW5o+CLS6k6FGdZneCT aSqA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706915913; x=1707520713; h=cc:to:from:subject:message-id:mime-version:date:reply-to :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=l9DTIvJdIifkzfJTgX6IxXh36jSV6kLqcP4pMjb+H5c=; b=G1jGt8PNl4yz4CbWWsVIjFhMYVoHGpoX3DIWQG3BNiHcPCfTC30J8vIfuhRN3utFge 6xpXhYrA/jE4cdXte8dLyGSZeQTZmMMiQQzhZM9IEMk96nD/SNiUMkAUh2pFcenXqSjp f8TjxIjQpeTO6v4Bc1oCf3CuXwK0wUr/9jpYtDdoj5LZa3uU/ojTvOdXY9hS3ena/4zH Yivqm4i2iCLoaPlV/FJiqXp1i+XvFmlB9J0yJsl/JK4rpn5mfDKbTTAsbTOGkNqMUQol NB+gn9Rj+ljAYas3wEqRGkQRjOqVlnAqDj3BSctAgpQ0u5V7IUCA/TlX8zIslmfcbSUZ 7Q8Q== X-Gm-Message-State: AOJu0YwPyvZP/04V/YKW5Mgok78ULZYVL1GmfxvSA6+n+JGc+gDzfH6o MkZk2mNk/1ZsVjncqE5C2KLnRnnTaIsfOpxjUiUIkCpqRcqoD9xtXKyBupLEvoocR/Bjan3hJMh Y4w== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:1085:b0:dc6:e5e9:f3af with SMTP id v5-20020a056902108500b00dc6e5e9f3afmr1216683ybu.9.1706915913773; Fri, 02 Feb 2024 15:18:33 -0800 (PST) Reply-To: Sean Christopherson <seanjc@google.com> Date: Fri, 2 Feb 2024 15:18:31 -0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.594.gd9cf4e227d-goog Message-ID: <20240202231831.354848-1-seanjc@google.com> Subject: [PATCH] KVM: selftests: Fix a semaphore imbalance in the dirty ring logging test From: Sean Christopherson <seanjc@google.com> To: Paolo Bonzini <pbonzini@redhat.com> Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, Shaoqin Huang <shahuang@redhat.com>, Sean Christopherson <seanjc@google.com> Content-Type: text/plain; charset="UTF-8" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789831094003564286 X-GMAIL-MSGID: 1789831094003564286 |
Series |
KVM: selftests: Fix a semaphore imbalance in the dirty ring logging test
|
|
Commit Message
Sean Christopherson
Feb. 2, 2024, 11:18 p.m. UTC
When finishing the final iteration of dirty_log_test testcase, set
host_quit _before_ the final "continue" so that the vCPU worker doesn't
run an extra iteration, and delete the hack-a-fix of an extra "continue"
from the dirty ring testcase. This fixes a bug where the extra post to
sem_vcpu_cont may not be consumed, which results in failures in subsequent
runs of the testcases. The bug likely was missed during development as
x86 supports only a single "guest mode", i.e. there aren't any subsequent
testcases after the dirty ring test, because for_each_guest_mode() only
runs a single iteration.
For the regular dirty log testcases, letting the vCPU run one extra
iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and
only if the worker is explicitly told to stop (vcpu_sync_stop_requested).
But for the dirty ring test, which needs to periodically stop the vCPU to
reap the dirty ring, letting the vCPU resume the guest _after_ the last
iteration means the vCPU will get stuck without an extra "continue".
However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to
be consumed, e.g. if the vCPU worker sees host_quit==true before resuming
the guest. This results in a dangling sem_vcpu_cont, which leads to
subsequent iterations getting out of sync, as the vCPU worker will
continue on before the main task is ready for it to resume the guest,
leading to a variety of asserts, e.g.
==== Test Assertion Failure ====
dirty_log_test.c:384: dirty_ring_vcpu_ring_full
pid=14854 tid=14854 errno=22 - Invalid argument
1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384
2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505
3 (inlined by) run_test at dirty_log_test.c:802
4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100
5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3)
6 0x0000ffff9be173c7: ?? ??:0
7 0x0000ffff9be1749f: ?? ??:0
8 0x000000000040206f: _start at ??:?
Didn't continue vcpu even without ring full
Alternatively, the test could simply reset the semaphores before each
testcase, but papering over hacks with more hacks usually ends in tears.
Reported-by: Shaoqin Huang <shahuang@redhat.com>
Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
tools/testing/selftests/kvm/dirty_log_test.c | 50 +++++++++++---------
1 file changed, 27 insertions(+), 23 deletions(-)
base-commit: d79e70e8cc9ee9b0901a93aef391929236ed0f39
Comments
Hi Sean, Thanks for your better solution to fix this issue, I've tested your patch, it actually fix the problem I encounter. Thanks. Reviewed-by: Shaoqin Huang <shahuang@redhat.com> On 2/3/24 07:18, Sean Christopherson wrote: > When finishing the final iteration of dirty_log_test testcase, set > host_quit _before_ the final "continue" so that the vCPU worker doesn't > run an extra iteration, and delete the hack-a-fix of an extra "continue" > from the dirty ring testcase. This fixes a bug where the extra post to > sem_vcpu_cont may not be consumed, which results in failures in subsequent > runs of the testcases. The bug likely was missed during development as > x86 supports only a single "guest mode", i.e. there aren't any subsequent > testcases after the dirty ring test, because for_each_guest_mode() only > runs a single iteration. > > For the regular dirty log testcases, letting the vCPU run one extra > iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and > only if the worker is explicitly told to stop (vcpu_sync_stop_requested). > But for the dirty ring test, which needs to periodically stop the vCPU to > reap the dirty ring, letting the vCPU resume the guest _after_ the last > iteration means the vCPU will get stuck without an extra "continue". > > However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to > be consumed, e.g. if the vCPU worker sees host_quit==true before resuming > the guest. This results in a dangling sem_vcpu_cont, which leads to > subsequent iterations getting out of sync, as the vCPU worker will > continue on before the main task is ready for it to resume the guest, > leading to a variety of asserts, e.g. > > ==== Test Assertion Failure ==== > dirty_log_test.c:384: dirty_ring_vcpu_ring_full > pid=14854 tid=14854 errno=22 - Invalid argument > 1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384 > 2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505 > 3 (inlined by) run_test at dirty_log_test.c:802 > 4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100 > 5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3) > 6 0x0000ffff9be173c7: ?? ??:0 > 7 0x0000ffff9be1749f: ?? ??:0 > 8 0x000000000040206f: _start at ??:? > Didn't continue vcpu even without ring full > > Alternatively, the test could simply reset the semaphores before each > testcase, but papering over hacks with more hacks usually ends in tears. > > Reported-by: Shaoqin Huang <shahuang@redhat.com> > Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test") > Signed-off-by: Sean Christopherson <seanjc@google.com> > --- > tools/testing/selftests/kvm/dirty_log_test.c | 50 +++++++++++--------- > 1 file changed, 27 insertions(+), 23 deletions(-) > > diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c > index babea97b31a4..eaad5b20854c 100644 > --- a/tools/testing/selftests/kvm/dirty_log_test.c > +++ b/tools/testing/selftests/kvm/dirty_log_test.c > @@ -376,7 +376,10 @@ static void dirty_ring_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, > > cleared = kvm_vm_reset_dirty_ring(vcpu->vm); > > - /* Cleared pages should be the same as collected */ > + /* > + * Cleared pages should be the same as collected, as KVM is supposed to > + * clear only the entries that have been harvested. > + */ > TEST_ASSERT(cleared == count, "Reset dirty pages (%u) mismatch " > "with collected (%u)", cleared, count); > > @@ -415,12 +418,6 @@ static void dirty_ring_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err) > } > } > > -static void dirty_ring_before_vcpu_join(void) > -{ > - /* Kick another round of vcpu just to make sure it will quit */ > - sem_post(&sem_vcpu_cont); > -} > - > struct log_mode { > const char *name; > /* Return true if this mode is supported, otherwise false */ > @@ -433,7 +430,6 @@ struct log_mode { > uint32_t *ring_buf_idx); > /* Hook to call when after each vcpu run */ > void (*after_vcpu_run)(struct kvm_vcpu *vcpu, int ret, int err); > - void (*before_vcpu_join) (void); > } log_modes[LOG_MODE_NUM] = { > { > .name = "dirty-log", > @@ -452,7 +448,6 @@ struct log_mode { > .supported = dirty_ring_supported, > .create_vm_done = dirty_ring_create_vm_done, > .collect_dirty_pages = dirty_ring_collect_dirty_pages, > - .before_vcpu_join = dirty_ring_before_vcpu_join, > .after_vcpu_run = dirty_ring_after_vcpu_run, > }, > }; > @@ -513,14 +508,6 @@ static void log_mode_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err) > mode->after_vcpu_run(vcpu, ret, err); > } > > -static void log_mode_before_vcpu_join(void) > -{ > - struct log_mode *mode = &log_modes[host_log_mode]; > - > - if (mode->before_vcpu_join) > - mode->before_vcpu_join(); > -} > - > static void generate_random_array(uint64_t *guest_array, uint64_t size) > { > uint64_t i; > @@ -719,6 +706,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) > struct kvm_vm *vm; > unsigned long *bmap; > uint32_t ring_buf_idx = 0; > + int sem_val; > > if (!log_mode_supported()) { > print_skip("Log mode '%s' not supported", > @@ -788,12 +776,22 @@ static void run_test(enum vm_guest_mode mode, void *arg) > /* Start the iterations */ > iteration = 1; > sync_global_to_guest(vm, iteration); > - host_quit = false; > + WRITE_ONCE(host_quit, false); > host_dirty_count = 0; > host_clear_count = 0; > host_track_next_count = 0; > WRITE_ONCE(dirty_ring_vcpu_ring_full, false); > > + /* > + * Ensure the previous iteration didn't leave a dangling semaphore, i.e. > + * that the main task and vCPU worker were synchronized and completed > + * verification of all iterations. > + */ > + sem_getvalue(&sem_vcpu_stop, &sem_val); > + TEST_ASSERT_EQ(sem_val, 0); > + sem_getvalue(&sem_vcpu_cont, &sem_val); > + TEST_ASSERT_EQ(sem_val, 0); > + > pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu); > > while (iteration < p->iterations) { > @@ -819,15 +817,21 @@ static void run_test(enum vm_guest_mode mode, void *arg) > assert(host_log_mode == LOG_MODE_DIRTY_RING || > atomic_read(&vcpu_sync_stop_requested) == false); > vm_dirty_log_verify(mode, bmap); > + > + /* > + * Set host_quit before sem_vcpu_cont in the final iteration to > + * ensure that the vCPU worker doesn't resume the guest. As > + * above, the dirty ring test may stop and wait even when not > + * explicitly request to do so, i.e. would hang waiting for a > + * "continue" if it's allowed to resume the guest. > + */ > + if (++iteration == p->iterations) > + WRITE_ONCE(host_quit, true); > + > sem_post(&sem_vcpu_cont); > - > - iteration++; > sync_global_to_guest(vm, iteration); > } > > - /* Tell the vcpu thread to quit */ > - host_quit = true; > - log_mode_before_vcpu_join(); > pthread_join(vcpu_thread, NULL); > > pr_info("Total bits checked: dirty (%"PRIu64"), clear (%"PRIu64"), " > > base-commit: d79e70e8cc9ee9b0901a93aef391929236ed0f39
Perhaps this issue is more difficult to reproduce on x86 than on arm. I cannot reproduce on x86. I tried to emulate the extra guest mode on x86 by running the same code again. Unfortunately, I cannot reproduce after several runs. @@ -940,6 +946,11 @@ int main(int argc, char *argv[]) host_log_mode = i; for_each_guest_mode(run_test, &p); } + for (i = 0; i < LOG_MODE_NUM; i++) { + pr_info("Testing Log Mode '%s'\n", log_modes[i].name); + host_log_mode = i; + for_each_guest_mode(run_test, &p); + } } else { host_log_mode = host_log_mode_option; for_each_guest_mode(run_test, &p); The good thing is, the below is able to detect non-zero sem. @@ -719,6 +719,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_vm *vm; unsigned long *bmap; uint32_t ring_buf_idx = 0; + int sem_val; if (!log_mode_supported()) { print_skip("Log mode '%s' not supported", @@ -794,6 +795,11 @@ static void run_test(enum vm_guest_mode mode, void *arg) host_track_next_count = 0; WRITE_ONCE(dirty_ring_vcpu_ring_full, false); + sem_getvalue(&sem_vcpu_stop, &sem_val); + TEST_ASSERT_EQ(sem_val, 0); + sem_getvalue(&sem_vcpu_cont, &sem_val); + TEST_ASSERT_EQ(sem_val, 0); + pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu); while (iteration < p->iterations) { Thank you very much! Dongli Zhang On 2/3/24 20:22, Shaoqin Huang wrote: > Hi Sean, > > Thanks for your better solution to fix this issue, I've tested your patch, it > actually fix the problem I encounter. Thanks. > > Reviewed-by: Shaoqin Huang <shahuang@redhat.com> > > On 2/3/24 07:18, Sean Christopherson wrote: >> When finishing the final iteration of dirty_log_test testcase, set >> host_quit _before_ the final "continue" so that the vCPU worker doesn't >> run an extra iteration, and delete the hack-a-fix of an extra "continue" >> from the dirty ring testcase. This fixes a bug where the extra post to >> sem_vcpu_cont may not be consumed, which results in failures in subsequent >> runs of the testcases. The bug likely was missed during development as >> x86 supports only a single "guest mode", i.e. there aren't any subsequent >> testcases after the dirty ring test, because for_each_guest_mode() only >> runs a single iteration. >> >> For the regular dirty log testcases, letting the vCPU run one extra >> iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and >> only if the worker is explicitly told to stop (vcpu_sync_stop_requested). >> But for the dirty ring test, which needs to periodically stop the vCPU to >> reap the dirty ring, letting the vCPU resume the guest _after_ the last >> iteration means the vCPU will get stuck without an extra "continue". >> >> However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to >> be consumed, e.g. if the vCPU worker sees host_quit==true before resuming >> the guest. This results in a dangling sem_vcpu_cont, which leads to >> subsequent iterations getting out of sync, as the vCPU worker will >> continue on before the main task is ready for it to resume the guest, >> leading to a variety of asserts, e.g. >> >> ==== Test Assertion Failure ==== >> dirty_log_test.c:384: dirty_ring_vcpu_ring_full >> pid=14854 tid=14854 errno=22 - Invalid argument >> 1 0x00000000004033eb: dirty_ring_collect_dirty_pages at >> dirty_log_test.c:384 >> 2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505 >> 3 (inlined by) run_test at dirty_log_test.c:802 >> 4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100 >> 5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3) >> 6 0x0000ffff9be173c7: ?? ??:0 >> 7 0x0000ffff9be1749f: ?? ??:0 >> 8 0x000000000040206f: _start at ??:? >> Didn't continue vcpu even without ring full >> >> Alternatively, the test could simply reset the semaphores before each >> testcase, but papering over hacks with more hacks usually ends in tears. >> >> Reported-by: Shaoqin Huang <shahuang@redhat.com> >> Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test") >> Signed-off-by: Sean Christopherson <seanjc@google.com> >> --- >> tools/testing/selftests/kvm/dirty_log_test.c | 50 +++++++++++--------- >> 1 file changed, 27 insertions(+), 23 deletions(-) >> >> diff --git a/tools/testing/selftests/kvm/dirty_log_test.c >> b/tools/testing/selftests/kvm/dirty_log_test.c >> index babea97b31a4..eaad5b20854c 100644 >> --- a/tools/testing/selftests/kvm/dirty_log_test.c >> +++ b/tools/testing/selftests/kvm/dirty_log_test.c >> @@ -376,7 +376,10 @@ static void dirty_ring_collect_dirty_pages(struct >> kvm_vcpu *vcpu, int slot, >> cleared = kvm_vm_reset_dirty_ring(vcpu->vm); >> - /* Cleared pages should be the same as collected */ >> + /* >> + * Cleared pages should be the same as collected, as KVM is supposed to >> + * clear only the entries that have been harvested. >> + */ >> TEST_ASSERT(cleared == count, "Reset dirty pages (%u) mismatch " >> "with collected (%u)", cleared, count); >> @@ -415,12 +418,6 @@ static void dirty_ring_after_vcpu_run(struct kvm_vcpu >> *vcpu, int ret, int err) >> } >> } >> -static void dirty_ring_before_vcpu_join(void) >> -{ >> - /* Kick another round of vcpu just to make sure it will quit */ >> - sem_post(&sem_vcpu_cont); >> -} >> - >> struct log_mode { >> const char *name; >> /* Return true if this mode is supported, otherwise false */ >> @@ -433,7 +430,6 @@ struct log_mode { >> uint32_t *ring_buf_idx); >> /* Hook to call when after each vcpu run */ >> void (*after_vcpu_run)(struct kvm_vcpu *vcpu, int ret, int err); >> - void (*before_vcpu_join) (void); >> } log_modes[LOG_MODE_NUM] = { >> { >> .name = "dirty-log", >> @@ -452,7 +448,6 @@ struct log_mode { >> .supported = dirty_ring_supported, >> .create_vm_done = dirty_ring_create_vm_done, >> .collect_dirty_pages = dirty_ring_collect_dirty_pages, >> - .before_vcpu_join = dirty_ring_before_vcpu_join, >> .after_vcpu_run = dirty_ring_after_vcpu_run, >> }, >> }; >> @@ -513,14 +508,6 @@ static void log_mode_after_vcpu_run(struct kvm_vcpu >> *vcpu, int ret, int err) >> mode->after_vcpu_run(vcpu, ret, err); >> } >> -static void log_mode_before_vcpu_join(void) >> -{ >> - struct log_mode *mode = &log_modes[host_log_mode]; >> - >> - if (mode->before_vcpu_join) >> - mode->before_vcpu_join(); >> -} >> - >> static void generate_random_array(uint64_t *guest_array, uint64_t size) >> { >> uint64_t i; >> @@ -719,6 +706,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) >> struct kvm_vm *vm; >> unsigned long *bmap; >> uint32_t ring_buf_idx = 0; >> + int sem_val; >> if (!log_mode_supported()) { >> print_skip("Log mode '%s' not supported", >> @@ -788,12 +776,22 @@ static void run_test(enum vm_guest_mode mode, void *arg) >> /* Start the iterations */ >> iteration = 1; >> sync_global_to_guest(vm, iteration); >> - host_quit = false; >> + WRITE_ONCE(host_quit, false); >> host_dirty_count = 0; >> host_clear_count = 0; >> host_track_next_count = 0; >> WRITE_ONCE(dirty_ring_vcpu_ring_full, false); >> + /* >> + * Ensure the previous iteration didn't leave a dangling semaphore, i.e. >> + * that the main task and vCPU worker were synchronized and completed >> + * verification of all iterations. >> + */ >> + sem_getvalue(&sem_vcpu_stop, &sem_val); >> + TEST_ASSERT_EQ(sem_val, 0); >> + sem_getvalue(&sem_vcpu_cont, &sem_val); >> + TEST_ASSERT_EQ(sem_val, 0); >> + >> pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu); >> while (iteration < p->iterations) { >> @@ -819,15 +817,21 @@ static void run_test(enum vm_guest_mode mode, void *arg) >> assert(host_log_mode == LOG_MODE_DIRTY_RING || >> atomic_read(&vcpu_sync_stop_requested) == false); >> vm_dirty_log_verify(mode, bmap); >> + >> + /* >> + * Set host_quit before sem_vcpu_cont in the final iteration to >> + * ensure that the vCPU worker doesn't resume the guest. As >> + * above, the dirty ring test may stop and wait even when not >> + * explicitly request to do so, i.e. would hang waiting for a >> + * "continue" if it's allowed to resume the guest. >> + */ >> + if (++iteration == p->iterations) >> + WRITE_ONCE(host_quit, true); >> + >> sem_post(&sem_vcpu_cont); >> - >> - iteration++; >> sync_global_to_guest(vm, iteration); >> } >> - /* Tell the vcpu thread to quit */ >> - host_quit = true; >> - log_mode_before_vcpu_join(); >> pthread_join(vcpu_thread, NULL); >> pr_info("Total bits checked: dirty (%"PRIu64"), clear (%"PRIu64"), " >> >> base-commit: d79e70e8cc9ee9b0901a93aef391929236ed0f39 >
On Fri, Feb 02, 2024 at 03:18:31PM -0800, Sean Christopherson wrote: > When finishing the final iteration of dirty_log_test testcase, set > host_quit _before_ the final "continue" so that the vCPU worker doesn't > run an extra iteration, and delete the hack-a-fix of an extra "continue" > from the dirty ring testcase. This fixes a bug where the extra post to > sem_vcpu_cont may not be consumed, which results in failures in subsequent > runs of the testcases. The bug likely was missed during development as > x86 supports only a single "guest mode", i.e. there aren't any subsequent > testcases after the dirty ring test, because for_each_guest_mode() only > runs a single iteration. > > For the regular dirty log testcases, letting the vCPU run one extra > iteration is a non-issue as the vCPU worker waits on sem_vcpu_cont if and > only if the worker is explicitly told to stop (vcpu_sync_stop_requested). > But for the dirty ring test, which needs to periodically stop the vCPU to > reap the dirty ring, letting the vCPU resume the guest _after_ the last > iteration means the vCPU will get stuck without an extra "continue". > > However, blindly firing off an post to sem_vcpu_cont isn't guaranteed to > be consumed, e.g. if the vCPU worker sees host_quit==true before resuming > the guest. This results in a dangling sem_vcpu_cont, which leads to > subsequent iterations getting out of sync, as the vCPU worker will > continue on before the main task is ready for it to resume the guest, > leading to a variety of asserts, e.g. > > ==== Test Assertion Failure ==== > dirty_log_test.c:384: dirty_ring_vcpu_ring_full > pid=14854 tid=14854 errno=22 - Invalid argument > 1 0x00000000004033eb: dirty_ring_collect_dirty_pages at dirty_log_test.c:384 > 2 0x0000000000402d27: log_mode_collect_dirty_pages at dirty_log_test.c:505 > 3 (inlined by) run_test at dirty_log_test.c:802 > 4 0x0000000000403dc7: for_each_guest_mode at guest_modes.c:100 > 5 0x0000000000401dff: main at dirty_log_test.c:941 (discriminator 3) > 6 0x0000ffff9be173c7: ?? ??:0 > 7 0x0000ffff9be1749f: ?? ??:0 > 8 0x000000000040206f: _start at ??:? > Didn't continue vcpu even without ring full > > Alternatively, the test could simply reset the semaphores before each > testcase, but papering over hacks with more hacks usually ends in tears. IMHO this is fine; we don't have a hard requirement anyway on sem in this test, and we're pretty sure where the extra sem count came from (rather than an unknown cause). However since the patch can drop before_vcpu_join() by a proper ordering of setup host_quit v.s. sem_post(), I think it's at least equally nice indeed. > > Reported-by: Shaoqin Huang <shahuang@redhat.com> > Fixes: 84292e565951 ("KVM: selftests: Add dirty ring buffer test") > Signed-off-by: Sean Christopherson <seanjc@google.com> Reviewed-by: Peter Xu <peterx@redhat.com> Thanks,
On Fri, 02 Feb 2024 15:18:31 -0800, Sean Christopherson wrote: > When finishing the final iteration of dirty_log_test testcase, set > host_quit _before_ the final "continue" so that the vCPU worker doesn't > run an extra iteration, and delete the hack-a-fix of an extra "continue" > from the dirty ring testcase. This fixes a bug where the extra post to > sem_vcpu_cont may not be consumed, which results in failures in subsequent > runs of the testcases. The bug likely was missed during development as > x86 supports only a single "guest mode", i.e. there aren't any subsequent > testcases after the dirty ring test, because for_each_guest_mode() only > runs a single iteration. > > [...] Applied to kvm-x86 selftests, thanks! [1/1] KVM: selftests: Fix a semaphore imbalance in the dirty ring logging test https://github.com/kvm-x86/linux/commit/ba58f873cdee -- https://github.com/kvm-x86/linux/tree/next
Hi Sean, On 2/6/24 22:36, Sean Christopherson wrote: > On Fri, 02 Feb 2024 15:18:31 -0800, Sean Christopherson wrote: >> When finishing the final iteration of dirty_log_test testcase, set >> host_quit _before_ the final "continue" so that the vCPU worker doesn't >> run an extra iteration, and delete the hack-a-fix of an extra "continue" >> from the dirty ring testcase. This fixes a bug where the extra post to >> sem_vcpu_cont may not be consumed, which results in failures in subsequent >> runs of the testcases. The bug likely was missed during development as >> x86 supports only a single "guest mode", i.e. there aren't any subsequent >> testcases after the dirty ring test, because for_each_guest_mode() only >> runs a single iteration. >> >> [...] > > Applied to kvm-x86 selftests, thanks! Do you plan to send this branch to Paolo for v6.8? Thanks Eric > > [1/1] KVM: selftests: Fix a semaphore imbalance in the dirty ring logging test > https://github.com/kvm-x86/linux/commit/ba58f873cdee > > -- > https://github.com/kvm-x86/linux/tree/next >
On Wed, Feb 07, 2024, Eric Auger wrote: > Hi Sean, > > On 2/6/24 22:36, Sean Christopherson wrote: > > On Fri, 02 Feb 2024 15:18:31 -0800, Sean Christopherson wrote: > >> When finishing the final iteration of dirty_log_test testcase, set > >> host_quit _before_ the final "continue" so that the vCPU worker doesn't > >> run an extra iteration, and delete the hack-a-fix of an extra "continue" > >> from the dirty ring testcase. This fixes a bug where the extra post to > >> sem_vcpu_cont may not be consumed, which results in failures in subsequent > >> runs of the testcases. The bug likely was missed during development as > >> x86 supports only a single "guest mode", i.e. there aren't any subsequent > >> testcases after the dirty ring test, because for_each_guest_mode() only > >> runs a single iteration. > >> > >> [...] > > > > Applied to kvm-x86 selftests, thanks! > Do you plan to send this branch to Paolo for v6.8? That wasn't my initial plan, but looking at what's in there, the only thing that isn't a fix is Andrew's series to remove redundant newlines. So yeah, I'll send this along for 6.8.
On 2/7/24 15:44, Sean Christopherson wrote: > On Wed, Feb 07, 2024, Eric Auger wrote: >> Hi Sean, >> >> On 2/6/24 22:36, Sean Christopherson wrote: >>> On Fri, 02 Feb 2024 15:18:31 -0800, Sean Christopherson wrote: >>>> When finishing the final iteration of dirty_log_test testcase, set >>>> host_quit _before_ the final "continue" so that the vCPU worker doesn't >>>> run an extra iteration, and delete the hack-a-fix of an extra "continue" >>>> from the dirty ring testcase. This fixes a bug where the extra post to >>>> sem_vcpu_cont may not be consumed, which results in failures in subsequent >>>> runs of the testcases. The bug likely was missed during development as >>>> x86 supports only a single "guest mode", i.e. there aren't any subsequent >>>> testcases after the dirty ring test, because for_each_guest_mode() only >>>> runs a single iteration. >>>> >>>> [...] >>> >>> Applied to kvm-x86 selftests, thanks! >> Do you plan to send this branch to Paolo for v6.8? > > That wasn't my initial plan, but looking at what's in there, the only thing that > isn't a fix is Andrew's series to remove redundant newlines. So yeah, I'll send > this along for 6.8. > OK. Many thanks! Eric
diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index babea97b31a4..eaad5b20854c 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -376,7 +376,10 @@ static void dirty_ring_collect_dirty_pages(struct kvm_vcpu *vcpu, int slot, cleared = kvm_vm_reset_dirty_ring(vcpu->vm); - /* Cleared pages should be the same as collected */ + /* + * Cleared pages should be the same as collected, as KVM is supposed to + * clear only the entries that have been harvested. + */ TEST_ASSERT(cleared == count, "Reset dirty pages (%u) mismatch " "with collected (%u)", cleared, count); @@ -415,12 +418,6 @@ static void dirty_ring_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err) } } -static void dirty_ring_before_vcpu_join(void) -{ - /* Kick another round of vcpu just to make sure it will quit */ - sem_post(&sem_vcpu_cont); -} - struct log_mode { const char *name; /* Return true if this mode is supported, otherwise false */ @@ -433,7 +430,6 @@ struct log_mode { uint32_t *ring_buf_idx); /* Hook to call when after each vcpu run */ void (*after_vcpu_run)(struct kvm_vcpu *vcpu, int ret, int err); - void (*before_vcpu_join) (void); } log_modes[LOG_MODE_NUM] = { { .name = "dirty-log", @@ -452,7 +448,6 @@ struct log_mode { .supported = dirty_ring_supported, .create_vm_done = dirty_ring_create_vm_done, .collect_dirty_pages = dirty_ring_collect_dirty_pages, - .before_vcpu_join = dirty_ring_before_vcpu_join, .after_vcpu_run = dirty_ring_after_vcpu_run, }, }; @@ -513,14 +508,6 @@ static void log_mode_after_vcpu_run(struct kvm_vcpu *vcpu, int ret, int err) mode->after_vcpu_run(vcpu, ret, err); } -static void log_mode_before_vcpu_join(void) -{ - struct log_mode *mode = &log_modes[host_log_mode]; - - if (mode->before_vcpu_join) - mode->before_vcpu_join(); -} - static void generate_random_array(uint64_t *guest_array, uint64_t size) { uint64_t i; @@ -719,6 +706,7 @@ static void run_test(enum vm_guest_mode mode, void *arg) struct kvm_vm *vm; unsigned long *bmap; uint32_t ring_buf_idx = 0; + int sem_val; if (!log_mode_supported()) { print_skip("Log mode '%s' not supported", @@ -788,12 +776,22 @@ static void run_test(enum vm_guest_mode mode, void *arg) /* Start the iterations */ iteration = 1; sync_global_to_guest(vm, iteration); - host_quit = false; + WRITE_ONCE(host_quit, false); host_dirty_count = 0; host_clear_count = 0; host_track_next_count = 0; WRITE_ONCE(dirty_ring_vcpu_ring_full, false); + /* + * Ensure the previous iteration didn't leave a dangling semaphore, i.e. + * that the main task and vCPU worker were synchronized and completed + * verification of all iterations. + */ + sem_getvalue(&sem_vcpu_stop, &sem_val); + TEST_ASSERT_EQ(sem_val, 0); + sem_getvalue(&sem_vcpu_cont, &sem_val); + TEST_ASSERT_EQ(sem_val, 0); + pthread_create(&vcpu_thread, NULL, vcpu_worker, vcpu); while (iteration < p->iterations) { @@ -819,15 +817,21 @@ static void run_test(enum vm_guest_mode mode, void *arg) assert(host_log_mode == LOG_MODE_DIRTY_RING || atomic_read(&vcpu_sync_stop_requested) == false); vm_dirty_log_verify(mode, bmap); + + /* + * Set host_quit before sem_vcpu_cont in the final iteration to + * ensure that the vCPU worker doesn't resume the guest. As + * above, the dirty ring test may stop and wait even when not + * explicitly request to do so, i.e. would hang waiting for a + * "continue" if it's allowed to resume the guest. + */ + if (++iteration == p->iterations) + WRITE_ONCE(host_quit, true); + sem_post(&sem_vcpu_cont); - - iteration++; sync_global_to_guest(vm, iteration); } - /* Tell the vcpu thread to quit */ - host_quit = true; - log_mode_before_vcpu_join(); pthread_join(vcpu_thread, NULL); pr_info("Total bits checked: dirty (%"PRIu64"), clear (%"PRIu64"), "