Message ID | 1707758174-142161-4-git-send-email-steven.sistare@oracle.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-62080-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:bc8a:b0:106:860b:bbdd with SMTP id dn10csp61146dyb; Mon, 12 Feb 2024 09:27:45 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCU9dIhl9xd/XGdYNBJyPkAoJdLxSVSdUu3o4bVC1inSqW8igpPnj17fFgoI9831nW3NdkpXLlXkZWp8a2fUxwQMMT6Llw== X-Google-Smtp-Source: AGHT+IGzj260HMFJyeWRQ5O0cIo349PD3E0Awg4epcXLFvKM1x0Ppn7mhmIXRDk6nEGUqACXYz0Y X-Received: by 2002:a17:902:d48e:b0:1da:2b52:52d4 with SMTP id c14-20020a170902d48e00b001da2b5252d4mr4306381plg.39.1707758865308; Mon, 12 Feb 2024 09:27:45 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707758865; cv=pass; d=google.com; s=arc-20160816; b=mwdRMI14xZkwryfuoyVb2R8cHXQXdtRoCpdNiPCJdfASnsOCra3+WbAVrVsX0SwfrU AkE3WWv+/+EY2JwobyD0CcuYeOjNSwuGxLkJYEWn2ufSEmuudXpKU0pGFTsBnNE75YZl OmW1vpzf25K3tAP2mDT64gE7Gs903cxbNvPwHFj/dvEu3sKpSZ0DV9eQxqjJZCaJN8JJ /tyND9h8HNM8sTuHOMLrempEnMOCZL2X3evn79woKpDJe/Diy1rBbCa5U2tYCyWPZBg6 Z63JYlZgtkL1SAscXz56uNTihqVDUBiGF6wEMMylysCM4P070vBNlHP6O5Bz8b/5vh3A 4kIQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-unsubscribe:list-subscribe:list-id:precedence:references :in-reply-to:message-id:date:subject:cc:to:from:dkim-signature; bh=fhdNJEwldjv6IUUX+AIoeSMZpJt3+JWrlmxQbPSV190=; fh=XRcI9TXxaApnRyo2GGiWVe8lo1zK29YAFBq4jOHKv0c=; b=wJaWno44p4QRM9jLFOGJP4Qy31/mbOtWxvN0oWvoxFxmsitRY5TZRxhDS/5ZqklXwm FIIPwe1+M+M3aB0rGcQCRKBPvP6X3Qif947+fMP+zuStYLA+5qWxr+eDdFfugp0xCtyC Y6u2e9ota24uiDr18hRfDm5jkoUl9vFwayyPrmnAPZIsfnpC2t6atT8xWDFkzZu4y0i/ D3CRhPbLx9Llm5sgTCx5ethXS7IJgXe/Qh6zfQriM+8ni70NJCW7N1WoQUHHtPsRi9lX wzfoPKucdM2Zs7I7tL/TPR41wFKSsPLqH87j/Ef1TbRuAHK+LjtsT6bNGtc+jFOvFNXg iCew==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-11-20 header.b=lrk7C3Xm; arc=pass (i=1 spf=pass spfdomain=oracle.com dkim=pass dkdomain=oracle.com dmarc=pass fromdomain=oracle.com); spf=pass (google.com: domain of linux-kernel+bounces-62080-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-62080-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com X-Forwarded-Encrypted: i=2; AJvYcCXFhElel9qpMTwjh0B7aEmJPHm9wxeTDS4SSNtgaSSQCNabap5SG6ADCDAEnXavwdsPwXueFgoz0FYpTaISpFE5P/7HEw== Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id p18-20020a170902e75200b001d8e7df68d9si536842plf.465.2024.02.12.09.27.45 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Feb 2024 09:27:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-62080-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-11-20 header.b=lrk7C3Xm; arc=pass (i=1 spf=pass spfdomain=oracle.com dkim=pass dkdomain=oracle.com dmarc=pass fromdomain=oracle.com); spf=pass (google.com: domain of linux-kernel+bounces-62080-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-62080-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 2AC0D28ACE4 for <ouuuleilei@gmail.com>; Mon, 12 Feb 2024 17:17:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A78083FE2E; Mon, 12 Feb 2024 17:16:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b="lrk7C3Xm" Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2FACB3F8C2 for <linux-kernel@vger.kernel.org>; Mon, 12 Feb 2024 17:16:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=205.220.177.32 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707758193; cv=none; b=nPCgSPWwY1HcJ1wBkOiEbUQ61pg8Jl9Sptq4ddm4F6v/9WVci7H3d9CFQ1/yYtg0j7tZRHTLWSx3jqGpBiGbKMzoTpJjgGm9SPc8y9wvlXXk/GavF/gbhp5piQSddW/f2//qZWG8j8vWvsY13p8wKK6vgkbGhQkRAGe7pIgxk/w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707758193; c=relaxed/simple; bh=8YvGJ2ZZyQiC7K2q187z+72Db2SHCfeV/AIbRS4MrO8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References; b=ZY62IuKw0EVxmrL2VqnzVDAE+J+A6iOZh45GWi/JD8eWZVAZ6pykvJtNhCHDefrbgKKiH+Gkvq6oWe2JDzSgxHWt54L5jCEGB8gjS72babX2u8jali25bWULyX8rrCZkLoKS38sZEcbUJ86QpDVcZB+8TL4oOpZjJJK1XEddE2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=oracle.com; spf=pass smtp.mailfrom=oracle.com; dkim=pass (2048-bit key) header.d=oracle.com header.i=@oracle.com header.b=lrk7C3Xm; arc=none smtp.client-ip=205.220.177.32 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=oracle.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=oracle.com Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 41CH8lKn017068; Mon, 12 Feb 2024 17:16:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-11-20; bh=fhdNJEwldjv6IUUX+AIoeSMZpJt3+JWrlmxQbPSV190=; b=lrk7C3XmHpr9hqdFhIHIUkxs3jui7oXu7hf/Y9opTu9qbbHvVKIrGMVpwBodB+cD+S1u EvyIsa6q+RhAgu01Omd9xmE9pmh+DJzSnP+TZBqEAtDwGUFvho9FpRCgHD0ZAVALaFWR YrrBnZJJLAr7l3AHnCmFZosfelAS5IOkAKkZAJxtJ2qKX+GDA9lh3vK6wBBmDpNgOhTS DA+VPhumPPGHkXzmPby1zUpPSqLaw0nLQOs6d/PKR4DS6AXOcJF72FLUDiSg+0zMLE8d 2Iv6VmKs3AYIUfgJn/2dyv9Z9tLw+0VymDWz2Bmv9i+0V7OFJISVEG2KiBmGg7civ+ZP KA== Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.appoci.oracle.com [130.35.103.27]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3w7mpqggaw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 Feb 2024 17:16:20 +0000 Received: from pps.filterd (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 41CGHEXh031321; Mon, 12 Feb 2024 17:16:20 GMT Received: from pps.reinject (localhost [127.0.0.1]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTPS id 3w5yk620g5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 12 Feb 2024 17:16:20 +0000 Received: from iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 41CHGFCE019216; Mon, 12 Feb 2024 17:16:19 GMT Received: from ca-dev63.us.oracle.com (ca-dev63.us.oracle.com [10.211.8.221]) by iadpaimrmta03.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 3w5yk620cq-4; Mon, 12 Feb 2024 17:16:19 +0000 From: Steve Sistare <steven.sistare@oracle.com> To: virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org Cc: "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Si-Wei Liu <si-wei.liu@oracle.com>, Eugenio Perez Martin <eperezma@redhat.com>, Xie Yongji <xieyongji@bytedance.com>, Stefano Garzarella <sgarzare@redhat.com>, Steve Sistare <steven.sistare@oracle.com> Subject: [PATCH V2 3/3] vdpa_sim: flush workers on suspend Date: Mon, 12 Feb 2024 09:16:14 -0800 Message-Id: <1707758174-142161-4-git-send-email-steven.sistare@oracle.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1707758174-142161-1-git-send-email-steven.sistare@oracle.com> References: <1707758174-142161-1-git-send-email-steven.sistare@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-02-12_14,2024-02-12_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311290000 definitions=main-2402120131 X-Proofpoint-GUID: djMs8rQaCzFpOPn0bvE4a5muSqxH_YcW X-Proofpoint-ORIG-GUID: djMs8rQaCzFpOPn0bvE4a5muSqxH_YcW Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790714959691796537 X-GMAIL-MSGID: 1790714959691796537 |
Series |
flush workers on suspend
|
|
Commit Message
Steven Sistare
Feb. 12, 2024, 5:16 p.m. UTC
Flush to guarantee no workers are running when suspend returns.
Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
Comments
On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@oracle.com> wrote: > > Flush to guarantee no workers are running when suspend returns. > > Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > --- > drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c > index be2925d0d283..a662b90357c3 100644 > --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c > +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c > @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, > kthread_flush_work(work); > } > > +static void flush_work_fn(struct kthread_work *work) {} > + > +static void vdpasim_flush_work(struct vdpasim *vdpasim) > +{ > + struct kthread_work work; > + > + kthread_init_work(&work, flush_work_fn); If the work is already queued, doesn't it break the linked list because of the memset in kthread_init_work? > + kthread_queue_work(vdpasim->worker, &work); > + kthread_flush_work(&work); > +} > + > static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) > { > return container_of(vdpa, struct vdpasim, vdpa); > @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) > vdpasim->running = false; > mutex_unlock(&vdpasim->mutex); > > + vdpasim_flush_work(vdpasim); Do we need to protect the case where vdpasim_kick_vq and vdpasim_suspend are called "at the same time"? Correct userland should not be doing it but buggy or mailious could be. Just calling vdpasim_flush_work with the mutex acquired would solve the issue, doesn't it? Thanks! > + > return 0; > } > > -- > 2.39.3 >
On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: > On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@oracle.com> wrote: >> >> Flush to guarantee no workers are running when suspend returns. >> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >> --- >> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c >> index be2925d0d283..a662b90357c3 100644 >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c >> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, >> kthread_flush_work(work); >> } >> >> +static void flush_work_fn(struct kthread_work *work) {} >> + >> +static void vdpasim_flush_work(struct vdpasim *vdpasim) >> +{ >> + struct kthread_work work; >> + >> + kthread_init_work(&work, flush_work_fn); > > If the work is already queued, doesn't it break the linked list > because of the memset in kthread_init_work? work is a local variable. It completes before vdpasim_flush_work returns, thus is never already queued on entry to vdpasim_flush_work. Am I missing your point? >> + kthread_queue_work(vdpasim->worker, &work); >> + kthread_flush_work(&work); >> +} >> + >> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) >> { >> return container_of(vdpa, struct vdpasim, vdpa); >> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) >> vdpasim->running = false; >> mutex_unlock(&vdpasim->mutex); >> >> + vdpasim_flush_work(vdpasim); > > Do we need to protect the case where vdpasim_kick_vq and > vdpasim_suspend are called "at the same time"? Correct userland should > not be doing it but buggy or mailious could be. Just calling > vdpasim_flush_work with the mutex acquired would solve the issue, > doesn't it? Good catch. I need to serialize access to vdpasim->running plus the worker queue in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called from non-task contexts, I should define a new spinlock to be acquired in both functions. - Steve
On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare <steven.sistare@oracle.com> wrote: > > On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: > > On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@oracle.com> wrote: > >> > >> Flush to guarantee no workers are running when suspend returns. > >> > >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > >> --- > >> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ > >> 1 file changed, 13 insertions(+) > >> > >> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c > >> index be2925d0d283..a662b90357c3 100644 > >> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c > >> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c > >> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, > >> kthread_flush_work(work); > >> } > >> > >> +static void flush_work_fn(struct kthread_work *work) {} > >> + > >> +static void vdpasim_flush_work(struct vdpasim *vdpasim) > >> +{ > >> + struct kthread_work work; > >> + > >> + kthread_init_work(&work, flush_work_fn); > > > > If the work is already queued, doesn't it break the linked list > > because of the memset in kthread_init_work? > > work is a local variable. It completes before vdpasim_flush_work returns, > thus is never already queued on entry to vdpasim_flush_work. > Am I missing your point? > No, sorry, I was the one missing that. Thanks for explaining it :)! I'm not so used to the kthread queue, but why not calling kthread_flush_work on vdpasim->work directly? > >> + kthread_queue_work(vdpasim->worker, &work); > >> + kthread_flush_work(&work); > >> +} > >> + > >> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) > >> { > >> return container_of(vdpa, struct vdpasim, vdpa); > >> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) > >> vdpasim->running = false; > >> mutex_unlock(&vdpasim->mutex); > >> > >> + vdpasim_flush_work(vdpasim); > > > > Do we need to protect the case where vdpasim_kick_vq and > > vdpasim_suspend are called "at the same time"? Correct userland should > > not be doing it but buggy or mailious could be. Just calling > > vdpasim_flush_work with the mutex acquired would solve the issue, > > doesn't it? > > Good catch. I need to serialize access to vdpasim->running plus the worker queue > in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called > from non-task contexts, I should define a new spinlock to be acquired in both functions. > > - Steve >
On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote: > On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare > <steven.sistare@oracle.com> wrote: >> >> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: >>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@oracle.com> wrote: >>>> >>>> Flush to guarantee no workers are running when suspend returns. >>>> >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >>>> --- >>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ >>>> 1 file changed, 13 insertions(+) >>>> >>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>> index be2925d0d283..a662b90357c3 100644 >>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, >>>> kthread_flush_work(work); >>>> } >>>> >>>> +static void flush_work_fn(struct kthread_work *work) {} >>>> + >>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim) >>>> +{ >>>> + struct kthread_work work; >>>> + >>>> + kthread_init_work(&work, flush_work_fn); >>> >>> If the work is already queued, doesn't it break the linked list >>> because of the memset in kthread_init_work? >> >> work is a local variable. It completes before vdpasim_flush_work returns, >> thus is never already queued on entry to vdpasim_flush_work. >> Am I missing your point? > > No, sorry, I was the one missing that. Thanks for explaining it :)! > > I'm not so used to the kthread queue, but why not calling > kthread_flush_work on vdpasim->work directly? vdpasim->work is not the only work posted to vdpasim->worker; see vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees they are all flushed. - Steve >>>> + kthread_queue_work(vdpasim->worker, &work); >>>> + kthread_flush_work(&work); >>>> +} >>>> + >>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) >>>> { >>>> return container_of(vdpa, struct vdpasim, vdpa); >>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) >>>> vdpasim->running = false; >>>> mutex_unlock(&vdpasim->mutex); >>>> >>>> + vdpasim_flush_work(vdpasim); >>> >>> Do we need to protect the case where vdpasim_kick_vq and >>> vdpasim_suspend are called "at the same time"? Correct userland should >>> not be doing it but buggy or mailious could be. Just calling >>> vdpasim_flush_work with the mutex acquired would solve the issue, >>> doesn't it? >> >> Good catch. I need to serialize access to vdpasim->running plus the worker queue >> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called >> from non-task contexts, I should define a new spinlock to be acquired in both functions. >> >> - Steve >> >
On Wed, Feb 14, 2024 at 8:52 PM Steven Sistare <steven.sistare@oracle.com> wrote: > > On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote: > > On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare > > <steven.sistare@oracle.com> wrote: > >> > >> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: > >>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@oracle.com> wrote: > >>>> > >>>> Flush to guarantee no workers are running when suspend returns. > >>>> > >>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> > >>>> --- > >>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ > >>>> 1 file changed, 13 insertions(+) > >>>> > >>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c > >>>> index be2925d0d283..a662b90357c3 100644 > >>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c > >>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c > >>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, > >>>> kthread_flush_work(work); > >>>> } > >>>> > >>>> +static void flush_work_fn(struct kthread_work *work) {} > >>>> + > >>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim) > >>>> +{ > >>>> + struct kthread_work work; > >>>> + > >>>> + kthread_init_work(&work, flush_work_fn); > >>> > >>> If the work is already queued, doesn't it break the linked list > >>> because of the memset in kthread_init_work? > >> > >> work is a local variable. It completes before vdpasim_flush_work returns, > >> thus is never already queued on entry to vdpasim_flush_work. > >> Am I missing your point? > > > > No, sorry, I was the one missing that. Thanks for explaining it :)! > > > > I'm not so used to the kthread queue, but why not calling > > kthread_flush_work on vdpasim->work directly? > > vdpasim->work is not the only work posted to vdpasim->worker; see > vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees > they are all flushed. > But it is ok to have concurrent mm updates, isn't it? Moreover, they can be enqueued immediately after the kthread_flush_work already, as there is no lock protecting it. > - Steve > > >>>> + kthread_queue_work(vdpasim->worker, &work); > >>>> + kthread_flush_work(&work); > >>>> +} > >>>> + > >>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) > >>>> { > >>>> return container_of(vdpa, struct vdpasim, vdpa); > >>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) > >>>> vdpasim->running = false; > >>>> mutex_unlock(&vdpasim->mutex); > >>>> > >>>> + vdpasim_flush_work(vdpasim); > >>> > >>> Do we need to protect the case where vdpasim_kick_vq and > >>> vdpasim_suspend are called "at the same time"? Correct userland should > >>> not be doing it but buggy or mailious could be. Just calling > >>> vdpasim_flush_work with the mutex acquired would solve the issue, > >>> doesn't it? > >> > >> Good catch. I need to serialize access to vdpasim->running plus the worker queue > >> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called > >> from non-task contexts, I should define a new spinlock to be acquired in both functions. > >> > >> - Steve > >> > > >
On 2/15/2024 10:44 AM, Eugenio Perez Martin wrote: > On Wed, Feb 14, 2024 at 8:52 PM Steven Sistare > <steven.sistare@oracle.com> wrote: >> >> On 2/14/2024 2:39 PM, Eugenio Perez Martin wrote: >>> On Wed, Feb 14, 2024 at 6:50 PM Steven Sistare >>> <steven.sistare@oracle.com> wrote: >>>> >>>> On 2/13/2024 11:10 AM, Eugenio Perez Martin wrote: >>>>> On Mon, Feb 12, 2024 at 6:16 PM Steve Sistare <steven.sistare@oracle.com> wrote: >>>>>> >>>>>> Flush to guarantee no workers are running when suspend returns. >>>>>> >>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com> >>>>>> --- >>>>>> drivers/vdpa/vdpa_sim/vdpa_sim.c | 13 +++++++++++++ >>>>>> 1 file changed, 13 insertions(+) >>>>>> >>>>>> diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>>>> index be2925d0d283..a662b90357c3 100644 >>>>>> --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>>>> +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c >>>>>> @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, >>>>>> kthread_flush_work(work); >>>>>> } >>>>>> >>>>>> +static void flush_work_fn(struct kthread_work *work) {} >>>>>> + >>>>>> +static void vdpasim_flush_work(struct vdpasim *vdpasim) >>>>>> +{ >>>>>> + struct kthread_work work; >>>>>> + >>>>>> + kthread_init_work(&work, flush_work_fn); >>>>> >>>>> If the work is already queued, doesn't it break the linked list >>>>> because of the memset in kthread_init_work? >>>> >>>> work is a local variable. It completes before vdpasim_flush_work returns, >>>> thus is never already queued on entry to vdpasim_flush_work. >>>> Am I missing your point? >>> >>> No, sorry, I was the one missing that. Thanks for explaining it :)! >>> >>> I'm not so used to the kthread queue, but why not calling >>> kthread_flush_work on vdpasim->work directly? >> >> vdpasim->work is not the only work posted to vdpasim->worker; see >> vdpasim_worker_change_mm_sync. Posting a new no-op work guarantees >> they are all flushed. > > But it is ok to have concurrent mm updates, isn't it? Moreover, they > can be enqueued immediately after the kthread_flush_work already, as > there is no lock protecting it. Agreed on both, thanks. I will simplify and only flush vdpasim->work. - Steve >>>>>> + kthread_queue_work(vdpasim->worker, &work); >>>>>> + kthread_flush_work(&work); >>>>>> +} >>>>>> + >>>>>> static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) >>>>>> { >>>>>> return container_of(vdpa, struct vdpasim, vdpa); >>>>>> @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) >>>>>> vdpasim->running = false; >>>>>> mutex_unlock(&vdpasim->mutex); >>>>>> >>>>>> + vdpasim_flush_work(vdpasim); >>>>> >>>>> Do we need to protect the case where vdpasim_kick_vq and >>>>> vdpasim_suspend are called "at the same time"? Correct userland should >>>>> not be doing it but buggy or mailious could be. Just calling >>>>> vdpasim_flush_work with the mutex acquired would solve the issue, >>>>> doesn't it? >>>> >>>> Good catch. I need to serialize access to vdpasim->running plus the worker queue >>>> in these two functions. vdpasim_kick_vq currently takes no locks. In case it is called >>>> from non-task contexts, I should define a new spinlock to be acquired in both functions. >>>> >>>> - Steve >>>> >>> >> >
diff --git a/drivers/vdpa/vdpa_sim/vdpa_sim.c b/drivers/vdpa/vdpa_sim/vdpa_sim.c index be2925d0d283..a662b90357c3 100644 --- a/drivers/vdpa/vdpa_sim/vdpa_sim.c +++ b/drivers/vdpa/vdpa_sim/vdpa_sim.c @@ -74,6 +74,17 @@ static void vdpasim_worker_change_mm_sync(struct vdpasim *vdpasim, kthread_flush_work(work); } +static void flush_work_fn(struct kthread_work *work) {} + +static void vdpasim_flush_work(struct vdpasim *vdpasim) +{ + struct kthread_work work; + + kthread_init_work(&work, flush_work_fn); + kthread_queue_work(vdpasim->worker, &work); + kthread_flush_work(&work); +} + static struct vdpasim *vdpa_to_sim(struct vdpa_device *vdpa) { return container_of(vdpa, struct vdpasim, vdpa); @@ -511,6 +522,8 @@ static int vdpasim_suspend(struct vdpa_device *vdpa) vdpasim->running = false; mutex_unlock(&vdpasim->mutex); + vdpasim_flush_work(vdpasim); + return 0; }