Message ID | 20240125130833.1953617-1-devarsht@ti.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-38623-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2553:b0:103:945f:af90 with SMTP id p19csp1626521dyi; Thu, 25 Jan 2024 05:10:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IGK+3DtCl05PN82GkofBCWmTdbefi7+h/Df1ETnCvxrBFU1AiNjzCtsgZ/SEElKPHZL65OS X-Received: by 2002:a05:622a:d4:b0:42a:5dbc:7e81 with SMTP id p20-20020a05622a00d400b0042a5dbc7e81mr1025539qtw.36.1706188199872; Thu, 25 Jan 2024 05:09:59 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706188199; cv=pass; d=google.com; s=arc-20160816; b=qRGVXVGS702JpM1gqsAV/H4uoNbtVKlTH3KuHmlbzTeoRwQlWfhQH2KGxiad/fY88c 9zOzrFR/Fu2qTQLR7qNvRFoeobly6p3eiiefDFHwpQSov61g/GH8guf5r0uVZ+eIszNt ZzKeycWjroXHCZPNAQoXx5r2gjMjn7HWW+K+ryWs9Bh/JlUh/noUUqwIU7oJKZDX0ajw XKwE8KMn/yN/6O7+mE86GgWnKDBXI+JzeSlZ/xO/xvNw2Q3qIr7JjN7BOycMurA2sLsk 6h/U0kbrV7KsoIb3K8X63xgDnW3K1dpNRLh1+OxOf0Q1vYUiXYL5e0EXl5+poZXpigut fV6A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=o6y4O/TctWzHPHDw+dZ++x6Lnwxe5cRrOa1AJDjnvVc=; fh=rVlO78dojVFis132BRX7T/RJohouSWa2XniFMHWTWE4=; b=xKkhPJ7wT3JF+gu/sdvVpOzsZV7v8vRPAfD8YsZkse4EdOhPXHYM181c2TW5kX/bms Z4HwBDK+KwoK+ClcUnqwvu8ZFwI7FFt4bbYhVS275E87/xmN9wAjkBzx8hq/BjiB2Hbw z84kX6FpHvcwEi+q9R/ryeTBJLnYP5q4NTRPt3ndYtgcZ65MXDPgeudc5vDXUeYjB7T9 M/KrwC9k2JMAJXuTDd5Ln2Wnqg2o6gRYzo+nCVKxPZgdJtGXY2SScMM+xk3UUgJmGZP5 9p8JE+f0AJfNOoZ9EDCA5fsYSpA6uiC0YyMLy0wUeuxV7RHX6JSSgasEUGXtDVIErz+D DWzg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ti.com header.s=ti-com-17Q1 header.b=jitcXo2N; arc=pass (i=1 spf=pass spfdomain=ti.com dkim=pass dkdomain=ti.com dmarc=pass fromdomain=ti.com); spf=pass (google.com: domain of linux-kernel+bounces-38623-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-38623-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=ti.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id n15-20020a05622a11cf00b0042a7092c7fesi303806qtk.477.2024.01.25.05.09.59 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 25 Jan 2024 05:09:59 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-38623-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@ti.com header.s=ti-com-17Q1 header.b=jitcXo2N; arc=pass (i=1 spf=pass spfdomain=ti.com dkim=pass dkdomain=ti.com dmarc=pass fromdomain=ti.com); spf=pass (google.com: domain of linux-kernel+bounces-38623-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-38623-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=NONE dis=NONE) header.from=ti.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 9A8541C23835 for <ouuuleilei@gmail.com>; Thu, 25 Jan 2024 13:09:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E679F4F8AC; Thu, 25 Jan 2024 13:09:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=ti.com header.i=@ti.com header.b="jitcXo2N" Received: from lelv0143.ext.ti.com (lelv0143.ext.ti.com [198.47.23.248]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DA0D5100C; Thu, 25 Jan 2024 13:08:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.47.23.248 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706188137; cv=none; b=tIdUcBp9QSOIPpqAiBfPR1iAEBWXxT6vzPugOWw1dsoD8KL39+kZwQXCk4z6LSasjbD29xVQnqQcNIpSJKnojEoZCJLv3nyBlMskfD3g1ZbjhXzkTNDufjG7VZ7FYdHuQgcT1baLvYfrAc7qkQLGIJFbwxWrqCa7zxXLmBrfaiA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1706188137; c=relaxed/simple; bh=9qRS888uxFK3V4wrFKJS6sGqRbjZOvbpaZgrwwEKW90=; h=From:To:CC:Subject:Date:Message-ID:MIME-Version:Content-Type; b=YDAdZhd0I6iCOSD811iA0D1RI07KjCs+7GdQ6zQ6JP4Vgsuim0BwMP5RPBqFdNBsmRDc43nKRjzy8/+wwrhIC4N9cMdSFgnUlEGVlmAV5nUdMg2SxZamiiUq8iGQ53s+gKbJqvUey1jp2d9sJsTeMFxWmivQOj6yhF38Io28XEo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ti.com; spf=pass smtp.mailfrom=ti.com; dkim=pass (1024-bit key) header.d=ti.com header.i=@ti.com header.b=jitcXo2N; arc=none smtp.client-ip=198.47.23.248 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=ti.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ti.com Received: from fllv0034.itg.ti.com ([10.64.40.246]) by lelv0143.ext.ti.com (8.15.2/8.15.2) with ESMTP id 40PD8ZqN124469; Thu, 25 Jan 2024 07:08:35 -0600 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ti.com; s=ti-com-17Q1; t=1706188115; bh=o6y4O/TctWzHPHDw+dZ++x6Lnwxe5cRrOa1AJDjnvVc=; h=From:To:CC:Subject:Date; b=jitcXo2N2pG3qC8N7gNquLtnlDN731B7JutE2z1nCMgmGutllkL9OT6ulSw27fj5Q 4FYomk+e7tXW2yt7Vq4xKW/LSQbHr9UbKFhGLCa6kocv2Wg+S/itFmt7E4lQ58YYPK rkw5fpqojm5/trOWM79QM4Gbuc6231z6qP/goxgg= Received: from DLEE100.ent.ti.com (dlee100.ent.ti.com [157.170.170.30]) by fllv0034.itg.ti.com (8.15.2/8.15.2) with ESMTPS id 40PD8ZDi093893 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 25 Jan 2024 07:08:35 -0600 Received: from DLEE103.ent.ti.com (157.170.170.33) by DLEE100.ent.ti.com (157.170.170.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2507.23; Thu, 25 Jan 2024 07:08:34 -0600 Received: from lelvsmtp5.itg.ti.com (10.180.75.250) by DLEE103.ent.ti.com (157.170.170.33) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2507.23 via Frontend Transport; Thu, 25 Jan 2024 07:08:35 -0600 Received: from localhost (ti.dhcp.ti.com [172.24.227.95] (may be forged)) by lelvsmtp5.itg.ti.com (8.15.2/8.15.2) with ESMTP id 40PD8YTm095770; Thu, 25 Jan 2024 07:08:34 -0600 From: Devarsh Thakkar <devarsht@ti.com> To: <nas.chung@chipsnmedia.com>, <jackson.lee@chipsnmedia.com>, <mchehab@kernel.org>, <linux-media@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <nm@ti.com> CC: <praneeth@ti.com>, <vigneshr@ti.com>, <a-bhatia1@ti.com>, <j-luthra@ti.com>, <b-brnich@ti.com>, <detheridge@ti.com>, <p-mantena@ti.com>, <vijayp@ti.com>, <devarsht@ti.com> Subject: [PATCH] media: chips-media: wave5: Add hrtimer based polling support Date: Thu, 25 Jan 2024 18:38:33 +0530 Message-ID: <20240125130833.1953617-1-devarsht@ti.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-EXCLAIMER-MD-CONFIG: e1e8a2fd-e40a-4ac6-ac9b-f7e9cc9ee180 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1789067997636030373 X-GMAIL-MSGID: 1789067997636030373 |
Series |
media: chips-media: wave5: Add hrtimer based polling support
|
|
Commit Message
Devarsh Thakkar
Jan. 25, 2024, 1:08 p.m. UTC
Add support for starting a polling timer in case interrupt is not
available. This helps keep the VPU functional in SoC's such as AM62A, where
the hardware interrupt hookup may not be present due to an SoC errata [1].
The timer is shared across all instances of encoder and decoder and is
started when first instance of encoder or decoder is opened and stopped
when last instance is closed, thus avoiding per instance polling and saving
CPU bandwidth.
hrtimer callback is called with 5ms polling interval while any of the
encoder/decoder instances are running to check the interrupt status as
being done in irq handler.
Based on above interrupt status, use a worker thread to iterate over the
interrupt status for each instance and send completion event as being done
in irq thread function.
Parse for irq number before v4l2 device registration and if not available
only then, initialize hrtimer and worker thread.
Move the core functionality of irq thread function to a separate function
wave5_vpu_handle_irq so that it can be used by both the worker thread when
using polling mode and irq thread when using interrupt mode.
Protect hrtimer access and instance list with device specific mutex locks
to avoid race conditions while different instances of encoder and decoder
are started together.
Add module param to change polling interval for debug purpose.
[1] https://www.ti.com/lit/pdf/spruj16
(Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks)
Signed-off-by: Devarsh Thakkar <devarsht@ti.com>
---
Test logs:
https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb
---
.../platform/chips-media/wave5/wave5-helper.c | 16 ++-
.../chips-media/wave5/wave5-vpu-dec.c | 13 +-
.../chips-media/wave5/wave5-vpu-enc.c | 13 +-
.../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------
.../platform/chips-media/wave5/wave5-vpuapi.h | 4 +
5 files changed, 122 insertions(+), 41 deletions(-)
Comments
Gentle reminder. On 25/01/24 18:38, Devarsh Thakkar wrote: > Add support for starting a polling timer in case interrupt is not > available. This helps keep the VPU functional in SoC's such as AM62A, where > the hardware interrupt hookup may not be present due to an SoC errata [1]. > > The timer is shared across all instances of encoder and decoder and is > started when first instance of encoder or decoder is opened and stopped > when last instance is closed, thus avoiding per instance polling and saving > CPU bandwidth. > > hrtimer callback is called with 5ms polling interval while any of the > encoder/decoder instances are running to check the interrupt status as > being done in irq handler. > > Based on above interrupt status, use a worker thread to iterate over the > interrupt status for each instance and send completion event as being done > in irq thread function. > > Parse for irq number before v4l2 device registration and if not available > only then, initialize hrtimer and worker thread. > > Move the core functionality of irq thread function to a separate function > wave5_vpu_handle_irq so that it can be used by both the worker thread when > using polling mode and irq thread when using interrupt mode. > > Protect hrtimer access and instance list with device specific mutex locks > to avoid race conditions while different instances of encoder and decoder > are started together. > > Add module param to change polling interval for debug purpose. > > [1] https://www.ti.com/lit/pdf/spruj16 > (Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks) > > Signed-off-by: Devarsh Thakkar <devarsht@ti.com> > --- > Test logs: > https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb > --- > .../platform/chips-media/wave5/wave5-helper.c | 16 ++- > .../chips-media/wave5/wave5-vpu-dec.c | 13 +- > .../chips-media/wave5/wave5-vpu-enc.c | 13 +- > .../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------ > .../platform/chips-media/wave5/wave5-vpuapi.h | 4 + > 5 files changed, 122 insertions(+), 41 deletions(-) > > diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c b/drivers/media/platform/chips-media/wave5/wave5-helper.c > index 8433ecab230c..475b7628964f 100644 > --- a/drivers/media/platform/chips-media/wave5/wave5-helper.c > +++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c > @@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, > char *name) > { > struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); > + struct vpu_device *dev = inst->dev; > + int ret = 0; > > v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); > if (inst->state != VPU_INST_STATE_NONE) { > @@ -71,8 +73,20 @@ int wave5_vpu_release_device(struct file *filp, > } > > wave5_cleanup_instance(inst); > + if (dev->irq < 0) { > + ret = mutex_lock_interruptible(&dev->dev_lock); > + if (ret) > + return ret; > > - return 0; > + if (list_empty(&dev->instances)) { > + dev_dbg(dev->dev, "Disabling the hrtimer\n"); > + hrtimer_cancel(&dev->hrtimer); > + } > + > + mutex_unlock(&dev->dev_lock); > + } > + > + return ret; > } > > int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq, > diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > index ef227af72348..c8624c681fa6 100644 > --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > @@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) > v4l2_fh_add(&inst->v4l2_fh); > > INIT_LIST_HEAD(&inst->list); > - list_add_tail(&inst->list, &dev->instances); > > inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; > inst->v4l2_fh.m2m_ctx = > @@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) > > wave5_vdi_allocate_sram(inst->dev); > > + ret = mutex_lock_interruptible(&dev->dev_lock); > + if (ret) > + goto cleanup_inst; > + > + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) > + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), > + HRTIMER_MODE_REL_PINNED); > + > + list_add_tail(&inst->list, &dev->instances); > + > + mutex_unlock(&dev->dev_lock); > + > return 0; > > cleanup_inst: > diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > index f29cfa3af94a..9e88424761b6 100644 > --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > @@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) > v4l2_fh_add(&inst->v4l2_fh); > > INIT_LIST_HEAD(&inst->list); > - list_add_tail(&inst->list, &dev->instances); > > inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; > inst->v4l2_fh.m2m_ctx = > @@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) > > wave5_vdi_allocate_sram(inst->dev); > > + ret = mutex_lock_interruptible(&dev->dev_lock); > + if (ret) > + goto cleanup_inst; > + > + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) > + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), > + HRTIMER_MODE_REL_PINNED); > + > + list_add_tail(&inst->list, &dev->instances); > + > + mutex_unlock(&dev->dev_lock); > + > return 0; > > cleanup_inst: > diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c > index bfe4caa79cc9..968ec9baf1ef 100644 > --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c > +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c > @@ -26,6 +26,9 @@ struct wave5_match_data { > const char *fw_name; > }; > > +static int vpu_poll_interval = 5; > +module_param(vpu_poll_interval, int, 0644); > + > int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) > { > int ret; > @@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) > return 0; > } > > -static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) > +static void wave5_vpu_handle_irq(void *dev_id) > { > u32 seq_done; > u32 cmd_done; > @@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) > struct vpu_instance *inst; > struct vpu_device *dev = dev_id; > > - if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { > - irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); > - wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); > - wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); > - > - list_for_each_entry(inst, &dev->instances, list) { > - seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); > - cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); > - > - if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || > - irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { > - if (seq_done & BIT(inst->id)) { > - seq_done &= ~BIT(inst->id); > - wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, > - seq_done); > - complete(&inst->irq_done); > - } > + irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); > + wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); > + wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); > + > + list_for_each_entry(inst, &dev->instances, list) { > + seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); > + cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); > + > + if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || > + irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { > + if (seq_done & BIT(inst->id)) { > + seq_done &= ~BIT(inst->id); > + wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, > + seq_done); > + complete(&inst->irq_done); > } > + } > > - if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || > - irq_reason & BIT(INT_WAVE5_ENC_PIC)) { > - if (cmd_done & BIT(inst->id)) { > - cmd_done &= ~BIT(inst->id); > - wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, > - cmd_done); > - inst->ops->finish_process(inst); > - } > + if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || > + irq_reason & BIT(INT_WAVE5_ENC_PIC)) { > + if (cmd_done & BIT(inst->id)) { > + cmd_done &= ~BIT(inst->id); > + wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, > + cmd_done); > + inst->ops->finish_process(inst); > } > - > - wave5_vpu_clear_interrupt(inst, irq_reason); > } > + > + wave5_vpu_clear_interrupt(inst, irq_reason); > } > +} > + > +static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) > +{ > + struct vpu_device *dev = dev_id; > + > + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) > + wave5_vpu_handle_irq(dev); > > return IRQ_HANDLED; > } > > +static void wave5_vpu_irq_work_fn(struct kthread_work *work) > +{ > + struct vpu_device *dev = container_of(work, struct vpu_device, work); > + > + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) > + wave5_vpu_handle_irq(dev); > +} > + > +static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer) > +{ > + struct vpu_device *dev = > + container_of(timer, struct vpu_device, hrtimer); > + > + kthread_queue_work(dev->worker, &dev->work); > + hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * NSEC_PER_MSEC)); > + > + return HRTIMER_RESTART; > +} > + > static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name, > u32 *revision) > { > @@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device *pdev) > > dev->irq = platform_get_irq(pdev, 0); > if (dev->irq < 0) { > - dev_err(&pdev->dev, "failed to get irq resource\n"); > - ret = -ENXIO; > - goto err_enc_unreg; > - } > - > - ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, > - wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); > - if (ret) { > - dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); > - goto err_enc_unreg; > + dev_err(&pdev->dev, "failed to get irq resource, falling back to polling\n"); > + hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); > + dev->hrtimer.function = &wave5_vpu_timer_callback; > + dev->worker = kthread_create_worker(0, "vpu_irq_thread"); > + if (IS_ERR(dev->worker)) { > + dev_err(&pdev->dev, "failed to create vpu irq worker\n"); > + ret = PTR_ERR(dev->worker); > + goto err_vdi_release; > + } > + dev->vpu_poll_interval = vpu_poll_interval; > + kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); > + } else { > + ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, > + wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); > + if (ret) { > + dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); > + goto err_enc_unreg; > + } > } > > ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, &fw_revision); > @@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct platform_device *pdev) > { > struct vpu_device *dev = dev_get_drvdata(&pdev->dev); > > + if (dev->irq < 0) { > + kthread_destroy_worker(dev->worker); > + hrtimer_cancel(&dev->hrtimer); > + } > + > mutex_destroy(&dev->dev_lock); > mutex_destroy(&dev->hw_lock); > clk_bulk_disable_unprepare(dev->num_clks, dev->clks); > diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > index 352f6e904e50..edc50450ddb8 100644 > --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > @@ -756,6 +756,10 @@ struct vpu_device { > u32 product_code; > struct ida inst_ida; > struct clk_bulk_data *clks; > + struct hrtimer hrtimer; > + struct kthread_work work; > + struct kthread_worker *worker; > + int vpu_poll_interval; > int num_clks; > }; >
Hey Devarsh, @nas.chung@chipsnmedia.com or @jackson.lee@chipsnmedia.com Could you please run tests on the hardware with this patch and review this patch to make sure that this doesn't break the other hardware with Wave5 chips. Thanks! On 25.01.2024 18:38, Devarsh Thakkar wrote: >Add support for starting a polling timer in case interrupt is not s/interrupt/an interrupt/ >available. This helps keep the VPU functional in SoC's such as AM62A, where s/helps keep/helps to keep/ s/SoC's/SoCs/ >the hardware interrupt hookup may not be present due to an SoC errata [1]. I would prefer if we do not hack the driver to adjust to SoC errata, instead of adding polling functionality you could also implement a fake IRQ chip, then the driver can still assume that an interrupt is present and you create a virtual interrupt source that does polling in the background. You could start by adding the fake irq-chip within the wave5 folder and later on that could be moved to the `drivers/irqchip/` directory (by you or someone else), which would be a nice addition to the Linux kernel to provide machines without IRQ support a way to register an interrupt. Please have a look at that potential solution, you potentially need to share the registers between the codec and the fake irq-chip in the DTS. > >The timer is shared across all instances of encoder and decoder and is s/encoder and decoder/encoders and decoders/ >started when first instance of encoder or decoder is opened and stopped s/first/the first/ s/encoder or decoder/an encoder or a decoder/ >when last instance is closed, thus avoiding per instance polling and saving s/last/the last/ >CPU bandwidth. > >hrtimer callback is called with 5ms polling interval while any of the s/hrtimer/The hrtimer/ >encoder/decoder instances are running to check the interrupt status as >being done in irq handler. s/irq/the irq/ Could you ellaborate why you choose the HRtimer for this job and why does it need to be a 5ms polling interval, is that a requirement of this specific hardware or the best constant you figured out during testing? > >Based on above interrupt status, use a worker thread to iterate over the s/on above interrupt status/on the interrupt status above/ >interrupt status for each instance and send completion event as being done s/completion/a completion/ >in irq thread function. s/irq/the irq/ > >Parse for irq number before v4l2 device registration and if not available >only then, initialize hrtimer and worker thread. Thanks for the effort on the description but you do not have to write out word for word what the code is doing, the code should be descriptive enough on its own. More important is that you clearly describe, for example: - why did you choose a certain design? - why did you choose for the HRtimer? - why is the 5ms interval is important (are other intervals also valid?) - why did you choose to add a polling mechanism instead of possible alternatives (like for example an fake irq-chip), - > >Move the core functionality of irq thread function to a separate function s/irq/the irq/ >wave5_vpu_handle_irq so that it can be used by both the worker thread when >using polling mode and irq thread when using interrupt mode. s/irq/the irq/ > >Protect hrtimer access and instance list with device specific mutex locks s/hrtimer/the hrtimer/ >to avoid race conditions while different instances of encoder and decoder s/encoder and decoder/encoders and decoders/ >are started together. > >Add module param to change polling interval for debug purpose. s/module/a module/ s/polling/the polling/ s/debug purpose/debugging purposes/ Here again is the question, what can the user expect with different polling intervals? Is there a valid range of intervals? The actual implementation seems good, but I would like to first make sure that we take the best possible approach to this issue. Greetings, Sebastian > >[1] https://www.ti.com/lit/pdf/spruj16 >(Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks) > >Signed-off-by: Devarsh Thakkar <devarsht@ti.com> >--- >Test logs: >https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb >--- > .../platform/chips-media/wave5/wave5-helper.c | 16 ++- > .../chips-media/wave5/wave5-vpu-dec.c | 13 +- > .../chips-media/wave5/wave5-vpu-enc.c | 13 +- > .../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------ > .../platform/chips-media/wave5/wave5-vpuapi.h | 4 + > 5 files changed, 122 insertions(+), 41 deletions(-) > >diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c b/drivers/media/platform/chips-media/wave5/wave5-helper.c >index 8433ecab230c..475b7628964f 100644 >--- a/drivers/media/platform/chips-media/wave5/wave5-helper.c >+++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c >@@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, > char *name) > { > struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); >+ struct vpu_device *dev = inst->dev; >+ int ret = 0; > > v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); > if (inst->state != VPU_INST_STATE_NONE) { >@@ -71,8 +73,20 @@ int wave5_vpu_release_device(struct file *filp, > } > > wave5_cleanup_instance(inst); >+ if (dev->irq < 0) { >+ ret = mutex_lock_interruptible(&dev->dev_lock); >+ if (ret) >+ return ret; > >- return 0; >+ if (list_empty(&dev->instances)) { >+ dev_dbg(dev->dev, "Disabling the hrtimer\n"); >+ hrtimer_cancel(&dev->hrtimer); >+ } >+ >+ mutex_unlock(&dev->dev_lock); >+ } >+ >+ return ret; > } > > int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq, >diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >index ef227af72348..c8624c681fa6 100644 >--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >@@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) > v4l2_fh_add(&inst->v4l2_fh); > > INIT_LIST_HEAD(&inst->list); >- list_add_tail(&inst->list, &dev->instances); > > inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; > inst->v4l2_fh.m2m_ctx = >@@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) > > wave5_vdi_allocate_sram(inst->dev); > >+ ret = mutex_lock_interruptible(&dev->dev_lock); >+ if (ret) >+ goto cleanup_inst; >+ >+ if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) >+ hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), >+ HRTIMER_MODE_REL_PINNED); >+ >+ list_add_tail(&inst->list, &dev->instances); >+ >+ mutex_unlock(&dev->dev_lock); >+ > return 0; > > cleanup_inst: >diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >index f29cfa3af94a..9e88424761b6 100644 >--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >@@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) > v4l2_fh_add(&inst->v4l2_fh); > > INIT_LIST_HEAD(&inst->list); >- list_add_tail(&inst->list, &dev->instances); > > inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; > inst->v4l2_fh.m2m_ctx = >@@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) > > wave5_vdi_allocate_sram(inst->dev); > >+ ret = mutex_lock_interruptible(&dev->dev_lock); >+ if (ret) >+ goto cleanup_inst; >+ >+ if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) >+ hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), >+ HRTIMER_MODE_REL_PINNED); >+ >+ list_add_tail(&inst->list, &dev->instances); >+ >+ mutex_unlock(&dev->dev_lock); >+ > return 0; > > cleanup_inst: >diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >index bfe4caa79cc9..968ec9baf1ef 100644 >--- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >@@ -26,6 +26,9 @@ struct wave5_match_data { > const char *fw_name; > }; > >+static int vpu_poll_interval = 5; >+module_param(vpu_poll_interval, int, 0644); >+ > int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) > { > int ret; >@@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) > return 0; > } > >-static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >+static void wave5_vpu_handle_irq(void *dev_id) > { > u32 seq_done; > u32 cmd_done; >@@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) > struct vpu_instance *inst; > struct vpu_device *dev = dev_id; > >- if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { >- irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >- wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >- wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >- >- list_for_each_entry(inst, &dev->instances, list) { >- seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); >- cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); >- >- if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >- irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >- if (seq_done & BIT(inst->id)) { >- seq_done &= ~BIT(inst->id); >- wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, >- seq_done); >- complete(&inst->irq_done); >- } >+ irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >+ wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >+ wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >+ >+ list_for_each_entry(inst, &dev->instances, list) { >+ seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); >+ cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); >+ >+ if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >+ irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >+ if (seq_done & BIT(inst->id)) { >+ seq_done &= ~BIT(inst->id); >+ wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, >+ seq_done); >+ complete(&inst->irq_done); > } >+ } > >- if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >- irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >- if (cmd_done & BIT(inst->id)) { >- cmd_done &= ~BIT(inst->id); >- wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >- cmd_done); >- inst->ops->finish_process(inst); >- } >+ if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >+ irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >+ if (cmd_done & BIT(inst->id)) { >+ cmd_done &= ~BIT(inst->id); >+ wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >+ cmd_done); >+ inst->ops->finish_process(inst); > } >- >- wave5_vpu_clear_interrupt(inst, irq_reason); > } >+ >+ wave5_vpu_clear_interrupt(inst, irq_reason); > } >+} >+ >+static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >+{ >+ struct vpu_device *dev = dev_id; >+ >+ if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >+ wave5_vpu_handle_irq(dev); > > return IRQ_HANDLED; > } > >+static void wave5_vpu_irq_work_fn(struct kthread_work *work) >+{ >+ struct vpu_device *dev = container_of(work, struct vpu_device, work); >+ >+ if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >+ wave5_vpu_handle_irq(dev); >+} >+ >+static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer) >+{ >+ struct vpu_device *dev = >+ container_of(timer, struct vpu_device, hrtimer); >+ >+ kthread_queue_work(dev->worker, &dev->work); >+ hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * NSEC_PER_MSEC)); >+ >+ return HRTIMER_RESTART; >+} >+ > static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name, > u32 *revision) > { >@@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device *pdev) > > dev->irq = platform_get_irq(pdev, 0); > if (dev->irq < 0) { >- dev_err(&pdev->dev, "failed to get irq resource\n"); >- ret = -ENXIO; >- goto err_enc_unreg; >- } >- >- ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >- wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >- if (ret) { >- dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); >- goto err_enc_unreg; >+ dev_err(&pdev->dev, "failed to get irq resource, falling back to polling\n"); >+ hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); >+ dev->hrtimer.function = &wave5_vpu_timer_callback; >+ dev->worker = kthread_create_worker(0, "vpu_irq_thread"); >+ if (IS_ERR(dev->worker)) { >+ dev_err(&pdev->dev, "failed to create vpu irq worker\n"); >+ ret = PTR_ERR(dev->worker); >+ goto err_vdi_release; >+ } >+ dev->vpu_poll_interval = vpu_poll_interval; >+ kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); >+ } else { >+ ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >+ wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >+ if (ret) { >+ dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); >+ goto err_enc_unreg; >+ } > } > > ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, &fw_revision); >@@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct platform_device *pdev) > { > struct vpu_device *dev = dev_get_drvdata(&pdev->dev); > >+ if (dev->irq < 0) { >+ kthread_destroy_worker(dev->worker); >+ hrtimer_cancel(&dev->hrtimer); >+ } >+ > mutex_destroy(&dev->dev_lock); > mutex_destroy(&dev->hw_lock); > clk_bulk_disable_unprepare(dev->num_clks, dev->clks); >diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >index 352f6e904e50..edc50450ddb8 100644 >--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >@@ -756,6 +756,10 @@ struct vpu_device { > u32 product_code; > struct ida inst_ida; > struct clk_bulk_data *clks; >+ struct hrtimer hrtimer; >+ struct kthread_work work; >+ struct kthread_worker *worker; >+ int vpu_poll_interval; > int num_clks; > }; > >-- >2.34.1 > >
Hello Sebastian > @nas.chung@chipsnmedia.com or @jackson.lee@chipsnmedia.com Could you > please run tests on the hardware with this patch and review this patch to > make sure that this doesn't break the other hardware with > Wave5 chips. Thanks! > > On 25.01.2024 18:38, Devarsh Thakkar wrote: > >Add support for starting a polling timer in case interrupt is not > > s/interrupt/an interrupt/ > > >available. This helps keep the VPU functional in SoC's such as AM62A, > >where > > s/helps keep/helps to keep/ > s/SoC's/SoCs/ > > >the hardware interrupt hookup may not be present due to an SoC errata [1]. > > I would prefer if we do not hack the driver to adjust to SoC errata, > instead of adding polling functionality you could also implement a fake > IRQ chip, then the driver can still assume that an interrupt is present > and you create a virtual interrupt source that does polling in the > background. > You could start by adding the fake irq-chip within the wave5 folder and > later on that could be moved to the `drivers/irqchip/` directory (by you > or someone else), which would be a nice addition to the Linux kernel to > provide machines without IRQ support a way to register an interrupt. > Please have a look at that potential solution, you potentially need to > share the registers between the codec and the fake irq-chip in the DTS. > > > > >The timer is shared across all instances of encoder and decoder and is > > s/encoder and decoder/encoders and decoders/ > > >started when first instance of encoder or decoder is opened and stopped > > s/first/the first/ > s/encoder or decoder/an encoder or a decoder/ > > >when last instance is closed, thus avoiding per instance polling and > >saving > > s/last/the last/ > > >CPU bandwidth. > > > >hrtimer callback is called with 5ms polling interval while any of the > > s/hrtimer/The hrtimer/ > > >encoder/decoder instances are running to check the interrupt status as > >being done in irq handler. > > s/irq/the irq/ > > Could you ellaborate why you choose the HRtimer for this job and why does > it need to be a 5ms polling interval, is that a requirement of this > specific hardware or the best constant you figured out during testing? > > > > >Based on above interrupt status, use a worker thread to iterate over > >the > > s/on above interrupt status/on the interrupt status above/ > > >interrupt status for each instance and send completion event as being > >done > > s/completion/a completion/ > > >in irq thread function. > > s/irq/the irq/ > > > > >Parse for irq number before v4l2 device registration and if not > >available only then, initialize hrtimer and worker thread. > > Thanks for the effort on the description but you do not have to write out > word for word what the code is doing, the code should be descriptive > enough on its own. > More important is that you clearly describe, for example: > - why did you choose a certain design? > - why did you choose for the HRtimer? > - why is the 5ms interval is important (are other intervals also valid?) > - why did you choose to add a polling mechanism instead of possible > alternatives (like for example an fake irq-chip), > - > > > >Move the core functionality of irq thread function to a separate > >function > > s/irq/the irq/ > > >wave5_vpu_handle_irq so that it can be used by both the worker thread > >when using polling mode and irq thread when using interrupt mode. > > s/irq/the irq/ > > > > >Protect hrtimer access and instance list with device specific mutex > >locks > > s/hrtimer/the hrtimer/ > > >to avoid race conditions while different instances of encoder and > >decoder > > s/encoder and decoder/encoders and decoders/ > > >are started together. > > > >Add module param to change polling interval for debug purpose. > > s/module/a module/ > s/polling/the polling/ > s/debug purpose/debugging purposes/ > > Here again is the question, what can the user expect with different > polling intervals? Is there a valid range of intervals? > > The actual implementation seems good, but I would like to first make sure > that we take the best possible approach to this issue. > > Greetings, > Sebastian > > > > >[1] https://www.ti.com/lit/pdf/spruj16 > >(Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks) > > > >Signed-off-by: Devarsh Thakkar <devarsht@ti.com> For this patch, Tested-by: Jackson Lee <jackson.lee@chipsnmedia.com> > >--- > >Test logs: > >https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb > >--- > > .../platform/chips-media/wave5/wave5-helper.c | 16 ++- > > .../chips-media/wave5/wave5-vpu-dec.c | 13 +- > > .../chips-media/wave5/wave5-vpu-enc.c | 13 +- > > .../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------ > > .../platform/chips-media/wave5/wave5-vpuapi.h | 4 + > > 5 files changed, 122 insertions(+), 41 deletions(-) > > > >diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c > >b/drivers/media/platform/chips-media/wave5/wave5-helper.c > >index 8433ecab230c..475b7628964f 100644 > >--- a/drivers/media/platform/chips-media/wave5/wave5-helper.c > >+++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c > >@@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, > > char *name) > > { > > struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); > >+ struct vpu_device *dev = inst->dev; > >+ int ret = 0; > > > > v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); > > if (inst->state != VPU_INST_STATE_NONE) { @@ -71,8 +73,20 @@ int > >wave5_vpu_release_device(struct file *filp, > > } > > > > wave5_cleanup_instance(inst); > >+ if (dev->irq < 0) { > >+ ret = mutex_lock_interruptible(&dev->dev_lock); > >+ if (ret) > >+ return ret; > > > >- return 0; > >+ if (list_empty(&dev->instances)) { > >+ dev_dbg(dev->dev, "Disabling the hrtimer\n"); > >+ hrtimer_cancel(&dev->hrtimer); > >+ } > >+ > >+ mutex_unlock(&dev->dev_lock); > >+ } > >+ > >+ return ret; > > } > > > > int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct > >vb2_queue *dst_vq, diff --git > >a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > >b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > >index ef227af72348..c8624c681fa6 100644 > >--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c > >@@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) > > v4l2_fh_add(&inst->v4l2_fh); > > > > INIT_LIST_HEAD(&inst->list); > >- list_add_tail(&inst->list, &dev->instances); > > > > inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; > > inst->v4l2_fh.m2m_ctx = > >@@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) > > > > wave5_vdi_allocate_sram(inst->dev); > > > >+ ret = mutex_lock_interruptible(&dev->dev_lock); > >+ if (ret) > >+ goto cleanup_inst; > >+ > >+ if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && > list_empty(&dev->instances)) > >+ hrtimer_start(&dev->hrtimer, ns_to_ktime(dev- > >vpu_poll_interval * NSEC_PER_MSEC), > >+ HRTIMER_MODE_REL_PINNED); > >+ > >+ list_add_tail(&inst->list, &dev->instances); > >+ > >+ mutex_unlock(&dev->dev_lock); > >+ > > return 0; > > > > cleanup_inst: > >diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > >b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > >index f29cfa3af94a..9e88424761b6 100644 > >--- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c > >@@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) > > v4l2_fh_add(&inst->v4l2_fh); > > > > INIT_LIST_HEAD(&inst->list); > >- list_add_tail(&inst->list, &dev->instances); > > > > inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; > > inst->v4l2_fh.m2m_ctx = > >@@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) > > > > wave5_vdi_allocate_sram(inst->dev); > > > >+ ret = mutex_lock_interruptible(&dev->dev_lock); > >+ if (ret) > >+ goto cleanup_inst; > >+ > >+ if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && > list_empty(&dev->instances)) > >+ hrtimer_start(&dev->hrtimer, ns_to_ktime(dev- > >vpu_poll_interval * NSEC_PER_MSEC), > >+ HRTIMER_MODE_REL_PINNED); > >+ > >+ list_add_tail(&inst->list, &dev->instances); > >+ > >+ mutex_unlock(&dev->dev_lock); > >+ > > return 0; > > > > cleanup_inst: > >diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c > >b/drivers/media/platform/chips-media/wave5/wave5-vpu.c > >index bfe4caa79cc9..968ec9baf1ef 100644 > >--- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c > >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c > >@@ -26,6 +26,9 @@ struct wave5_match_data { > > const char *fw_name; > > }; > > > >+static int vpu_poll_interval = 5; > >+module_param(vpu_poll_interval, int, 0644); > >+ > > int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int > >timeout) { > > int ret; > >@@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, > unsigned int timeout) > > return 0; > > } > > > >-static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) > >+static void wave5_vpu_handle_irq(void *dev_id) > > { > > u32 seq_done; > > u32 cmd_done; > >@@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void > *dev_id) > > struct vpu_instance *inst; > > struct vpu_device *dev = dev_id; > > > >- if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { > >- irq_reason = wave5_vdi_read_register(dev, > W5_VPU_VINT_REASON); > >- wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, > irq_reason); > >- wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); > >- > >- list_for_each_entry(inst, &dev->instances, list) { > >- seq_done = wave5_vdi_read_register(dev, > W5_RET_SEQ_DONE_INSTANCE_INFO); > >- cmd_done = wave5_vdi_read_register(dev, > W5_RET_QUEUE_CMD_DONE_INST); > >- > >- if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || > >- irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { > >- if (seq_done & BIT(inst->id)) { > >- seq_done &= ~BIT(inst->id); > >- wave5_vdi_write_register(dev, > W5_RET_SEQ_DONE_INSTANCE_INFO, > >- seq_done); > >- complete(&inst->irq_done); > >- } > >+ irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); > >+ wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); > >+ wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); > >+ > >+ list_for_each_entry(inst, &dev->instances, list) { > >+ seq_done = wave5_vdi_read_register(dev, > W5_RET_SEQ_DONE_INSTANCE_INFO); > >+ cmd_done = wave5_vdi_read_register(dev, > W5_RET_QUEUE_CMD_DONE_INST); > >+ > >+ if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || > >+ irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { > >+ if (seq_done & BIT(inst->id)) { > >+ seq_done &= ~BIT(inst->id); > >+ wave5_vdi_write_register(dev, > W5_RET_SEQ_DONE_INSTANCE_INFO, > >+ seq_done); > >+ complete(&inst->irq_done); > > } > >+ } > > > >- if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || > >- irq_reason & BIT(INT_WAVE5_ENC_PIC)) { > >- if (cmd_done & BIT(inst->id)) { > >- cmd_done &= ~BIT(inst->id); > >- wave5_vdi_write_register(dev, > W5_RET_QUEUE_CMD_DONE_INST, > >- cmd_done); > >- inst->ops->finish_process(inst); > >- } > >+ if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || > >+ irq_reason & BIT(INT_WAVE5_ENC_PIC)) { > >+ if (cmd_done & BIT(inst->id)) { > >+ cmd_done &= ~BIT(inst->id); > >+ wave5_vdi_write_register(dev, > W5_RET_QUEUE_CMD_DONE_INST, > >+ cmd_done); > >+ inst->ops->finish_process(inst); > > } > >- > >- wave5_vpu_clear_interrupt(inst, irq_reason); > > } > >+ > >+ wave5_vpu_clear_interrupt(inst, irq_reason); > > } > >+} > >+ > >+static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) { > >+ struct vpu_device *dev = dev_id; > >+ > >+ if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) > >+ wave5_vpu_handle_irq(dev); > > > > return IRQ_HANDLED; > > } > > > >+static void wave5_vpu_irq_work_fn(struct kthread_work *work) { > >+ struct vpu_device *dev = container_of(work, struct vpu_device, > work); > >+ > >+ if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) > >+ wave5_vpu_handle_irq(dev); > >+} > >+ > >+static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer > >+*timer) { > >+ struct vpu_device *dev = > >+ container_of(timer, struct vpu_device, hrtimer); > >+ > >+ kthread_queue_work(dev->worker, &dev->work); > >+ hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * > >+NSEC_PER_MSEC)); > >+ > >+ return HRTIMER_RESTART; > >+} > >+ > > static int wave5_vpu_load_firmware(struct device *dev, const char > *fw_name, > > u32 *revision) > > { > >@@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device > >*pdev) > > > > dev->irq = platform_get_irq(pdev, 0); > > if (dev->irq < 0) { > >- dev_err(&pdev->dev, "failed to get irq resource\n"); > >- ret = -ENXIO; > >- goto err_enc_unreg; > >- } > >- > >- ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, > >- wave5_vpu_irq_thread, IRQF_ONESHOT, > "vpu_irq", dev); > >- if (ret) { > >- dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", > ret); > >- goto err_enc_unreg; > >+ dev_err(&pdev->dev, "failed to get irq resource, falling > back to polling\n"); > >+ hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, > HRTIMER_MODE_REL_PINNED); > >+ dev->hrtimer.function = &wave5_vpu_timer_callback; > >+ dev->worker = kthread_create_worker(0, "vpu_irq_thread"); > >+ if (IS_ERR(dev->worker)) { > >+ dev_err(&pdev->dev, "failed to create vpu irq > worker\n"); > >+ ret = PTR_ERR(dev->worker); > >+ goto err_vdi_release; > >+ } > >+ dev->vpu_poll_interval = vpu_poll_interval; > >+ kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); > >+ } else { > >+ ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, > >+ wave5_vpu_irq_thread, IRQF_ONESHOT, > "vpu_irq", dev); > >+ if (ret) { > >+ dev_err(&pdev->dev, "Register interrupt handler, > fail: %d\n", ret); > >+ goto err_enc_unreg; > >+ } > > } > > > > ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, > >&fw_revision); @@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct > >platform_device *pdev) { > > struct vpu_device *dev = dev_get_drvdata(&pdev->dev); > > > >+ if (dev->irq < 0) { > >+ kthread_destroy_worker(dev->worker); > >+ hrtimer_cancel(&dev->hrtimer); > >+ } > >+ > > mutex_destroy(&dev->dev_lock); > > mutex_destroy(&dev->hw_lock); > > clk_bulk_disable_unprepare(dev->num_clks, dev->clks); diff --git > >a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > >b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > >index 352f6e904e50..edc50450ddb8 100644 > >--- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > >+++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h > >@@ -756,6 +756,10 @@ struct vpu_device { > > u32 product_code; > > struct ida inst_ida; > > struct clk_bulk_data *clks; > >+ struct hrtimer hrtimer; > >+ struct kthread_work work; > >+ struct kthread_worker *worker; > >+ int vpu_poll_interval; > > int num_clks; > > }; > > > >-- > >2.34.1 > > > >
Hi Sebastian, Thanks for the review. On 16/02/24 15:15, Sebastian Fricke wrote: > Hey Devarsh, > > @nas.chung@chipsnmedia.com or @jackson.lee@chipsnmedia.com > Could you please run tests on the hardware with this patch and review > this patch to make sure that this doesn't break the other hardware with > Wave5 chips. Thanks! > Yes, I think Jackson verified it and provided a Tested-By, at my end too I had tested different scenarios on different platforms, I can attach some test logs too in cover-letter in V2. Also we have been using a version of this patch in our local SDK for over a year, running different kind of tests (multistream, file based e.t.c) so it pretty much looks stable to me. > On 25.01.2024 18:38, Devarsh Thakkar wrote: >> Add support for starting a polling timer in case interrupt is not > > s/interrupt/an interrupt/ > >> available. This helps keep the VPU functional in SoC's such as AM62A, where > > s/helps keep/helps to keep/ > s/SoC's/SoCs/ > >> the hardware interrupt hookup may not be present due to an SoC errata [1]. > > I would prefer if we do not hack the driver to adjust to SoC errata, > instead of adding polling functionality you could also implement a fake > IRQ chip, then the driver can still assume that an interrupt is present > and you create a virtual interrupt source that does polling in the > background. > You could start by adding the fake irq-chip within the wave5 folder and > later on that could be moved to the `drivers/irqchip/` directory (by you > or someone else), which would be a nice addition to the Linux kernel to > provide machines without IRQ support a way to register an interrupt. > Please have a look at that potential solution, you potentially need to > share the registers between the codec and the fake irq-chip in the DTS. > [0] Thanks for the inputs, I tried to have a look at fakeirq approach after you suggested. Just a quick recap, as we want to avoid per instance hrtimer threads to save on CPU, we are running a single hrtimer polling thread across all VPU instances by starting it once first instance is opened and disabling it when last instance is closed Now as I understand with fakeirq irqchip driver we can emulate and trigger an interrupt every Xms (by running an hrtimer underneath) but we would probably still need some hooks in CnM driver similar to and in place of what we have in current hrtimer based implementation as fakeirq driver is unaware of VPU context, For e.g. we need to signal the fakeirq to start the hrtimer when first instance is opened (probably by calling irq_enable) and disable the hrtimer when last instance is closed. Also I am not sure it would be good (and legal) to keep polling interval module param in irqchip driver. Secondly, the current hrtimer based implementation looks compact to me in the sense that code-paths are non-intrusive w.r.t irq flow and we have the ISR specific code refactored into a common function to be used by both flows to avoid duplication. Also with CnM VPU already capable of running without irq support as reflected in dt-bindings I don't see it totally illegal to keep hrtimer polling support in driver as such. Furthermore, I do see some driver(s) switching to hrtimer when irq is not present [1] as shown in below link : https://elixir.bootlin.com/linux/v6.7.5/source/drivers/usb/typec/tipd/core.c#L1299 >> >> The timer is shared across all instances of encoder and decoder and is > > s/encoder and decoder/encoders and decoders/ > >> started when first instance of encoder or decoder is opened and stopped > > s/first/the first/ > s/encoder or decoder/an encoder or a decoder/ > >> when last instance is closed, thus avoiding per instance polling and saving > > s/last/the last/ > >> CPU bandwidth. >> >> hrtimer callback is called with 5ms polling interval while any of the > > s/hrtimer/The hrtimer/ > >> encoder/decoder instances are running to check the interrupt status as >> being done in irq handler. > > s/irq/the irq/ > > Could you ellaborate why you choose the HRtimer for this job and why > does it need to be a 5ms polling interval, is that a requirement of this > specific hardware or the best constant you figured out during testing? > >> >> Based on above interrupt status, use a worker thread to iterate over the > > s/on above interrupt status/on the interrupt status above/ > >> interrupt status for each instance and send completion event as being done > > s/completion/a completion/ > >> in irq thread function. > > s/irq/the irq/ > >> >> Parse for irq number before v4l2 device registration and if not available >> only then, initialize hrtimer and worker thread. > > Thanks for the effort on the description but you do not have to write > out word for word what the code is doing, the code should be descriptive > enough on its own. Yeah, will update it in v2. > More important is that you clearly describe, for example: > - why did you choose a certain design?> - why did you choose for the HRtimer?> - why is the 5ms interval is important (are other intervals also valid?) hrtimer was mainly chosen as it provides precise timing and scheduling and the API seems better suited for periodic polling task such as this. Yes other intervals are also valid. The trade-off is mainly between CPU usage and the latency. As a general rule of thumb, Worst case latency with hrtimer = Actual latency (which was possible with irq) + Polling interval NOTE: Latency: Time taken to process one frame by VPU) Actual Latency : Time taken by hardware to process frame and the time when it would have triggerd irq if it was present With aggressive polling intervals (i.e. going with even lesser values) the CPU usage increases although worst case latencies get better. The 5ms offered a good balance between the two as we were able to reach close to actual latencies without incurring too much of CPU as shown in below analysis: 1) 1x 640x480@25 Encoding: [2] https://gist.github.com/devarsht/ee9664d3403d1212ef477a027b71896c In above, it can be seen 5ms polling gives better latency 2) 4x 1080p30 Transcode (File→decode→encode→file).csv irq vs polling comparison: [3] https://gist.github.com/devarsht/3a58b4f201430dfc61697c7e224e74c2 3) 1x 1080p transcode (File→decode→encode→file) polling_vs_irq_transcode comparison : [4] https://gist.github.com/devarsht/a6480f1f2cbdf8dd694d698309d81fb0 In both above, it can be seen there is comparable performance achieved with 5ms polling without incurring too much of CPU. 4) We also ran 1080p60 streaming use-case for irq vs polling comparison for CPU usage and Latencies : [5] https://gist.github.com/devarsht/44aaa4322454e85e01a8d65ac47c5edb And here too we found 5ms as apt value to reach closer to actual (irq) latencies without too much of CPU usage. > - why did you choose to add a polling mechanism instead of possible > alternatives (like for example an fake irq-chip), > - I think I covered it above in first paragraph [0]. >>>> Move the core functionality of irq thread function to a separate function > > s/irq/the irq/ > >> wave5_vpu_handle_irq so that it can be used by both the worker thread when >> using polling mode and irq thread when using interrupt mode. > > s/irq/the irq/ > >> >> Protect hrtimer access and instance list with device specific mutex locks > > s/hrtimer/the hrtimer/ > >> to avoid race conditions while different instances of encoder and decoder > > s/encoder and decoder/encoders and decoders/ > >> are started together. >> >> Add module param to change polling interval for debug purpose. > > s/module/a module/ > s/polling/the polling/ > s/debug purpose/debugging purposes/ > > Here again is the question, what can the user expect with different > polling intervals? Is there a valid range of intervals? > As mentioned above, decreasing the polling interval will make us reach better latencies but CPU usage will increase too. It depends on what user wants from this trade-off perspective but 5ms as a default for this hardware seemed appropriate per experiments shared above to reach closer to actual latencies (for e.g. CnM VPU gives ~3ms latency for 640x480). > The actual implementation seems good, but I would like to first make > sure that we take the best possible approach to this issue. > I personally feel this hrtimer based approach as an apt one for the reasons (and analysis) explained above. Kindly let me know your opinions and if any queries or I missed to answer something here. [1] : https://elixir.bootlin.com/linux/v6.7.5/source/drivers/usb/typec/tipd/core.c#L1299 [2] https://gist.github.com/devarsht/ee9664d3403d1212ef477a027b71896c [3] https://gist.github.com/devarsht/3a58b4f201430dfc61697c7e224e74c2 [4] https://gist.github.com/devarsht/a6480f1f2cbdf8dd694d698309d81fb0 [5] https://gist.github.com/devarsht/44aaa4322454e85e01a8d65ac47c5edb Regards Devarsh > Greetings, > Sebastian > >> >> [1] https://www.ti.com/lit/pdf/spruj16 >> (Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks) >> >> Signed-off-by: Devarsh Thakkar <devarsht@ti.com> >> --- >> Test logs: >> https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb >> --- >> .../platform/chips-media/wave5/wave5-helper.c | 16 ++- >> .../chips-media/wave5/wave5-vpu-dec.c | 13 +- >> .../chips-media/wave5/wave5-vpu-enc.c | 13 +- >> .../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------ >> .../platform/chips-media/wave5/wave5-vpuapi.h | 4 + >> 5 files changed, 122 insertions(+), 41 deletions(-) >> >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c >> b/drivers/media/platform/chips-media/wave5/wave5-helper.c >> index 8433ecab230c..475b7628964f 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-helper.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c >> @@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, >> char *name) >> { >> struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); >> + struct vpu_device *dev = inst->dev; >> + int ret = 0; >> >> v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); >> if (inst->state != VPU_INST_STATE_NONE) { >> @@ -71,8 +73,20 @@ int wave5_vpu_release_device(struct file *filp, >> } >> >> wave5_cleanup_instance(inst); >> + if (dev->irq < 0) { >> + ret = mutex_lock_interruptible(&dev->dev_lock); >> + if (ret) >> + return ret; >> >> - return 0; >> + if (list_empty(&dev->instances)) { >> + dev_dbg(dev->dev, "Disabling the hrtimer\n"); >> + hrtimer_cancel(&dev->hrtimer); >> + } >> + >> + mutex_unlock(&dev->dev_lock); >> + } >> + >> + return ret; >> } >> >> int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct >> vb2_queue *dst_vq, >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> index ef227af72348..c8624c681fa6 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> @@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) >> v4l2_fh_add(&inst->v4l2_fh); >> >> INIT_LIST_HEAD(&inst->list); >> - list_add_tail(&inst->list, &dev->instances); >> >> inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; >> inst->v4l2_fh.m2m_ctx = >> @@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) >> >> wave5_vdi_allocate_sram(inst->dev); >> >> + ret = mutex_lock_interruptible(&dev->dev_lock); >> + if (ret) >> + goto cleanup_inst; >> + >> + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && >> list_empty(&dev->instances)) >> + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * >> NSEC_PER_MSEC), >> + HRTIMER_MODE_REL_PINNED); >> + >> + list_add_tail(&inst->list, &dev->instances); >> + >> + mutex_unlock(&dev->dev_lock); >> + >> return 0; >> >> cleanup_inst: >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> index f29cfa3af94a..9e88424761b6 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> @@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) >> v4l2_fh_add(&inst->v4l2_fh); >> >> INIT_LIST_HEAD(&inst->list); >> - list_add_tail(&inst->list, &dev->instances); >> >> inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; >> inst->v4l2_fh.m2m_ctx = >> @@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) >> >> wave5_vdi_allocate_sram(inst->dev); >> >> + ret = mutex_lock_interruptible(&dev->dev_lock); >> + if (ret) >> + goto cleanup_inst; >> + >> + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && >> list_empty(&dev->instances)) >> + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * >> NSEC_PER_MSEC), >> + HRTIMER_MODE_REL_PINNED); >> + >> + list_add_tail(&inst->list, &dev->instances); >> + >> + mutex_unlock(&dev->dev_lock); >> + >> return 0; >> >> cleanup_inst: >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> index bfe4caa79cc9..968ec9baf1ef 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> @@ -26,6 +26,9 @@ struct wave5_match_data { >> const char *fw_name; >> }; >> >> +static int vpu_poll_interval = 5; >> +module_param(vpu_poll_interval, int, 0644); >> + >> int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) >> { >> int ret; >> @@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, >> unsigned int timeout) >> return 0; >> } >> >> -static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >> +static void wave5_vpu_handle_irq(void *dev_id) >> { >> u32 seq_done; >> u32 cmd_done; >> @@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void >> *dev_id) >> struct vpu_instance *inst; >> struct vpu_device *dev = dev_id; >> >> - if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { >> - irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >> - wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >> - wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >> - >> - list_for_each_entry(inst, &dev->instances, list) { >> - seq_done = wave5_vdi_read_register(dev, >> W5_RET_SEQ_DONE_INSTANCE_INFO); >> - cmd_done = wave5_vdi_read_register(dev, >> W5_RET_QUEUE_CMD_DONE_INST); >> - >> - if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >> - irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >> - if (seq_done & BIT(inst->id)) { >> - seq_done &= ~BIT(inst->id); >> - wave5_vdi_write_register(dev, >> W5_RET_SEQ_DONE_INSTANCE_INFO, >> - seq_done); >> - complete(&inst->irq_done); >> - } >> + irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >> + wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >> + wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >> + >> + list_for_each_entry(inst, &dev->instances, list) { >> + seq_done = wave5_vdi_read_register(dev, >> W5_RET_SEQ_DONE_INSTANCE_INFO); >> + cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); >> + >> + if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >> + irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >> + if (seq_done & BIT(inst->id)) { >> + seq_done &= ~BIT(inst->id); >> + wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, >> + seq_done); >> + complete(&inst->irq_done); >> } >> + } >> >> - if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >> - irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >> - if (cmd_done & BIT(inst->id)) { >> - cmd_done &= ~BIT(inst->id); >> - wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >> - cmd_done); >> - inst->ops->finish_process(inst); >> - } >> + if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >> + irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >> + if (cmd_done & BIT(inst->id)) { >> + cmd_done &= ~BIT(inst->id); >> + wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >> + cmd_done); >> + inst->ops->finish_process(inst); >> } >> - >> - wave5_vpu_clear_interrupt(inst, irq_reason); >> } >> + >> + wave5_vpu_clear_interrupt(inst, irq_reason); >> } >> +} >> + >> +static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >> +{ >> + struct vpu_device *dev = dev_id; >> + >> + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >> + wave5_vpu_handle_irq(dev); >> >> return IRQ_HANDLED; >> } >> >> +static void wave5_vpu_irq_work_fn(struct kthread_work *work) >> +{ >> + struct vpu_device *dev = container_of(work, struct vpu_device, work); >> + >> + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >> + wave5_vpu_handle_irq(dev); >> +} >> + >> +static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer) >> +{ >> + struct vpu_device *dev = >> + container_of(timer, struct vpu_device, hrtimer); >> + >> + kthread_queue_work(dev->worker, &dev->work); >> + hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * >> NSEC_PER_MSEC)); >> + >> + return HRTIMER_RESTART; >> +} >> + >> static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name, >> u32 *revision) >> { >> @@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device *pdev) >> >> dev->irq = platform_get_irq(pdev, 0); >> if (dev->irq < 0) { >> - dev_err(&pdev->dev, "failed to get irq resource\n"); >> - ret = -ENXIO; >> - goto err_enc_unreg; >> - } >> - >> - ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >> - wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >> - if (ret) { >> - dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); >> - goto err_enc_unreg; >> + dev_err(&pdev->dev, "failed to get irq resource, falling back to >> polling\n"); >> + hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); >> + dev->hrtimer.function = &wave5_vpu_timer_callback; >> + dev->worker = kthread_create_worker(0, "vpu_irq_thread"); >> + if (IS_ERR(dev->worker)) { >> + dev_err(&pdev->dev, "failed to create vpu irq worker\n"); >> + ret = PTR_ERR(dev->worker); >> + goto err_vdi_release; >> + } >> + dev->vpu_poll_interval = vpu_poll_interval; >> + kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); >> + } else { >> + ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >> + wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >> + if (ret) { >> + dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", >> ret); >> + goto err_enc_unreg; >> + } >> } >> >> ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, >> &fw_revision); >> @@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct platform_device *pdev) >> { >> struct vpu_device *dev = dev_get_drvdata(&pdev->dev); >> >> + if (dev->irq < 0) { >> + kthread_destroy_worker(dev->worker); >> + hrtimer_cancel(&dev->hrtimer); >> + } >> + >> mutex_destroy(&dev->dev_lock); >> mutex_destroy(&dev->hw_lock); >> clk_bulk_disable_unprepare(dev->num_clks, dev->clks); >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> index 352f6e904e50..edc50450ddb8 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> @@ -756,6 +756,10 @@ struct vpu_device { >> u32 product_code; >> struct ida inst_ida; >> struct clk_bulk_data *clks; >> + struct hrtimer hrtimer; >> + struct kthread_work work; >> + struct kthread_worker *worker; >> + int vpu_poll_interval; >> int num_clks; >> }; >> >> -- >> 2.34.1 >> >>
On 2/16/24 3:45 AM, Sebastian Fricke wrote: > Hey Devarsh, > > @nas.chung@chipsnmedia.com or @jackson.lee@chipsnmedia.com > Could you please run tests on the hardware with this patch and review > this patch to make sure that this doesn't break the other hardware with > Wave5 chips. Thanks! > > On 25.01.2024 18:38, Devarsh Thakkar wrote: >> Add support for starting a polling timer in case interrupt is not > > s/interrupt/an interrupt/ > >> available. This helps keep the VPU functional in SoC's such as AM62A, where > > s/helps keep/helps to keep/ > s/SoC's/SoCs/ > >> the hardware interrupt hookup may not be present due to an SoC errata [1]. > > I would prefer if we do not hack the driver to adjust to SoC errata, > instead of adding polling functionality you could also implement a fake > IRQ chip, then the driver can still assume that an interrupt is present > and you create a virtual interrupt source that does polling in the > background. > You could start by adding the fake irq-chip within the wave5 folder and > later on that could be moved to the `drivers/irqchip/` directory (by you > or someone else), which would be a nice addition to the Linux kernel to > provide machines without IRQ support a way to register an interrupt. +1 I'm really surprised that isn't already a thing. A lot of drivers do hacky irq+polling to work around missing status update interrupts. It seems to lead to race conditions and generally messy to follow code in most cases. A simple generic "fire an inteterrupt every X time units" virtual interrupt source would cleanup a lot of these. Andrew > Please have a look at that potential solution, you potentially need to > share the registers between the codec and the fake irq-chip in the DTS. > >> >> The timer is shared across all instances of encoder and decoder and is > > s/encoder and decoder/encoders and decoders/ > >> started when first instance of encoder or decoder is opened and stopped > > s/first/the first/ > s/encoder or decoder/an encoder or a decoder/ > >> when last instance is closed, thus avoiding per instance polling and saving > > s/last/the last/ > >> CPU bandwidth. >> >> hrtimer callback is called with 5ms polling interval while any of the > > s/hrtimer/The hrtimer/ > >> encoder/decoder instances are running to check the interrupt status as >> being done in irq handler. > > s/irq/the irq/ > > Could you ellaborate why you choose the HRtimer for this job and why > does it need to be a 5ms polling interval, is that a requirement of this > specific hardware or the best constant you figured out during testing? > >> >> Based on above interrupt status, use a worker thread to iterate over the > > s/on above interrupt status/on the interrupt status above/ > >> interrupt status for each instance and send completion event as being done > > s/completion/a completion/ > >> in irq thread function. > > s/irq/the irq/ > >> >> Parse for irq number before v4l2 device registration and if not available >> only then, initialize hrtimer and worker thread. > > Thanks for the effort on the description but you do not have to write > out word for word what the code is doing, the code should be descriptive > enough on its own. > More important is that you clearly describe, for example: > - why did you choose a certain design? > - why did you choose for the HRtimer? > - why is the 5ms interval is important (are other intervals also valid?) > - why did you choose to add a polling mechanism instead of possible alternatives (like for example an fake irq-chip), > - >> >> Move the core functionality of irq thread function to a separate function > > s/irq/the irq/ > >> wave5_vpu_handle_irq so that it can be used by both the worker thread when >> using polling mode and irq thread when using interrupt mode. > > s/irq/the irq/ > >> >> Protect hrtimer access and instance list with device specific mutex locks > > s/hrtimer/the hrtimer/ > >> to avoid race conditions while different instances of encoder and decoder > > s/encoder and decoder/encoders and decoders/ > >> are started together. >> >> Add module param to change polling interval for debug purpose. > > s/module/a module/ > s/polling/the polling/ > s/debug purpose/debugging purposes/ > > Here again is the question, what can the user expect with different > polling intervals? Is there a valid range of intervals? > > The actual implementation seems good, but I would like to first make > sure that we take the best possible approach to this issue. > > Greetings, > Sebastian > >> >> [1] https://www.ti.com/lit/pdf/spruj16 >> (Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks) >> >> Signed-off-by: Devarsh Thakkar <devarsht@ti.com> >> --- >> Test logs: >> https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb >> --- >> .../platform/chips-media/wave5/wave5-helper.c | 16 ++- >> .../chips-media/wave5/wave5-vpu-dec.c | 13 +- >> .../chips-media/wave5/wave5-vpu-enc.c | 13 +- >> .../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------ >> .../platform/chips-media/wave5/wave5-vpuapi.h | 4 + >> 5 files changed, 122 insertions(+), 41 deletions(-) >> >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c b/drivers/media/platform/chips-media/wave5/wave5-helper.c >> index 8433ecab230c..475b7628964f 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-helper.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c >> @@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, >> char *name) >> { >> struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); >> + struct vpu_device *dev = inst->dev; >> + int ret = 0; >> >> v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); >> if (inst->state != VPU_INST_STATE_NONE) { >> @@ -71,8 +73,20 @@ int wave5_vpu_release_device(struct file *filp, >> } >> >> wave5_cleanup_instance(inst); >> + if (dev->irq < 0) { >> + ret = mutex_lock_interruptible(&dev->dev_lock); >> + if (ret) >> + return ret; >> >> - return 0; >> + if (list_empty(&dev->instances)) { >> + dev_dbg(dev->dev, "Disabling the hrtimer\n"); >> + hrtimer_cancel(&dev->hrtimer); >> + } >> + >> + mutex_unlock(&dev->dev_lock); >> + } >> + >> + return ret; >> } >> >> int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq, >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> index ef227af72348..c8624c681fa6 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >> @@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) >> v4l2_fh_add(&inst->v4l2_fh); >> >> INIT_LIST_HEAD(&inst->list); >> - list_add_tail(&inst->list, &dev->instances); >> >> inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; >> inst->v4l2_fh.m2m_ctx = >> @@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) >> >> wave5_vdi_allocate_sram(inst->dev); >> >> + ret = mutex_lock_interruptible(&dev->dev_lock); >> + if (ret) >> + goto cleanup_inst; >> + >> + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) >> + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), >> + HRTIMER_MODE_REL_PINNED); >> + >> + list_add_tail(&inst->list, &dev->instances); >> + >> + mutex_unlock(&dev->dev_lock); >> + >> return 0; >> >> cleanup_inst: >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> index f29cfa3af94a..9e88424761b6 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >> @@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) >> v4l2_fh_add(&inst->v4l2_fh); >> >> INIT_LIST_HEAD(&inst->list); >> - list_add_tail(&inst->list, &dev->instances); >> >> inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; >> inst->v4l2_fh.m2m_ctx = >> @@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) >> >> wave5_vdi_allocate_sram(inst->dev); >> >> + ret = mutex_lock_interruptible(&dev->dev_lock); >> + if (ret) >> + goto cleanup_inst; >> + >> + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) >> + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), >> + HRTIMER_MODE_REL_PINNED); >> + >> + list_add_tail(&inst->list, &dev->instances); >> + >> + mutex_unlock(&dev->dev_lock); >> + >> return 0; >> >> cleanup_inst: >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> index bfe4caa79cc9..968ec9baf1ef 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >> @@ -26,6 +26,9 @@ struct wave5_match_data { >> const char *fw_name; >> }; >> >> +static int vpu_poll_interval = 5; >> +module_param(vpu_poll_interval, int, 0644); >> + >> int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) >> { >> int ret; >> @@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) >> return 0; >> } >> >> -static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >> +static void wave5_vpu_handle_irq(void *dev_id) >> { >> u32 seq_done; >> u32 cmd_done; >> @@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >> struct vpu_instance *inst; >> struct vpu_device *dev = dev_id; >> >> - if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { >> - irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >> - wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >> - wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >> - >> - list_for_each_entry(inst, &dev->instances, list) { >> - seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); >> - cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); >> - >> - if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >> - irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >> - if (seq_done & BIT(inst->id)) { >> - seq_done &= ~BIT(inst->id); >> - wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, >> - seq_done); >> - complete(&inst->irq_done); >> - } >> + irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >> + wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >> + wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >> + >> + list_for_each_entry(inst, &dev->instances, list) { >> + seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); >> + cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); >> + >> + if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >> + irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >> + if (seq_done & BIT(inst->id)) { >> + seq_done &= ~BIT(inst->id); >> + wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, >> + seq_done); >> + complete(&inst->irq_done); >> } >> + } >> >> - if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >> - irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >> - if (cmd_done & BIT(inst->id)) { >> - cmd_done &= ~BIT(inst->id); >> - wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >> - cmd_done); >> - inst->ops->finish_process(inst); >> - } >> + if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >> + irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >> + if (cmd_done & BIT(inst->id)) { >> + cmd_done &= ~BIT(inst->id); >> + wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >> + cmd_done); >> + inst->ops->finish_process(inst); >> } >> - >> - wave5_vpu_clear_interrupt(inst, irq_reason); >> } >> + >> + wave5_vpu_clear_interrupt(inst, irq_reason); >> } >> +} >> + >> +static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >> +{ >> + struct vpu_device *dev = dev_id; >> + >> + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >> + wave5_vpu_handle_irq(dev); >> >> return IRQ_HANDLED; >> } >> >> +static void wave5_vpu_irq_work_fn(struct kthread_work *work) >> +{ >> + struct vpu_device *dev = container_of(work, struct vpu_device, work); >> + >> + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >> + wave5_vpu_handle_irq(dev); >> +} >> + >> +static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer) >> +{ >> + struct vpu_device *dev = >> + container_of(timer, struct vpu_device, hrtimer); >> + >> + kthread_queue_work(dev->worker, &dev->work); >> + hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * NSEC_PER_MSEC)); >> + >> + return HRTIMER_RESTART; >> +} >> + >> static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name, >> u32 *revision) >> { >> @@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device *pdev) >> >> dev->irq = platform_get_irq(pdev, 0); >> if (dev->irq < 0) { >> - dev_err(&pdev->dev, "failed to get irq resource\n"); >> - ret = -ENXIO; >> - goto err_enc_unreg; >> - } >> - >> - ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >> - wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >> - if (ret) { >> - dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); >> - goto err_enc_unreg; >> + dev_err(&pdev->dev, "failed to get irq resource, falling back to polling\n"); >> + hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); >> + dev->hrtimer.function = &wave5_vpu_timer_callback; >> + dev->worker = kthread_create_worker(0, "vpu_irq_thread"); >> + if (IS_ERR(dev->worker)) { >> + dev_err(&pdev->dev, "failed to create vpu irq worker\n"); >> + ret = PTR_ERR(dev->worker); >> + goto err_vdi_release; >> + } >> + dev->vpu_poll_interval = vpu_poll_interval; >> + kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); >> + } else { >> + ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >> + wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >> + if (ret) { >> + dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); >> + goto err_enc_unreg; >> + } >> } >> >> ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, &fw_revision); >> @@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct platform_device *pdev) >> { >> struct vpu_device *dev = dev_get_drvdata(&pdev->dev); >> >> + if (dev->irq < 0) { >> + kthread_destroy_worker(dev->worker); >> + hrtimer_cancel(&dev->hrtimer); >> + } >> + >> mutex_destroy(&dev->dev_lock); >> mutex_destroy(&dev->hw_lock); >> clk_bulk_disable_unprepare(dev->num_clks, dev->clks); >> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> index 352f6e904e50..edc50450ddb8 100644 >> --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >> @@ -756,6 +756,10 @@ struct vpu_device { >> u32 product_code; >> struct ida inst_ida; >> struct clk_bulk_data *clks; >> + struct hrtimer hrtimer; >> + struct kthread_work work; >> + struct kthread_worker *worker; >> + int vpu_poll_interval; >> int num_clks; >> }; >> >> -- >> 2.34.1 >> >> >
Hi Andrew, Thanks for the review. On 22/02/24 21:41, Andrew Davis wrote: > On 2/16/24 3:45 AM, Sebastian Fricke wrote: >> Hey Devarsh, >> >> @nas.chung@chipsnmedia.com or @jackson.lee@chipsnmedia.com >> Could you please run tests on the hardware with this patch and review >> this patch to make sure that this doesn't break the other hardware with >> Wave5 chips. Thanks! >> >> On 25.01.2024 18:38, Devarsh Thakkar wrote: >>> Add support for starting a polling timer in case interrupt is not >> >> s/interrupt/an interrupt/ >> >>> available. This helps keep the VPU functional in SoC's such as AM62A, where >> >> s/helps keep/helps to keep/ >> s/SoC's/SoCs/ >> >>> the hardware interrupt hookup may not be present due to an SoC errata [1]. >> >> I would prefer if we do not hack the driver to adjust to SoC errata, >> instead of adding polling functionality you could also implement a fake >> IRQ chip, then the driver can still assume that an interrupt is present >> and you create a virtual interrupt source that does polling in the >> background. >> You could start by adding the fake irq-chip within the wave5 folder and >> later on that could be moved to the `drivers/irqchip/` directory (by you >> or someone else), which would be a nice addition to the Linux kernel to >> provide machines without IRQ support a way to register an interrupt. > > +1 > > I'm really surprised that isn't already a thing. A lot of drivers do > hacky irq+polling to work around missing status update interrupts. > It seems to lead to race conditions and generally messy to follow code > in most cases. A simple generic "fire an inteterrupt every X time units" > virtual interrupt source would cleanup a lot of these. > > Andrew > Well at least for the video codec type of scenarios where multiple instances of encode & decode can run together and we require to achieve balance between optimum CPU utilization, real-time performance and latencies we require an intelligent polling based scheme as implemented in this patchset and I don't think the fakeirq approach would help to replicate an optimum solution given all the meta-data related to instance handling and polling interval is present in video codec driver. I already shared a good amount of reasoning here [0], but to summarize in general the scheme we want to follow (and also the scheme which this patchset implements) is to have a single polling thread for all instances of encoder and decoder so as to keep the CPU utilization in check while achieving real-time latencies and performance as seen in the results [1]. For e.g. we only need to start polling when first instance gets launched from user-space and this information is known only to VPU driver so it would still need to signal to irq chip (possibly using irq_enable hook) to start polling on first instance launch. And again we need to stop polling when last instance gets closed, now here again driver only knows when the last VPU instance gets closed so the VPU driver would need to signal to irqchip again to stop polling. And all these handling requires locking related handling too as multiple instances can be launched anytime and of any type (either for encoder and decoder) and this logic would again fall in VPU driver. Secondly it makes more sense to keep VPU polling interval param inside the VPU driver rather than fakeirq driver (which would ideally be agnostic to VPU and handling other peripherals too) as user may want to change it to a different value than defaults if they have a preference to a subset of fixed use-cases or want to give more preference to CPU utilizations over Performance/latency or vice-versa. Also if we look at it from another angle in some scenario the polling scheme may offer some advantage over the irq too if system is already under high interrupt load. For e.g let's say we want to run a 32 instance use-case for a non real-time processing of video. Now with the polling scheme implemented here, there would be a single polling thread which would trigger a softirq every X ms as compare to 32 interrupts being fired per frame for each instance when irq is being used, and if the system is already running under heavy interrupt load than it may make more sense to switch to polling if the use-case doesn't demand aggressive latencies (as with polling VPU latencies would be slightly higher but that's okay for non real-time video processing scenarios ). So I think polling feature implemented may not be completely use-less in general too for SoC's where VPU irq is present. [0] : https://lore.kernel.org/all/47fe5d24-4331-a88c-5481-d60a3ae96efa@ti.com/ [1] : 1x 640x480@25 Encoding : https://gist.github.com/devarsht/ee9664d3403d1212ef477a027b71896c 4x 1080p30 Transcode (File→decode→encode→file).csv irq vs polling comparison : https://gist.github.com/devarsht/3a58b4f201430dfc61697c7e224e74c2 1x 1080p transcode (File→decode→encode→file) polling_vs_irq_transcode comparison : https://gist.github.com/devarsht/a6480f1f2cbdf8dd694d698309d81fb0 1080p60 streaming use-case for irq vs polling comparison https://gist.github.com/devarsht/44aaa4322454e85e01a8d65ac47c5edb Regards Devarsh >> Please have a look at that potential solution, you potentially need to >> share the registers between the codec and the fake irq-chip in the DTS. >> >>> >>> The timer is shared across all instances of encoder and decoder and is >> >> s/encoder and decoder/encoders and decoders/ >> >>> started when first instance of encoder or decoder is opened and stopped >> >> s/first/the first/ >> s/encoder or decoder/an encoder or a decoder/ >> >>> when last instance is closed, thus avoiding per instance polling and saving >> >> s/last/the last/ >> >>> CPU bandwidth. >>> >>> hrtimer callback is called with 5ms polling interval while any of the >> >> s/hrtimer/The hrtimer/ >> >>> encoder/decoder instances are running to check the interrupt status as >>> being done in irq handler. >> >> s/irq/the irq/ >> >> Could you ellaborate why you choose the HRtimer for this job and why >> does it need to be a 5ms polling interval, is that a requirement of this >> specific hardware or the best constant you figured out during testing? >> >>> >>> Based on above interrupt status, use a worker thread to iterate over the >> >> s/on above interrupt status/on the interrupt status above/ >> >>> interrupt status for each instance and send completion event as being done >> >> s/completion/a completion/ >> >>> in irq thread function. >> >> s/irq/the irq/ >> >>> >>> Parse for irq number before v4l2 device registration and if not available >>> only then, initialize hrtimer and worker thread. >> >> Thanks for the effort on the description but you do not have to write >> out word for word what the code is doing, the code should be descriptive >> enough on its own. >> More important is that you clearly describe, for example: >> - why did you choose a certain design? >> - why did you choose for the HRtimer? >> - why is the 5ms interval is important (are other intervals also valid?) >> - why did you choose to add a polling mechanism instead of possible >> alternatives (like for example an fake irq-chip), >> - >>> >>> Move the core functionality of irq thread function to a separate function >> >> s/irq/the irq/ >> >>> wave5_vpu_handle_irq so that it can be used by both the worker thread when >>> using polling mode and irq thread when using interrupt mode. >> >> s/irq/the irq/ >> >>> >>> Protect hrtimer access and instance list with device specific mutex locks >> >> s/hrtimer/the hrtimer/ >> >>> to avoid race conditions while different instances of encoder and decoder >> >> s/encoder and decoder/encoders and decoders/ >> >>> are started together. >>> >>> Add module param to change polling interval for debug purpose. >> >> s/module/a module/ >> s/polling/the polling/ >> s/debug purpose/debugging purposes/ >> >> Here again is the question, what can the user expect with different >> polling intervals? Is there a valid range of intervals? >> >> The actual implementation seems good, but I would like to first make >> sure that we take the best possible approach to this issue. >> >> Greetings, >> Sebastian >> >>> >>> [1] https://www.ti.com/lit/pdf/spruj16 >>> (Ref: Section 4.2.3.3 Resets, Interrupts, and Clocks) >>> >>> Signed-off-by: Devarsh Thakkar <devarsht@ti.com> >>> --- >>> Test logs: >>> https://gist.github.com/devarsht/613bc8aa66e65814c8374ffb6a4f50fb >>> --- >>> .../platform/chips-media/wave5/wave5-helper.c | 16 ++- >>> .../chips-media/wave5/wave5-vpu-dec.c | 13 +- >>> .../chips-media/wave5/wave5-vpu-enc.c | 13 +- >>> .../platform/chips-media/wave5/wave5-vpu.c | 117 ++++++++++++------ >>> .../platform/chips-media/wave5/wave5-vpuapi.h | 4 + >>> 5 files changed, 122 insertions(+), 41 deletions(-) >>> >>> diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c >>> b/drivers/media/platform/chips-media/wave5/wave5-helper.c >>> index 8433ecab230c..475b7628964f 100644 >>> --- a/drivers/media/platform/chips-media/wave5/wave5-helper.c >>> +++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c >>> @@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, >>> char *name) >>> { >>> struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); >>> + struct vpu_device *dev = inst->dev; >>> + int ret = 0; >>> >>> v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); >>> if (inst->state != VPU_INST_STATE_NONE) { >>> @@ -71,8 +73,20 @@ int wave5_vpu_release_device(struct file *filp, >>> } >>> >>> wave5_cleanup_instance(inst); >>> + if (dev->irq < 0) { >>> + ret = mutex_lock_interruptible(&dev->dev_lock); >>> + if (ret) >>> + return ret; >>> >>> - return 0; >>> + if (list_empty(&dev->instances)) { >>> + dev_dbg(dev->dev, "Disabling the hrtimer\n"); >>> + hrtimer_cancel(&dev->hrtimer); >>> + } >>> + >>> + mutex_unlock(&dev->dev_lock); >>> + } >>> + >>> + return ret; >>> } >>> >>> int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct >>> vb2_queue *dst_vq, >>> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >>> b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >>> index ef227af72348..c8624c681fa6 100644 >>> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >>> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c >>> @@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) >>> v4l2_fh_add(&inst->v4l2_fh); >>> >>> INIT_LIST_HEAD(&inst->list); >>> - list_add_tail(&inst->list, &dev->instances); >>> >>> inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; >>> inst->v4l2_fh.m2m_ctx = >>> @@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) >>> >>> wave5_vdi_allocate_sram(inst->dev); >>> >>> + ret = mutex_lock_interruptible(&dev->dev_lock); >>> + if (ret) >>> + goto cleanup_inst; >>> + >>> + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && >>> list_empty(&dev->instances)) >>> + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * >>> NSEC_PER_MSEC), >>> + HRTIMER_MODE_REL_PINNED); >>> + >>> + list_add_tail(&inst->list, &dev->instances); >>> + >>> + mutex_unlock(&dev->dev_lock); >>> + >>> return 0; >>> >>> cleanup_inst: >>> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >>> b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >>> index f29cfa3af94a..9e88424761b6 100644 >>> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >>> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c >>> @@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) >>> v4l2_fh_add(&inst->v4l2_fh); >>> >>> INIT_LIST_HEAD(&inst->list); >>> - list_add_tail(&inst->list, &dev->instances); >>> >>> inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; >>> inst->v4l2_fh.m2m_ctx = >>> @@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) >>> >>> wave5_vdi_allocate_sram(inst->dev); >>> >>> + ret = mutex_lock_interruptible(&dev->dev_lock); >>> + if (ret) >>> + goto cleanup_inst; >>> + >>> + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && >>> list_empty(&dev->instances)) >>> + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * >>> NSEC_PER_MSEC), >>> + HRTIMER_MODE_REL_PINNED); >>> + >>> + list_add_tail(&inst->list, &dev->instances); >>> + >>> + mutex_unlock(&dev->dev_lock); >>> + >>> return 0; >>> >>> cleanup_inst: >>> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c >>> b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >>> index bfe4caa79cc9..968ec9baf1ef 100644 >>> --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c >>> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c >>> @@ -26,6 +26,9 @@ struct wave5_match_data { >>> const char *fw_name; >>> }; >>> >>> +static int vpu_poll_interval = 5; >>> +module_param(vpu_poll_interval, int, 0644); >>> + >>> int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) >>> { >>> int ret; >>> @@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, >>> unsigned int timeout) >>> return 0; >>> } >>> >>> -static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >>> +static void wave5_vpu_handle_irq(void *dev_id) >>> { >>> u32 seq_done; >>> u32 cmd_done; >>> @@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void >>> *dev_id) >>> struct vpu_instance *inst; >>> struct vpu_device *dev = dev_id; >>> >>> - if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { >>> - irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >>> - wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >>> - wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >>> - >>> - list_for_each_entry(inst, &dev->instances, list) { >>> - seq_done = wave5_vdi_read_register(dev, >>> W5_RET_SEQ_DONE_INSTANCE_INFO); >>> - cmd_done = wave5_vdi_read_register(dev, >>> W5_RET_QUEUE_CMD_DONE_INST); >>> - >>> - if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >>> - irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >>> - if (seq_done & BIT(inst->id)) { >>> - seq_done &= ~BIT(inst->id); >>> - wave5_vdi_write_register(dev, >>> W5_RET_SEQ_DONE_INSTANCE_INFO, >>> - seq_done); >>> - complete(&inst->irq_done); >>> - } >>> + irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); >>> + wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); >>> + wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); >>> + >>> + list_for_each_entry(inst, &dev->instances, list) { >>> + seq_done = wave5_vdi_read_register(dev, >>> W5_RET_SEQ_DONE_INSTANCE_INFO); >>> + cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); >>> + >>> + if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || >>> + irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { >>> + if (seq_done & BIT(inst->id)) { >>> + seq_done &= ~BIT(inst->id); >>> + wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, >>> + seq_done); >>> + complete(&inst->irq_done); >>> } >>> + } >>> >>> - if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >>> - irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >>> - if (cmd_done & BIT(inst->id)) { >>> - cmd_done &= ~BIT(inst->id); >>> - wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >>> - cmd_done); >>> - inst->ops->finish_process(inst); >>> - } >>> + if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || >>> + irq_reason & BIT(INT_WAVE5_ENC_PIC)) { >>> + if (cmd_done & BIT(inst->id)) { >>> + cmd_done &= ~BIT(inst->id); >>> + wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, >>> + cmd_done); >>> + inst->ops->finish_process(inst); >>> } >>> - >>> - wave5_vpu_clear_interrupt(inst, irq_reason); >>> } >>> + >>> + wave5_vpu_clear_interrupt(inst, irq_reason); >>> } >>> +} >>> + >>> +static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) >>> +{ >>> + struct vpu_device *dev = dev_id; >>> + >>> + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >>> + wave5_vpu_handle_irq(dev); >>> >>> return IRQ_HANDLED; >>> } >>> >>> +static void wave5_vpu_irq_work_fn(struct kthread_work *work) >>> +{ >>> + struct vpu_device *dev = container_of(work, struct vpu_device, work); >>> + >>> + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) >>> + wave5_vpu_handle_irq(dev); >>> +} >>> + >>> +static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer) >>> +{ >>> + struct vpu_device *dev = >>> + container_of(timer, struct vpu_device, hrtimer); >>> + >>> + kthread_queue_work(dev->worker, &dev->work); >>> + hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * >>> NSEC_PER_MSEC)); >>> + >>> + return HRTIMER_RESTART; >>> +} >>> + >>> static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name, >>> u32 *revision) >>> { >>> @@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device *pdev) >>> >>> dev->irq = platform_get_irq(pdev, 0); >>> if (dev->irq < 0) { >>> - dev_err(&pdev->dev, "failed to get irq resource\n"); >>> - ret = -ENXIO; >>> - goto err_enc_unreg; >>> - } >>> - >>> - ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >>> - wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >>> - if (ret) { >>> - dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); >>> - goto err_enc_unreg; >>> + dev_err(&pdev->dev, "failed to get irq resource, falling back to >>> polling\n"); >>> + hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, >>> HRTIMER_MODE_REL_PINNED); >>> + dev->hrtimer.function = &wave5_vpu_timer_callback; >>> + dev->worker = kthread_create_worker(0, "vpu_irq_thread"); >>> + if (IS_ERR(dev->worker)) { >>> + dev_err(&pdev->dev, "failed to create vpu irq worker\n"); >>> + ret = PTR_ERR(dev->worker); >>> + goto err_vdi_release; >>> + } >>> + dev->vpu_poll_interval = vpu_poll_interval; >>> + kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); >>> + } else { >>> + ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, >>> + wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); >>> + if (ret) { >>> + dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", >>> ret); >>> + goto err_enc_unreg; >>> + } >>> } >>> >>> ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, >>> &fw_revision); >>> @@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct platform_device *pdev) >>> { >>> struct vpu_device *dev = dev_get_drvdata(&pdev->dev); >>> >>> + if (dev->irq < 0) { >>> + kthread_destroy_worker(dev->worker); >>> + hrtimer_cancel(&dev->hrtimer); >>> + } >>> + >>> mutex_destroy(&dev->dev_lock); >>> mutex_destroy(&dev->hw_lock); >>> clk_bulk_disable_unprepare(dev->num_clks, dev->clks); >>> diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >>> b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >>> index 352f6e904e50..edc50450ddb8 100644 >>> --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >>> +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h >>> @@ -756,6 +756,10 @@ struct vpu_device { >>> u32 product_code; >>> struct ida inst_ida; >>> struct clk_bulk_data *clks; >>> + struct hrtimer hrtimer; >>> + struct kthread_work work; >>> + struct kthread_worker *worker; >>> + int vpu_poll_interval; >>> int num_clks; >>> }; >>> >>> -- >>> 2.34.1 >>> >>> >>
diff --git a/drivers/media/platform/chips-media/wave5/wave5-helper.c b/drivers/media/platform/chips-media/wave5/wave5-helper.c index 8433ecab230c..475b7628964f 100644 --- a/drivers/media/platform/chips-media/wave5/wave5-helper.c +++ b/drivers/media/platform/chips-media/wave5/wave5-helper.c @@ -52,6 +52,8 @@ int wave5_vpu_release_device(struct file *filp, char *name) { struct vpu_instance *inst = wave5_to_vpu_inst(filp->private_data); + struct vpu_device *dev = inst->dev; + int ret = 0; v4l2_m2m_ctx_release(inst->v4l2_fh.m2m_ctx); if (inst->state != VPU_INST_STATE_NONE) { @@ -71,8 +73,20 @@ int wave5_vpu_release_device(struct file *filp, } wave5_cleanup_instance(inst); + if (dev->irq < 0) { + ret = mutex_lock_interruptible(&dev->dev_lock); + if (ret) + return ret; - return 0; + if (list_empty(&dev->instances)) { + dev_dbg(dev->dev, "Disabling the hrtimer\n"); + hrtimer_cancel(&dev->hrtimer); + } + + mutex_unlock(&dev->dev_lock); + } + + return ret; } int wave5_vpu_queue_init(void *priv, struct vb2_queue *src_vq, struct vb2_queue *dst_vq, diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c index ef227af72348..c8624c681fa6 100644 --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-dec.c @@ -1810,7 +1810,6 @@ static int wave5_vpu_open_dec(struct file *filp) v4l2_fh_add(&inst->v4l2_fh); INIT_LIST_HEAD(&inst->list); - list_add_tail(&inst->list, &dev->instances); inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_dec_dev; inst->v4l2_fh.m2m_ctx = @@ -1867,6 +1866,18 @@ static int wave5_vpu_open_dec(struct file *filp) wave5_vdi_allocate_sram(inst->dev); + ret = mutex_lock_interruptible(&dev->dev_lock); + if (ret) + goto cleanup_inst; + + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), + HRTIMER_MODE_REL_PINNED); + + list_add_tail(&inst->list, &dev->instances); + + mutex_unlock(&dev->dev_lock); + return 0; cleanup_inst: diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c index f29cfa3af94a..9e88424761b6 100644 --- a/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu-enc.c @@ -1554,7 +1554,6 @@ static int wave5_vpu_open_enc(struct file *filp) v4l2_fh_add(&inst->v4l2_fh); INIT_LIST_HEAD(&inst->list); - list_add_tail(&inst->list, &dev->instances); inst->v4l2_m2m_dev = inst->dev->v4l2_m2m_enc_dev; inst->v4l2_fh.m2m_ctx = @@ -1729,6 +1728,18 @@ static int wave5_vpu_open_enc(struct file *filp) wave5_vdi_allocate_sram(inst->dev); + ret = mutex_lock_interruptible(&dev->dev_lock); + if (ret) + goto cleanup_inst; + + if (dev->irq < 0 && !hrtimer_active(&dev->hrtimer) && list_empty(&dev->instances)) + hrtimer_start(&dev->hrtimer, ns_to_ktime(dev->vpu_poll_interval * NSEC_PER_MSEC), + HRTIMER_MODE_REL_PINNED); + + list_add_tail(&inst->list, &dev->instances); + + mutex_unlock(&dev->dev_lock); + return 0; cleanup_inst: diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpu.c b/drivers/media/platform/chips-media/wave5/wave5-vpu.c index bfe4caa79cc9..968ec9baf1ef 100644 --- a/drivers/media/platform/chips-media/wave5/wave5-vpu.c +++ b/drivers/media/platform/chips-media/wave5/wave5-vpu.c @@ -26,6 +26,9 @@ struct wave5_match_data { const char *fw_name; }; +static int vpu_poll_interval = 5; +module_param(vpu_poll_interval, int, 0644); + int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) { int ret; @@ -40,7 +43,7 @@ int wave5_vpu_wait_interrupt(struct vpu_instance *inst, unsigned int timeout) return 0; } -static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) +static void wave5_vpu_handle_irq(void *dev_id) { u32 seq_done; u32 cmd_done; @@ -48,42 +51,67 @@ static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) struct vpu_instance *inst; struct vpu_device *dev = dev_id; - if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) { - irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); - wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); - wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); - - list_for_each_entry(inst, &dev->instances, list) { - seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); - cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); - - if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || - irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { - if (seq_done & BIT(inst->id)) { - seq_done &= ~BIT(inst->id); - wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, - seq_done); - complete(&inst->irq_done); - } + irq_reason = wave5_vdi_read_register(dev, W5_VPU_VINT_REASON); + wave5_vdi_write_register(dev, W5_VPU_VINT_REASON_CLR, irq_reason); + wave5_vdi_write_register(dev, W5_VPU_VINT_CLEAR, 0x1); + + list_for_each_entry(inst, &dev->instances, list) { + seq_done = wave5_vdi_read_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO); + cmd_done = wave5_vdi_read_register(dev, W5_RET_QUEUE_CMD_DONE_INST); + + if (irq_reason & BIT(INT_WAVE5_INIT_SEQ) || + irq_reason & BIT(INT_WAVE5_ENC_SET_PARAM)) { + if (seq_done & BIT(inst->id)) { + seq_done &= ~BIT(inst->id); + wave5_vdi_write_register(dev, W5_RET_SEQ_DONE_INSTANCE_INFO, + seq_done); + complete(&inst->irq_done); } + } - if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || - irq_reason & BIT(INT_WAVE5_ENC_PIC)) { - if (cmd_done & BIT(inst->id)) { - cmd_done &= ~BIT(inst->id); - wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, - cmd_done); - inst->ops->finish_process(inst); - } + if (irq_reason & BIT(INT_WAVE5_DEC_PIC) || + irq_reason & BIT(INT_WAVE5_ENC_PIC)) { + if (cmd_done & BIT(inst->id)) { + cmd_done &= ~BIT(inst->id); + wave5_vdi_write_register(dev, W5_RET_QUEUE_CMD_DONE_INST, + cmd_done); + inst->ops->finish_process(inst); } - - wave5_vpu_clear_interrupt(inst, irq_reason); } + + wave5_vpu_clear_interrupt(inst, irq_reason); } +} + +static irqreturn_t wave5_vpu_irq_thread(int irq, void *dev_id) +{ + struct vpu_device *dev = dev_id; + + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) + wave5_vpu_handle_irq(dev); return IRQ_HANDLED; } +static void wave5_vpu_irq_work_fn(struct kthread_work *work) +{ + struct vpu_device *dev = container_of(work, struct vpu_device, work); + + if (wave5_vdi_read_register(dev, W5_VPU_VPU_INT_STS)) + wave5_vpu_handle_irq(dev); +} + +static enum hrtimer_restart wave5_vpu_timer_callback(struct hrtimer *timer) +{ + struct vpu_device *dev = + container_of(timer, struct vpu_device, hrtimer); + + kthread_queue_work(dev->worker, &dev->work); + hrtimer_forward_now(timer, ns_to_ktime(vpu_poll_interval * NSEC_PER_MSEC)); + + return HRTIMER_RESTART; +} + static int wave5_vpu_load_firmware(struct device *dev, const char *fw_name, u32 *revision) { @@ -209,16 +237,24 @@ static int wave5_vpu_probe(struct platform_device *pdev) dev->irq = platform_get_irq(pdev, 0); if (dev->irq < 0) { - dev_err(&pdev->dev, "failed to get irq resource\n"); - ret = -ENXIO; - goto err_enc_unreg; - } - - ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, - wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); - if (ret) { - dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); - goto err_enc_unreg; + dev_err(&pdev->dev, "failed to get irq resource, falling back to polling\n"); + hrtimer_init(&dev->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_PINNED); + dev->hrtimer.function = &wave5_vpu_timer_callback; + dev->worker = kthread_create_worker(0, "vpu_irq_thread"); + if (IS_ERR(dev->worker)) { + dev_err(&pdev->dev, "failed to create vpu irq worker\n"); + ret = PTR_ERR(dev->worker); + goto err_vdi_release; + } + dev->vpu_poll_interval = vpu_poll_interval; + kthread_init_work(&dev->work, wave5_vpu_irq_work_fn); + } else { + ret = devm_request_threaded_irq(&pdev->dev, dev->irq, NULL, + wave5_vpu_irq_thread, IRQF_ONESHOT, "vpu_irq", dev); + if (ret) { + dev_err(&pdev->dev, "Register interrupt handler, fail: %d\n", ret); + goto err_enc_unreg; + } } ret = wave5_vpu_load_firmware(&pdev->dev, match_data->fw_name, &fw_revision); @@ -254,6 +290,11 @@ static int wave5_vpu_remove(struct platform_device *pdev) { struct vpu_device *dev = dev_get_drvdata(&pdev->dev); + if (dev->irq < 0) { + kthread_destroy_worker(dev->worker); + hrtimer_cancel(&dev->hrtimer); + } + mutex_destroy(&dev->dev_lock); mutex_destroy(&dev->hw_lock); clk_bulk_disable_unprepare(dev->num_clks, dev->clks); diff --git a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h index 352f6e904e50..edc50450ddb8 100644 --- a/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h +++ b/drivers/media/platform/chips-media/wave5/wave5-vpuapi.h @@ -756,6 +756,10 @@ struct vpu_device { u32 product_code; struct ida inst_ida; struct clk_bulk_data *clks; + struct hrtimer hrtimer; + struct kthread_work work; + struct kthread_worker *worker; + int vpu_poll_interval; int num_clks; };