Message ID | 20240222092609.31382-2-yunfei.dong@mediatek.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-76217-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:aa16:b0:108:e6aa:91d0 with SMTP id by22csp134482dyb; Thu, 22 Feb 2024 01:27:18 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXOJI7w2v/5cT2/3OhP8/Z1ZO0vkJjg0IexfofcDODWiVnTkuENUkM34t2Fzkm6ossBmmgs4Z5zlyP7Y6BUsIZwAZClwQ== X-Google-Smtp-Source: AGHT+IGDFRaKpF72p1HZ4cLQHnHDAzZ47MYS6srsVxYy+yNTpcUmLOw/BqAz71+wiOz/IMMLmadX X-Received: by 2002:a05:6830:18c7:b0:6e2:f5b6:c91b with SMTP id v7-20020a05683018c700b006e2f5b6c91bmr21358940ote.11.1708594038177; Thu, 22 Feb 2024 01:27:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708594038; cv=pass; d=google.com; s=arc-20160816; b=NX2YoIrfOLJNpJn1LWJvQ7X5w+EtIh59chyN5pCfUHeXOb9pW+EIoCcFna3d92RxPt oh58WWNUtlQuHOXxibR6D7Y5uenv9krj6MvmFC5CqCieu1h1q+9DTuPQ2SIdCRIga28R e32GVhti91ydOVaLUhbRUR/ygwtbgF6k86GVuvZTbwLCF2lxblZPmhPU7u40XzK0f//T sT5rBbzjYz9pCooQJbIdpYHE/9TlP53Gv0erd4UMhrMTrHk7L+d1csx7mND/GY5NdH3X DM2aYt9Wy60QskOnG5kUoJDRHJzTJcVWhrmSHiKJuSek7hFZAyBSkJhMh+aBJUlfcq1Q 9Lig== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=8Z6fmaYt8GqnHbucfKF/eS5GrvWtuMMfLRUtjCK0qkM=; fh=Vktzu/YTIZzU5sMxulZwrMGMBUdYRBN/NCkVeWAuVnU=; b=F7cQ1QlMLnqEJcbihFI8kRY6S21FqTc3wxIOSCMZgZnLaBW6vH+CrgMkUHyXEq2vUX T24pZ+1kRPPJGkd9fpxvVxKD2LtJpDfvhlSubmLaE2u/FRbQ/efyxLVQCrTdf0D/gEEV Zw7n+2gkQ3cLVO0z6hj94GQ9f3DmU7jkC8EbHg3FA41Zgd0WKlMz0HAyA2QZ0jZHOMWc 2mMME01QxuOh1iSRO+803S4uCoB1tRiUv4KDQdMCiy+nuuSITNhhkoVL8nBLlG8fDA9Z bZ3lymCaC8HU2tkCZyDwSTrV33HQUIdGmAqDwuM1Gx9Dar2MvtXtmU/9cDtnD3rjTQwZ /hbg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@mediatek.com header.s=dk header.b=fcb0AFN6; arc=pass (i=1 spf=pass spfdomain=mediatek.com dkim=pass dkdomain=mediatek.com dmarc=pass fromdomain=mediatek.com); spf=pass (google.com: domain of linux-kernel+bounces-76217-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-76217-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=mediatek.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id b190-20020a6334c7000000b005dc958db2fesi9949317pga.34.2024.02.22.01.27.18 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Feb 2024 01:27:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-76217-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@mediatek.com header.s=dk header.b=fcb0AFN6; arc=pass (i=1 spf=pass spfdomain=mediatek.com dkim=pass dkdomain=mediatek.com dmarc=pass fromdomain=mediatek.com); spf=pass (google.com: domain of linux-kernel+bounces-76217-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-76217-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=mediatek.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id EA112286724 for <ouuuleilei@gmail.com>; Thu, 22 Feb 2024 09:27:17 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EF7183A1C8; Thu, 22 Feb 2024 09:26:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=mediatek.com header.i=@mediatek.com header.b="fcb0AFN6" Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D92E37152; Thu, 22 Feb 2024 09:26:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=210.61.82.184 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708593981; cv=none; b=OkPAZ5u44ikRPJC5Qr7xZhxq8NqnFq1OEoOlTnNeOH3SjTD7ZEzN4UdiR9CT+qPUj/dJQUXh0R52dQqeNKPcW7YQy10lByPAcGBUjjc4vqRI2bv1dK8n+pSecUC/D69R8OZazLuLx64YPB1wet1vGF4FrNkvPW8YC3NZwSaCzco= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708593981; c=relaxed/simple; bh=z/UMHqTqwkyzkV06L9TmDwTSo92Olx2iHmNFJDx4avw=; h=From:To:CC:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=WOBoadB5mIg2bVnSbs4u9LYc556ZB0ACn5on5ml4A53FZcfcr3PaD5hzv5zlX406X9/egzSHTWCRWxpKasEktWNHdm0VpQ1MoRxMRPGm/Ynp2VqfSMXmXFIvzZAwH8QwkeCz9HskRxcVKbrza9327WfXcxFQzGi8RILU1OA4I2Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=mediatek.com; spf=pass smtp.mailfrom=mediatek.com; dkim=pass (1024-bit key) header.d=mediatek.com header.i=@mediatek.com header.b=fcb0AFN6; arc=none smtp.client-ip=210.61.82.184 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=mediatek.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=mediatek.com X-UUID: 6b1749d6d16411eea4ad694c3f9da370-20240222 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:Content-Transfer-Encoding:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=8Z6fmaYt8GqnHbucfKF/eS5GrvWtuMMfLRUtjCK0qkM=; b=fcb0AFN6gYw8dwIbOjg8Sul6Ct/AXqBcIj4xoEHqttVS1+7za5uPM7m9vwRsWSLK1sXAxIVgeZT37qJitEhjsJArcNi6wSvEFXEKhPpIhaWJAoFv+72B6hMX4dgxgxeBfJsJbtFXK2OL3ByYhumcrzOs5CAImVwzMcEK2Hb7D3g=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.37,REQID:333f2ed4-66bc-411a-8a09-8c59291aaef7,IP:0,U RL:0,TC:0,Content:-25,EDM:-30,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACT ION:release,TS:-55 X-CID-META: VersionHash:6f543d0,CLOUDID:f2fb968f-e2c0-40b0-a8fe-7c7e47299109,B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:2,IP:nil,UR L:11|1,File:nil,RT:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES: 1,SPR:NO,DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_ULN,TF_CID_SPAM_SNR X-UUID: 6b1749d6d16411eea4ad694c3f9da370-20240222 Received: from mtkmbs10n2.mediatek.inc [(172.21.101.183)] by mailgw02.mediatek.com (envelope-from <yunfei.dong@mediatek.com>) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1512645080; Thu, 22 Feb 2024 17:26:11 +0800 Received: from mtkmbs11n2.mediatek.inc (172.21.101.187) by mtkmbs11n2.mediatek.inc (172.21.101.187) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Thu, 22 Feb 2024 17:26:11 +0800 Received: from mhfsdcap04.gcn.mediatek.inc (10.17.3.154) by mtkmbs11n2.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Thu, 22 Feb 2024 17:26:10 +0800 From: Yunfei Dong <yunfei.dong@mediatek.com> To: =?utf-8?q?N=C3=ADcolas_F_=2E_R_=2E_A_=2E_Prado?= <nfraprado@collabora.com>, Nicolas Dufresne <nicolas.dufresne@collabora.com>, Hans Verkuil <hverkuil-cisco@xs4all.nl>, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>, Benjamin Gaignard <benjamin.gaignard@collabora.com>, Nathan Hebert <nhebert@chromium.org>, Irui Wang <irui.wang@mediatek.com> CC: Hsin-Yi Wang <hsinyi@chromium.org>, Fritz Koenig <frkoenig@chromium.org>, Daniel Vetter <daniel@ffwll.ch>, Steve Cho <stevecho@chromium.org>, Yunfei Dong <yunfei.dong@mediatek.com>, <linux-media@vger.kernel.org>, <devicetree@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linux-arm-kernel@lists.infradead.org>, <linux-mediatek@lists.infradead.org>, <Project_Global_Chrome_Upstream_Group@mediatek.com> Subject: [PATCH v3,1/2] media: mediatek: vcodec: adding lock to protect decoder context list Date: Thu, 22 Feb 2024 17:26:08 +0800 Message-ID: <20240222092609.31382-2-yunfei.dong@mediatek.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20240222092609.31382-1-yunfei.dong@mediatek.com> References: <20240222092609.31382-1-yunfei.dong@mediatek.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-MTK: N X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791590702205885755 X-GMAIL-MSGID: 1791590702205885755 |
Series |
media: adding lock to protect the context list
|
|
Commit Message
Yunfei Dong (董云飞)
Feb. 22, 2024, 9:26 a.m. UTC
The ctx_list will be deleted when scp getting unexpected behavior, then the
ctx_list->next will be NULL, the kernel driver maybe access NULL pointer in
function vpu_dec_ipi_handler when going through each context, then reboot.
Need to add lock to protect the ctx_list to make sure the ctx_list->next isn't
NULL pointer.
Hardware name: Google juniper sku16 board (DT)
pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=--)
pc : vpu_dec_ipi_handler+0x58/0x1f8 [mtk_vcodec_dec]
lr : scp_ipi_handler+0xd0/0x194 [mtk_scp]
sp : ffffffc0131dbbd0
x29: ffffffc0131dbbd0 x28: 0000000000000000
x27: ffffff9bb277f348 x26: ffffff9bb242ad00
x25: ffffffd2d440d3b8 x24: ffffffd2a13ff1d4
x23: ffffff9bb7fe85a0 x22: ffffffc0133fbdb0
x21: 0000000000000010 x20: ffffff9b050ea328
x19: ffffffc0131dbc08 x18: 0000000000001000
x17: 0000000000000000 x16: ffffffd2d461c6e0
x15: 0000000000000242 x14: 000000000000018f
x13: 000000000000004d x12: 0000000000000000
x11: 0000000000000001 x10: fffffffffffffff0
x9 : ffffff9bb6e793a8 x8 : 0000000000000000
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : fffffffffffffff0
x3 : 0000000000000020 x2 : ffffff9bb6e79080
x1 : 0000000000000010 x0 : ffffffc0131dbc08
Call trace:
vpu_dec_ipi_handler+0x58/0x1f8 [mtk_vcodec_dec (HASH:6c3f 2)]
scp_ipi_handler+0xd0/0x194 [mtk_scp (HASH:7046 3)]
mt8183_scp_irq_handler+0x44/0x88 [mtk_scp (HASH:7046 3)]
scp_irq_handler+0x48/0x90 [mtk_scp (HASH:7046 3)]
irq_thread_fn+0x38/0x94
irq_thread+0x100/0x1c0
kthread+0x140/0x1fc
ret_from_fork+0x10/0x30
Code: 54000088 f94ca50a eb14015f 54000060 (f9400108)
---[ end trace ace43ce36cbd5c93 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
Kernel Offset: 0x12c4000000 from 0xffffffc010000000
PHYS_OFFSET: 0xffffffe580000000
CPU features: 0x08240002,2188200c
Memory Limit: none
Fixes: 655b86e52eac ("media: mediatek: vcodec: Fix possible invalid memory access for decoder")
Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com>
---
.../platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c | 4 ++--
.../platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c | 5 +++++
.../platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h | 2 ++
drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c | 2 ++
4 files changed, 11 insertions(+), 2 deletions(-)
Comments
Il 22/02/24 10:26, Yunfei Dong ha scritto: > The ctx_list will be deleted when scp getting unexpected behavior, then the > ctx_list->next will be NULL, the kernel driver maybe access NULL pointer in > function vpu_dec_ipi_handler when going through each context, then reboot. > > Need to add lock to protect the ctx_list to make sure the ctx_list->next isn't > NULL pointer. > > Hardware name: Google juniper sku16 board (DT) > pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=--) > pc : vpu_dec_ipi_handler+0x58/0x1f8 [mtk_vcodec_dec] > lr : scp_ipi_handler+0xd0/0x194 [mtk_scp] > sp : ffffffc0131dbbd0 > x29: ffffffc0131dbbd0 x28: 0000000000000000 > x27: ffffff9bb277f348 x26: ffffff9bb242ad00 > x25: ffffffd2d440d3b8 x24: ffffffd2a13ff1d4 > x23: ffffff9bb7fe85a0 x22: ffffffc0133fbdb0 > x21: 0000000000000010 x20: ffffff9b050ea328 > x19: ffffffc0131dbc08 x18: 0000000000001000 > x17: 0000000000000000 x16: ffffffd2d461c6e0 > x15: 0000000000000242 x14: 000000000000018f > x13: 000000000000004d x12: 0000000000000000 > x11: 0000000000000001 x10: fffffffffffffff0 > x9 : ffffff9bb6e793a8 x8 : 0000000000000000 > x7 : 0000000000000000 x6 : 000000000000003f > x5 : 0000000000000040 x4 : fffffffffffffff0 > x3 : 0000000000000020 x2 : ffffff9bb6e79080 > x1 : 0000000000000010 x0 : ffffffc0131dbc08 > Call trace: > vpu_dec_ipi_handler+0x58/0x1f8 [mtk_vcodec_dec (HASH:6c3f 2)] > scp_ipi_handler+0xd0/0x194 [mtk_scp (HASH:7046 3)] > mt8183_scp_irq_handler+0x44/0x88 [mtk_scp (HASH:7046 3)] > scp_irq_handler+0x48/0x90 [mtk_scp (HASH:7046 3)] > irq_thread_fn+0x38/0x94 > irq_thread+0x100/0x1c0 > kthread+0x140/0x1fc > ret_from_fork+0x10/0x30 > Code: 54000088 f94ca50a eb14015f 54000060 (f9400108) > ---[ end trace ace43ce36cbd5c93 ]--- > Kernel panic - not syncing: Oops: Fatal exception > SMP: stopping secondary CPUs > Kernel Offset: 0x12c4000000 from 0xffffffc010000000 > PHYS_OFFSET: 0xffffffe580000000 > CPU features: 0x08240002,2188200c > Memory Limit: none > > Fixes: 655b86e52eac ("media: mediatek: vcodec: Fix possible invalid memory access for decoder") > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com> Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
Hi Yunfei, Le jeudi 22 février 2024 à 17:26 +0800, Yunfei Dong a écrit : > The ctx_list will be deleted when scp getting unexpected behavior, then the > ctx_list->next will be NULL, the kernel driver maybe access NULL pointer in > function vpu_dec_ipi_handler when going through each context, then reboot. > > Need to add lock to protect the ctx_list to make sure the ctx_list->next isn't > NULL pointer. > > Hardware name: Google juniper sku16 board (DT) > pstate: 20400005 (nzCv daif +PAN -UAO -TCO BTYPE=--) > pc : vpu_dec_ipi_handler+0x58/0x1f8 [mtk_vcodec_dec] > lr : scp_ipi_handler+0xd0/0x194 [mtk_scp] > sp : ffffffc0131dbbd0 > x29: ffffffc0131dbbd0 x28: 0000000000000000 > x27: ffffff9bb277f348 x26: ffffff9bb242ad00 > x25: ffffffd2d440d3b8 x24: ffffffd2a13ff1d4 > x23: ffffff9bb7fe85a0 x22: ffffffc0133fbdb0 > x21: 0000000000000010 x20: ffffff9b050ea328 > x19: ffffffc0131dbc08 x18: 0000000000001000 > x17: 0000000000000000 x16: ffffffd2d461c6e0 > x15: 0000000000000242 x14: 000000000000018f > x13: 000000000000004d x12: 0000000000000000 > x11: 0000000000000001 x10: fffffffffffffff0 > x9 : ffffff9bb6e793a8 x8 : 0000000000000000 > x7 : 0000000000000000 x6 : 000000000000003f > x5 : 0000000000000040 x4 : fffffffffffffff0 > x3 : 0000000000000020 x2 : ffffff9bb6e79080 > x1 : 0000000000000010 x0 : ffffffc0131dbc08 > Call trace: > vpu_dec_ipi_handler+0x58/0x1f8 [mtk_vcodec_dec (HASH:6c3f 2)] > scp_ipi_handler+0xd0/0x194 [mtk_scp (HASH:7046 3)] > mt8183_scp_irq_handler+0x44/0x88 [mtk_scp (HASH:7046 3)] > scp_irq_handler+0x48/0x90 [mtk_scp (HASH:7046 3)] > irq_thread_fn+0x38/0x94 > irq_thread+0x100/0x1c0 > kthread+0x140/0x1fc > ret_from_fork+0x10/0x30 > Code: 54000088 f94ca50a eb14015f 54000060 (f9400108) > ---[ end trace ace43ce36cbd5c93 ]--- > Kernel panic - not syncing: Oops: Fatal exception > SMP: stopping secondary CPUs > Kernel Offset: 0x12c4000000 from 0xffffffc010000000 > PHYS_OFFSET: 0xffffffe580000000 > CPU features: 0x08240002,2188200c > Memory Limit: none > > Fixes: 655b86e52eac ("media: mediatek: vcodec: Fix possible invalid memory access for decoder") > Signed-off-by: Yunfei Dong <yunfei.dong@mediatek.com> I've been experiencing this crasher recently, so nice to see you found the problem. Reviewed-by: Nicolas Dufresne <nicolas.dufresne@collabora.com> > --- > .../platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c | 4 ++-- > .../platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c | 5 +++++ > .../platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h | 2 ++ > drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c | 2 ++ > 4 files changed, 11 insertions(+), 2 deletions(-) > > diff --git a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c > index 9f6e4b59455da..9a11a2c248045 100644 > --- a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c > +++ b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c > @@ -58,12 +58,12 @@ static void mtk_vcodec_vpu_reset_dec_handler(void *priv) > > dev_err(&dev->plat_dev->dev, "Watchdog timeout!!"); > > - mutex_lock(&dev->dev_mutex); > + mutex_lock(&dev->dev_ctx_lock); > list_for_each_entry(ctx, &dev->ctx_list, list) { > ctx->state = MTK_STATE_ABORT; > mtk_v4l2_vdec_dbg(0, ctx, "[%d] Change to state MTK_STATE_ABORT", ctx->id); > } > - mutex_unlock(&dev->dev_mutex); > + mutex_unlock(&dev->dev_ctx_lock); > } > > static void mtk_vcodec_vpu_reset_enc_handler(void *priv) > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drvc > index ad9b68380692f..d69c9fe2af6f3 100644 > --- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c > +++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c > @@ -267,7 +267,9 @@ static int fops_vcodec_open(struct file *file) > > ctx->dev->vdec_pdata->init_vdec_params(ctx); > > + mutex_lock(&dev->dev_ctx_lock); > list_add(&ctx->list, &dev->ctx_list); > + mutex_unlock(&dev->dev_ctx_lock); > mtk_vcodec_dbgfs_create(ctx); > > mutex_unlock(&dev->dev_mutex); > @@ -310,7 +312,9 @@ static int fops_vcodec_release(struct file *file) > v4l2_ctrl_handler_free(&ctx->ctrl_hdl); > > mtk_vcodec_dbgfs_remove(dev, ctx->id); > + mutex_lock(&dev->dev_ctx_lock); > list_del_init(&ctx->list); > + mutex_unlock(&dev->dev_ctx_lock); > kfree(ctx); > mutex_unlock(&dev->dev_mutex); > return 0; > @@ -403,6 +407,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev) > for (i = 0; i < MTK_VDEC_HW_MAX; i++) > mutex_init(&dev->dec_mutex[i]); > mutex_init(&dev->dev_mutex); > + mutex_init(&dev->dev_ctx_lock); > spin_lock_init(&dev->irqlock); > > snprintf(dev->v4l2_dev.name, sizeof(dev->v4l2_dev.name), "%s", > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drvh > index 849b89dd205c2..85b2c0d3d8bcd 100644 > --- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h > +++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h > @@ -241,6 +241,7 @@ struct mtk_vcodec_dec_ctx { > * > * @dec_mutex: decoder hardware lock > * @dev_mutex: video_device lock > + * @dev_ctx_lock: the lock of context list > * @decode_workqueue: decode work queue > * > * @irqlock: protect data access by irq handler and work thread > @@ -282,6 +283,7 @@ struct mtk_vcodec_dec_dev { > /* decoder hardware mutex lock */ > struct mutex dec_mutex[MTK_VDEC_HW_MAX]; > struct mutex dev_mutex; > + struct mutex dev_ctx_lock; > struct workqueue_struct *decode_workqueue; > > spinlock_t irqlock; > diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c > index 82e57ae983d55..da6be556727bb 100644 > --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c > +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c > @@ -77,12 +77,14 @@ static bool vpu_dec_check_ap_inst(struct mtk_vcodec_dec_dev *dec_dev, struct vde > struct mtk_vcodec_dec_ctx *ctx; > int ret = false; > > + mutex_lock(&dec_dev->dev_ctx_lock); > list_for_each_entry(ctx, &dec_dev->ctx_list, list) { > if (!IS_ERR_OR_NULL(ctx) && ctx->vpu_inst == vpu) { > ret = true; > break; > } > } > + mutex_unlock(&dec_dev->dev_ctx_lock); > > return ret; > }
Hi, Le jeudi 22 février 2024 à 17:26 +0800, Yunfei Dong a écrit : > The ctx_list will be deleted when scp getting unexpected behavior, then the > ctx_list->next will be NULL, the kernel driver maybe access NULL pointer in > function vpu_dec_ipi_handler when going through each context, then reboot. > > Need to add lock to protect the ctx_list to make sure the ctx_list->next isn't > NULL pointer. The cited crash no longer occurs for me, but it still sometimes crashes while the SCP being rebooted. I think this patch can still go in, as it overall improves the situation. Meanwhile, here's my stress test using GStreamer and stream downloaded by fluster. I call this script few times this way as it does not always crash. The test just keep starting decode sessions and terminate them after 2 seconds. It is highly parallel. Using too low number does not reproduce the crash, using too high number leads to alloc failure, which wasn't the goal of this test. /mtk-vcodec-crash.sh 100 Script code: *** #!/bin/bash test() { gst-launch-1.0 --no-fault filesrc location=TILES_B_Cisco_1.bin ! h265parse ! v4l2slh265dec ! fakevideosink & pid=$! sleep 2 kill $pid } for i in $(seq 1 $1) do test & done wait *** The kernel Crash: [ 93.261248] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008 [ 93.270056] Mem abort info: [ 93.272880] ESR = 0x0000000096000004 [ 93.276804] EC = 0x25: DABT (current EL), IL = 32 bits [ 93.282233] SET = 0, FnV = 0 [ 93.285372] EA = 0, S1PTW = 0 [ 93.288561] FSC = 0x04: level 0 translation fault [ 93.293493] Data abort info: [ 93.296424] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 93.301920] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 93.306977] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 93.312321] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000168daf000 [ 93.318790] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000 [ 93.325588] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 93.331842] Modules linked in: mt7921e mt7921_common mt792x_lib mt76_connac_lib mt76 mac80211 btusb btintel mtk_vcodec_dec_hw btmtk btrtl mtk_vcodec_dec btbcm cfg80211 bluetooth snd_sof_mt8195 mtk_vcodec_enc mtk_adsp_common uvcvideo v4l2_vp9 snd_sof_xtensa_dsp v4l2_h264 mtk_vcodec_dbgfs snd_sof_of snd_sof ecdh_generic mtk_vcodec_common ecc uvc elan_i2c videobuf2_vmalloc crct10dif_ce cros_ec_lid_angle cros_ec_sensors snd_sof_utils cros_ec_sensors_core cros_usbpd_logger cros_usbpd_charger fuse ip_tables ipv6 [ 93.376652] CPU: 5 PID: 3210 Comm: h265parse0:sink Tainted: G W 6.8.0-rc4-next-20240212+ #14 [ 93.386463] Hardware name: Acer Tomato (rev3 - 4) board (DT) [ 93.392107] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 93.399054] pc : vcodec_vpu_send_msg+0x4c/0x190 [mtk_vcodec_dec] [ 93.405058] lr : vcodec_send_ap_ipi+0x78/0x170 [mtk_vcodec_dec] [ 93.410968] sp : ffff80008750bc20 [ 93.414269] x29: ffff80008750bc20 x28: ffff1299f6d70000 x27: 0000000000000000 [ 93.421391] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 93.428512] x23: ffff80008750bc98 x22: 000000000000a003 x21: ffffd45c4cfae000 [ 93.435632] x20: 0000000000000010 x19: ffff1299fd668310 x18: 000000000000001a [ 93.442753] x17: 000000040044ffff x16: ffffd45cb15dc648 x15: 0000000000000000 [ 93.449874] x14: ffff1299c08da1c0 x13: ffffd45cb1f87a10 x12: ffffd45cb2f5fe80 [ 93.456995] x11: 0000000000000001 x10: 0000000000001b30 x9 : ffffd45c4d12b488 [ 93.464116] x8 : 1fffe25339380d81 x7 : 0000000000000001 x6 : ffff1299c9c06c00 [ 93.471236] x5 : 0000000000000132 x4 : 0000000000000000 x3 : 0000000000000000 [ 93.478358] x2 : 0000000000000010 x1 : ffff80008750bc98 x0 : 0000000000000000 [ 93.485479] Call trace: [ 93.487914] vcodec_vpu_send_msg+0x4c/0x190 [mtk_vcodec_dec] [ 93.493563] vcodec_send_ap_ipi+0x78/0x170 [mtk_vcodec_dec] [ 93.499125] vpu_dec_deinit+0x1c/0x30 [mtk_vcodec_dec] [ 93.504254] vdec_hevc_slice_deinit+0x30/0x98 [mtk_vcodec_dec] [ 93.510076] vdec_if_deinit+0x38/0x68 [mtk_vcodec_dec] [ 93.515205] mtk_vcodec_dec_release+0x20/0x40 [mtk_vcodec_dec] [ 93.521027] fops_vcodec_release+0x64/0x118 [mtk_vcodec_dec] [ 93.526677] v4l2_release+0x7c/0x100 [ 93.530245] __fput+0x80/0x2d8 [ 93.533292] __fput_sync+0x58/0x70 [ 93.536681] __arm64_sys_close+0x40/0x90 [ 93.540590] invoke_syscall+0x50/0x128 [ 93.544329] el0_svc_common.constprop.0+0x48/0xf0 [ 93.549020] do_el0_svc+0x24/0x38 [ 93.552323] el0_svc+0x38/0xd8 [ 93.555367] el0t_64_sync_handler+0xc0/0xc8 [ 93.559537] el0t_64_sync+0x1a8/0x1b0 [ 93.563189] Code: d503201f f9401660 b900127f b900227f (f9400400) [ 93.569268] ---[ end trace 0000000000000000 ]---
Hi Yunfei, Le lundi 26 février 2024 à 14:39 -0500, Nicolas Dufresne a écrit : > Hi, > > Le jeudi 22 février 2024 à 17:26 +0800, Yunfei Dong a écrit : > > The ctx_list will be deleted when scp getting unexpected behavior, then the > > ctx_list->next will be NULL, the kernel driver maybe access NULL pointer in > > function vpu_dec_ipi_handler when going through each context, then reboot. > > > > Need to add lock to protect the ctx_list to make sure the ctx_list->next isn't > > NULL pointer. > > The cited crash no longer occurs for me, but it still sometimes crashes while > the SCP being rebooted. I think this patch can still go in, as it overall > improves the situation. > > Meanwhile, here's my stress test using GStreamer and stream downloaded by > fluster. I call this script few times this way as it does not always crash. The > test just keep starting decode sessions and terminate them after 2 seconds. It > is highly parallel. Using too low number does not reproduce the crash, using too > high number leads to alloc failure, which wasn't the goal of this test. I just sent a fix for that crash, it was limited to HEVC. https://lore.kernel.org/all/20240226211954.400891-1-nicolas.dufresne@collabora.com/ With this applied, the kernel no longer crash. But the SCP get reset every-time I run the script below. Will you be able to provide a firmware (or driver if that turns out to the issue) for this ? regards, Nicolas > > ./mtk-vcodec-crash.sh 100 > > Script code: > *** > #!/bin/bash > > test() { > gst-launch-1.0 --no-fault filesrc location=TILES_B_Cisco_1.bin ! h265parse ! v4l2slh265dec ! fakevideosink & > pid=$! > > sleep 2 > kill $pid > } > > for i in $(seq 1 $1) > do > test & > done > > wait > *** > > The kernel Crash: > [ 93.261248] Unable to handle kernel NULL pointer dereference at virtual > address 0000000000000008 > [ 93.270056] Mem abort info: > [ 93.272880] ESR = 0x0000000096000004 > [ 93.276804] EC = 0x25: DABT (current EL), IL = 32 bits > [ 93.282233] SET = 0, FnV = 0 > [ 93.285372] EA = 0, S1PTW = 0 > [ 93.288561] FSC = 0x04: level 0 translation fault > [ 93.293493] Data abort info: > [ 93.296424] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 > [ 93.301920] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 > [ 93.306977] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 > [ 93.312321] user pgtable: 4k pages, 48-bit VAs, pgdp=0000000168daf000 > [ 93.318790] [0000000000000008] pgd=0000000000000000, p4d=0000000000000000 > [ 93.325588] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP > [ 93.331842] Modules linked in: mt7921e mt7921_common mt792x_lib > mt76_connac_lib mt76 mac80211 btusb btintel mtk_vcodec_dec_hw btmtk btrtl > mtk_vcodec_dec btbcm cfg80211 bluetooth snd_sof_mt8195 mtk_vcodec_enc > mtk_adsp_common uvcvideo v4l2_vp9 snd_sof_xtensa_dsp v4l2_h264 mtk_vcodec_dbgfs > snd_sof_of snd_sof ecdh_generic mtk_vcodec_common ecc uvc elan_i2c > videobuf2_vmalloc crct10dif_ce cros_ec_lid_angle cros_ec_sensors snd_sof_utils > cros_ec_sensors_core cros_usbpd_logger cros_usbpd_charger fuse ip_tables ipv6 > [ 93.376652] CPU: 5 PID: 3210 Comm: h265parse0:sink Tainted: G W > 6.8.0-rc4-next-20240212+ #14 > [ 93.386463] Hardware name: Acer Tomato (rev3 - 4) board (DT) > [ 93.392107] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [ 93.399054] pc : vcodec_vpu_send_msg+0x4c/0x190 [mtk_vcodec_dec] > [ 93.405058] lr : vcodec_send_ap_ipi+0x78/0x170 [mtk_vcodec_dec] > [ 93.410968] sp : ffff80008750bc20 > [ 93.414269] x29: ffff80008750bc20 x28: ffff1299f6d70000 x27: 0000000000000000 > [ 93.421391] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 > [ 93.428512] x23: ffff80008750bc98 x22: 000000000000a003 x21: ffffd45c4cfae000 > [ 93.435632] x20: 0000000000000010 x19: ffff1299fd668310 x18: 000000000000001a > [ 93.442753] x17: 000000040044ffff x16: ffffd45cb15dc648 x15: 0000000000000000 > [ 93.449874] x14: ffff1299c08da1c0 x13: ffffd45cb1f87a10 x12: ffffd45cb2f5fe80 > [ 93.456995] x11: 0000000000000001 x10: 0000000000001b30 x9 : ffffd45c4d12b488 > [ 93.464116] x8 : 1fffe25339380d81 x7 : 0000000000000001 x6 : ffff1299c9c06c00 > [ 93.471236] x5 : 0000000000000132 x4 : 0000000000000000 x3 : 0000000000000000 > [ 93.478358] x2 : 0000000000000010 x1 : ffff80008750bc98 x0 : 0000000000000000 > [ 93.485479] Call trace: > [ 93.487914] vcodec_vpu_send_msg+0x4c/0x190 [mtk_vcodec_dec] > [ 93.493563] vcodec_send_ap_ipi+0x78/0x170 [mtk_vcodec_dec] > [ 93.499125] vpu_dec_deinit+0x1c/0x30 [mtk_vcodec_dec] > [ 93.504254] vdec_hevc_slice_deinit+0x30/0x98 [mtk_vcodec_dec] > [ 93.510076] vdec_if_deinit+0x38/0x68 [mtk_vcodec_dec] > [ 93.515205] mtk_vcodec_dec_release+0x20/0x40 [mtk_vcodec_dec] > [ 93.521027] fops_vcodec_release+0x64/0x118 [mtk_vcodec_dec] > [ 93.526677] v4l2_release+0x7c/0x100 > [ 93.530245] __fput+0x80/0x2d8 > [ 93.533292] __fput_sync+0x58/0x70 > [ 93.536681] __arm64_sys_close+0x40/0x90 > [ 93.540590] invoke_syscall+0x50/0x128 > [ 93.544329] el0_svc_common.constprop.0+0x48/0xf0 > [ 93.549020] do_el0_svc+0x24/0x38 > [ 93.552323] el0_svc+0x38/0xd8 > [ 93.555367] el0t_64_sync_handler+0xc0/0xc8 > [ 93.559537] el0t_64_sync+0x1a8/0x1b0 > [ 93.563189] Code: d503201f f9401660 b900127f b900227f (f9400400) > [ 93.569268] ---[ end trace 0000000000000000 ]--- >
diff --git a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c index 9f6e4b59455da..9a11a2c248045 100644 --- a/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c +++ b/drivers/media/platform/mediatek/vcodec/common/mtk_vcodec_fw_vpu.c @@ -58,12 +58,12 @@ static void mtk_vcodec_vpu_reset_dec_handler(void *priv) dev_err(&dev->plat_dev->dev, "Watchdog timeout!!"); - mutex_lock(&dev->dev_mutex); + mutex_lock(&dev->dev_ctx_lock); list_for_each_entry(ctx, &dev->ctx_list, list) { ctx->state = MTK_STATE_ABORT; mtk_v4l2_vdec_dbg(0, ctx, "[%d] Change to state MTK_STATE_ABORT", ctx->id); } - mutex_unlock(&dev->dev_mutex); + mutex_unlock(&dev->dev_ctx_lock); } static void mtk_vcodec_vpu_reset_enc_handler(void *priv) diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c index ad9b68380692f..d69c9fe2af6f3 100644 --- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c +++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.c @@ -267,7 +267,9 @@ static int fops_vcodec_open(struct file *file) ctx->dev->vdec_pdata->init_vdec_params(ctx); + mutex_lock(&dev->dev_ctx_lock); list_add(&ctx->list, &dev->ctx_list); + mutex_unlock(&dev->dev_ctx_lock); mtk_vcodec_dbgfs_create(ctx); mutex_unlock(&dev->dev_mutex); @@ -310,7 +312,9 @@ static int fops_vcodec_release(struct file *file) v4l2_ctrl_handler_free(&ctx->ctrl_hdl); mtk_vcodec_dbgfs_remove(dev, ctx->id); + mutex_lock(&dev->dev_ctx_lock); list_del_init(&ctx->list); + mutex_unlock(&dev->dev_ctx_lock); kfree(ctx); mutex_unlock(&dev->dev_mutex); return 0; @@ -403,6 +407,7 @@ static int mtk_vcodec_probe(struct platform_device *pdev) for (i = 0; i < MTK_VDEC_HW_MAX; i++) mutex_init(&dev->dec_mutex[i]); mutex_init(&dev->dev_mutex); + mutex_init(&dev->dev_ctx_lock); spin_lock_init(&dev->irqlock); snprintf(dev->v4l2_dev.name, sizeof(dev->v4l2_dev.name), "%s", diff --git a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h index 849b89dd205c2..85b2c0d3d8bcd 100644 --- a/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h +++ b/drivers/media/platform/mediatek/vcodec/decoder/mtk_vcodec_dec_drv.h @@ -241,6 +241,7 @@ struct mtk_vcodec_dec_ctx { * * @dec_mutex: decoder hardware lock * @dev_mutex: video_device lock + * @dev_ctx_lock: the lock of context list * @decode_workqueue: decode work queue * * @irqlock: protect data access by irq handler and work thread @@ -282,6 +283,7 @@ struct mtk_vcodec_dec_dev { /* decoder hardware mutex lock */ struct mutex dec_mutex[MTK_VDEC_HW_MAX]; struct mutex dev_mutex; + struct mutex dev_ctx_lock; struct workqueue_struct *decode_workqueue; spinlock_t irqlock; diff --git a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c index 82e57ae983d55..da6be556727bb 100644 --- a/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c +++ b/drivers/media/platform/mediatek/vcodec/decoder/vdec_vpu_if.c @@ -77,12 +77,14 @@ static bool vpu_dec_check_ap_inst(struct mtk_vcodec_dec_dev *dec_dev, struct vde struct mtk_vcodec_dec_ctx *ctx; int ret = false; + mutex_lock(&dec_dev->dev_ctx_lock); list_for_each_entry(ctx, &dec_dev->ctx_list, list) { if (!IS_ERR_OR_NULL(ctx) && ctx->vpu_inst == vpu) { ret = true; break; } } + mutex_unlock(&dec_dev->dev_ctx_lock); return ret; }