Message ID | 20230503083438.85139-1-benjamin.gaignard@collabora.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1166347vqo; Wed, 3 May 2023 01:35:52 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6TRYsChW82wHz8CZIW6+zRMVoEpmLHrCu4cXdEOrZ808knlbKLwU8uRzglKvenXT5fthtN X-Received: by 2002:a05:6a20:d2c7:b0:f7:fbd8:5951 with SMTP id ir7-20020a056a20d2c700b000f7fbd85951mr21183197pzb.49.1683102952392; Wed, 03 May 2023 01:35:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683102952; cv=none; d=google.com; s=arc-20160816; b=X2tWyi5A4BggQ1pAioQZj6A40ggI4EOf3I8u2UHrZmI1ubzSswWud8BWqF3j7YnqC0 cJMaex5VZzYAYRLNfaSG6ieWwGc9hg4DSeV0MtRljVI/k67K99c+cngNw9Bt0NwLtK0U 7l5VpjIrNM8sAUnnVUB2myNUCe6X9s4aDdaSI0HF0dvoN/k++H+3BmVVPdwP4CNoCHJB y+eR21Vdy8BsjXsdt2ieEDM3HCJ6OgbhPizF/wXbHmTmH/hc00K0g3fTOog/kXMzyWEz fn4EzReEfc//0toDy3DtRouZuPWggHE/jieJK95rI3eN0MBknzR6undWsdQkx982KoWu lV3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=uz5hkk7U+mv6r5hgJ2r1PCLa4hoECxl1bXeg7tYTpxs=; b=RQVA2UiIlkAmXXhqoPNPZ8q3+nwNjIXFS23nfawf+0RnV486Zg5nGaf/vtviQpTDLH fDD2WOvwlU1YR7G2Q2OoICX1aGxi1ZBMW8ToIJiSpg6yTFUgehNTY+hbzWVL7Di+Ph7a 9ewZaXuaS6PG3lveZOxjDowvVoWlhcEsyljJCcSuxUSH3BC5BaeASOCOG51eB0j1bQd/ Ygn0O7qOKAYGj9xDE7WIdN59gyN2CO7Q/+zPTPtPDl3kE2TVPLD+UgdumpQndSaSubrx aHOXUAdgHRGhQ4jRLfXI22nKqrM+9yl7LT8WVONw4LkHZ02KMrpdn1TLTVvGJ5OYl2z6 ScJg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=CgmRKEkT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 141-20020a630093000000b00513f1464b0esi32516860pga.618.2023.05.03.01.35.37; Wed, 03 May 2023 01:35:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=CgmRKEkT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229828AbjECIeu (ORCPT <rfc822;heyuhang3455@gmail.com> + 99 others); Wed, 3 May 2023 04:34:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229713AbjECIet (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 3 May 2023 04:34:49 -0400 Received: from madras.collabora.co.uk (madras.collabora.co.uk [46.235.227.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91DAC49F0; Wed, 3 May 2023 01:34:47 -0700 (PDT) Received: from benjamin-XPS-13-9310.. (unknown [IPv6:2a01:e0a:120:3210:8234:977e:ebbe:d70b]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: benjamin.gaignard) by madras.collabora.co.uk (Postfix) with ESMTPSA id AD6DB6602338; Wed, 3 May 2023 09:34:45 +0100 (BST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1683102886; bh=KaqkJvEMba9WXHV9EWAyvWYFN18VeoPcjrsUhdVX16g=; h=From:To:Cc:Subject:Date:From; b=CgmRKEkTGgIx1oEyBFrOUg0/jrYKx20PgHF5wGTgZYvuPKx/k2BmTLcXwDBvBa/5L /M+Vv1Q4XO1GbiCLQ6tHEIlJ5TztDqgPsgBETHtO7eaREav4QSKUJkicStLlIOaySs mhHTRCIa6yI2L0fEomcSgDPPZu2kMzFJuiRiSQN79oAZOPjs/Fm5Xe/zxix8MPQXZN LvgZPptlHAXkzuK+fz+gvo6zGVZ3Opa7dTcLAaJ8ifXXQ+zpXk1QVu5QMs+k3T5dTa XT5k0hgIpkpddtYP7EqOCwdHYZmF/RsK85Jf3Qr9yqDeJah4YVvjbYG/8ynQgTtfX1 MRD/g+UN7sCUg== From: Benjamin Gaignard <benjamin.gaignard@collabora.com> To: ezequiel@vanguardiasur.com.ar, p.zabel@pengutronix.de, mchehab@kernel.org, robh+dt@kernel.org, krzysztof.kozlowski+dt@linaro.org, heiko@sntech.de, hverkuil-cisco@xs4all.nl, nicolas.dufresne@collabora.com Cc: linux-media@vger.kernel.org, linux-rockchip@lists.infradead.org, devicetree@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kernel@collabora.com, Benjamin Gaignard <benjamin.gaignard@collabora.com> Subject: [PATCH v7 00/13] AV1 stateless decoder for RK3588 Date: Wed, 3 May 2023 10:34:25 +0200 Message-Id: <20230503083438.85139-1-benjamin.gaignard@collabora.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764861361285071507?= X-GMAIL-MSGID: =?utf-8?q?1764861361285071507?= |
Series |
AV1 stateless decoder for RK3588
|
|
Message
Benjamin Gaignard
May 3, 2023, 8:34 a.m. UTC
This series implement AV1 stateless decoder for RK3588 SoC. The hardware support 8 and 10 bits bitstreams up to 7680x4320. AV1 feature like film grain or scaling are done by the postprocessor. The driver can produce NV12_4L4, NV12_10LE40_4L4, NV12 and P010 pixels formats. Even if Rockchip have named the hardware VPU981 it looks like a VC9000 but with a different registers mapping. The full branch can be found here: https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v7 Fluster score is: 200/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0. The failing tests are: - the 2 tests with 2 spatial layers: few errors in luma/chroma values - tests with resolution < hardware limit (64x64) - 10bits film grain test: bad macroblocks while decoding, the same 8bits test is working fine. Changes in v7: - Rebased on media_tree master branch. - Fix warnings exposed by W=1 - Fix Angelo's comments Changes in v6: - Rename NV12_10LE40_4L4 pixel format into NV15_4L4. - Add defines for post-proc selection. - Change patch order as requested by Nicolas. - Fix frame-larger-than warning. Changes in v5: - Add a patch to initialize bit_depth field of V4L2_CTRL_TYPE_AV1_SEQUENCE ioctl. Changes in v4: - Squash "Save bit depth for AV1 decoder" and "Check AV1 bitstreams bit depth" patches. - Double motion vectors buffer size. - Fix the various errors reported by Hans. Changes in v3: - Fix arrays loops limites. - Remove unused field. - Reset raw pixel formats list when bit depth or film grain feature values change. - Enable post-processor P010 support Changes in v2: - Remove useless +1 in sbs computation. - Describe NV12_10LE40_4L4 pixels format. - Post-processor could generate P010. - Fix comments done on v1. - The last patch make sure that only post-processed formats are used when film grain feature is enabled. Benjamin Benjamin Gaignard (12): dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible media: AV1: Make sure that bit depth in correctly initialize media: Add NV15_4L4 pixel format media: verisilicon: Get bit depth for V4L2_PIX_FMT_NV15_4L4 media: verisilicon: Add AV1 decoder mode and controls media: verisilicon: Check AV1 bitstreams bit depth media: verisilicon: Compute motion vectors size for AV1 frames media: verisilicon: Add AV1 entropy helpers media: verisilicon: Add Rockchip AV1 decoder media: verisilicon: Add film grain feature to AV1 driver media: verisilicon: Enable AV1 decoder on rk3588 media: verisilicon: Conditionally ignore native formats Nicolas Dufresne (1): v4l2-common: Add support for fractional bpp .../bindings/media/rockchip-vpu.yaml | 1 + .../media/v4l/pixfmt-yuv-planar.rst | 16 + drivers/media/platform/verisilicon/Makefile | 3 + drivers/media/platform/verisilicon/hantro.h | 8 + .../media/platform/verisilicon/hantro_drv.c | 68 +- .../media/platform/verisilicon/hantro_hw.h | 102 + .../platform/verisilicon/hantro_postproc.c | 9 +- .../media/platform/verisilicon/hantro_v4l2.c | 67 +- .../media/platform/verisilicon/hantro_v4l2.h | 8 +- .../verisilicon/rockchip_av1_entropymode.c | 4424 +++++++++++++++++ .../verisilicon/rockchip_av1_entropymode.h | 272 + .../verisilicon/rockchip_av1_filmgrain.c | 401 ++ .../verisilicon/rockchip_av1_filmgrain.h | 36 + .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2232 +++++++++ .../verisilicon/rockchip_vpu981_regs.h | 477 ++ .../platform/verisilicon/rockchip_vpu_hw.c | 134 + drivers/media/v4l2-core/v4l2-common.c | 162 +- drivers/media/v4l2-core/v4l2-ctrls-core.c | 5 + drivers/media/v4l2-core/v4l2-ioctl.c | 1 + include/media/v4l2-common.h | 2 + include/uapi/linux/videodev2.h | 1 + 21 files changed, 8326 insertions(+), 103 deletions(-) create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h
Comments
Le mercredi 03 mai 2023 à 10:34 +0200, Benjamin Gaignard a écrit : > This series implement AV1 stateless decoder for RK3588 SoC. > The hardware support 8 and 10 bits bitstreams up to 7680x4320. > AV1 feature like film grain or scaling are done by the postprocessor. > The driver can produce NV12_4L4, NV12_10LE40_4L4, NV12 and P010 pixels formats. > Even if Rockchip have named the hardware VPU981 it looks like a VC9000 but > with a different registers mapping. Just wanted to add a comment about the series. After discussion with Ezequiel, it seems this driver is getting quite big, at some point we should probably split it, so the newer chips have a driver free from legacy, and also free from having encoders in the same driver. But as usual, who makes the work tend to rule, and this one did not turned too badly, a lot of the extra work ended related to issues with the VP9/G2 integration. G2 cores are much closer to VC8000D/9000D then G1 and might have been the right moment to make the split, but we kind of miss that opportunity. The difference is that G2 had highly limited post processing support, VC8000D/9000D have the same post processor, which is similar in functionality to the G1 post processor. **Informative, feel free to skip the rest** Because I was asked at least twice last week, the reason for having encoders in that driver is that the Hantro H1 and G1 cores shares the same cache storage, and cannot be run concurrently. The solution that was picked was to place them both in the same driver, sharing the same m2m ctx and leaving it to the kernel scheduler to dispatch on both encoder and decoder cores. Newer VSI cores can run concurrently. In newer chips generation, notably RKVDEC2 and probably VC9000D (we don't have spec), the multicore model will not work with such a scheduling model since for 8k stream you need to use 2 cores, where you can use those two concurrently for other streams. The scheduling will have to be fancier to ensure we don't starve the 8K+ streams. Having a clean driver to do so will help. This AV1 core does not have 8K support with this method of binding two cores together, so I see no harm going forward similarly to what we did for the G2 core (HEVC/VP9). Perhaps there is going to be challenges in possibly moving the implementation in future, I'd be very happy to get feedback about this, so the we can help Ezequiel here in making sure this driver does not become a giant blob of unrelated but similar chips all under the same driver. regards, Nicolas > > The full branch can be found here: > https://gitlab.collabora.com/linux/for-upstream/-/commits/rk3588_av1_decoder_v7 > > Fluster score is: 200/239 while testing AV1-TEST-VECTORS with GStreamer-AV1-V4L2SL-Gst1.0. > The failing tests are: > - the 2 tests with 2 spatial layers: few errors in luma/chroma values > - tests with resolution < hardware limit (64x64) > - 10bits film grain test: bad macroblocks while decoding, the same 8bits > test is working fine. > > Changes in v7: > - Rebased on media_tree master branch. > - Fix warnings exposed by W=1 > - Fix Angelo's comments > > Changes in v6: > - Rename NV12_10LE40_4L4 pixel format into NV15_4L4. > - Add defines for post-proc selection. > - Change patch order as requested by Nicolas. > - Fix frame-larger-than warning. > > Changes in v5: > - Add a patch to initialize bit_depth field of V4L2_CTRL_TYPE_AV1_SEQUENCE > ioctl. > > Changes in v4: > - Squash "Save bit depth for AV1 decoder" and "Check AV1 bitstreams bit > depth" patches. > - Double motion vectors buffer size. > - Fix the various errors reported by Hans. > > Changes in v3: > - Fix arrays loops limites. > - Remove unused field. > - Reset raw pixel formats list when bit depth or film grain feature > values change. > - Enable post-processor P010 support > > Changes in v2: > - Remove useless +1 in sbs computation. > - Describe NV12_10LE40_4L4 pixels format. > - Post-processor could generate P010. > - Fix comments done on v1. > - The last patch make sure that only post-processed formats are used when film > grain feature is enabled. > > Benjamin > > > Benjamin Gaignard (12): > dt-bindings: media: rockchip-vpu: Add rk3588 vpu compatible > media: AV1: Make sure that bit depth in correctly initialize > media: Add NV15_4L4 pixel format > media: verisilicon: Get bit depth for V4L2_PIX_FMT_NV15_4L4 > media: verisilicon: Add AV1 decoder mode and controls > media: verisilicon: Check AV1 bitstreams bit depth > media: verisilicon: Compute motion vectors size for AV1 frames > media: verisilicon: Add AV1 entropy helpers > media: verisilicon: Add Rockchip AV1 decoder > media: verisilicon: Add film grain feature to AV1 driver > media: verisilicon: Enable AV1 decoder on rk3588 > media: verisilicon: Conditionally ignore native formats > > Nicolas Dufresne (1): > v4l2-common: Add support for fractional bpp > > .../bindings/media/rockchip-vpu.yaml | 1 + > .../media/v4l/pixfmt-yuv-planar.rst | 16 + > drivers/media/platform/verisilicon/Makefile | 3 + > drivers/media/platform/verisilicon/hantro.h | 8 + > .../media/platform/verisilicon/hantro_drv.c | 68 +- > .../media/platform/verisilicon/hantro_hw.h | 102 + > .../platform/verisilicon/hantro_postproc.c | 9 +- > .../media/platform/verisilicon/hantro_v4l2.c | 67 +- > .../media/platform/verisilicon/hantro_v4l2.h | 8 +- > .../verisilicon/rockchip_av1_entropymode.c | 4424 +++++++++++++++++ > .../verisilicon/rockchip_av1_entropymode.h | 272 + > .../verisilicon/rockchip_av1_filmgrain.c | 401 ++ > .../verisilicon/rockchip_av1_filmgrain.h | 36 + > .../verisilicon/rockchip_vpu981_hw_av1_dec.c | 2232 +++++++++ > .../verisilicon/rockchip_vpu981_regs.h | 477 ++ > .../platform/verisilicon/rockchip_vpu_hw.c | 134 + > drivers/media/v4l2-core/v4l2-common.c | 162 +- > drivers/media/v4l2-core/v4l2-ctrls-core.c | 5 + > drivers/media/v4l2-core/v4l2-ioctl.c | 1 + > include/media/v4l2-common.h | 2 + > include/uapi/linux/videodev2.h | 1 + > 21 files changed, 8326 insertions(+), 103 deletions(-) > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.c > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_entropymode.h > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.c > create mode 100644 drivers/media/platform/verisilicon/rockchip_av1_filmgrain.h > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_hw_av1_dec.c > create mode 100644 drivers/media/platform/verisilicon/rockchip_vpu981_regs.h >