From patchwork Wed Jun 21 00:57:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 110725 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4067781vqr; Tue, 20 Jun 2023 19:09:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ44vwhjgpzVHo+KBBicCkky9f//Rn2KX7c6Wd+dTBFp8xx8tzoSETZ8KdXN07roaxkSpARl X-Received: by 2002:a05:6a00:2d04:b0:65c:2ea:2c5e with SMTP id fa4-20020a056a002d0400b0065c02ea2c5emr14536692pfb.29.1687313362565; Tue, 20 Jun 2023 19:09:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687313362; cv=none; d=google.com; s=arc-20160816; b=HF2RgQz/pCjaD8aiZCzD1vZH54YPEi93zSwxIAQAEnqxj6jv+V4giaNOulDgfuhKdw SjPs93umQ+C1uD7+sCcBLuBiAyeidHfBoU+ZA9lQf1a3uDuQaoMl4OfX0TSHiVpAF1X4 wXal+hGlW5kbVUBTVlV2wOI9izuO7lwPa/AHjBGUMRraJ0hFi0Y9lcXHN9D6hRwcHGo8 WiHuuVQxZXJRMc+iu8Toe2PXS/VFh13u1GeHpnbBrdvr8J7UZA1YTJCQkulUk/Wm65S8 pvwZ6qkVYM0UOQjrmOV6uWRD02BlHgpjMysaj63xl+p0lL9oaEj7MSozup9lT2b4pM4I uUgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/U+iXS7cPu32eSBwTnxjvnLVI4LNUEgzC2fiTpf+640=; b=tY8SArujaMLRKpMIeOvPYebG02oaa4oxAiUmLiSJfY0prPn9eDHwyhe+FKBiiU+icg wACouDSXDE3idwC2wnZnWJgSW0m+OjWVn6CcBeAoQ48pX4S6hkYqxxfcMKf2oXlU5QQH tkwGMeVNXLVCICQMmsw+/ZzU4sAOTjkuLDbfqy8IoMsOTC6iSNrI18/S5J6AamGXselY Y5NgAhAnLDHqkvDJqEQTTIxQim+eE1faVd+n0HnaNopeZDENzN1luzE4JEBbXMn46taz 1PFz5naX3t/cdcqvpE9MBFhUs9sbs3FSespO6ucEM5Dr7ST7S/GMA3uyrhah7qFE9+ED 6jsA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=ViYBsyZd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z127-20020a633385000000b005533f397576si2972033pgz.46.2023.06.20.19.09.10; Tue, 20 Jun 2023 19:09:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=ViYBsyZd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229930AbjFUA6g (ORCPT + 99 others); Tue, 20 Jun 2023 20:58:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53670 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbjFUA6d (ORCPT ); Tue, 20 Jun 2023 20:58:33 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FFFF10CE for ; Tue, 20 Jun 2023 17:58:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=/U+iXS7cPu32eSBwTnxjvnLVI4LNUEgzC2fiTpf+640=; b=ViYBsyZduSvJIePE3SHI9Xcseq ZUGzW4FIeLj6u40vb7Mut5GmR+n+ppjsfMBRmCJauD8lwHce6uWkVLA4p0l7j9xhwZT0ECXhrIWT7 UPL10h29Wx2SGRQe44eJHve6Y9Ho1ZOcuZZJnYXeqoPcPejsj46pXh7CcLPHEoOKfHKRqUei5OfTp heAeYym/HcDNnNGapEkznLmRZ1gA3kvbLGamL9Gyh9JcOUQJswEbq6KL9O8EHF50qBeb8hgNMazRh LoyI85oR2Mu0idXEWvjbf/BHvEsbLvzG3ixyv1NVSv4hzYhVGmg0cp1s1SvlgF4A2QA076u0ty0Ef tlxTUvyQ==; Received: from [179.113.218.86] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qBmB4-0011pg-5q; Wed, 21 Jun 2023 02:58:30 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, Simon Ser , Rob Clark , Pekka Paalanen , Daniel Vetter , Daniel Stone , =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Dave Airlie , =?utf-8?q?Michel_D=C3=A4nzer?= , Samuel Pitoiset , =?utf-8?q?Timur_Krist=C3=B3f?= , Bas Nieuwenhuizen , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [RFC PATCH v3 1/4] drm/doc: Document DRM device reset expectations Date: Tue, 20 Jun 2023 21:57:16 -0300 Message-ID: <20230621005719.836857-2-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230621005719.836857-1-andrealmeid@igalia.com> References: <20230621005719.836857-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769276296853672588?= X-GMAIL-MSGID: =?utf-8?q?1769276296853672588?= Create a section that specifies how to deal with DRM device resets for kernel and userspace drivers. Signed-off-by: André Almeida --- Documentation/gpu/drm-uapi.rst | 65 ++++++++++++++++++++++++++++++++++ 1 file changed, 65 insertions(+) diff --git a/Documentation/gpu/drm-uapi.rst b/Documentation/gpu/drm-uapi.rst index 65fb3036a580..da4f8a694d8d 100644 --- a/Documentation/gpu/drm-uapi.rst +++ b/Documentation/gpu/drm-uapi.rst @@ -285,6 +285,71 @@ for GPU1 and GPU2 from different vendors, and a third handler for mmapped regular files. Threads cause additional pain with signal handling as well. +Device reset +============ + +The GPU stack is really complex and is prone to errors, from hardware bugs, +faulty applications and everything in between the many layers. To recover +from this kind of state, sometimes is needed to reset the device. This section +describes what's the expectations for DRM and usermode drivers when a device +resets and how to propagate the reset status. + +Kernel Mode Driver +------------------ + +The KMD is responsible for checking if the device needs a reset, and to perform +it as needed. Usually a hung is detected when a job gets stuck executing. KMD +then update it's internal reset tracking to be ready when userspace asks the +kernel about reset information. Drivers should implement the DRM_IOCTL_GET_RESET +for that. + +User Mode Driver +---------------- + +The UMD should check before submitting new commands to the KMD if the device has +been reset, and this can be checked more often if it requires to. The +DRM_IOCTL_GET_RESET is the default interface for those kind of checks. After +detecting a reset, UMD will then proceed to report it to the application using +the appropriated API error code, as explained in the bellow section about +robustness. + +Robustness +---------- + +The only way to try to keep an application working after a reset is if it +complies with the robustness aspects of the graphical API that is using. + +Graphical APIs provide ways to application to deal with device resets. However, +there's no guarantee that the app will be correctly using such features, and UMD +can implement policies to close the app if it's a repeating offender, likely in +a broken loop. This is done to ensure that it doesn't keeps blocking the user +interface to be correctly displayed. + +OpenGL +~~~~~~ + +Apps using OpenGL can rely on ``GL_ARB_robustness`` to be robust. This extension +tells if a reset has happened, and if so, all the context state is considered +lost and the app proceeds by creating new ones. If robustness isn't in use, UMD +will terminate the app when a reset is detected, giving that the contexts are +lost and the app won't be able to figure this out and recreate the contexts. + +Vulkan +~~~~~~ + +Apps using Vulkan should check for ``VK_ERROR_DEVICE_LOST`` for submissions. +This error code means, among other things, that a device reset has happened and +it needs to recreate the contexts to keep going. + +Reporting resets causes +----------------------- + +Apart from propagating the reset through the stack so apps can recover, it's +really useful for driver developers to learn more about what caused the reset in +first place. DRM devices should make use of devcoredump to store relevant +information about the reset, so this information can be added to user bug +reports. + .. _drm_driver_ioctl: IOCTL Support on Device Nodes From patchwork Wed Jun 21 00:57:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 110722 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4051883vqr; Tue, 20 Jun 2023 18:24:38 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6j7XzjjPpskhZG5QybaS9bOr13ppklO+V258YkeDNj2D+ZZtnl9+AC6WKwTEH00MmQJxOj X-Received: by 2002:a05:6a20:1611:b0:121:90df:5a7b with SMTP id l17-20020a056a20161100b0012190df5a7bmr6084154pzj.28.1687310677667; Tue, 20 Jun 2023 18:24:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687310677; cv=none; d=google.com; s=arc-20160816; b=zqKDEqYXsUjQ86YavPbuEsNZLz5mpEXylSEaaDmSvJR+srWAHtTbh7HRRZaxvYofgW TtXgbGMUF/TA2sj/Dio2CPsTqPYcoNtvaT2cEqbuKWgK76JCD7baIftZ7pE/1mDCa7FE hZqGNmMt8zbW1Fzx9jxtbfSJuWFei8DXAE7q85AoJoRtl5F3ZKASZlxg78NYEDIw5JsN 96+5ENFBnjVbvh0E4m6DhO3nr6nbFMiTYdpbLseqbE1o9hFdXxtnwHPMZvSAnIUDb+8z 8sL8+pIf30FJuSoSaWZeEMv2SupDO97A2lePCtS9mewHYzeDvIyuhdezTDN2BIKPsD0m rYTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=AHRpiiiLC5kKDXx9I68+5zUOknZnGRCucbKH1IwIdOA=; b=Msy9FU77Rxbb9GCvq/Cm5zhaj8WaytCXDVSkPGu8IG+5dTF4/oleN3c7EHwTMi6lus cP3LPDM+qZ7Kp0owsSgKzgIWLbxq5OcDyjZN6H1A2Why8NQl9DGUipYXlaU44+w4LzhA WgGokZ9k2xe2umdLVKZiJtRApDA56yaM2PTJHDgvatDN2u/FwBD/EsVv4a0oEOG34jmB UBZFjAFh5RqFHVyGeuknABA4e7hAcvxfytbpIRSRrklbhlqW2Kte/0OWKDnkRXUsarYT q+ioUBf/cky6g110rZtgVdVX29TW9+H2WvSUP9ADweFAjgbSw8byI7I22SJdNjqr18Me /pJA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=ZcBC5oLi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b18-20020a170903229200b001a6ee332903si3415437plh.347.2023.06.20.18.24.23; Tue, 20 Jun 2023 18:24:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=ZcBC5oLi; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229961AbjFUA6s (ORCPT + 99 others); Tue, 20 Jun 2023 20:58:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53680 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229628AbjFUA6i (ORCPT ); Tue, 20 Jun 2023 20:58:38 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0FB80183 for ; Tue, 20 Jun 2023 17:58:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=AHRpiiiLC5kKDXx9I68+5zUOknZnGRCucbKH1IwIdOA=; b=ZcBC5oLi2E/CWy/em1B12OApC7 20r32P8uA7Qb94hiFHgcfV9W3wrZXUErThEqCZNt17E90yEP19EmMhaXiQTM1wTzIpP16E2JWJxhX F7wpFdXLcBUmEmdVMkXjOQKUSJ10msQ5YFyu44DpJftADcB7dPF/n/qB4BGU5sEVCZqjMNDUlFu6W fP+bpoWCx6inN4SRmVjTmK4dinhO/EwDb6oGB58aHsnublzySd86iXLCWk7Bpyb6BWObbE0Zc/c2N FmpkOCwKYCHQ/hUr3obgd/q/0cCAOAD0tHeSwjcu5d2yhpAkwOA8OiwFrcDwGR4H3qTCpJnd+ReBH ZKz5J65A==; Received: from [179.113.218.86] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qBmB8-0011pg-K6; Wed, 21 Jun 2023 02:58:35 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, Simon Ser , Rob Clark , Pekka Paalanen , Daniel Vetter , Daniel Stone , =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Dave Airlie , =?utf-8?q?Michel_D=C3=A4nzer?= , Samuel Pitoiset , =?utf-8?q?Timur_Krist=C3=B3f?= , Bas Nieuwenhuizen , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [RFC PATCH v3 2/4] drm: Create DRM_IOCTL_GET_RESET Date: Tue, 20 Jun 2023 21:57:17 -0300 Message-ID: <20230621005719.836857-3-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230621005719.836857-1-andrealmeid@igalia.com> References: <20230621005719.836857-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769273481209462367?= X-GMAIL-MSGID: =?utf-8?q?1769273481209462367?= Create a new DRM ioctl operation to get the numbers of resets for a given context. The numbers reflect just the resets that happened after the context was created, and not since the machine was booted. Create a debugfs interface to make easier to test the API without real resets. Signed-off-by: André Almeida --- drivers/gpu/drm/drm_debugfs.c | 2 ++ drivers/gpu/drm/drm_ioctl.c | 58 +++++++++++++++++++++++++++++++++++ include/drm/drm_device.h | 3 ++ include/drm/drm_drv.h | 3 ++ include/uapi/drm/drm.h | 21 +++++++++++++ include/uapi/drm/drm_mode.h | 15 +++++++++ 6 files changed, 102 insertions(+) diff --git a/drivers/gpu/drm/drm_debugfs.c b/drivers/gpu/drm/drm_debugfs.c index 4855230ba2c6..316dce60434d 100644 --- a/drivers/gpu/drm/drm_debugfs.c +++ b/drivers/gpu/drm/drm_debugfs.c @@ -251,6 +251,8 @@ int drm_debugfs_init(struct drm_minor *minor, int minor_id, list_del(&entry->list); } + debugfs_create_bool("drm_reset_spoof", 0644, minor->debugfs_root, &dev->reset_spoof); + return 0; } diff --git a/drivers/gpu/drm/drm_ioctl.c b/drivers/gpu/drm/drm_ioctl.c index 7c9d66ee917d..23c282681ec7 100644 --- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -528,6 +528,63 @@ int drm_version(struct drm_device *dev, void *data, return err; } +/** + * drm_spoof_reset - Spoof a fake reset + * + * @reset: reset struct to be spoofed + * + * Create a fake reset report for testing + */ +static void drm_spoof_reset(struct drm_get_reset *reset) +{ + reset->dev_reset_count = 1; + reset->ctx_reset_count = 0; + reset->flags = 0; + reset->ctx_id = 0; + + DRM_INFO("[Spoofed] Reporting reset.ctx = %llu .dev = %llu\n", + reset->ctx_reset_count, reset->dev_reset_count); +} + +/** + * drm_getreset - Get reset information from a DRM device + * + * @dev DRM device + * @data user argument, pointing to a drm_get_reset structure + * @filp file pointer + * + * Return zero on success or negative number on failure. + * + * Fills in the reset information in data arg. + */ +int drm_getreset(struct drm_device *dev, void *data, + struct drm_file *file_priv) +{ + struct drm_get_reset *reset = data; + int ret = 0; + + if (dev->reset_spoof) { + drm_spoof_reset(reset); + return 0; + } + + if (!dev->driver->get_reset) + return -ENOSYS; + + if (reset->flags) + return -EINVAL; + + ret = dev->driver->get_reset(file_priv, dev, reset); + + if (!ret) + DRM_INFO("Reporting reset.ctx = %llu .dev = %llu\n", + reset->ctx_reset_count, reset->dev_reset_count); + else + DRM_WARN("%s failed with %d return\n", __func__, ret); + + return ret; +} + static int drm_ioctl_permit(u32 flags, struct drm_file *file_priv) { /* ROOT_ONLY is only for CAP_SYS_ADMIN */ @@ -716,6 +773,7 @@ static const struct drm_ioctl_desc drm_ioctls[] = { DRM_IOCTL_DEF(DRM_IOCTL_MODE_LIST_LESSEES, drm_mode_list_lessees_ioctl, DRM_MASTER), DRM_IOCTL_DEF(DRM_IOCTL_MODE_GET_LEASE, drm_mode_get_lease_ioctl, DRM_MASTER), DRM_IOCTL_DEF(DRM_IOCTL_MODE_REVOKE_LEASE, drm_mode_revoke_lease_ioctl, DRM_MASTER), + DRM_IOCTL_DEF(DRM_IOCTL_GET_RESET, drm_getreset, DRM_RENDER_ALLOW), }; #define DRM_CORE_IOCTL_COUNT ARRAY_SIZE(drm_ioctls) diff --git a/include/drm/drm_device.h b/include/drm/drm_device.h index 7cf4afae2e79..fcd7b5d45cde 100644 --- a/include/drm/drm_device.h +++ b/include/drm/drm_device.h @@ -326,6 +326,9 @@ struct drm_device { */ struct list_head debugfs_list; + /* Spoof device reset for testing */ + bool reset_spoof; + /* Everything below here is for legacy driver, never use! */ /* private: */ #if IS_ENABLED(CONFIG_DRM_LEGACY) diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h index 89e2706cac56..518a9db157fb 100644 --- a/include/drm/drm_drv.h +++ b/include/drm/drm_drv.h @@ -401,6 +401,9 @@ struct drm_driver { struct drm_device *dev, uint32_t handle, uint64_t *offset); + int (*get_reset)(struct drm_file *file_priv, + struct drm_device *dev, struct drm_get_reset *reset); + /** * @show_fdinfo: * diff --git a/include/uapi/drm/drm.h b/include/uapi/drm/drm.h index a87bbbbca2d4..a84559aa0d77 100644 --- a/include/uapi/drm/drm.h +++ b/include/uapi/drm/drm.h @@ -1169,6 +1169,27 @@ extern "C" { */ #define DRM_IOCTL_MODE_GETFB2 DRM_IOWR(0xCE, struct drm_mode_fb_cmd2) +/** + * DRM_IOCTL_GET_RESET - Get information about device resets + * + * This operation requests from the device information about resets. It should + * consider only resets that happens after the context is created, therefore, + * the counter should be zero during context creation. + * + * dev_reset_count tells how many resets have happened on this device, and + * ctx_reset_count tells how many of such resets were caused by this context. + * + * Flags can be used to tell if a reset is in progress, and userspace should + * wait until it's not in progress anymore to be able to create a new context; + * and to tell if the VRAM is considered lost. There's no safe way to clean this + * flag so if a context see this flag set, it should be like that until the end + * of the context. + */ +#define DRM_IOCTL_GET_RESET DRM_IOWR(0xCF, struct drm_get_reset) + +#define DRM_RESET_IN_PROGRESS 0x1 +#define DRM_RESET_VRAM_LOST 0x2 + /* * Device specific ioctls should only be in their respective headers * The device specific ioctl range is from 0x40 to 0x9f. diff --git a/include/uapi/drm/drm_mode.h b/include/uapi/drm/drm_mode.h index 43691058d28f..c3257bd1af9c 100644 --- a/include/uapi/drm/drm_mode.h +++ b/include/uapi/drm/drm_mode.h @@ -1308,6 +1308,21 @@ struct drm_mode_rect { __s32 y2; }; +/** + * struct drm_get_reset - Get information about a DRM device resets + * @ctx_id: the context id to be queried about resets + * @flags: flags + * @dev_reset_count: global counter of resets for a given DRM device + * @ctx_reset_count: of all the resets counted by this device, how many were + * caused by this context. + */ +struct drm_get_reset { + __u32 ctx_id; + __u32 flags; + __u64 dev_reset_count; + __u64 ctx_reset_count; +}; + #if defined(__cplusplus) } #endif From patchwork Wed Jun 21 00:57:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 110721 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4051188vqr; Tue, 20 Jun 2023 18:22:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5G8Ug2miqocOC/J6FJN8htcacf4x5T/rvoeQrJeRbVz9HmkCN3TzqtG/EDv6EHRAVd8zDG X-Received: by 2002:a17:90a:1a07:b0:259:5494:db4a with SMTP id 7-20020a17090a1a0700b002595494db4amr10200425pjk.30.1687310567678; Tue, 20 Jun 2023 18:22:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687310567; cv=none; d=google.com; s=arc-20160816; b=RSjopGb81VX/BEk2S303IkSKEDc0JVpbBJOiYV+E6snrzhX2AufrnavQ72bfsQXlGe 9VfIsZqDInbbjcJKvwwaq3mWlRROsCjXrQvAUPTW+pJLXeGn0t1OLnpetyVvucuYAASu sOZ3L7cN0Ijo+h5EUrGA4S7b6mCBZ5v0RoDbkrdo0mpyf7ByMEej0va7O6rHKhtl9Cy1 hzzLpHTZ2Co93EibMhUmEZ3gL4Fj9qZHwaNYEecP6TEYLakJFGWK7HNt4A2hE55c3ZyS 8d9Hce+YK0TFxuXySCxtoDuI8irSuYM28AeGk8dhLxt9o4b4TZYd97JKVbZ6z9GH4V3V itRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0e/opeWgqF70uqTQU9A9bAOiZ20K1sbyHX7KO5D5TY8=; b=h6QYJJ9PrYDcHjHAk0nN4vb1JDI+H2crscniavsjns2ZjAYdt7uCEYXdR/R8Z9ZWDT KaCcYU1l/tHkGtnaCduPTbQKl9guxMuOl0UYbzxjAd+sHykL1Xptn1Lgk2EMYrsXUqtc 3wnsPc5UQ5lgULwKQXB+9usp48W/vOHUGZ+WZxLE1In4FaJ8a53QFQGTjsZh/ECgNXdQ 14BzJIxgZtjyuJpyn52a4xUQOV539X8J09D+M0sVfSakeM6ejCGk6K+gOXln6+FlZsrl lwqVPnM7vPiT3UqAcSAvldR3cRNroIGJ1/+tP9NsFrfn6pVZbfjsdBb7tDpPEEYCV4GK CHSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=XTK919HN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t15-20020a17090a3e4f00b0025be6419478si3086400pjm.92.2023.06.20.18.22.33; Tue, 20 Jun 2023 18:22:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=XTK919HN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229994AbjFUA7A (ORCPT + 99 others); Tue, 20 Jun 2023 20:59:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229948AbjFUA6n (ORCPT ); Tue, 20 Jun 2023 20:58:43 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AAA181706 for ; Tue, 20 Jun 2023 17:58:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=0e/opeWgqF70uqTQU9A9bAOiZ20K1sbyHX7KO5D5TY8=; b=XTK919HN6/AH2/VHbMb/7HlmAu GoW2i5lwonJBrmjCq6YTpHnq/0ev1q446f0JNUkFqCXc5l1PMJdMZHUvMZOmqGbp1VGr045PfbuUg KbaWXF+sLvepux8ODfT8yIXd3LQ4Az1Xlc3pazNjwgcDAeo/EF1xhmz1zwsG4AmXYPnQONQp5EOzF n3qQNaepeR9ljB8izJuBxlFDTWaq00sHSAGsxhc4I5sdXJBYQaaQSrqFDHeRXNTIN8AIQoPC/0/AG D3XI3yirz89u8Wi7wzho6GlJ4Mdli88A1bYeW6JmFqCAcS2WAzah8CNbgth3YWHWUrMnveiJEp4pB bXxnoC8Q==; Received: from [179.113.218.86] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qBmBD-0011pg-Eo; Wed, 21 Jun 2023 02:58:39 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, Simon Ser , Rob Clark , Pekka Paalanen , Daniel Vetter , Daniel Stone , =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Dave Airlie , =?utf-8?q?Michel_D=C3=A4nzer?= , Samuel Pitoiset , =?utf-8?q?Timur_Krist=C3=B3f?= , Bas Nieuwenhuizen , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [RFC PATCH v3 3/4] drm/amdgpu: Implement DRM_IOCTL_GET_RESET Date: Tue, 20 Jun 2023 21:57:18 -0300 Message-ID: <20230621005719.836857-4-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230621005719.836857-1-andrealmeid@igalia.com> References: <20230621005719.836857-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769273365879471886?= X-GMAIL-MSGID: =?utf-8?q?1769273365879471886?= Implement get_reset ioctl for amdgpu Signed-off-by: André Almeida --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 4 ++- drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c | 35 +++++++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h | 5 ++++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_job.c | 12 +++++++-- drivers/gpu/drm/amd/amdgpu/amdgpu_job.h | 2 ++ 6 files changed, 56 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c index 2eb2c66843a8..0ba26b4b039c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -1262,8 +1262,10 @@ static int amdgpu_cs_submit(struct amdgpu_cs_parser *p, uint64_t seq; int r; - for (i = 0; i < p->gang_size; ++i) + for (i = 0; i < p->gang_size; ++i) { + p->jobs[i]->ctx = p->ctx; drm_sched_job_arm(&p->jobs[i]->base); + } for (i = 0; i < p->gang_size; ++i) { struct dma_fence *fence; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c index d2139ac12159..d3e292382d4a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.c @@ -322,6 +322,9 @@ static int amdgpu_ctx_init(struct amdgpu_ctx_mgr *mgr, int32_t priority, ctx->init_priority = priority; ctx->override_priority = AMDGPU_CTX_PRIORITY_UNSET; + ctx->global_reset_counter = atomic_read(&mgr->adev->gpu_reset_counter); + ctx->local_reset_counter = 0; + r = amdgpu_ctx_get_stable_pstate(ctx, ¤t_stable_pstate); if (r) return r; @@ -963,3 +966,35 @@ void amdgpu_ctx_mgr_usage(struct amdgpu_ctx_mgr *mgr, } mutex_unlock(&mgr->lock); } + +int amdgpu_get_reset(struct drm_file *filp, struct drm_device *dev, + struct drm_get_reset *reset) +{ + struct amdgpu_device *adev = drm_to_adev(dev); + struct amdgpu_ctx *ctx; + struct amdgpu_ctx_mgr *mgr; + unsigned int id = reset->ctx_id; + struct amdgpu_fpriv *fpriv = filp->driver_priv; + + mgr = &fpriv->ctx_mgr; + mutex_lock(&mgr->lock); + ctx = idr_find(&mgr->ctx_handles, id); + if (!ctx) { + mutex_unlock(&mgr->lock); + return -EINVAL; + } + + reset->dev_reset_count = + atomic_read(&adev->gpu_reset_counter) - ctx->global_reset_counter; + + reset->ctx_reset_count = ctx->local_reset_counter; + + if (amdgpu_in_reset(adev)) + reset->flags |= DRM_RESET_IN_PROGRESS; + + if (ctx->vram_lost_counter != atomic_read(&adev->vram_lost_counter)) + reset->flags |= DRM_RESET_VRAM_LOST; + + mutex_unlock(&mgr->lock); + return 0; +} diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h index 0fa0e56daf67..0c9815695884 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ctx.h @@ -57,6 +57,9 @@ struct amdgpu_ctx { unsigned long ras_counter_ce; unsigned long ras_counter_ue; uint32_t stable_pstate; + + uint64_t global_reset_counter; + uint64_t local_reset_counter; }; struct amdgpu_ctx_mgr { @@ -97,4 +100,6 @@ void amdgpu_ctx_mgr_fini(struct amdgpu_ctx_mgr *mgr); void amdgpu_ctx_mgr_usage(struct amdgpu_ctx_mgr *mgr, ktime_t usage[AMDGPU_HW_IP_NUM]); +int amdgpu_get_reset(struct drm_file *file_priv, struct drm_device *dev, + struct drm_get_reset *reset); #endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index c9a41c997c6c..431791b2c3cb 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2805,6 +2805,7 @@ static const struct drm_driver amdgpu_kms_driver = { #ifdef CONFIG_PROC_FS .show_fdinfo = amdgpu_show_fdinfo, #endif + .get_reset = amdgpu_get_reset, .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c index c3d9d75143f4..1553a2633d46 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.c @@ -35,11 +35,20 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) { struct amdgpu_ring *ring = to_amdgpu_ring(s_job->sched); struct amdgpu_job *job = to_amdgpu_job(s_job); + struct drm_sched_entity *entity = job->base.entity; struct amdgpu_task_info ti; struct amdgpu_device *adev = ring->adev; int idx; int r; + memset(&ti, 0, sizeof(struct amdgpu_task_info)); + amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); + + if (job->ctx) { + DRM_INFO("Increasing ctx reset count for %s (%d)\n", ti.process_name, ti.pid); + job->ctx->local_reset_counter++; + } + if (!drm_dev_enter(adev_to_drm(adev), &idx)) { DRM_INFO("%s - device unplugged skipping recovery on scheduler:%s", __func__, s_job->sched->name); @@ -48,7 +57,6 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) return DRM_GPU_SCHED_STAT_ENODEV; } - memset(&ti, 0, sizeof(struct amdgpu_task_info)); adev->job_hang = true; if (amdgpu_gpu_recovery && @@ -58,7 +66,6 @@ static enum drm_gpu_sched_stat amdgpu_job_timedout(struct drm_sched_job *s_job) goto exit; } - amdgpu_vm_get_task_info(ring->adev, job->pasid, &ti); DRM_ERROR("ring %s timeout, signaled seq=%u, emitted seq=%u\n", job->base.sched->name, atomic_read(&ring->fence_drv.last_seq), ring->fence_drv.sync_seq); @@ -105,6 +112,7 @@ int amdgpu_job_alloc(struct amdgpu_device *adev, struct amdgpu_vm *vm, */ (*job)->base.sched = &adev->rings[0]->sched; (*job)->vm = vm; + (*job)->ctx = NULL; amdgpu_sync_create(&(*job)->explicit_sync); (*job)->vram_lost_counter = atomic_read(&adev->vram_lost_counter); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h index 52f2e313ea17..0d463babaa60 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_job.h @@ -63,6 +63,8 @@ struct amdgpu_job { uint32_t oa_base, oa_size; uint32_t vram_lost_counter; + struct amdgpu_ctx *ctx; + /* user fence handling */ uint64_t uf_addr; uint64_t uf_sequence; From patchwork Wed Jun 21 00:57:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 110723 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp4055601vqr; Tue, 20 Jun 2023 18:35:14 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5ikNAP+cZ2XrVP4VaD+LOGj7iJCMzvl73UOnx/JdNGchIMNQA7ZcjGG/lRf7FG1i2OLRaS X-Received: by 2002:a05:6a21:900c:b0:123:2973:672c with SMTP id tq12-20020a056a21900c00b001232973672cmr338316pzb.57.1687311313951; Tue, 20 Jun 2023 18:35:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687311313; cv=none; d=google.com; s=arc-20160816; b=oVV/At2rHRdrxcn+V1qhzuWgpfzb1N33Xp0DoVvMflRxi006gQDdFv/uQh2wkwA4bR vtUsj3vXiu6ii8H4iP9BBG6WPpPS2Ek0Yol7SnX722x/+77cQyctBCEiARnslLPeiJ8a CWmAPDFP43lGW0NKL2vMKj6Tt4GOFjVJF/bDAQ2Gwt4r5bcz9fHx/hKErPv+fcq0LLFR j/tperKlNpHTWhWxhLZSpk7SKVkYwbPsUr2scJPgKQW4+tRn0148cePe7zqrX5dNraLG r98AZjAOpaQGuLzE0p4ZnTcJv3lbfLi942jnfQ9sMjlSwZaXDQvAU/BsuWXiJ7Kl/fqb +a9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wMbMm2LjbMgpiJqJVmk81D3jcaDM5uS/Z4Si5959JYc=; b=WvyYJLx0hvVh8dCIHs4pSHIb/7RROH6OMIsETnoC6KrBDkQhv7EqROCYu0t1fjELCs A7B14qKqhZLr6YFUFDofN08jUboPgj3JWWTD/4etDEH/8AkY8jtGZUGH3PgXfYN0DG2X Ei5c7wHDAxUqzdMK8GraqkGa7qVVPq7BRhTNqecffkGPwggXJi4VMrZeeSzr6TbvLnHT KkN7GazZjtuj5WqH74apAwWg05v5/jBmcCdlxWaPVtys60wuHLMGpqdhK4pn+kRjBPhZ PXUZ/ynOomViaWJfhbVccmokl1wpOXQ0EMxS+pwspgzcgm9bmad6wpi6ApItRI5jl3nI J9Jg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=QIwarYKn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h5-20020a170902f2c500b001b3e8b1d5c7si2859261plc.260.2023.06.20.18.35.01; Tue, 20 Jun 2023 18:35:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=QIwarYKn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230074AbjFUA7L (ORCPT + 99 others); Tue, 20 Jun 2023 20:59:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54152 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229993AbjFUA7A (ORCPT ); Tue, 20 Jun 2023 20:59:00 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 49B891988 for ; Tue, 20 Jun 2023 17:58:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=wMbMm2LjbMgpiJqJVmk81D3jcaDM5uS/Z4Si5959JYc=; b=QIwarYKnq64Uj9A19OtKYgeo+0 9wjl6C3e8cPUCZtOyJblrw87TDaLkhfaGSmUPDN5L1WmFKge/xUZ6oMOulQ0hMdJUN9R1ZHN8Jpy8 ArwGpQHbNLo5pBk0FpUAzy7f07le7/QhaW6IJoeq57PZmSiNApDdVRyNArWJMQFDxueT8qSeqhQxH oH59LxBDhXBstxzVBld8BSn8DuMpeX3mDQPCkrfdLv5OoR5qwiudN/ZwiWSE4nLluIIERebdDlOra UUznXhdeONCjbE8/PWnPOS8Cw+ls49cIGXoS4jMLgUtm9FPxm6R+9F8bcmAVsid1Sbk0mRKUPQOHE 8HF94R0g==; Received: from [179.113.218.86] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qBmBH-0011pg-SC; Wed, 21 Jun 2023 02:58:44 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, Simon Ser , Rob Clark , Pekka Paalanen , Daniel Vetter , Daniel Stone , =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Dave Airlie , =?utf-8?q?Michel_D=C3=A4nzer?= , Samuel Pitoiset , =?utf-8?q?Timur_Krist=C3=B3f?= , Bas Nieuwenhuizen , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [RFC PATCH v3 4/4] drm/i915: Implement DRM_IOCTL_GET_RESET Date: Tue, 20 Jun 2023 21:57:19 -0300 Message-ID: <20230621005719.836857-5-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230621005719.836857-1-andrealmeid@igalia.com> References: <20230621005719.836857-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769274148705219024?= X-GMAIL-MSGID: =?utf-8?q?1769274148705219024?= Implement get_reset ioctl for i915. Signed-off-by: André Almeida --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 18 ++++++++++++++++++ drivers/gpu/drm/i915/gem/i915_gem_context.h | 2 ++ .../gpu/drm/i915/gem/i915_gem_context_types.h | 2 ++ drivers/gpu/drm/i915/i915_driver.c | 2 ++ 4 files changed, 24 insertions(+) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index 9a9ff84c90d7..fba8c9bbc7e9 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -1666,6 +1666,8 @@ i915_gem_create_context(struct drm_i915_private *i915, ctx->uses_protected_content = true; } + ctx->dev_reset_counter = i915_reset_count(&i915->gpu_error); + trace_i915_context_create(ctx); return ctx; @@ -2558,6 +2560,22 @@ int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, return 0; } +int i915_gem_get_reset(struct drm_file *filp, struct drm_device *dev, + struct drm_get_reset *reset) +{ + struct i915_gem_context *ctx; + + ctx = i915_gem_context_lookup(file->driver_priv, reset->ctx_id); + if (IS_ERR(ctx)) + return PTR_ERR(ctx); + + reset->dev_reset_count = i915_reset_count(&i915->gpu_error) - ctx->dev_reset_count; + reset->ctx_reset_count = ctx->guilty_count; + + i915_gem_context_put(ctx); + return 0; +} + /* GEM context-engines iterator: for_each_gem_engine() */ struct intel_context * i915_gem_engines_iter_next(struct i915_gem_engines_iter *it) diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.h b/drivers/gpu/drm/i915/gem/i915_gem_context.h index e5b0f66ea1fe..9ee119d8123f 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.h @@ -138,6 +138,8 @@ int i915_gem_context_setparam_ioctl(struct drm_device *dev, void *data, struct drm_file *file_priv); int i915_gem_context_reset_stats_ioctl(struct drm_device *dev, void *data, struct drm_file *file); +int i915_gem_get_reset(struct drm_file *file_priv, struct drm_device *dev, + struct drm_get_reset *reset); struct i915_gem_context * i915_gem_context_lookup(struct drm_i915_file_private *file_priv, u32 id); diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h index cb78214a7dcd..2e4cf0f0d3dc 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_context_types.h @@ -414,6 +414,8 @@ struct i915_gem_context { /** @engines: list of stale engines */ struct list_head engines; } stale; + + uint64_t dev_reset_counter; }; #endif /* __I915_GEM_CONTEXT_TYPES_H__ */ diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c index 97244541ec28..640304141ada 100644 --- a/drivers/gpu/drm/i915/i915_driver.c +++ b/drivers/gpu/drm/i915/i915_driver.c @@ -1805,6 +1805,8 @@ static const struct drm_driver i915_drm_driver = { .postclose = i915_driver_postclose, .show_fdinfo = i915_drm_client_fdinfo, + .get_reset = i915_gem_get_reset, + .prime_handle_to_fd = drm_gem_prime_handle_to_fd, .prime_fd_to_handle = drm_gem_prime_fd_to_handle, .gem_prime_import = i915_gem_prime_import,