From patchwork Thu Aug 17 18:20:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 136066 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp1669221vqi; Fri, 18 Aug 2023 13:59:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEuMpqKiZD3ADhqnNtif59ayFCTLcBfRPImcxsVopk8SrpL6GvEuFWiRLJIiQ1j8x1XZ4Hy X-Received: by 2002:a05:6a20:938d:b0:130:d5a:e40e with SMTP id x13-20020a056a20938d00b001300d5ae40emr272915pzh.7.1692392367622; Fri, 18 Aug 2023 13:59:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692392367; cv=none; d=google.com; s=arc-20160816; b=TQuq9yQ55I1OqLhcRZjNkdY5+vXP/abubDnAtbu4G6DNOmR/J4zoBRDrlz5vp03R/E Gnu6FgC5bOTvGrPfVZGDVuugqwvCVBH682sUHaSzK59pekEonPiCZV/W9Ipg+4lfc8E5 t1uZ9BKihRDElb58TTMims0oZXoATdo3JeAK3SSBcfti0wimXKnjUsNeGcGb09CDJKjc HD/ZNu64o0iKQtnZpp9y7HdHtMTBR25nrsD06eb1XmCqbbxNbQQUvbiXxabPo1DSIEr2 M6BCHtMuslHSPzy6YTwhLPSRSPUNyVyljISUye3iknYnVg8goYuQzkzUNRoVnSHOL7UZ lL9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Mrs3vloTYZU5576RzrOHkfzzCoM/F6rfj/ZXqR1qTD8=; fh=akTS0Wq8jEms5e4CIjRZ2klQmCylMzrjhYuxow3r01E=; b=hGLObdkKe2BmTCNR45tH2i2gPCNfqv1xRmgP7Of6An80B88EK3/7cT2InTTIWRmHMo z/CA8iZXW/kmDAMBj5I4K38AgeiBVZpTv/sPwxGXjoYRTMFEObqU6UtOZJhCzg+NN+F0 BvtFwU9RhaSmZin6JrlYfzU2L5FCS6p9oxm5kxqOs+FrThZNF3NGKGjExkKmwmCbmPtA 4k6uz6nXg3GCzDzPk5HWF8P2O8icmylLMnLOy/yRMMpqT2BhWzwPNe5Wl+kE19OPv6wY U+iL8Smp7zPIMo2XVC1BZE2vmXp+2F+hLYlazYvAg10kyvVtOfrZts78dXA0+Agn9TZi dmpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=eieaTgIc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a63-20020a639042000000b00565660b78bcsi1969223pge.79.2023.08.18.13.59.13; Fri, 18 Aug 2023 13:59:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=eieaTgIc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1353418AbjHQSna (ORCPT + 99 others); Thu, 17 Aug 2023 14:43:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48100 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354506AbjHQSnR (ORCPT ); Thu, 17 Aug 2023 14:43:17 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D8D030D6 for ; Thu, 17 Aug 2023 11:43:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=Mrs3vloTYZU5576RzrOHkfzzCoM/F6rfj/ZXqR1qTD8=; b=eieaTgIcBmA7Hp/wSyY49dhf29 0uVeGzMglxGKDgrcQnea2NVmNPkLQto6iDeLXEeBWMZWpqkYgamIrX/c8VWfPHJYosyjeAxrf5ckm uTfVvAwLJj/nchLsOSGBBBe1LnQTIrdox4ZG6cLg2FU3vGs888movhaDVDpNyo9UXgkrMossRmRe9 2w81kqVZvkOu6AKhA6v7l7mvos9ccgiYKGp12LYEkhtzOE3+GXW/0/9rpPGjp9cdjLnV42fnCd6tw oQk2uzM62xWn6sxeGkDtmubJRzZJEK4zps0+D6GLi2vV1b0uPrPdd05QcVjKmZdoetOT7/ryaI6pr awjvL4NA==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qWhcC-0021I9-Pq; Thu, 17 Aug 2023 20:21:01 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , Shashank Sharma , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [PATCH v5 1/5] drm/amdgpu: Allocate coredump memory in a nonblocking way Date: Thu, 17 Aug 2023 15:20:46 -0300 Message-ID: <20230817182050.205925-2-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230817182050.205925-1-andrealmeid@igalia.com> References: <20230817182050.205925-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774602018805512996 X-GMAIL-MSGID: 1774602018805512996 During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida Reviewed-by: Christian König --- v5: no change --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index aa171db68639..bf4781551f88 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4847,7 +4847,7 @@ static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev) struct drm_device *dev = adev_to_drm(adev); ktime_get_ts64(&adev->reset_time); - dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_KERNEL, + dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_NOWAIT, amdgpu_devcoredump_read, amdgpu_devcoredump_free); } #endif From patchwork Thu Aug 17 18:20:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 136173 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp2112796vqi; Sat, 19 Aug 2023 09:53:07 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFJ1cpq+Gk9vQ8qt1pGWzEIxoMN13VRrWP3NB5ApGujM5EHekv5J+KeFw6E22QJIE3jNr1b X-Received: by 2002:a05:6a20:9704:b0:141:a70:6c26 with SMTP id hr4-20020a056a20970400b001410a706c26mr2457185pzc.57.1692463987366; Sat, 19 Aug 2023 09:53:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692463987; cv=none; d=google.com; s=arc-20160816; b=sdTBcOqD0AgYVZck2j1S7MGJIsg7ULqt8kS1xeqFi80WWQh+sfprnYOpbh84S2U/S4 nf6svmwhyGZxu0d2PfbAif8wm/1eYOB+A40U/McTMl9vx7THJ/Ee39qlZrddAVTN+/VM SMs/ya2lOAdJCkovMnhsVUEAz18VJpc9L1C2qWk6ff10tJIGFQyPF4+KRotpMXkMZXGB GN3YNiJ3NbZvvN2dZ/KHD8xIjTasawefU+T+JBO41kZUXlVHHLvzn9y45FuudHLbBAeu DaeMxk7noPAMGzcBA+Z/yfnKTPShF1O4ZQ9eje94L2Mf6t7klBTj9cb1qDWNIhEaxiaZ whvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tnViSDrYq1wVWUVUWxxofLSdzl3fRoe0iOVgg7Jsz5Q=; fh=akTS0Wq8jEms5e4CIjRZ2klQmCylMzrjhYuxow3r01E=; b=asIcdxXoSeSOkGPomPxzh/AVmPrS7Wwaz9dJ6++n6h8t7dnGQDXvkzwXiLuT44PbaT fM7gayhYkt/JO0OIBd8x1jzM7pOowXn81ud7x7r03qxIYd88Y5XQfz62O64NqCAuI87L ZMPjOHjyKxgiKATNdaNhl0sjgYHKvUV7bFN7oRMzSQyHvpMk9cJHRIx4vEA7QA8TrsOa +tgENiw9QWeYtdixxPmoT71NoQiMC6xo+JhZtDO6Y3c9BDcA/EJY+nUbrOhEcebuwxDp GLylb9ZtxjaY6znycFEq3lVQpPgrwlbZsMCkGuDTaancQPjv++kNAaEYJk5yWu680AyB XyVA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=V22JJw0H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id n14-20020a170902d2ce00b001b8c824e826si3887957plc.533.2023.08.19.09.53.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 19 Aug 2023 09:53:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=V22JJw0H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 476F411FEDA; Sat, 19 Aug 2023 01:53:09 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354301AbjHQSXg (ORCPT + 99 others); Thu, 17 Aug 2023 14:23:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40676 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354371AbjHQSXJ (ORCPT ); Thu, 17 Aug 2023 14:23:09 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 777893A91 for ; Thu, 17 Aug 2023 11:22:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=tnViSDrYq1wVWUVUWxxofLSdzl3fRoe0iOVgg7Jsz5Q=; b=V22JJw0HLWytmsYzKrtpGIlp45 n/uwaHbfHzs5ylLbHWCXbNlzDjKHxtpB1xzd1Qs3NWGahj+1EO64TImsMG84zQGm0/L/cNxg2ik+E ssj/Qn1FnMbF1C2+UdFG9CcC1b72GN/yBKyEfv1MlKfeNjZXngymIglaVytFlHidFUd+U5G5MMrJl v2cU1sIscYKmu7qkfneo4sYR3C6uc1f+t7izA8uFD9uGbOZm2AYUJ4wDLGg+h3wOfotmCgQACJcSP 901JFiDhFOPzQ9qCcQuckByoypYAfe8rpIihXAuLmXYiLxEQgdpJzCcnvxs+JWWxebSsknhUc4rwY +BqVxREw==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qWhcG-0021I9-1u; Thu, 17 Aug 2023 20:21:04 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , Shashank Sharma , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [PATCH v5 2/5] drm/amdgpu: Rework coredump to use memory dynamically Date: Thu, 17 Aug 2023 15:20:47 -0300 Message-ID: <20230817182050.205925-3-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230817182050.205925-1-andrealmeid@igalia.com> References: <20230817182050.205925-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774677118111129859 X-GMAIL-MSGID: 1774677118111129859 Instead of storing coredump information inside amdgpu_device struct, move if to a proper separated struct and allocate it dynamically. This will make it easier to further expand the logged information. Signed-off-by: André Almeida --- v5: no change v4: change kmalloc to kzalloc --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 14 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 63 ++++++++++++++-------- 2 files changed, 49 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 9c6a332261ab..0d560b713948 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1088,11 +1088,6 @@ struct amdgpu_device { uint32_t *reset_dump_reg_list; uint32_t *reset_dump_reg_value; int num_regs; -#ifdef CONFIG_DEV_COREDUMP - struct amdgpu_task_info reset_task_info; - bool reset_vram_lost; - struct timespec64 reset_time; -#endif bool scpm_enabled; uint32_t scpm_status; @@ -1105,6 +1100,15 @@ struct amdgpu_device { uint32_t aid_mask; }; +#ifdef CONFIG_DEV_COREDUMP +struct amdgpu_coredump_info { + struct amdgpu_device *adev; + struct amdgpu_task_info reset_task_info; + struct timespec64 reset_time; + bool reset_vram_lost; +}; +#endif + static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) { return container_of(ddev, struct amdgpu_device, ddev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bf4781551f88..b5b879bcc5c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4799,12 +4799,17 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) return 0; } -#ifdef CONFIG_DEV_COREDUMP +#ifndef CONFIG_DEV_COREDUMP +static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) +{ +} +#else static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, size_t count, void *data, size_t datalen) { struct drm_printer p; - struct amdgpu_device *adev = data; + struct amdgpu_coredump_info *coredump = data; struct drm_print_iterator iter; int i; @@ -4818,21 +4823,21 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); drm_printf(&p, "kernel: " UTS_RELEASE "\n"); drm_printf(&p, "module: " KBUILD_MODNAME "\n"); - drm_printf(&p, "time: %lld.%09ld\n", adev->reset_time.tv_sec, adev->reset_time.tv_nsec); - if (adev->reset_task_info.pid) + drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); + if (coredump->reset_task_info.pid) drm_printf(&p, "process_name: %s PID: %d\n", - adev->reset_task_info.process_name, - adev->reset_task_info.pid); + coredump->reset_task_info.process_name, + coredump->reset_task_info.pid); - if (adev->reset_vram_lost) + if (coredump->reset_vram_lost) drm_printf(&p, "VRAM is lost due to GPU reset!\n"); - if (adev->num_regs) { + if (coredump->adev->num_regs) { drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); - for (i = 0; i < adev->num_regs; i++) + for (i = 0; i < coredump->adev->num_regs; i++) drm_printf(&p, "0x%08x: 0x%08x\n", - adev->reset_dump_reg_list[i], - adev->reset_dump_reg_value[i]); + coredump->adev->reset_dump_reg_list[i], + coredump->adev->reset_dump_reg_value[i]); } return count - iter.remain; @@ -4840,14 +4845,32 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, static void amdgpu_devcoredump_free(void *data) { + kfree(data); } -static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev) +static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) { + struct amdgpu_coredump_info *coredump; struct drm_device *dev = adev_to_drm(adev); - ktime_get_ts64(&adev->reset_time); - dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_NOWAIT, + coredump = kzalloc(sizeof(*coredump), GFP_NOWAIT); + + if (!coredump) { + DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); + return; + } + + coredump->reset_vram_lost = vram_lost; + + if (reset_context->job && reset_context->job->vm) + coredump->reset_task_info = reset_context->job->vm->task_info; + + coredump->adev = adev; + + ktime_get_ts64(&coredump->reset_time); + + dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_NOWAIT, amdgpu_devcoredump_read, amdgpu_devcoredump_free); } #endif @@ -4955,15 +4978,9 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle, goto out; vram_lost = amdgpu_device_check_vram_lost(tmp_adev); -#ifdef CONFIG_DEV_COREDUMP - tmp_adev->reset_vram_lost = vram_lost; - memset(&tmp_adev->reset_task_info, 0, - sizeof(tmp_adev->reset_task_info)); - if (reset_context->job && reset_context->job->vm) - tmp_adev->reset_task_info = - reset_context->job->vm->task_info; - amdgpu_reset_capture_coredumpm(tmp_adev); -#endif + + amdgpu_coredump(tmp_adev, vram_lost, reset_context); + if (vram_lost) { DRM_INFO("VRAM is lost due to GPU reset!\n"); amdgpu_inc_vram_lost(tmp_adev); From patchwork Thu Aug 17 18:20:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 136042 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp1551949vqi; Fri, 18 Aug 2023 10:10:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFfn19Teue1r6w8AmXhp7k7YAnv/yeQB/BVTFGRkm722OvjcyX39n2rHA2xPS2ETcVnDiiq X-Received: by 2002:a17:90a:6606:b0:26d:2158:10ac with SMTP id l6-20020a17090a660600b0026d215810acmr2903213pjj.14.1692378654088; Fri, 18 Aug 2023 10:10:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692378654; cv=none; d=google.com; s=arc-20160816; b=O2H0Kk+bhauU/hjo5nCHRgyIm95/vAMGdSViHqNc559fVDR0DPJQGTXJFGJmYvMw4V m3dRvS9l/GVmxMfPCkUiJIbnjIf6kXLn/AcCFq7yKbrG4NE1RaM8YNX8g0PxFgQbPzzB DD9jwIhKpWHn3B6MmHt5wSMQIe7Qcbs9EblgEpz+Ak21Pq6ZLKxAQ0T5WtqxKg1LOvGr jLfsBmQVNTSIoM2O/GhaBWaJBkNuF4OUmA/j2gMFep/sF9JFqOWvuc+E+2KU2H3qLwo5 /LAY6oSuA2UhADJPvXbpPMSi7ZmLJVAlik54hV/PV71vnYqdr3noYR6yXgZpFKWzjTgA vmwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=iUYe9g2MBePpeImYFZSYmc73rD01LVejX27ZWnGqtrg=; fh=akTS0Wq8jEms5e4CIjRZ2klQmCylMzrjhYuxow3r01E=; b=Mt2MIVX+/t9yce3+eKkVFIrq2jKFY77Qu8UHC8VXAgrdlZJtd0d/e9q+k81fEdKDFc 2h/fb5Qj2VXYSRepK47nTA/5iP3UuzpxcYkafBm6Ta/kTIqPYzJ/G0+BMyoS4GgEH1JH 3n5CH+9hdz0mZkLrhnhsEfdW4qQ1LLEzG91dev0jxprq4ddYawxPnIg8nuxmswOpzlgs 5qEgZbbVNildvpML54c4ipnyXepHz7PJHZyfpvgKoIJghhog/F8BD3RYwiwCLatzT6mF lmMt95NEI4wFo22h0Vl843wi+v4jsh7fNhgdEvlOlc0NgMZXl23DO0b8dLGptwCtKLOJ DBVw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=SY1uUsqW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hk15-20020a17090b224f00b00263ba5d7088si3602654pjb.48.2023.08.18.10.10.39; Fri, 18 Aug 2023 10:10:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=SY1uUsqW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354384AbjHQSYJ (ORCPT + 99 others); Thu, 17 Aug 2023 14:24:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40172 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354437AbjHQSXs (ORCPT ); Thu, 17 Aug 2023 14:23:48 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 835953AB4 for ; Thu, 17 Aug 2023 11:23:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=iUYe9g2MBePpeImYFZSYmc73rD01LVejX27ZWnGqtrg=; b=SY1uUsqWfuvjq6yENvUMBo1gqM SC9arx0L3MSypUGtfvK41ncQFrO1ADfv0LvGbu4HK1dVwBJ1Kc6bZlnSbkdqCPtR2Di6B/XbGDzjl pXSqqX8A5vcEvbEeqif571LDArSCs4OrivzcngH+koG87KNF0wHBzuZxPJCWMaZ4HOHP9Nu1djJ2p m2vdgMuNmDucHx6h6eBQ1yHnnculLgxon+8YriDf5zXs2h1M+sI9fZVnaF1aX/ampnTenit2fXOpB ceLV86A34+yv4MNCqIfzm2WFVvfEy9mYDxdxKlvWAn9ndgX8jaUAvbnC51D8my54bBD/RSDuEiZd4 FZtXpE/A==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qWhcJ-0021I9-BT; Thu, 17 Aug 2023 20:21:07 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , Shashank Sharma , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [PATCH v5 3/5] drm/amdgpu: Encapsulate all device reset info Date: Thu, 17 Aug 2023 15:20:48 -0300 Message-ID: <20230817182050.205925-4-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230817182050.205925-1-andrealmeid@igalia.com> References: <20230817182050.205925-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774587639670846547 X-GMAIL-MSGID: 1774587639670846547 To better organize struct amdgpu_device, keep all reset information related fields together in a separated struct. Signed-off-by: André Almeida --- v5: new patch, as requested by Shashank Sharma --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 34 +++++++++++++-------- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 10 +++--- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 16 +++++----- 3 files changed, 34 insertions(+), 26 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 0d560b713948..56d78ca6e917 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -781,6 +781,26 @@ struct amdgpu_mqd { #define AMDGPU_PRODUCT_NAME_LEN 64 struct amdgpu_reset_domain; +#ifdef CONFIG_DEV_COREDUMP +struct amdgpu_coredump_info { + struct amdgpu_device *adev; + struct amdgpu_task_info reset_task_info; + struct timespec64 reset_time; + bool reset_vram_lost; +}; +#endif + +struct amdgpu_reset_info { + /* reset dump register */ + u32 *reset_dump_reg_list; + u32 *reset_dump_reg_value; + int num_regs; + +#ifdef CONFIG_DEV_COREDUMP + struct amdgpu_coredump_info *coredump_info; +#endif +}; + /* * Non-zero (true) if the GPU has VRAM. Zero (false) otherwise. */ @@ -1084,10 +1104,7 @@ struct amdgpu_device { struct mutex benchmark_mutex; - /* reset dump register */ - uint32_t *reset_dump_reg_list; - uint32_t *reset_dump_reg_value; - int num_regs; + struct amdgpu_reset_info reset_info; bool scpm_enabled; uint32_t scpm_status; @@ -1100,15 +1117,6 @@ struct amdgpu_device { uint32_t aid_mask; }; -#ifdef CONFIG_DEV_COREDUMP -struct amdgpu_coredump_info { - struct amdgpu_device *adev; - struct amdgpu_task_info reset_task_info; - struct timespec64 reset_time; - bool reset_vram_lost; -}; -#endif - static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) { return container_of(ddev, struct amdgpu_device, ddev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index a4faea4fa0b5..3136a0774dd9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -2016,8 +2016,8 @@ static ssize_t amdgpu_reset_dump_register_list_read(struct file *f, if (ret) return ret; - for (i = 0; i < adev->num_regs; i++) { - sprintf(reg_offset, "0x%x\n", adev->reset_dump_reg_list[i]); + for (i = 0; i < adev->reset_info.num_regs; i++) { + sprintf(reg_offset, "0x%x\n", adev->reset_info.reset_dump_reg_list[i]); up_read(&adev->reset_domain->sem); if (copy_to_user(buf + len, reg_offset, strlen(reg_offset))) return -EFAULT; @@ -2074,9 +2074,9 @@ static ssize_t amdgpu_reset_dump_register_list_write(struct file *f, if (ret) goto error_free; - swap(adev->reset_dump_reg_list, tmp); - swap(adev->reset_dump_reg_value, new); - adev->num_regs = i; + swap(adev->reset_info.reset_dump_reg_list, tmp); + swap(adev->reset_info.reset_dump_reg_value, new); + adev->reset_info.num_regs = i; up_write(&adev->reset_domain->sem); ret = size; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index b5b879bcc5c9..96975591841d 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4790,10 +4790,10 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) lockdep_assert_held(&adev->reset_domain->sem); - for (i = 0; i < adev->num_regs; i++) { - adev->reset_dump_reg_value[i] = RREG32(adev->reset_dump_reg_list[i]); - trace_amdgpu_reset_reg_dumps(adev->reset_dump_reg_list[i], - adev->reset_dump_reg_value[i]); + for (i = 0; i < adev->reset_info.num_regs; i++) { + adev->reset_info.reset_dump_reg_value[i] = RREG32(adev->reset_info.reset_dump_reg_list[i]); + trace_amdgpu_reset_reg_dumps(adev->reset_info.reset_dump_reg_list[i], + adev->reset_info.reset_dump_reg_value[i]); } return 0; @@ -4831,13 +4831,13 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, if (coredump->reset_vram_lost) drm_printf(&p, "VRAM is lost due to GPU reset!\n"); - if (coredump->adev->num_regs) { + if (coredump->adev->reset_info.num_regs) { drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); - for (i = 0; i < coredump->adev->num_regs; i++) + for (i = 0; i < coredump->adev->reset_info.num_regs; i++) drm_printf(&p, "0x%08x: 0x%08x\n", - coredump->adev->reset_dump_reg_list[i], - coredump->adev->reset_dump_reg_value[i]); + coredump->adev->reset_info.reset_dump_reg_list[i], + coredump->adev->reset_info.reset_dump_reg_value[i]); } return count - iter.remain; From patchwork Thu Aug 17 18:20:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 135981 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp1160121vqi; Thu, 17 Aug 2023 21:53:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IG2844wqA4GbpAgPYAwNjz8VJ1v5EGIbdQkGwufd4THg7dUQqoRmWdB+QUVhf2klScXGVv/ X-Received: by 2002:a17:907:784f:b0:993:f6c8:300f with SMTP id lb15-20020a170907784f00b00993f6c8300fmr978159ejc.15.1692334425160; Thu, 17 Aug 2023 21:53:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692334425; cv=none; d=google.com; s=arc-20160816; b=xSLLKe5q4aoI3Lt5mnWl06jN7zA9V26VAsFWInMxmcbJExyotW4GnAjZX39nKrP/7O WMZ7KjtpVoaWm/li5F5aMbSp+GeOp4icKDfDXNQv5ymPKYJXCbf6CR4AWD6ifPKraD9k 1zIOL3jOpMIqAEsCz0IklRHFIDxFrowhUMQNetJGT0kHvp3saFjT5EUWkpQzSweZj11E TcpNbR6ZYJH08AvJVfIt2sujcGEhruCFPZb8YzjuSNQC+P6JOsSYI+4bcrbYJWWGqHfc kmjZHiJ5xOMoOdhN/M6PvkXUw8a5rdvpQNhlUbuFEHJYeSj1HoZ49raeiC/C49C7Ttle X5gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=wpNvn+yqhe73QXNzChOkRUjDE6Bq2McrBGfbGxLraHE=; fh=akTS0Wq8jEms5e4CIjRZ2klQmCylMzrjhYuxow3r01E=; b=C55ETaWNLSiLh3zvcCAcC7Mm1F3ZiJkSxZRB9n4ScznuhJz1yw9PYKIvuJa24MaTHs OypYz+3tBd3iNlIp0uJ2NJDJy+OHlc0uc9w3b/6eRUfGJjw7y02iaS3AFQUjkWB4L528 1kssW21+V3hX1xm7t00lbeQHt7f0Xro/q5e3WHMYQqwws+LKXa9VR7xn+zaDVpeHR07R fhkkH8RdVlGU7bWmf06q9y97eCfF7OcNDmTZfk5DrY/wbrWG2zZUOzCVJKgkxX5Y9Nhp nIIUC3BvFzTqVMwYjpQQFXA5oJh5GaMW4O2FhEUXtd24zJwCbYoJp5bO47VRlxzfUAkG JbzA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=eFbM+ZwK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ov14-20020a170906fc0e00b00992acf1f370si715525ejb.939.2023.08.17.21.53.09; Thu, 17 Aug 2023 21:53:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=eFbM+ZwK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354345AbjHQSXI (ORCPT + 99 others); Thu, 17 Aug 2023 14:23:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354388AbjHQSW7 (ORCPT ); Thu, 17 Aug 2023 14:22:59 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C28C83C04 for ; Thu, 17 Aug 2023 11:22:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=wpNvn+yqhe73QXNzChOkRUjDE6Bq2McrBGfbGxLraHE=; b=eFbM+ZwKAOWR9xKNTpXtJqaNoi jgoTDV8OKTnk8EluvS/4Nitato5I17tsI+KYiXpFbJWWJGn+DBKFt5pq+MuYRo3lPqPpC2gMo8Pow ZoUDl1lNveSkly0Grfcb8DAkBQ2ls4APXX8DKIHF5+TJSdawJrwFVa5dgYYrwQPhzT5BzxJNiagcc 1T8m+mErTE291plaWY+QsIsZumQXbYMeZBWKHp1n6wX1B8iMyNH7pAIUpcMY4+mZvdAfuhK+kIUFF Qzp39cW50t6LIAdE4yhniax7N0OhGo31y5SutagknqTiloYGYrvjhJ1h1fyiorYOTGYyE40Pbm8oP bXZy4baA==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qWhcM-0021I9-KG; Thu, 17 Aug 2023 20:21:11 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , Shashank Sharma , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [PATCH v5 4/5] drm/amdgpu: Move coredump code to amdgpu_reset file Date: Thu, 17 Aug 2023 15:20:49 -0300 Message-ID: <20230817182050.205925-5-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230817182050.205925-1-andrealmeid@igalia.com> References: <20230817182050.205925-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774541262593207706 X-GMAIL-MSGID: 1774541262593207706 Giving that we use codedump just for device resets, move it's functions and structs to a more semantic file, the amdgpu_reset.{c, h}. Signed-off-by: André Almeida --- v5: no change --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 9 --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 78 ---------------------- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 76 +++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 10 +++ 4 files changed, 86 insertions(+), 87 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 56d78ca6e917..b11187d153ef 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -781,15 +781,6 @@ struct amdgpu_mqd { #define AMDGPU_PRODUCT_NAME_LEN 64 struct amdgpu_reset_domain; -#ifdef CONFIG_DEV_COREDUMP -struct amdgpu_coredump_info { - struct amdgpu_device *adev; - struct amdgpu_task_info reset_task_info; - struct timespec64 reset_time; - bool reset_vram_lost; -}; -#endif - struct amdgpu_reset_info { /* reset dump register */ u32 *reset_dump_reg_list; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index 96975591841d..883953f2ae53 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -32,8 +32,6 @@ #include #include #include -#include -#include #include #include @@ -4799,82 +4797,6 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) return 0; } -#ifndef CONFIG_DEV_COREDUMP -static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, - struct amdgpu_reset_context *reset_context) -{ -} -#else -static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, - size_t count, void *data, size_t datalen) -{ - struct drm_printer p; - struct amdgpu_coredump_info *coredump = data; - struct drm_print_iterator iter; - int i; - - iter.data = buffer; - iter.offset = 0; - iter.start = offset; - iter.remain = count; - - p = drm_coredump_printer(&iter); - - drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); - drm_printf(&p, "kernel: " UTS_RELEASE "\n"); - drm_printf(&p, "module: " KBUILD_MODNAME "\n"); - drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); - if (coredump->reset_task_info.pid) - drm_printf(&p, "process_name: %s PID: %d\n", - coredump->reset_task_info.process_name, - coredump->reset_task_info.pid); - - if (coredump->reset_vram_lost) - drm_printf(&p, "VRAM is lost due to GPU reset!\n"); - if (coredump->adev->reset_info.num_regs) { - drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); - - for (i = 0; i < coredump->adev->reset_info.num_regs; i++) - drm_printf(&p, "0x%08x: 0x%08x\n", - coredump->adev->reset_info.reset_dump_reg_list[i], - coredump->adev->reset_info.reset_dump_reg_value[i]); - } - - return count - iter.remain; -} - -static void amdgpu_devcoredump_free(void *data) -{ - kfree(data); -} - -static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, - struct amdgpu_reset_context *reset_context) -{ - struct amdgpu_coredump_info *coredump; - struct drm_device *dev = adev_to_drm(adev); - - coredump = kzalloc(sizeof(*coredump), GFP_NOWAIT); - - if (!coredump) { - DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); - return; - } - - coredump->reset_vram_lost = vram_lost; - - if (reset_context->job && reset_context->job->vm) - coredump->reset_task_info = reset_context->job->vm->task_info; - - coredump->adev = adev; - - ktime_get_ts64(&coredump->reset_time); - - dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_NOWAIT, - amdgpu_devcoredump_read, amdgpu_devcoredump_free); -} -#endif - int amdgpu_do_asic_reset(struct list_head *device_list_handle, struct amdgpu_reset_context *reset_context) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c index 5fed06ffcc6b..579b70a3cdab 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c @@ -21,6 +21,9 @@ * */ +#include +#include + #include "amdgpu_reset.h" #include "aldebaran.h" #include "sienna_cichlid.h" @@ -167,5 +170,78 @@ void amdgpu_device_unlock_reset_domain(struct amdgpu_reset_domain *reset_domain) up_write(&reset_domain->sem); } +#ifndef CONFIG_DEV_COREDUMP +void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) +{ +} +#else +static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, + size_t count, void *data, size_t datalen) +{ + struct drm_printer p; + struct amdgpu_coredump_info *coredump = data; + struct drm_print_iterator iter; + int i; + + iter.data = buffer; + iter.offset = 0; + iter.start = offset; + iter.remain = count; + + p = drm_coredump_printer(&iter); + + drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); + drm_printf(&p, "kernel: " UTS_RELEASE "\n"); + drm_printf(&p, "module: " KBUILD_MODNAME "\n"); + drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); + if (coredump->reset_task_info.pid) + drm_printf(&p, "process_name: %s PID: %d\n", + coredump->reset_task_info.process_name, + coredump->reset_task_info.pid); + + if (coredump->reset_vram_lost) + drm_printf(&p, "VRAM is lost due to GPU reset!\n"); + if (coredump->adev->reset_info.num_regs) { + drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); + + for (i = 0; i < coredump->adev->reset_info.num_regs; i++) + drm_printf(&p, "0x%08x: 0x%08x\n", + coredump->adev->reset_info.reset_dump_reg_list[i], + coredump->adev->reset_info.reset_dump_reg_value[i]); + } + + return count - iter.remain; +} +static void amdgpu_devcoredump_free(void *data) +{ + kfree(data); +} +void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) +{ + struct amdgpu_coredump_info *coredump; + struct drm_device *dev = adev_to_drm(adev); + + coredump = kzalloc(sizeof(*coredump), GFP_NOWAIT); + + if (!coredump) { + DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); + return; + } + + coredump->reset_vram_lost = vram_lost; + + if (reset_context->job && reset_context->job->vm) + coredump->reset_task_info = reset_context->job->vm->task_info; + + coredump->adev = adev; + + ktime_get_ts64(&coredump->reset_time); + + dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_NOWAIT, + amdgpu_devcoredump_read, amdgpu_devcoredump_free); +} +#endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h index f4a501ff87d9..01e8183ade4b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h @@ -87,6 +87,14 @@ struct amdgpu_reset_domain { atomic_t reset_res; }; +#ifdef CONFIG_DEV_COREDUMP +struct amdgpu_coredump_info { + struct amdgpu_device *adev; + struct amdgpu_task_info reset_task_info; + struct timespec64 reset_time; + bool reset_vram_lost; +}; +#endif int amdgpu_reset_init(struct amdgpu_device *adev); int amdgpu_reset_fini(struct amdgpu_device *adev); @@ -126,4 +134,6 @@ void amdgpu_device_lock_reset_domain(struct amdgpu_reset_domain *reset_domain); void amdgpu_device_unlock_reset_domain(struct amdgpu_reset_domain *reset_domain); +void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context); #endif From patchwork Thu Aug 17 18:20:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 136041 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp1544101vqi; Fri, 18 Aug 2023 10:00:42 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE1AnAPoAI53wfo8UFjCAJFcgHtbudIavvwTFpJv8AbJbRdy0KrXsVocBzPGUwEWmLFVsu9 X-Received: by 2002:a17:907:7797:b0:98c:e72c:6b83 with SMTP id ky23-20020a170907779700b0098ce72c6b83mr1942057ejc.45.1692378042371; Fri, 18 Aug 2023 10:00:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692378042; cv=none; d=google.com; s=arc-20160816; b=YQeYWmT40FYoaX8fI/2Ax55pe2U4Vbm/ix65OPZ30+LNoFJ0wAjOukQzDBurunm/Sr d2B9Wr2gSFfhGvje3PlSVj/hEzfR0ZeN3jDpneWd8Q4rsEIcei7JLj2DKcGvgPhlizdE TC2zy0OFiNePlmTohwy/+gNhDn4HuI+Hx+yeZBTfDFATGrpIAkBhwiaSIBNpaLUficC9 o3a5Q7oMr+rtwiZCvCrMlr7HqgMwApPSqdEJ5yfx87Z/OVVvLUVGat/iHyMkxoX7DLpD +l5jm5ruAaqsOmCU+6GAJULA1sBbcVg9fgFYc8Ka4tuFQMiXb1x6PeTN/jwNd9bCFeGF jtVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vJR/YE4jLhlvafKUQzTk10zMBj33RH6GySsj83iQLmk=; fh=akTS0Wq8jEms5e4CIjRZ2klQmCylMzrjhYuxow3r01E=; b=OjVG3qp41rOWGubW73fq/Un+3f4HeI/VU7vVp/ee6QwjU/BIM0731BDYPx5ib6qqhs qiTsk6g210eietx1HX17K1vIRUGHLNo+BoSTx4rZD0bQky4HlnlRkg4EzuB5nYFuVT88 tJu2snc13HrrDoJOKWXFDjfR/iSPwIl4nMQrYU6AnnxZjqpM8JAUYAgUpCAlpDw53Ynp OWZbwY0iBzCOT86u7eSvRg1Vt2ffeeYLWCynXiScUfRQgG03BksY/RVNFKOrvXrirwts I5Q1qZYTUjzjI1RoG4nzcQwQP8MjDXZUk2lzsXryAFo0i+gueNxEaFSoX+mgV124jBsU x9Mw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=aMhn3nzy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lx2-20020a170906af0200b00992f309cfe7si1455134ejb.602.2023.08.18.10.00.16; Fri, 18 Aug 2023 10:00:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=aMhn3nzy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354431AbjHQSXr (ORCPT + 99 others); Thu, 17 Aug 2023 14:23:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1354443AbjHQSX2 (ORCPT ); Thu, 17 Aug 2023 14:23:28 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C21CD3C05 for ; Thu, 17 Aug 2023 11:23:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=vJR/YE4jLhlvafKUQzTk10zMBj33RH6GySsj83iQLmk=; b=aMhn3nzymXpTHm8PA1f+ApsZxb KxcM8oJThmbLXgVPEIGWMQUsjs7pTM9vUoxTAbd37ZjIsTCEl9e90MnKXlqtC6Xi428DqUnYYCEy3 TlnOMHgqto7NHhJt35qHDWIj91bkJWFRDEshH/gXKYS/OBLqd+r/KRe9Dw8uIv0lx5R31tCP1qZ/v i5Zub+4BJN/30S3LmcpBGztGz4qxTn0kno+ykjQBRnFXeQjFaasZelp5ekqWqk33PLy7x/8AhFt8c gZrAXxKRtdH2NEBetvOGNnXAjO82dkA2IbnFysXfw5qLluUIQRA82Sa/7xSbOt6TqgFXcqkgRGP0r 2SJYF9Hg==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qWhcP-0021I9-Tc; Thu, 17 Aug 2023 20:21:14 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , Shashank Sharma , =?utf-8?q?Andr=C3=A9_Almeida?= Subject: [PATCH v5 5/5] drm/amdgpu: Create version number for coredumps Date: Thu, 17 Aug 2023 15:20:50 -0300 Message-ID: <20230817182050.205925-6-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230817182050.205925-1-andrealmeid@igalia.com> References: <20230817182050.205925-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774586997839653739 X-GMAIL-MSGID: 1774586997839653739 Even if there's nothing currently parsing amdgpu's coredump files, if we eventually have such tools they will be glad to find a version field to properly read the file. Create a version number to be displayed on top of coredump file, to be incremented when the file format or content get changed. Signed-off-by: André Almeida --- v5: new patch drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 3 +++ 2 files changed, 4 insertions(+) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c index 579b70a3cdab..e92c81ff27be 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c @@ -192,6 +192,7 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, p = drm_coredump_printer(&iter); drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); + drm_printf(&p, "version: " AMDGPU_COREDUMP_VERSION "\n"); drm_printf(&p, "kernel: " UTS_RELEASE "\n"); drm_printf(&p, "module: " KBUILD_MODNAME "\n"); drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h index 01e8183ade4b..ec3a409ec509 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h @@ -88,6 +88,9 @@ struct amdgpu_reset_domain { }; #ifdef CONFIG_DEV_COREDUMP + +#define AMDGPU_COREDUMP_VERSION "1" + struct amdgpu_coredump_info { struct amdgpu_device *adev; struct amdgpu_task_info reset_task_info;