From patchwork Tue Aug 15 19:50:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 136252 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp2460561vqi; Sun, 20 Aug 2023 05:45:23 -0700 (PDT) X-Google-Smtp-Source: AGHT+IH1R0jFFU0rv+Fg2wKv7MtKn0LiNDJWoHH4sqlk3+RsK6AlZnDUQF+sIxomAOjbqa1kjsp/ X-Received: by 2002:a17:902:dac9:b0:1bc:5f27:a20b with SMTP id q9-20020a170902dac900b001bc5f27a20bmr5840165plx.59.1692535523210; Sun, 20 Aug 2023 05:45:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692535523; cv=none; d=google.com; s=arc-20160816; b=sbW3EYkTDWnrH9iTvihfG57/kJ/8KbGZt7ugWutVBjIy4r2rkZCVqtFcnyO4re7/p6 sd6/7qzhFb7w5FUs3rZzSy5b03bVDHaczy/nsPjRTs1E9ak3m7jjQXewAf7KWBegkB3H 5cmFpUhRzCzXIY7MEFOxHqDq/LSz5a50pRUUQxm5FZmkg9sKNY6buSbx4JXprz7pq5zp 4Fo3df+ivI1f/WBWVfrC6W5zs32t//ETW7fXeKXyzokujcUBeGvC+8qnd8fnBJxvlFk/ gnf1StlvMHU379CXQW7+zoTT4VMDTMtrrsiIIJ1DRaJinfY2UkFFBoxLSLvSxGRdsnIs /IHw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=0So8O3/MNtUi08+iM0GTq8USjRthw2+S2AnIdqjvHuo=; fh=4drWZfXh6M/zU5TbzJpAHkpGJffjB2tpWVKmXKr8SAQ=; b=vfw+7sEtDek5p8kAm16mPFjzLWhGNmIfMILXqRg9NjvuSizDH4rHmy0i+sNKFtyHZY PWuJcZF9D+WrI4RTMZHr2YfV2l0EMwCBhJc57vQ3r0V1/UesRCTgtVXDv4ExzGxIWs3k DahZdRcBjk8oGxgUZA7c7e1PjnkcQWMYVYUWiMxwc1G7MKGQA/wBSDtMElnz6YgAHFOj wUXW2mg4eI872RHFzL7dEpsKP7llbKZLvsSBxLxZNd560GuT48qLv0+0Vq6dbsNZJSIR usvRmu6c3eZ4saR1DRUIVnlU+XpsX2MCPtBt25Vh5tfogE0CX7FngQ9uECasVgpiAV9i KAdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=q2glmrup; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id u15-20020a17090341cf00b001bb0ff2b354si5447267ple.425.2023.08.20.05.45.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 20 Aug 2023 05:45:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=q2glmrup; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9F4C32DB2E0; Sat, 19 Aug 2023 11:51:39 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240075AbjHOTwJ (ORCPT + 99 others); Tue, 15 Aug 2023 15:52:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43236 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240144AbjHOTva (ORCPT ); Tue, 15 Aug 2023 15:51:30 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDFC51BD1 for ; Tue, 15 Aug 2023 12:51:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=0So8O3/MNtUi08+iM0GTq8USjRthw2+S2AnIdqjvHuo=; b=q2glmrupVK+JIYupjkCpk8dXgr bFE/kRAqwwmDJPardd9qkvhQD9bdp4ZybFG3M8ZJErKyZhYuXf1Nr4hnGNwkwpzauTNzq4+JexPA4 xqJgISPfNLAhJK+9zjxaSVPP1A7TTuHXPTefAE/Lw6yHrWPSGJVOvggl8f+mnd3Z6VgLq+29qcUVt pTRZ9trsck8W5vrKHXYW0K06pDzhSSJScSiGazYOjCq3QYC6ypez9ZiU2yoYWca1zfhrlWa9ufu4O BI7VbPX+C2rI4qRCawUAYqP2hUH+vQ2gvE3CDmLf59hPa2tGZvxIXqM++u9nCj6itWRAxaccuW6D/ 6dU7+AWg==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qW04a-001DUX-DG; Tue, 15 Aug 2023 21:51:24 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , =?utf-8?q?Andr?= =?utf-8?q?=C3=A9_Almeida?= Subject: [PATCH v4 1/4] drm/amdgpu: Allocate coredump memory in a nonblocking way Date: Tue, 15 Aug 2023 16:50:57 -0300 Message-ID: <20230815195100.294458-2-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230815195100.294458-1-andrealmeid@igalia.com> References: <20230815195100.294458-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774752128834411060 X-GMAIL-MSGID: 1774752128834411060 During a GPU reset, a normal memory reclaim could block to reclaim memory. Giving that coredump is a best effort mechanism, it shouldn't disturb the reset path. Change its memory allocation flag to a nonblocking one. Signed-off-by: André Almeida Reviewed-by: Christian König --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index aa171db68639..bf4781551f88 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4847,7 +4847,7 @@ static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev) struct drm_device *dev = adev_to_drm(adev); ktime_get_ts64(&adev->reset_time); - dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_KERNEL, + dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_NOWAIT, amdgpu_devcoredump_read, amdgpu_devcoredump_free); } #endif From patchwork Tue Aug 15 19:50:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 135769 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b82d:0:b0:3f2:4152:657d with SMTP id z13csp30651vqi; Wed, 16 Aug 2023 07:01:01 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFBIFZdCoOU7rwaXVISQInBP8qa+qKgWC4v5AkvUkevWwdw+o/Uk1qKpA8SQ2y4BE5sk34k X-Received: by 2002:a05:6a20:1398:b0:133:be9d:a8d3 with SMTP id hn24-20020a056a20139800b00133be9da8d3mr1806237pzc.14.1692194461067; Wed, 16 Aug 2023 07:01:01 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692194461; cv=none; d=google.com; s=arc-20160816; b=aW4+sUzBeqjjJ0l7UezhIdjDIOtqej8x/JLEscUgFzI1is3mcftNmmvia7cmQ0tJ/V wqY6iZSdadKCtqWTJg9uSvRtdveKlHFJBu5WbaKgZ1yuaIq4zQvXEUHw+PPlHpIcihul hgUrMp3RXc5MuIT83UtOVf+OVtQ6BW3VS9V6eaN02/9mSFPwgaf+kSW+ya1gBMbtqcTG boNzPLMZzDcIs5aUKgQ2eB/Gr4IL/IziJK+BW6s8foOyWIgI1Upv0Nj/ctqybwR55r8a /p9Y/2+Ojy7WjI1AK0S2Y+Rf6VRwA3ra304K8AnpGdG68FPDBPcCN1ZzY9nMY086UBoQ J8VA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ImBHA/WSjOaOggymA5iZxzWXoxfGvc/FXVhXRQzxsLk=; fh=4drWZfXh6M/zU5TbzJpAHkpGJffjB2tpWVKmXKr8SAQ=; b=JvS/+l652vpB+qYiDTtxx64X8x9EK0IcOtgpC2iDV1/K1FsGgaVGegYuADRdXrg4yE reL+FbMb9R5loXo9nxzHGEkKyZIIJvDucJDiISxG47/60SnroyyPv/qK04g2RehtzqHo S6A5EFhFxdJD1w0yvH98iwwln3T5bOJfBHPR7RNet0EiFDyAYl1dZz17gTo+T/v0qqwW tNPXkQhpWYzTJ+PxcOPBrGBK1phCcGiil5AYQYzqFoeIduYWYPAP3KGbe0AF0dky/4zD TRBVcNV6KDK3AlTtbHLZyx7DGCR3WgL53N8QtzGtGPxEJHSe8aHw2QV+6fsZP3yWCiy+ LMQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=MMxyqXxT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fj8-20020a056a003a0800b0068795f0098csi11797553pfb.109.2023.08.16.07.00.35; Wed, 16 Aug 2023 07:01:01 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=MMxyqXxT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240035AbjHOTwF (ORCPT + 99 others); Tue, 15 Aug 2023 15:52:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37730 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240034AbjHOTvb (ORCPT ); Tue, 15 Aug 2023 15:51:31 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 30DCB1BF8 for ; Tue, 15 Aug 2023 12:51:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=ImBHA/WSjOaOggymA5iZxzWXoxfGvc/FXVhXRQzxsLk=; b=MMxyqXxTZrMEeKtPEwT3HzdgC0 QYLNAkh4aqbVJCLLIiphLa6iEJhD8r5/x9eh35MGYTTKl7Za9pVTQeAQK75uCnKlzvEkF9Hjbgmkw iS4+79zhGFjM7a8dNeM+H0uP0k0gTQq3Dd7+HAaIr7TrUs5B1RboEUOd79mDATI8sSSpvnpBaX0TH Y4RUOCcE2t7rpGx3S7PYydsiSR+Moz3OwLOiM4x0QVDYsXanD8XTeicGt01kuXj4ElJU0pDnfo+qt RiqZ83BRC6kQaMYqztM/+1r4nUnfKkCQ4KsGS75HP0gJiSNOtJQcm8VwyGsJ+1LoQxMOxWNit647y WdEu7xvw==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qW04d-001DUX-MA; Tue, 15 Aug 2023 21:51:28 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , =?utf-8?q?Andr?= =?utf-8?q?=C3=A9_Almeida?= Subject: [PATCH v4 2/4] drm/amdgpu: Rework coredump to use memory dynamically Date: Tue, 15 Aug 2023 16:50:58 -0300 Message-ID: <20230815195100.294458-3-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230815195100.294458-1-andrealmeid@igalia.com> References: <20230815195100.294458-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774394498900657440 X-GMAIL-MSGID: 1774394498900657440 Instead of storing coredump information inside amdgpu_device struct, move if to a proper separated struct and allocate it dynamically. This will make it easier to further expand the logged information. Signed-off-by: André Almeida --- v4: change kmalloc to kzalloc --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 14 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 63 ++++++++++++++-------- 2 files changed, 49 insertions(+), 28 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 9c6a332261ab..0d560b713948 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1088,11 +1088,6 @@ struct amdgpu_device { uint32_t *reset_dump_reg_list; uint32_t *reset_dump_reg_value; int num_regs; -#ifdef CONFIG_DEV_COREDUMP - struct amdgpu_task_info reset_task_info; - bool reset_vram_lost; - struct timespec64 reset_time; -#endif bool scpm_enabled; uint32_t scpm_status; @@ -1105,6 +1100,15 @@ struct amdgpu_device { uint32_t aid_mask; }; +#ifdef CONFIG_DEV_COREDUMP +struct amdgpu_coredump_info { + struct amdgpu_device *adev; + struct amdgpu_task_info reset_task_info; + struct timespec64 reset_time; + bool reset_vram_lost; +}; +#endif + static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) { return container_of(ddev, struct amdgpu_device, ddev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index bf4781551f88..b5b879bcc5c9 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -4799,12 +4799,17 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) return 0; } -#ifdef CONFIG_DEV_COREDUMP +#ifndef CONFIG_DEV_COREDUMP +static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) +{ +} +#else static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, size_t count, void *data, size_t datalen) { struct drm_printer p; - struct amdgpu_device *adev = data; + struct amdgpu_coredump_info *coredump = data; struct drm_print_iterator iter; int i; @@ -4818,21 +4823,21 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); drm_printf(&p, "kernel: " UTS_RELEASE "\n"); drm_printf(&p, "module: " KBUILD_MODNAME "\n"); - drm_printf(&p, "time: %lld.%09ld\n", adev->reset_time.tv_sec, adev->reset_time.tv_nsec); - if (adev->reset_task_info.pid) + drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); + if (coredump->reset_task_info.pid) drm_printf(&p, "process_name: %s PID: %d\n", - adev->reset_task_info.process_name, - adev->reset_task_info.pid); + coredump->reset_task_info.process_name, + coredump->reset_task_info.pid); - if (adev->reset_vram_lost) + if (coredump->reset_vram_lost) drm_printf(&p, "VRAM is lost due to GPU reset!\n"); - if (adev->num_regs) { + if (coredump->adev->num_regs) { drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); - for (i = 0; i < adev->num_regs; i++) + for (i = 0; i < coredump->adev->num_regs; i++) drm_printf(&p, "0x%08x: 0x%08x\n", - adev->reset_dump_reg_list[i], - adev->reset_dump_reg_value[i]); + coredump->adev->reset_dump_reg_list[i], + coredump->adev->reset_dump_reg_value[i]); } return count - iter.remain; @@ -4840,14 +4845,32 @@ static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, static void amdgpu_devcoredump_free(void *data) { + kfree(data); } -static void amdgpu_reset_capture_coredumpm(struct amdgpu_device *adev) +static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) { + struct amdgpu_coredump_info *coredump; struct drm_device *dev = adev_to_drm(adev); - ktime_get_ts64(&adev->reset_time); - dev_coredumpm(dev->dev, THIS_MODULE, adev, 0, GFP_NOWAIT, + coredump = kzalloc(sizeof(*coredump), GFP_NOWAIT); + + if (!coredump) { + DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); + return; + } + + coredump->reset_vram_lost = vram_lost; + + if (reset_context->job && reset_context->job->vm) + coredump->reset_task_info = reset_context->job->vm->task_info; + + coredump->adev = adev; + + ktime_get_ts64(&coredump->reset_time); + + dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_NOWAIT, amdgpu_devcoredump_read, amdgpu_devcoredump_free); } #endif @@ -4955,15 +4978,9 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle, goto out; vram_lost = amdgpu_device_check_vram_lost(tmp_adev); -#ifdef CONFIG_DEV_COREDUMP - tmp_adev->reset_vram_lost = vram_lost; - memset(&tmp_adev->reset_task_info, 0, - sizeof(tmp_adev->reset_task_info)); - if (reset_context->job && reset_context->job->vm) - tmp_adev->reset_task_info = - reset_context->job->vm->task_info; - amdgpu_reset_capture_coredumpm(tmp_adev); -#endif + + amdgpu_coredump(tmp_adev, vram_lost, reset_context); + if (vram_lost) { DRM_INFO("VRAM is lost due to GPU reset!\n"); amdgpu_inc_vram_lost(tmp_adev); From patchwork Tue Aug 15 19:50:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Andr=C3=A9_Almeida?= X-Patchwork-Id: 135700 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a888:0:b0:3f2:4152:657d with SMTP id x8csp731836vqo; Tue, 15 Aug 2023 15:32:28 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHtri6fEnzRnihDGoICw9JIvV/Ey9W5GLICaKN7qG80vbFYXuaujKHkga9BYPsN1MwBuINt X-Received: by 2002:a05:6a00:189c:b0:686:5e0d:bd4f with SMTP id x28-20020a056a00189c00b006865e0dbd4fmr152580pfh.0.1692138748093; Tue, 15 Aug 2023 15:32:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1692138748; cv=none; d=google.com; s=arc-20160816; b=miM5SjFf3YHbANzSP4bWsH8eVEaf9rqeHQz7sajlKS5b7oSIzwPM6lg85ObhOoj5Ix 3MJJzlNhdPvK7sLdsMurzLI71QPoYpk/yFrbD6msnEknpy6FnpVn6LM1ez/jxsbpO0EN YkuxoyMqtP1B5vjak7sja1INZpOiHKwyw/233UbSArNKuE7+A5ls1KKlF28BbnebF41E 6wxRj4GiKt77AiTXI9BmxBPXe09aZtc0zmOllsGgXjESjZ5qBNWLreNyHWQK6inQRYYb Z/5kNy2dXwI0Febu+pxKMvD3u3PwBNzG/q4X8snusOQW1iu3e4wjwUE27IqLuqbCafLy xaWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=xNAAR5n1wJiEtZZTzwg2cDtRtWsVUsqF4aEb0hGN2xA=; fh=4drWZfXh6M/zU5TbzJpAHkpGJffjB2tpWVKmXKr8SAQ=; b=Rp8UMJacxHM3gRkOAWjhwnXDGAp9EmMgWOd8XO2AKjN82DhFUtjFiPNqu0ezChpFTV SwaItwzgNVsjD1viKLDjVF8G4wLk8qF/P8hPR/2WkJTixwW8XXxmRNJS8x3MEw1Q/k+w vIjmX62wxMKXrnC7LrKxeV9YBf5nnT3LcNE39wyNXcDv/miinvt8icOlDUY/LFmORdGS 8o3k6x24EI95mBLVr2VWyJ/Gul34l1ILrnnv1/eRjT6ZgbaXyA33knjZnvqA3uhjTNJW TiJmRKTznKnafYn2oZ0AQMav9+/T+Baho4bdLXTyvTI/OejKoPKb0miW2EIlS2nLwUXX cIKg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=Yuo3DtQL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id br15-20020a056a00440f00b006875bd6d8d9si10025938pfb.169.2023.08.15.15.32.13; Tue, 15 Aug 2023 15:32:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=Yuo3DtQL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240088AbjHOTwL (ORCPT + 99 others); Tue, 15 Aug 2023 15:52:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43232 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240023AbjHOTve (ORCPT ); Tue, 15 Aug 2023 15:51:34 -0400 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4976F19A5 for ; Tue, 15 Aug 2023 12:51:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:References: In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=xNAAR5n1wJiEtZZTzwg2cDtRtWsVUsqF4aEb0hGN2xA=; b=Yuo3DtQLU3/0++LW0+NnaSw0nr /yi+2UU76UUUC2TTR7Kab0uJei8j8F0KLEZJQlNqKJO6sojhTqZJhGZ7W4d/mJ2GqKdo7PsN+dObK gB1o0txqyt1IJRt8UBSrVmzvSyT3xsPR0qoUEt7bY2V1ap1BGM0gRO44g6TpB/Vrr3dK/LEbCJVoT xRNnpdY/Ad0ei/ntG+ZgNFfA9k7YLSdUcK5TAntlcAtkWwpMKIW61BsQnxwpcDaYOU022yhgUiX9p KBXcSf+DKd8JLKB2q4hy3Q+DzC7sy/azaSStVloICVyAN/bZz6W2n9zXdkq07OgTHBRB4DoV9Xi8A yrcuh11g==; Received: from [191.193.179.209] (helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1qW04g-001DUX-Oa; Tue, 15 Aug 2023 21:51:31 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= To: dri-devel@lists.freedesktop.org, amd-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org Cc: kernel-dev@igalia.com, alexander.deucher@amd.com, christian.koenig@amd.com, pierre-eric.pelloux-prayer@amd.com, =?utf-8?b?J01hcmVrIE9sxaHDoWsn?= , Samuel Pitoiset , Bas Nieuwenhuizen , =?utf-8?q?Timur_Krist=C3=B3f?= , =?utf-8?q?Andr?= =?utf-8?q?=C3=A9_Almeida?= Subject: [PATCH v4 3/4] drm/amdgpu: Move coredump code to amdgpu_reset file Date: Tue, 15 Aug 2023 16:50:59 -0300 Message-ID: <20230815195100.294458-4-andrealmeid@igalia.com> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20230815195100.294458-1-andrealmeid@igalia.com> References: <20230815195100.294458-1-andrealmeid@igalia.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1774336079912464449 X-GMAIL-MSGID: 1774336079912464449 Giving that we use codedump just for device resets, move it's functions and structs to a more semantic file, the amdgpu_reset.{c, h}. Signed-off-by: André Almeida --- drivers/gpu/drm/amd/amdgpu/amdgpu.h | 9 --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 78 ---------------------- drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 76 +++++++++++++++++++++ drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h | 11 +++ 4 files changed, 87 insertions(+), 87 deletions(-) diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu.h b/drivers/gpu/drm/amd/amdgpu/amdgpu.h index 0d560b713948..314b06cddc39 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu.h @@ -1100,15 +1100,6 @@ struct amdgpu_device { uint32_t aid_mask; }; -#ifdef CONFIG_DEV_COREDUMP -struct amdgpu_coredump_info { - struct amdgpu_device *adev; - struct amdgpu_task_info reset_task_info; - struct timespec64 reset_time; - bool reset_vram_lost; -}; -#endif - static inline struct amdgpu_device *drm_to_adev(struct drm_device *ddev) { return container_of(ddev, struct amdgpu_device, ddev); diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index b5b879bcc5c9..9706f608723a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -32,8 +32,6 @@ #include #include #include -#include -#include #include #include @@ -4799,82 +4797,6 @@ static int amdgpu_reset_reg_dumps(struct amdgpu_device *adev) return 0; } -#ifndef CONFIG_DEV_COREDUMP -static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, - struct amdgpu_reset_context *reset_context) -{ -} -#else -static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, - size_t count, void *data, size_t datalen) -{ - struct drm_printer p; - struct amdgpu_coredump_info *coredump = data; - struct drm_print_iterator iter; - int i; - - iter.data = buffer; - iter.offset = 0; - iter.start = offset; - iter.remain = count; - - p = drm_coredump_printer(&iter); - - drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); - drm_printf(&p, "kernel: " UTS_RELEASE "\n"); - drm_printf(&p, "module: " KBUILD_MODNAME "\n"); - drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); - if (coredump->reset_task_info.pid) - drm_printf(&p, "process_name: %s PID: %d\n", - coredump->reset_task_info.process_name, - coredump->reset_task_info.pid); - - if (coredump->reset_vram_lost) - drm_printf(&p, "VRAM is lost due to GPU reset!\n"); - if (coredump->adev->num_regs) { - drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); - - for (i = 0; i < coredump->adev->num_regs; i++) - drm_printf(&p, "0x%08x: 0x%08x\n", - coredump->adev->reset_dump_reg_list[i], - coredump->adev->reset_dump_reg_value[i]); - } - - return count - iter.remain; -} - -static void amdgpu_devcoredump_free(void *data) -{ - kfree(data); -} - -static void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, - struct amdgpu_reset_context *reset_context) -{ - struct amdgpu_coredump_info *coredump; - struct drm_device *dev = adev_to_drm(adev); - - coredump = kzalloc(sizeof(*coredump), GFP_NOWAIT); - - if (!coredump) { - DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); - return; - } - - coredump->reset_vram_lost = vram_lost; - - if (reset_context->job && reset_context->job->vm) - coredump->reset_task_info = reset_context->job->vm->task_info; - - coredump->adev = adev; - - ktime_get_ts64(&coredump->reset_time); - - dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_NOWAIT, - amdgpu_devcoredump_read, amdgpu_devcoredump_free); -} -#endif - int amdgpu_do_asic_reset(struct list_head *device_list_handle, struct amdgpu_reset_context *reset_context) { diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c index 5fed06ffcc6b..46c8d6ce349c 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c @@ -21,6 +21,9 @@ * */ +#include +#include + #include "amdgpu_reset.h" #include "aldebaran.h" #include "sienna_cichlid.h" @@ -167,5 +170,78 @@ void amdgpu_device_unlock_reset_domain(struct amdgpu_reset_domain *reset_domain) up_write(&reset_domain->sem); } +#ifndef CONFIG_DEV_COREDUMP +void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) +{ +} +#else +static ssize_t amdgpu_devcoredump_read(char *buffer, loff_t offset, + size_t count, void *data, size_t datalen) +{ + struct drm_printer p; + struct amdgpu_coredump_info *coredump = data; + struct drm_print_iterator iter; + int i; + + iter.data = buffer; + iter.offset = 0; + iter.start = offset; + iter.remain = count; + + p = drm_coredump_printer(&iter); + + drm_printf(&p, "**** AMDGPU Device Coredump ****\n"); + drm_printf(&p, "kernel: " UTS_RELEASE "\n"); + drm_printf(&p, "module: " KBUILD_MODNAME "\n"); + drm_printf(&p, "time: %lld.%09ld\n", coredump->reset_time.tv_sec, coredump->reset_time.tv_nsec); + if (coredump->reset_task_info.pid) + drm_printf(&p, "process_name: %s PID: %d\n", + coredump->reset_task_info.process_name, + coredump->reset_task_info.pid); + + if (coredump->reset_vram_lost) + drm_printf(&p, "VRAM is lost due to GPU reset!\n"); + if (coredump->adev->num_regs) { + drm_printf(&p, "AMDGPU register dumps:\nOffset: Value:\n"); + + for (i = 0; i < coredump->adev->num_regs; i++) + drm_printf(&p, "0x%08x: 0x%08x\n", + coredump->adev->reset_dump_reg_list[i], + coredump->adev->reset_dump_reg_value[i]); + } + + return count - iter.remain; +} +static void amdgpu_devcoredump_free(void *data) +{ + kfree(data); +} +void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context) +{ + struct amdgpu_coredump_info *coredump; + struct drm_device *dev = adev_to_drm(adev); + + coredump = kzalloc(sizeof(*coredump), GFP_NOWAIT); + + if (!coredump) { + DRM_ERROR("%s: failed to allocate memory for coredump\n", __func__); + return; + } + + coredump->reset_vram_lost = vram_lost; + + if (reset_context->job && reset_context->job->vm) + coredump->reset_task_info = reset_context->job->vm->task_info; + + coredump->adev = adev; + + ktime_get_ts64(&coredump->reset_time); + + dev_coredumpm(dev->dev, THIS_MODULE, coredump, 0, GFP_NOWAIT, + amdgpu_devcoredump_read, amdgpu_devcoredump_free); +} +#endif diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h index f4a501ff87d9..362954521721 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.h @@ -87,6 +87,15 @@ struct amdgpu_reset_domain { atomic_t reset_res; }; +#ifdef CONFIG_DEV_COREDUMP +struct amdgpu_coredump_info { + struct amdgpu_device *adev; + struct amdgpu_task_info reset_task_info; + struct timespec64 reset_time; + bool reset_vram_lost; +}; +#endif + int amdgpu_reset_init(struct amdgpu_device *adev); int amdgpu_reset_fini(struct amdgpu_device *adev); @@ -126,4 +135,6 @@ void amdgpu_device_lock_reset_domain(struct amdgpu_reset_domain *reset_domain); void amdgpu_device_unlock_reset_domain(struct amdgpu_reset_domain *reset_domain); +void amdgpu_coredump(struct amdgpu_device *adev, bool vram_lost, + struct amdgpu_reset_context *reset_context); #endif