[07/15] habanalabs: add an option to control watchdog timeout via debugfs
Commit Message
From: Tomer Tayar <ttayar@habana.ai>
Add an option to control the timeout value for the driver's watchdog
of the reset process. The timeout represents the amount of the user
has to close his process once he gets a device reset notification from
the driver.
Signed-off-by: Tomer Tayar <ttayar@habana.ai>
Reviewed-by: Oded Gabbay <ogabbay@kernel.org>
Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
Documentation/ABI/testing/debugfs-driver-habanalabs | 7 +++++++
drivers/misc/habanalabs/common/debugfs.c | 5 +++++
2 files changed, 12 insertions(+)
@@ -91,6 +91,13 @@ Description: Enables the root user to set the device to specific state.
Valid values are "disable", "enable", "suspend", "resume".
User can read this property to see the valid values
+What: /sys/kernel/debug/habanalabs/hl<n>/device_release_watchdog_timeout
+Date: Oct 2022
+KernelVersion: 6.2
+Contact: ttayar@habana.ai
+Description: The watchdog timeout value in seconds for a device relese upon
+ certain error cases, after which the device is reset.
+
What: /sys/kernel/debug/habanalabs/hl<n>/dma_size
Date: Apr 2021
KernelVersion: 5.13
@@ -1769,6 +1769,11 @@ void hl_debugfs_add_device(struct hl_device *hdev)
dev_entry,
&hl_timeout_locked_fops);
+ debugfs_create_u32("device_release_watchdog_timeout",
+ 0644,
+ dev_entry->root,
+ &hdev->device_release_watchdog_timeout_sec);
+
for (i = 0, entry = dev_entry->entry_arr ; i < count ; i++, entry++) {
debugfs_create_file(hl_debugfs_list[i].name,
0444,