[0/5] iommu/vt-d: debugfs: Enhancements to IOMMU debugfs

Message ID 20230625150442.42197-1-Jingqi.liu@intel.com
Headers
Series iommu/vt-d: debugfs: Enhancements to IOMMU debugfs |

Message

Liu, Jingqi June 25, 2023, 3:04 p.m. UTC
  The original debugfs only dumps all IOMMU page tables without pasid
supported. It traverses all devices on the pci bus, then dumps all page
tables based on device domains. This traversal is from software
perspective.

This series dumps page tables by traversing root tables, context tables,
pasid directories and pasid tables from hardware perspective. By
specifying source identifier and PASID, it supports dumping specified
page table or all page tables in legacy mode or scalable mode.

For a device that only supports legacy mode, specify the source
identifier, and search the root table and context table to dump its
page table. It does not support to specify PASID.

For a device that supports scalable mode, specify a
{source identifier, PASID} pair and search the root table, context table
and pasid table to dump its page table.  If the pasid is not specified,
it is set to RID_PASID.

Switch to dump all page tables by specifying "auto".

Examples are as follows:
1) Dump the page table of device "00:1f.0" that only supports legacy
mode.

$ sudo echo 00:1f.0 >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
Device 0000:00:1f.0 @0x105407000
IOVA_PFN                PML5E                   PML4E
0x0000000000000 |       0x0000000000000000      0x0000000105408003
0x0000000000001 |       0x0000000000000000      0x0000000105408003
0x0000000000002 |       0x0000000000000000      0x0000000105408003
0x0000000000003 |       0x0000000000000000      0x0000000105408003

PDPE                    PDE                     PTE
0x0000000105409003      0x000000010540a003      0x0000000000000003
0x0000000105409003      0x000000010540a003      0x0000000000001003
0x0000000105409003      0x000000010540a003      0x0000000000002003
0x0000000105409003      0x000000010540a003      0x0000000000003003

[...]

2) Dump the page table of device "00:0a.0" with pasid "2".

$ sudo echo 00:0a.0,2 >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
Device 0000:00:0a.0 with pasid 2 @0x1083d7000
IOVA_PFN                PML5E                   PML4E
0x0000000000000 |       0x0000000000000000      0x0000000106aaa003
0x0000000000001 |       0x0000000000000000      0x0000000106aaa003
0x0000000000002 |       0x0000000000000000      0x0000000106aaa003
0x0000000000003 |       0x0000000000000000      0x0000000106aaa003

PDPE                    PDE                     PTE
0x000000010a819003      0x000000010a7aa003      0x0000000129800003
0x000000010a819003      0x000000010a7aa003      0x0000000129801003
0x000000010a819003      0x000000010a7aa003      0x0000000129802003
0x000000010a819003      0x000000010a7aa003      0x0000000129803003

[...]

3) Dump all page tables:
$ sudo echo "auto" >
/sys/kernel/debug/iommu/intel/domain_translation_struct
$ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
[...]

Device 0000:00:02.0 @0x103072000
IOVA_PFN                PML5E                   PML4E
0x000000008d800 |       0x0000000000000000      0x0000000103073003
0x000000008d801 |       0x0000000000000000      0x0000000103073003

PDPE                    PDE                     PTE
0x0000000103074003      0x0000000103075003      0x000000008d800003
0x0000000103074003      0x0000000103075003      0x000000008d801003

[...]

Device 0000:00:0a.0 with pasid 2 @0x10a0b6000
IOVA_PFN                PML5E                   PML4E
0x0000000000000 |       0x0000000000000000      0x00000001072d2003
0x0000000000001 |       0x0000000000000000      0x00000001072d2003

PDPE                    PDE                     PTE
0x0000000107d6e003      0x00000001161d4003      0x00000001bdc00003
0x0000000107d6e003      0x00000001161d4003      0x00000001bdc01003

[...]

Thanks,
Jingqi

Jingqi Liu (5):
  iommu/vt-d: debugfs: Define domain_translation_struct file ops
  iommu/vt-d: debugfs: Support specifying source identifier and PASID
  iommu/vt-d: debugfs: Dump the corresponding page table of a pasid
  iommu/vt-d: debugfs: Support dumping a specified page table
  iommu/vt-d: debugfs: Dump entry pointing to huge page

 drivers/iommu/intel/debugfs.c | 361 ++++++++++++++++++++++++++++++----
 1 file changed, 326 insertions(+), 35 deletions(-)
  

Comments

Tian, Kevin July 3, 2023, 7:15 a.m. UTC | #1
> From: Liu, Jingqi <jingqi.liu@intel.com>
> Sent: Sunday, June 25, 2023 11:05 PM
> 
> The original debugfs only dumps all IOMMU page tables without pasid
> supported. It traverses all devices on the pci bus, then dumps all page
> tables based on device domains. This traversal is from software
> perspective.
> 
> This series dumps page tables by traversing root tables, context tables,
> pasid directories and pasid tables from hardware perspective. By
> specifying source identifier and PASID, it supports dumping specified
> page table or all page tables in legacy mode or scalable mode.
> 
> For a device that only supports legacy mode, specify the source
> identifier, and search the root table and context table to dump its
> page table. It does not support to specify PASID.
> 
> For a device that supports scalable mode, specify a
> {source identifier, PASID} pair and search the root table, context table
> and pasid table to dump its page table.  If the pasid is not specified,
> it is set to RID_PASID.
> 
> Switch to dump all page tables by specifying "auto".
> 
> Examples are as follows:
> 1) Dump the page table of device "00:1f.0" that only supports legacy
> mode.
> 
> $ sudo echo 00:1f.0 >
> /sys/kernel/debug/iommu/intel/domain_translation_struct
> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> Device 0000:00:1f.0 @0x105407000
> IOVA_PFN                PML5E                   PML4E
> 0x0000000000000 |       0x0000000000000000      0x0000000105408003
> 0x0000000000001 |       0x0000000000000000      0x0000000105408003
> 0x0000000000002 |       0x0000000000000000      0x0000000105408003
> 0x0000000000003 |       0x0000000000000000      0x0000000105408003
> 
> PDPE                    PDE                     PTE
> 0x0000000105409003      0x000000010540a003      0x0000000000000003
> 0x0000000105409003      0x000000010540a003      0x0000000000001003
> 0x0000000105409003      0x000000010540a003      0x0000000000002003
> 0x0000000105409003      0x000000010540a003      0x0000000000003003
> 
> [...]
> 
> 2) Dump the page table of device "00:0a.0" with pasid "2".
> 
> $ sudo echo 00:0a.0,2 >
> /sys/kernel/debug/iommu/intel/domain_translation_struct
> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct

What about creating a directory layout per {dev, pasid} so the user can
easily figure out and dump?

e.g.

/sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
/sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct

> Device 0000:00:0a.0 with pasid 2 @0x1083d7000
> IOVA_PFN                PML5E                   PML4E
> 0x0000000000000 |       0x0000000000000000      0x0000000106aaa003
> 0x0000000000001 |       0x0000000000000000      0x0000000106aaa003
> 0x0000000000002 |       0x0000000000000000      0x0000000106aaa003
> 0x0000000000003 |       0x0000000000000000      0x0000000106aaa003
> 
> PDPE                    PDE                     PTE
> 0x000000010a819003      0x000000010a7aa003      0x0000000129800003
> 0x000000010a819003      0x000000010a7aa003      0x0000000129801003
> 0x000000010a819003      0x000000010a7aa003      0x0000000129802003
> 0x000000010a819003      0x000000010a7aa003      0x0000000129803003
> 
> [...]
> 
> 3) Dump all page tables:
> $ sudo echo "auto" >
> /sys/kernel/debug/iommu/intel/domain_translation_struct
> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> [...]
> 
> Device 0000:00:02.0 @0x103072000
> IOVA_PFN                PML5E                   PML4E
> 0x000000008d800 |       0x0000000000000000      0x0000000103073003
> 0x000000008d801 |       0x0000000000000000      0x0000000103073003
> 
> PDPE                    PDE                     PTE
> 0x0000000103074003      0x0000000103075003      0x000000008d800003
> 0x0000000103074003      0x0000000103075003      0x000000008d801003
> 
> [...]
> 
> Device 0000:00:0a.0 with pasid 2 @0x10a0b6000
> IOVA_PFN                PML5E                   PML4E
> 0x0000000000000 |       0x0000000000000000      0x00000001072d2003
> 0x0000000000001 |       0x0000000000000000      0x00000001072d2003
> 
> PDPE                    PDE                     PTE
> 0x0000000107d6e003      0x00000001161d4003      0x00000001bdc00003
> 0x0000000107d6e003      0x00000001161d4003      0x00000001bdc01003
> 
> [...]
> 
> Thanks,
> Jingqi
> 
> Jingqi Liu (5):
>   iommu/vt-d: debugfs: Define domain_translation_struct file ops
>   iommu/vt-d: debugfs: Support specifying source identifier and PASID
>   iommu/vt-d: debugfs: Dump the corresponding page table of a pasid
>   iommu/vt-d: debugfs: Support dumping a specified page table
>   iommu/vt-d: debugfs: Dump entry pointing to huge page
> 
>  drivers/iommu/intel/debugfs.c | 361 ++++++++++++++++++++++++++++++----
>  1 file changed, 326 insertions(+), 35 deletions(-)
> 
> --
> 2.21.3
  
Liu, Jingqi July 3, 2023, 2:37 p.m. UTC | #2
On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>> From: Liu, Jingqi <jingqi.liu@intel.com>
>> Sent: Sunday, June 25, 2023 11:05 PM
>>
>> The original debugfs only dumps all IOMMU page tables without pasid
>> supported. It traverses all devices on the pci bus, then dumps all page
>> tables based on device domains. This traversal is from software
>> perspective.
>>
>> This series dumps page tables by traversing root tables, context tables,
>> pasid directories and pasid tables from hardware perspective. By
>> specifying source identifier and PASID, it supports dumping specified
>> page table or all page tables in legacy mode or scalable mode.
>>
>> For a device that only supports legacy mode, specify the source
>> identifier, and search the root table and context table to dump its
>> page table. It does not support to specify PASID.
>>
>> For a device that supports scalable mode, specify a
>> {source identifier, PASID} pair and search the root table, context table
>> and pasid table to dump its page table.  If the pasid is not specified,
>> it is set to RID_PASID.
>>
>> Switch to dump all page tables by specifying "auto".
>>
>> Examples are as follows:
>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>> mode.
>>
>> $ sudo echo 00:1f.0 >
>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>> Device 0000:00:1f.0 @0x105407000
>> IOVA_PFN                PML5E                   PML4E
>> 0x0000000000000 |       0x0000000000000000      0x0000000105408003
>> 0x0000000000001 |       0x0000000000000000      0x0000000105408003
>> 0x0000000000002 |       0x0000000000000000      0x0000000105408003
>> 0x0000000000003 |       0x0000000000000000      0x0000000105408003
>>
>> PDPE                    PDE                     PTE
>> 0x0000000105409003      0x000000010540a003      0x0000000000000003
>> 0x0000000105409003      0x000000010540a003      0x0000000000001003
>> 0x0000000105409003      0x000000010540a003      0x0000000000002003
>> 0x0000000105409003      0x000000010540a003      0x0000000000003003
>>
>> [...]
>>
>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>
>> $ sudo echo 00:0a.0,2 >
>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> What about creating a directory layout per {dev, pasid} so the user can
> easily figure out and dump?
>
> e.g.
>
> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
Thanks.

Do you mean create a directory for each device, whether it supports 
PASID or not ?
Seems the PASID can be assigned at runtime.
So it needs to support creating debugfs file at runtime in IOMMU driver.
Looks like this requires modifying IOMMU driver.

BR,
Jingqi
  
Tian, Kevin July 4, 2023, 7:54 a.m. UTC | #3
> From: Liu, Jingqi <jingqi.liu@intel.com>
> Sent: Monday, July 3, 2023 10:37 PM
> 
> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
> >> From: Liu, Jingqi <jingqi.liu@intel.com>
> >> Sent: Sunday, June 25, 2023 11:05 PM
> >>
> >> The original debugfs only dumps all IOMMU page tables without pasid
> >> supported. It traverses all devices on the pci bus, then dumps all page
> >> tables based on device domains. This traversal is from software
> >> perspective.
> >>
> >> This series dumps page tables by traversing root tables, context tables,
> >> pasid directories and pasid tables from hardware perspective. By
> >> specifying source identifier and PASID, it supports dumping specified
> >> page table or all page tables in legacy mode or scalable mode.
> >>
> >> For a device that only supports legacy mode, specify the source
> >> identifier, and search the root table and context table to dump its
> >> page table. It does not support to specify PASID.
> >>
> >> For a device that supports scalable mode, specify a
> >> {source identifier, PASID} pair and search the root table, context table
> >> and pasid table to dump its page table.  If the pasid is not specified,
> >> it is set to RID_PASID.
> >>
> >> Switch to dump all page tables by specifying "auto".
> >>
> >> Examples are as follows:
> >> 1) Dump the page table of device "00:1f.0" that only supports legacy
> >> mode.
> >>
> >> $ sudo echo 00:1f.0 >
> >> /sys/kernel/debug/iommu/intel/domain_translation_struct
> >> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> >> Device 0000:00:1f.0 @0x105407000
> >> IOVA_PFN                PML5E                   PML4E
> >> 0x0000000000000 |       0x0000000000000000      0x0000000105408003
> >> 0x0000000000001 |       0x0000000000000000      0x0000000105408003
> >> 0x0000000000002 |       0x0000000000000000      0x0000000105408003
> >> 0x0000000000003 |       0x0000000000000000      0x0000000105408003
> >>
> >> PDPE                    PDE                     PTE
> >> 0x0000000105409003      0x000000010540a003      0x0000000000000003
> >> 0x0000000105409003      0x000000010540a003      0x0000000000001003
> >> 0x0000000105409003      0x000000010540a003      0x0000000000002003
> >> 0x0000000105409003      0x000000010540a003      0x0000000000003003
> >>
> >> [...]
> >>
> >> 2) Dump the page table of device "00:0a.0" with pasid "2".
> >>
> >> $ sudo echo 00:0a.0,2 >
> >> /sys/kernel/debug/iommu/intel/domain_translation_struct
> >> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
> > What about creating a directory layout per {dev, pasid} so the user can
> > easily figure out and dump?
> >
> > e.g.
> >
> > /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
> > /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
> Thanks.
> 
> Do you mean create a directory for each device, whether it supports
> PASID or not ?

every device has PASID#0 valid, i.e. RID2PASID.

> Seems the PASID can be assigned at runtime.
> So it needs to support creating debugfs file at runtime in IOMMU driver.
> Looks like this requires modifying IOMMU driver.
> 

Isn't this patch trying to modify the driver?
  
Liu, Jingqi July 11, 2023, 1:40 a.m. UTC | #4
On 7/4/2023 3:54 PM, Tian, Kevin wrote:
>> From: Liu, Jingqi <jingqi.liu@intel.com>
>> Sent: Monday, July 3, 2023 10:37 PM
>>
>> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>>>> From: Liu, Jingqi <jingqi.liu@intel.com>
>>>> Sent: Sunday, June 25, 2023 11:05 PM
>>>>
>>>> The original debugfs only dumps all IOMMU page tables without pasid
>>>> supported. It traverses all devices on the pci bus, then dumps all page
>>>> tables based on device domains. This traversal is from software
>>>> perspective.
>>>>
>>>> This series dumps page tables by traversing root tables, context tables,
>>>> pasid directories and pasid tables from hardware perspective. By
>>>> specifying source identifier and PASID, it supports dumping specified
>>>> page table or all page tables in legacy mode or scalable mode.
>>>>
>>>> For a device that only supports legacy mode, specify the source
>>>> identifier, and search the root table and context table to dump its
>>>> page table. It does not support to specify PASID.
>>>>
>>>> For a device that supports scalable mode, specify a
>>>> {source identifier, PASID} pair and search the root table, context table
>>>> and pasid table to dump its page table.  If the pasid is not specified,
>>>> it is set to RID_PASID.
>>>>
>>>> Switch to dump all page tables by specifying "auto".
>>>>
>>>> Examples are as follows:
>>>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>>>> mode.
>>>>
>>>> $ sudo echo 00:1f.0 >
>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> Device 0000:00:1f.0 @0x105407000
>>>> IOVA_PFN                PML5E                   PML4E
>>>> 0x0000000000000 |       0x0000000000000000      0x0000000105408003
>>>> 0x0000000000001 |       0x0000000000000000      0x0000000105408003
>>>> 0x0000000000002 |       0x0000000000000000      0x0000000105408003
>>>> 0x0000000000003 |       0x0000000000000000      0x0000000105408003
>>>>
>>>> PDPE                    PDE                     PTE
>>>> 0x0000000105409003      0x000000010540a003      0x0000000000000003
>>>> 0x0000000105409003      0x000000010540a003      0x0000000000001003
>>>> 0x0000000105409003      0x000000010540a003      0x0000000000002003
>>>> 0x0000000105409003      0x000000010540a003      0x0000000000003003
>>>>
>>>> [...]
>>>>
>>>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>>>
>>>> $ sudo echo 00:0a.0,2 >
>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>> What about creating a directory layout per {dev, pasid} so the user can
>>> easily figure out and dump?
>>>
>>> e.g.
>>>
>>> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
>>> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
>> Thanks.
>>
>> Do you mean create a directory for each device, whether it supports
>> PASID or not ?
> every device has PASID#0 valid, i.e. RID2PASID.
Sorry for the late response.
Got it. Thanks.
>> Seems the PASID can be assigned at runtime.
>> So it needs to support creating debugfs file at runtime in IOMMU driver.
>> Looks like this requires modifying IOMMU driver.
>>
> Isn't this patch trying to modify the driver?
I just tried not to modify the driver except debugfs.
I'll try this implementation.

Thanks,
Jingqi
  
Baolu Lu July 11, 2023, 2:52 a.m. UTC | #5
On 2023/7/11 9:40, Liu, Jingqi wrote:
> On 7/4/2023 3:54 PM, Tian, Kevin wrote:
>>> From: Liu, Jingqi <jingqi.liu@intel.com>
>>> Sent: Monday, July 3, 2023 10:37 PM
>>>
>>> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>>>>> From: Liu, Jingqi <jingqi.liu@intel.com>
>>>>> Sent: Sunday, June 25, 2023 11:05 PM
>>>>>
>>>>> The original debugfs only dumps all IOMMU page tables without pasid
>>>>> supported. It traverses all devices on the pci bus, then dumps all 
>>>>> page
>>>>> tables based on device domains. This traversal is from software
>>>>> perspective.
>>>>>
>>>>> This series dumps page tables by traversing root tables, context 
>>>>> tables,
>>>>> pasid directories and pasid tables from hardware perspective. By
>>>>> specifying source identifier and PASID, it supports dumping specified
>>>>> page table or all page tables in legacy mode or scalable mode.
>>>>>
>>>>> For a device that only supports legacy mode, specify the source
>>>>> identifier, and search the root table and context table to dump its
>>>>> page table. It does not support to specify PASID.
>>>>>
>>>>> For a device that supports scalable mode, specify a
>>>>> {source identifier, PASID} pair and search the root table, context 
>>>>> table
>>>>> and pasid table to dump its page table.  If the pasid is not 
>>>>> specified,
>>>>> it is set to RID_PASID.
>>>>>
>>>>> Switch to dump all page tables by specifying "auto".
>>>>>
>>>>> Examples are as follows:
>>>>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>>>>> mode.
>>>>>
>>>>> $ sudo echo 00:1f.0 >
>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> Device 0000:00:1f.0 @0x105407000
>>>>> IOVA_PFN                PML5E                   PML4E
>>>>> 0x0000000000000 |       0x0000000000000000      0x0000000105408003
>>>>> 0x0000000000001 |       0x0000000000000000      0x0000000105408003
>>>>> 0x0000000000002 |       0x0000000000000000      0x0000000105408003
>>>>> 0x0000000000003 |       0x0000000000000000      0x0000000105408003
>>>>>
>>>>> PDPE                    PDE                     PTE
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000000003
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000001003
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000002003
>>>>> 0x0000000105409003      0x000000010540a003      0x0000000000003003
>>>>>
>>>>> [...]
>>>>>
>>>>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>>>>
>>>>> $ sudo echo 00:0a.0,2 >
>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>> What about creating a directory layout per {dev, pasid} so the user can
>>>> easily figure out and dump?
>>>>
>>>> e.g.
>>>>
>>>> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
>>>> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
>>> Thanks.
>>>
>>> Do you mean create a directory for each device, whether it supports
>>> PASID or not ?
>> every device has PASID#0 valid, i.e. RID2PASID.
> Sorry for the late response.
> Got it. Thanks.
>>> Seems the PASID can be assigned at runtime.
>>> So it needs to support creating debugfs file at runtime in IOMMU driver.
>>> Looks like this requires modifying IOMMU driver.
>>>
>> Isn't this patch trying to modify the driver?
> I just tried not to modify the driver except debugfs.
> I'll try this implementation.

I'd second Kevin's suggestion.

If you check how usb xhci dumps its contexts for devices, you can see
the similar scheme.

# ls /sys/kernel/debug/usb/xhci/0000:00:14.0/devices
01  02  03  04  05

In our case, pasid 0 is special which denotes the domain attached to the
RID.

Best regards,
baolu
  
Liu, Jingqi July 11, 2023, 6:23 a.m. UTC | #6
On 7/11/2023 10:52 AM, Baolu Lu wrote:
> On 2023/7/11 9:40, Liu, Jingqi wrote:
>> On 7/4/2023 3:54 PM, Tian, Kevin wrote:
>>>> From: Liu, Jingqi <jingqi.liu@intel.com>
>>>> Sent: Monday, July 3, 2023 10:37 PM
>>>>
>>>> On 7/3/2023 3:15 PM, Tian, Kevin wrote:
>>>>>> From: Liu, Jingqi <jingqi.liu@intel.com>
>>>>>> Sent: Sunday, June 25, 2023 11:05 PM
>>>>>>
>>>>>> The original debugfs only dumps all IOMMU page tables without pasid
>>>>>> supported. It traverses all devices on the pci bus, then dumps 
>>>>>> all page
>>>>>> tables based on device domains. This traversal is from software
>>>>>> perspective.
>>>>>>
>>>>>> This series dumps page tables by traversing root tables, context 
>>>>>> tables,
>>>>>> pasid directories and pasid tables from hardware perspective. By
>>>>>> specifying source identifier and PASID, it supports dumping 
>>>>>> specified
>>>>>> page table or all page tables in legacy mode or scalable mode.
>>>>>>
>>>>>> For a device that only supports legacy mode, specify the source
>>>>>> identifier, and search the root table and context table to dump its
>>>>>> page table. It does not support to specify PASID.
>>>>>>
>>>>>> For a device that supports scalable mode, specify a
>>>>>> {source identifier, PASID} pair and search the root table, 
>>>>>> context table
>>>>>> and pasid table to dump its page table.  If the pasid is not 
>>>>>> specified,
>>>>>> it is set to RID_PASID.
>>>>>>
>>>>>> Switch to dump all page tables by specifying "auto".
>>>>>>
>>>>>> Examples are as follows:
>>>>>> 1) Dump the page table of device "00:1f.0" that only supports legacy
>>>>>> mode.
>>>>>>
>>>>>> $ sudo echo 00:1f.0 >
>>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>>> Device 0000:00:1f.0 @0x105407000
>>>>>> IOVA_PFN                PML5E                   PML4E
>>>>>> 0x0000000000000 |       0x0000000000000000 0x0000000105408003
>>>>>> 0x0000000000001 |       0x0000000000000000 0x0000000105408003
>>>>>> 0x0000000000002 |       0x0000000000000000 0x0000000105408003
>>>>>> 0x0000000000003 |       0x0000000000000000 0x0000000105408003
>>>>>>
>>>>>> PDPE                    PDE                     PTE
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000000003
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000001003
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000002003
>>>>>> 0x0000000105409003      0x000000010540a003 0x0000000000003003
>>>>>>
>>>>>> [...]
>>>>>>
>>>>>> 2) Dump the page table of device "00:0a.0" with pasid "2".
>>>>>>
>>>>>> $ sudo echo 00:0a.0,2 >
>>>>>> /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>>> $ sudo cat /sys/kernel/debug/iommu/intel/domain_translation_struct
>>>>> What about creating a directory layout per {dev, pasid} so the 
>>>>> user can
>>>>> easily figure out and dump?
>>>>>
>>>>> e.g.
>>>>>
>>>>> /sys/kernel/debug/iommu/intel/00:0a.0/0/domain_translation_struct
>>>>> /sys/kernel/debug/iommu/intel/00:0a.0/2/domain_translation_struct
>>>> Thanks.
>>>>
>>>> Do you mean create a directory for each device, whether it supports
>>>> PASID or not ?
>>> every device has PASID#0 valid, i.e. RID2PASID.
>> Sorry for the late response.
>> Got it. Thanks.
>>>> Seems the PASID can be assigned at runtime.
>>>> So it needs to support creating debugfs file at runtime in IOMMU 
>>>> driver.
>>>> Looks like this requires modifying IOMMU driver.
>>>>
>>> Isn't this patch trying to modify the driver?
>> I just tried not to modify the driver except debugfs.
>> I'll try this implementation. [
>
> I'd second Kevin's suggestion.
>
> If you check how usb xhci dumps its contexts for devices, you can see
> the similar scheme.
>
> # ls /sys/kernel/debug/usb/xhci/0000:00:14.0/devices
> 01  02  03  04  05
>
> In our case, pasid 0 is special which denotes the domain attached to the
> RID.
Thanks for your info.
This implementation is more friendly for user.
I'll implement it as such.

BR,
Jingqi