firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards

Message ID 20231229035735.11127-1-o-takashi@sakamocchi.jp
State New
Headers
Series firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards |

Commit Message

Takashi Sakamoto Dec. 29, 2023, 3:57 a.m. UTC
  VIA VT6306/6307/6308 provides PCI interface compliant to 1394 OHCI. When
the hardware is combined with Asmedia ASM1083/1085 PCIe-to-PCI bus bridge,
it appears that accesses to its 'Isochronous Cycle Timer' register (offset
0xf0 on PCI I/O space) often causes unexpected system reboot in any type
of AMD Ryzen machine (both 0x17 and 0x19 families). It does not appears in
the other type of machine (AMD pre-Ryzen machine, Intel machine, at least),
or in the other OHCI 1394 hardware (e.g. Texas Instruments).

The issue explicitly appears at a commit dcadfd7f7c74 ("firewire: core:
use union for callback of transaction completion") added to v6.5 kernel.
It changed 1394 OHCI driver to access to the register every time to
dispatch local asynchronous transaction. However, the issue exists in
older version of kernel as long as it runs in AMD Ryzen machine, since
the access to the register is required to maintain bus time. It is not
hard to imagine that users experience the unexpected system reboot when
generating bus reset by plugging any devices in, or reading the register
by time-aware application programs; e.g. audio sample processing.

Well, this commit suppresses the system reboot in the combination of
hardware. It avoids the access itself. As a result, the software stack can
not provide the hardware time anymore to unit drivers, userspace
applications, and nodes in the same IEEE 1394 bus. It brings apparent
disadvantage since time-aware application programs require it, while
time-unaware applications are available again; e.g. sbp2.

Cc: stable@vger.kernel.org
Reported-by: Jiri Slaby <jirislaby@kernel.org>
Closes: https://bugzilla.suse.com/show_bug.cgi?id=1215436
Reported-by: Mario Limonciello <mario.limonciello@amd.com>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217994
Reported-by: Tobias Gruetzmacher <tobias-lists@23.gs>
Closes: https://sourceforge.net/p/linux1394/mailman/message/58711901/
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2240973
Closes: https://bugs.launchpad.net/linux/+bug/2043905
Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
---
 drivers/firewire/ohci.c | 49 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 49 insertions(+)
  

Comments

kernel test robot Dec. 29, 2023, 10:56 p.m. UTC | #1
Hi Takashi,

kernel test robot noticed the following build errors:

[auto build test ERROR on ieee1394-linux1394/for-next]
[also build test ERROR on ieee1394-linux1394/for-linus linus/master v6.7-rc7 next-20231222]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Takashi-Sakamoto/firewire-ohci-suppress-unexpected-system-reboot-in-AMD-Ryzen-machines-and-ASM108x-VT630x-PCIe-cards/20231229-120311
base:   https://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394.git for-next
patch link:    https://lore.kernel.org/r/20231229035735.11127-1-o-takashi%40sakamocchi.jp
patch subject: [PATCH] firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
config: loongarch-randconfig-r081-20231229 (https://download.01.org/0day-ci/archive/20231230/202312300612.hrtBWbnp-lkp@intel.com/config)
compiler: loongarch64-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231230/202312300612.hrtBWbnp-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312300612.hrtBWbnp-lkp@intel.com/

All errors (new ones prefixed by >>):

   drivers/firewire/ohci.c: In function 'pci_probe':
>> drivers/firewire/ohci.c:3679:70: error: macro "detect_vt630x_with_asm1083_on_amd_ryzen_machine" passed 2 arguments, but takes just 1
    3679 |         if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
         |                                                                      ^
   drivers/firewire/ohci.c:3573: note: macro "detect_vt630x_with_asm1083_on_amd_ryzen_machine" defined here
    3573 | #define detect_vt630x_with_asm1083_on_amd_ryzen_machine(pdev)   false
         | 
>> drivers/firewire/ohci.c:3679:13: error: 'detect_vt630x_with_asm1083_on_amd_ryzen_machine' undeclared (first use in this function)
    3679 |         if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
         |             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   drivers/firewire/ohci.c:3679:13: note: each undeclared identifier is reported only once for each function it appears in


vim +/detect_vt630x_with_asm1083_on_amd_ryzen_machine +3679 drivers/firewire/ohci.c

  3617	
  3618	static int pci_probe(struct pci_dev *dev,
  3619				       const struct pci_device_id *ent)
  3620	{
  3621		struct fw_ohci *ohci;
  3622		u32 bus_options, max_receive, link_speed, version;
  3623		u64 guid;
  3624		int i, err;
  3625		size_t size;
  3626	
  3627		if (dev->vendor == PCI_VENDOR_ID_PINNACLE_SYSTEMS) {
  3628			dev_err(&dev->dev, "Pinnacle MovieBoard is not yet supported\n");
  3629			return -ENOSYS;
  3630		}
  3631	
  3632		ohci = devres_alloc(release_ohci, sizeof(*ohci), GFP_KERNEL);
  3633		if (ohci == NULL)
  3634			return -ENOMEM;
  3635		fw_card_initialize(&ohci->card, &ohci_driver, &dev->dev);
  3636		pci_set_drvdata(dev, ohci);
  3637		pmac_ohci_on(dev);
  3638		devres_add(&dev->dev, ohci);
  3639	
  3640		err = pcim_enable_device(dev);
  3641		if (err) {
  3642			dev_err(&dev->dev, "failed to enable OHCI hardware\n");
  3643			return err;
  3644		}
  3645	
  3646		pci_set_master(dev);
  3647		pci_write_config_dword(dev, OHCI1394_PCI_HCI_Control, 0);
  3648	
  3649		spin_lock_init(&ohci->lock);
  3650		mutex_init(&ohci->phy_reg_mutex);
  3651	
  3652		INIT_WORK(&ohci->bus_reset_work, bus_reset_work);
  3653	
  3654		if (!(pci_resource_flags(dev, 0) & IORESOURCE_MEM) ||
  3655		    pci_resource_len(dev, 0) < OHCI1394_REGISTER_SIZE) {
  3656			ohci_err(ohci, "invalid MMIO resource\n");
  3657			return -ENXIO;
  3658		}
  3659	
  3660		err = pcim_iomap_regions(dev, 1 << 0, ohci_driver_name);
  3661		if (err) {
  3662			ohci_err(ohci, "request and map MMIO resource unavailable\n");
  3663			return -ENXIO;
  3664		}
  3665		ohci->registers = pcim_iomap_table(dev)[0];
  3666	
  3667		for (i = 0; i < ARRAY_SIZE(ohci_quirks); i++)
  3668			if ((ohci_quirks[i].vendor == dev->vendor) &&
  3669			    (ohci_quirks[i].device == (unsigned short)PCI_ANY_ID ||
  3670			     ohci_quirks[i].device == dev->device) &&
  3671			    (ohci_quirks[i].revision == (unsigned short)PCI_ANY_ID ||
  3672			     ohci_quirks[i].revision >= dev->revision)) {
  3673				ohci->quirks = ohci_quirks[i].flags;
  3674				break;
  3675			}
  3676		if (param_quirks)
  3677			ohci->quirks = param_quirks;
  3678	
> 3679		if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
  3680			ohci->quirks |= QUIRK_REBOOT_BY_CYCLE_TIMER_READ;
  3681	
  3682		/*
  3683		 * Because dma_alloc_coherent() allocates at least one page,
  3684		 * we save space by using a common buffer for the AR request/
  3685		 * response descriptors and the self IDs buffer.
  3686		 */
  3687		BUILD_BUG_ON(AR_BUFFERS * sizeof(struct descriptor) > PAGE_SIZE/4);
  3688		BUILD_BUG_ON(SELF_ID_BUF_SIZE > PAGE_SIZE/2);
  3689		ohci->misc_buffer = dmam_alloc_coherent(&dev->dev, PAGE_SIZE, &ohci->misc_buffer_bus,
  3690							GFP_KERNEL);
  3691		if (!ohci->misc_buffer)
  3692			return -ENOMEM;
  3693	
  3694		err = ar_context_init(&ohci->ar_request_ctx, ohci, 0,
  3695				      OHCI1394_AsReqRcvContextControlSet);
  3696		if (err < 0)
  3697			return err;
  3698	
  3699		err = ar_context_init(&ohci->ar_response_ctx, ohci, PAGE_SIZE/4,
  3700				      OHCI1394_AsRspRcvContextControlSet);
  3701		if (err < 0)
  3702			return err;
  3703	
  3704		err = context_init(&ohci->at_request_ctx, ohci,
  3705				   OHCI1394_AsReqTrContextControlSet, handle_at_packet);
  3706		if (err < 0)
  3707			return err;
  3708	
  3709		err = context_init(&ohci->at_response_ctx, ohci,
  3710				   OHCI1394_AsRspTrContextControlSet, handle_at_packet);
  3711		if (err < 0)
  3712			return err;
  3713	
  3714		reg_write(ohci, OHCI1394_IsoRecvIntMaskSet, ~0);
  3715		ohci->ir_context_channels = ~0ULL;
  3716		ohci->ir_context_support = reg_read(ohci, OHCI1394_IsoRecvIntMaskSet);
  3717		reg_write(ohci, OHCI1394_IsoRecvIntMaskClear, ~0);
  3718		ohci->ir_context_mask = ohci->ir_context_support;
  3719		ohci->n_ir = hweight32(ohci->ir_context_mask);
  3720		size = sizeof(struct iso_context) * ohci->n_ir;
  3721		ohci->ir_context_list = devm_kzalloc(&dev->dev, size, GFP_KERNEL);
  3722		if (!ohci->ir_context_list)
  3723			return -ENOMEM;
  3724	
  3725		reg_write(ohci, OHCI1394_IsoXmitIntMaskSet, ~0);
  3726		ohci->it_context_support = reg_read(ohci, OHCI1394_IsoXmitIntMaskSet);
  3727		/* JMicron JMB38x often shows 0 at first read, just ignore it */
  3728		if (!ohci->it_context_support) {
  3729			ohci_notice(ohci, "overriding IsoXmitIntMask\n");
  3730			ohci->it_context_support = 0xf;
  3731		}
  3732		reg_write(ohci, OHCI1394_IsoXmitIntMaskClear, ~0);
  3733		ohci->it_context_mask = ohci->it_context_support;
  3734		ohci->n_it = hweight32(ohci->it_context_mask);
  3735		size = sizeof(struct iso_context) * ohci->n_it;
  3736		ohci->it_context_list = devm_kzalloc(&dev->dev, size, GFP_KERNEL);
  3737		if (!ohci->it_context_list)
  3738			return -ENOMEM;
  3739	
  3740		ohci->self_id     = ohci->misc_buffer     + PAGE_SIZE/2;
  3741		ohci->self_id_bus = ohci->misc_buffer_bus + PAGE_SIZE/2;
  3742	
  3743		bus_options = reg_read(ohci, OHCI1394_BusOptions);
  3744		max_receive = (bus_options >> 12) & 0xf;
  3745		link_speed = bus_options & 0x7;
  3746		guid = ((u64) reg_read(ohci, OHCI1394_GUIDHi) << 32) |
  3747			reg_read(ohci, OHCI1394_GUIDLo);
  3748	
  3749		if (!(ohci->quirks & QUIRK_NO_MSI))
  3750			pci_enable_msi(dev);
  3751		err = devm_request_irq(&dev->dev, dev->irq, irq_handler,
  3752				       pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED, ohci_driver_name, ohci);
  3753		if (err < 0) {
  3754			ohci_err(ohci, "failed to allocate interrupt %d\n", dev->irq);
  3755			goto fail_msi;
  3756		}
  3757	
  3758		err = fw_card_add(&ohci->card, max_receive, link_speed, guid);
  3759		if (err)
  3760			goto fail_msi;
  3761	
  3762		version = reg_read(ohci, OHCI1394_Version) & 0x00ff00ff;
  3763		ohci_notice(ohci,
  3764			    "added OHCI v%x.%x device as card %d, "
  3765			    "%d IR + %d IT contexts, quirks 0x%x%s\n",
  3766			    version >> 16, version & 0xff, ohci->card.index,
  3767			    ohci->n_ir, ohci->n_it, ohci->quirks,
  3768			    reg_read(ohci, OHCI1394_PhyUpperBound) ?
  3769				", physUB" : "");
  3770	
  3771		return 0;
  3772	
  3773	 fail_msi:
  3774		pci_disable_msi(dev);
  3775	
  3776		return err;
  3777	}
  3778
  
Mario Limonciello Dec. 30, 2023, 2:30 a.m. UTC | #2
On 12/28/2023 21:57, Takashi Sakamoto wrote:
> VIA VT6306/6307/6308 provides PCI interface compliant to 1394 OHCI. When
> the hardware is combined with Asmedia ASM1083/1085 PCIe-to-PCI bus bridge,
> it appears that accesses to its 'Isochronous Cycle Timer' register (offset
> 0xf0 on PCI I/O space) often causes unexpected system reboot in any type
> of AMD Ryzen machine (both 0x17 and 0x19 families). It does not appears in
> the other type of machine (AMD pre-Ryzen machine, Intel machine, at least),
> or in the other OHCI 1394 hardware (e.g. Texas Instruments).
> 
> The issue explicitly appears at a commit dcadfd7f7c74 ("firewire: core:
> use union for callback of transaction completion") added to v6.5 kernel.
> It changed 1394 OHCI driver to access to the register every time to
> dispatch local asynchronous transaction. However, the issue exists in
> older version of kernel as long as it runs in AMD Ryzen machine, since
> the access to the register is required to maintain bus time. It is not
> hard to imagine that users experience the unexpected system reboot when
> generating bus reset by plugging any devices in, or reading the register
> by time-aware application programs; e.g. audio sample processing.
> 
> Well, this commit suppresses the system reboot in the combination of
> hardware. It avoids the access itself. As a result, the software stack can
> not provide the hardware time anymore to unit drivers, userspace
> applications, and nodes in the same IEEE 1394 bus. It brings apparent
> disadvantage since time-aware application programs require it, while
> time-unaware applications are available again; e.g. sbp2.
> 
> Cc: stable@vger.kernel.org
> Reported-by: Jiri Slaby <jirislaby@kernel.org>
> Closes: https://bugzilla.suse.com/show_bug.cgi?id=1215436
> Reported-by: Mario Limonciello <mario.limonciello@amd.com>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217994
> Reported-by: Tobias Gruetzmacher <tobias-lists@23.gs>
> Closes: https://sourceforge.net/p/linux1394/mailman/message/58711901/
> Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2240973
> Closes: https://bugs.launchpad.net/linux/+bug/2043905
> Signed-off-by: Takashi Sakamoto <o-takashi@sakamocchi.jp>
> ---
>   drivers/firewire/ohci.c | 49 +++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 49 insertions(+)
> 
> diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
> index 7e88fd489741..62af3fa39a70 100644
> --- a/drivers/firewire/ohci.c
> +++ b/drivers/firewire/ohci.c
> @@ -279,6 +279,8 @@ static char ohci_driver_name[] = KBUILD_MODNAME;
>   #define QUIRK_TI_SLLZ059		0x20
>   #define QUIRK_IR_WAKE			0x40
>   
> +#define QUIRK_REBOOT_BY_CYCLE_TIMER_READ	0x80000000
> +
>   /* In case of multiple matches in ohci_quirks[], only the first one is used. */
>   static const struct {
>   	unsigned short vendor, device, revision, flags;
> @@ -1724,6 +1726,11 @@ static u32 get_cycle_time(struct fw_ohci *ohci)
>   	s32 diff01, diff12;
>   	int i;
>   
> +#if IS_ENABLED(CONFIG_X86)
> +	if (ohci->quirks & QUIRK_REBOOT_BY_CYCLE_TIMER_READ)
> +		return 0;
> +#endif
> +
>   	c2 = reg_read(ohci, OHCI1394_IsochronousCycleTimer);
>   
>   	if (ohci->quirks & QUIRK_CYCLE_TIMER) {
> @@ -3527,6 +3534,45 @@ static const struct fw_card_driver ohci_driver = {
>   	.stop_iso		= ohci_stop_iso,
>   };
>   
> +// On PCI Express Root Complex in any type of AMD Ryzen machine, VIA VT6306/6307/6308 with Asmedia
> +// ASM1083/1085 brings an inconvenience that read accesses to 'Isochronous Cycle Timer' register
> +// (at offset 0xf0 in PCI I/O space) often causes unexpected system reboot. The mechanism is not
> +// clear, since the read access to the other registers is enough safe; e.g. 'Node ID' register,
> +// while it is probable due to detection of any type of PCIe error.
> +#if IS_ENABLED(CONFIG_X86)
> +
> +#define PCI_DEVICE_ID_ASMEDIA_ASM108X	0x1080
> +
> +static bool detect_vt630x_with_asm1083_on_amd_ryzen_machine(const struct pci_dev *pdev,
> +							    struct fw_ohci *ohci)
> +{
> +	const struct pci_dev *pcie_to_pci_bridge;
> +	const struct cpuinfo_x86 *cinfo = &cpu_data(0);
> +
> +	// Detect any type of AMD Ryzen machine.
> +	if (cinfo->x86_vendor != X86_VENDOR_AMD || cinfo->x86 < 0x17)
> +		return false;

Maybe it's better to use X86_FEATURE_ZEN?

> +
> +	// Detect VIA VT6306/6307/6308.
> +	if (pdev->vendor != PCI_VENDOR_ID_VIA)
> +		return false;
> +	if (pdev->device != PCI_DEVICE_ID_VIA_VT630X)
> +		return false;
> +
> +	// Detect Asmedia ASM1083/1085.
> +	pcie_to_pci_bridge = pdev->bus->self;
> +	if (pcie_to_pci_bridge->vendor != PCI_VENDOR_ID_ASMEDIA)
> +		return false;
> +	if (pcie_to_pci_bridge->device != PCI_DEVICE_ID_ASMEDIA_ASM108X)
> +		return false;
> +
> +	return true;
> +}
> +
> +#else
> +#define detect_vt630x_with_asm1083_on_amd_ryzen_machine(pdev)	false
> +#endif
> +
>   #ifdef CONFIG_PPC_PMAC
>   static void pmac_ohci_on(struct pci_dev *dev)
>   {
> @@ -3630,6 +3676,9 @@ static int pci_probe(struct pci_dev *dev,
>   	if (param_quirks)
>   		ohci->quirks = param_quirks;
>   
> +	if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
> +		ohci->quirks |= QUIRK_REBOOT_BY_CYCLE_TIMER_READ;
> +
>   	/*
>   	 * Because dma_alloc_coherent() allocates at least one page,
>   	 * we save space by using a common buffer for the AR request/
  
kernel test robot Dec. 30, 2023, 8:43 a.m. UTC | #3
Hi Takashi,

kernel test robot noticed the following build errors:

[auto build test ERROR on ieee1394-linux1394/for-next]
[also build test ERROR on ieee1394-linux1394/for-linus linus/master v6.7-rc7 next-20231222]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Takashi-Sakamoto/firewire-ohci-suppress-unexpected-system-reboot-in-AMD-Ryzen-machines-and-ASM108x-VT630x-PCIe-cards/20231229-120311
base:   https://git.kernel.org/pub/scm/linux/kernel/git/ieee1394/linux1394.git for-next
patch link:    https://lore.kernel.org/r/20231229035735.11127-1-o-takashi%40sakamocchi.jp
patch subject: [PATCH] firewire: ohci: suppress unexpected system reboot in AMD Ryzen machines and ASM108x/VT630x PCIe cards
config: arm64-allyesconfig (https://download.01.org/0day-ci/archive/20231230/202312301629.2sCcBeRp-lkp@intel.com/config)
compiler: clang version 18.0.0git (https://github.com/llvm/llvm-project 8a4266a626914765c0c69839e8a51be383013c1a)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231230/202312301629.2sCcBeRp-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202312301629.2sCcBeRp-lkp@intel.com/

All errors (new ones prefixed by >>):

>> drivers/firewire/ohci.c:3679:59: error: too many arguments provided to function-like macro invocation
    3679 |         if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
         |                                                                  ^
   drivers/firewire/ohci.c:3573:9: note: macro 'detect_vt630x_with_asm1083_on_amd_ryzen_machine' defined here
    3573 | #define detect_vt630x_with_asm1083_on_amd_ryzen_machine(pdev)   false
         |         ^
>> drivers/firewire/ohci.c:3679:6: error: use of undeclared identifier 'detect_vt630x_with_asm1083_on_amd_ryzen_machine'
    3679 |         if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
         |             ^
   2 errors generated.


vim +3679 drivers/firewire/ohci.c

  3617	
  3618	static int pci_probe(struct pci_dev *dev,
  3619				       const struct pci_device_id *ent)
  3620	{
  3621		struct fw_ohci *ohci;
  3622		u32 bus_options, max_receive, link_speed, version;
  3623		u64 guid;
  3624		int i, err;
  3625		size_t size;
  3626	
  3627		if (dev->vendor == PCI_VENDOR_ID_PINNACLE_SYSTEMS) {
  3628			dev_err(&dev->dev, "Pinnacle MovieBoard is not yet supported\n");
  3629			return -ENOSYS;
  3630		}
  3631	
  3632		ohci = devres_alloc(release_ohci, sizeof(*ohci), GFP_KERNEL);
  3633		if (ohci == NULL)
  3634			return -ENOMEM;
  3635		fw_card_initialize(&ohci->card, &ohci_driver, &dev->dev);
  3636		pci_set_drvdata(dev, ohci);
  3637		pmac_ohci_on(dev);
  3638		devres_add(&dev->dev, ohci);
  3639	
  3640		err = pcim_enable_device(dev);
  3641		if (err) {
  3642			dev_err(&dev->dev, "failed to enable OHCI hardware\n");
  3643			return err;
  3644		}
  3645	
  3646		pci_set_master(dev);
  3647		pci_write_config_dword(dev, OHCI1394_PCI_HCI_Control, 0);
  3648	
  3649		spin_lock_init(&ohci->lock);
  3650		mutex_init(&ohci->phy_reg_mutex);
  3651	
  3652		INIT_WORK(&ohci->bus_reset_work, bus_reset_work);
  3653	
  3654		if (!(pci_resource_flags(dev, 0) & IORESOURCE_MEM) ||
  3655		    pci_resource_len(dev, 0) < OHCI1394_REGISTER_SIZE) {
  3656			ohci_err(ohci, "invalid MMIO resource\n");
  3657			return -ENXIO;
  3658		}
  3659	
  3660		err = pcim_iomap_regions(dev, 1 << 0, ohci_driver_name);
  3661		if (err) {
  3662			ohci_err(ohci, "request and map MMIO resource unavailable\n");
  3663			return -ENXIO;
  3664		}
  3665		ohci->registers = pcim_iomap_table(dev)[0];
  3666	
  3667		for (i = 0; i < ARRAY_SIZE(ohci_quirks); i++)
  3668			if ((ohci_quirks[i].vendor == dev->vendor) &&
  3669			    (ohci_quirks[i].device == (unsigned short)PCI_ANY_ID ||
  3670			     ohci_quirks[i].device == dev->device) &&
  3671			    (ohci_quirks[i].revision == (unsigned short)PCI_ANY_ID ||
  3672			     ohci_quirks[i].revision >= dev->revision)) {
  3673				ohci->quirks = ohci_quirks[i].flags;
  3674				break;
  3675			}
  3676		if (param_quirks)
  3677			ohci->quirks = param_quirks;
  3678	
> 3679		if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
  3680			ohci->quirks |= QUIRK_REBOOT_BY_CYCLE_TIMER_READ;
  3681	
  3682		/*
  3683		 * Because dma_alloc_coherent() allocates at least one page,
  3684		 * we save space by using a common buffer for the AR request/
  3685		 * response descriptors and the self IDs buffer.
  3686		 */
  3687		BUILD_BUG_ON(AR_BUFFERS * sizeof(struct descriptor) > PAGE_SIZE/4);
  3688		BUILD_BUG_ON(SELF_ID_BUF_SIZE > PAGE_SIZE/2);
  3689		ohci->misc_buffer = dmam_alloc_coherent(&dev->dev, PAGE_SIZE, &ohci->misc_buffer_bus,
  3690							GFP_KERNEL);
  3691		if (!ohci->misc_buffer)
  3692			return -ENOMEM;
  3693	
  3694		err = ar_context_init(&ohci->ar_request_ctx, ohci, 0,
  3695				      OHCI1394_AsReqRcvContextControlSet);
  3696		if (err < 0)
  3697			return err;
  3698	
  3699		err = ar_context_init(&ohci->ar_response_ctx, ohci, PAGE_SIZE/4,
  3700				      OHCI1394_AsRspRcvContextControlSet);
  3701		if (err < 0)
  3702			return err;
  3703	
  3704		err = context_init(&ohci->at_request_ctx, ohci,
  3705				   OHCI1394_AsReqTrContextControlSet, handle_at_packet);
  3706		if (err < 0)
  3707			return err;
  3708	
  3709		err = context_init(&ohci->at_response_ctx, ohci,
  3710				   OHCI1394_AsRspTrContextControlSet, handle_at_packet);
  3711		if (err < 0)
  3712			return err;
  3713	
  3714		reg_write(ohci, OHCI1394_IsoRecvIntMaskSet, ~0);
  3715		ohci->ir_context_channels = ~0ULL;
  3716		ohci->ir_context_support = reg_read(ohci, OHCI1394_IsoRecvIntMaskSet);
  3717		reg_write(ohci, OHCI1394_IsoRecvIntMaskClear, ~0);
  3718		ohci->ir_context_mask = ohci->ir_context_support;
  3719		ohci->n_ir = hweight32(ohci->ir_context_mask);
  3720		size = sizeof(struct iso_context) * ohci->n_ir;
  3721		ohci->ir_context_list = devm_kzalloc(&dev->dev, size, GFP_KERNEL);
  3722		if (!ohci->ir_context_list)
  3723			return -ENOMEM;
  3724	
  3725		reg_write(ohci, OHCI1394_IsoXmitIntMaskSet, ~0);
  3726		ohci->it_context_support = reg_read(ohci, OHCI1394_IsoXmitIntMaskSet);
  3727		/* JMicron JMB38x often shows 0 at first read, just ignore it */
  3728		if (!ohci->it_context_support) {
  3729			ohci_notice(ohci, "overriding IsoXmitIntMask\n");
  3730			ohci->it_context_support = 0xf;
  3731		}
  3732		reg_write(ohci, OHCI1394_IsoXmitIntMaskClear, ~0);
  3733		ohci->it_context_mask = ohci->it_context_support;
  3734		ohci->n_it = hweight32(ohci->it_context_mask);
  3735		size = sizeof(struct iso_context) * ohci->n_it;
  3736		ohci->it_context_list = devm_kzalloc(&dev->dev, size, GFP_KERNEL);
  3737		if (!ohci->it_context_list)
  3738			return -ENOMEM;
  3739	
  3740		ohci->self_id     = ohci->misc_buffer     + PAGE_SIZE/2;
  3741		ohci->self_id_bus = ohci->misc_buffer_bus + PAGE_SIZE/2;
  3742	
  3743		bus_options = reg_read(ohci, OHCI1394_BusOptions);
  3744		max_receive = (bus_options >> 12) & 0xf;
  3745		link_speed = bus_options & 0x7;
  3746		guid = ((u64) reg_read(ohci, OHCI1394_GUIDHi) << 32) |
  3747			reg_read(ohci, OHCI1394_GUIDLo);
  3748	
  3749		if (!(ohci->quirks & QUIRK_NO_MSI))
  3750			pci_enable_msi(dev);
  3751		err = devm_request_irq(&dev->dev, dev->irq, irq_handler,
  3752				       pci_dev_msi_enabled(dev) ? 0 : IRQF_SHARED, ohci_driver_name, ohci);
  3753		if (err < 0) {
  3754			ohci_err(ohci, "failed to allocate interrupt %d\n", dev->irq);
  3755			goto fail_msi;
  3756		}
  3757	
  3758		err = fw_card_add(&ohci->card, max_receive, link_speed, guid);
  3759		if (err)
  3760			goto fail_msi;
  3761	
  3762		version = reg_read(ohci, OHCI1394_Version) & 0x00ff00ff;
  3763		ohci_notice(ohci,
  3764			    "added OHCI v%x.%x device as card %d, "
  3765			    "%d IR + %d IT contexts, quirks 0x%x%s\n",
  3766			    version >> 16, version & 0xff, ohci->card.index,
  3767			    ohci->n_ir, ohci->n_it, ohci->quirks,
  3768			    reg_read(ohci, OHCI1394_PhyUpperBound) ?
  3769				", physUB" : "");
  3770	
  3771		return 0;
  3772	
  3773	 fail_msi:
  3774		pci_disable_msi(dev);
  3775	
  3776		return err;
  3777	}
  3778
  
Takashi Sakamoto Jan. 2, 2024, 3:55 a.m. UTC | #4
Hi Mario,

On Fri, Dec 29, 2023 at 08:30:00PM -0600, Mario Limonciello wrote:
> On 12/28/2023 21:57, Takashi Sakamoto wrote:
> > @@ -3527,6 +3534,45 @@ static const struct fw_card_driver ohci_driver = {
> >   	.stop_iso		= ohci_stop_iso,
> >   };
> > +// On PCI Express Root Complex in any type of AMD Ryzen machine, VIA VT6306/6307/6308 with Asmedia
> > +// ASM1083/1085 brings an inconvenience that read accesses to 'Isochronous Cycle Timer' register
> > +// (at offset 0xf0 in PCI I/O space) often causes unexpected system reboot. The mechanism is not
> > +// clear, since the read access to the other registers is enough safe; e.g. 'Node ID' register,
> > +// while it is probable due to detection of any type of PCIe error.
> > +#if IS_ENABLED(CONFIG_X86)
> > +
> > +#define PCI_DEVICE_ID_ASMEDIA_ASM108X	0x1080
> > +
> > +static bool detect_vt630x_with_asm1083_on_amd_ryzen_machine(const struct pci_dev *pdev,
> > +							    struct fw_ohci *ohci)
> > +{
> > +	const struct pci_dev *pcie_to_pci_bridge;
> > +	const struct cpuinfo_x86 *cinfo = &cpu_data(0);
> > +
> > +	// Detect any type of AMD Ryzen machine.
> > +	if (cinfo->x86_vendor != X86_VENDOR_AMD || cinfo->x86 < 0x17)
> > +		return false;
> 
> Maybe it's better to use X86_FEATURE_ZEN?

Indeed. We can use it under the condition branch for CONFIG_X86, like:

+       // Detect any type of AMD Ryzen machine.
+       if (!static_cpu_has(X86_FEATURE_ZEN))
+               return false;


Thanks

Takashi Sakamoto
  

Patch

diff --git a/drivers/firewire/ohci.c b/drivers/firewire/ohci.c
index 7e88fd489741..62af3fa39a70 100644
--- a/drivers/firewire/ohci.c
+++ b/drivers/firewire/ohci.c
@@ -279,6 +279,8 @@  static char ohci_driver_name[] = KBUILD_MODNAME;
 #define QUIRK_TI_SLLZ059		0x20
 #define QUIRK_IR_WAKE			0x40
 
+#define QUIRK_REBOOT_BY_CYCLE_TIMER_READ	0x80000000
+
 /* In case of multiple matches in ohci_quirks[], only the first one is used. */
 static const struct {
 	unsigned short vendor, device, revision, flags;
@@ -1724,6 +1726,11 @@  static u32 get_cycle_time(struct fw_ohci *ohci)
 	s32 diff01, diff12;
 	int i;
 
+#if IS_ENABLED(CONFIG_X86)
+	if (ohci->quirks & QUIRK_REBOOT_BY_CYCLE_TIMER_READ)
+		return 0;
+#endif
+
 	c2 = reg_read(ohci, OHCI1394_IsochronousCycleTimer);
 
 	if (ohci->quirks & QUIRK_CYCLE_TIMER) {
@@ -3527,6 +3534,45 @@  static const struct fw_card_driver ohci_driver = {
 	.stop_iso		= ohci_stop_iso,
 };
 
+// On PCI Express Root Complex in any type of AMD Ryzen machine, VIA VT6306/6307/6308 with Asmedia
+// ASM1083/1085 brings an inconvenience that read accesses to 'Isochronous Cycle Timer' register
+// (at offset 0xf0 in PCI I/O space) often causes unexpected system reboot. The mechanism is not
+// clear, since the read access to the other registers is enough safe; e.g. 'Node ID' register,
+// while it is probable due to detection of any type of PCIe error.
+#if IS_ENABLED(CONFIG_X86)
+
+#define PCI_DEVICE_ID_ASMEDIA_ASM108X	0x1080
+
+static bool detect_vt630x_with_asm1083_on_amd_ryzen_machine(const struct pci_dev *pdev,
+							    struct fw_ohci *ohci)
+{
+	const struct pci_dev *pcie_to_pci_bridge;
+	const struct cpuinfo_x86 *cinfo = &cpu_data(0);
+
+	// Detect any type of AMD Ryzen machine.
+	if (cinfo->x86_vendor != X86_VENDOR_AMD || cinfo->x86 < 0x17)
+		return false;
+
+	// Detect VIA VT6306/6307/6308.
+	if (pdev->vendor != PCI_VENDOR_ID_VIA)
+		return false;
+	if (pdev->device != PCI_DEVICE_ID_VIA_VT630X)
+		return false;
+
+	// Detect Asmedia ASM1083/1085.
+	pcie_to_pci_bridge = pdev->bus->self;
+	if (pcie_to_pci_bridge->vendor != PCI_VENDOR_ID_ASMEDIA)
+		return false;
+	if (pcie_to_pci_bridge->device != PCI_DEVICE_ID_ASMEDIA_ASM108X)
+		return false;
+
+	return true;
+}
+
+#else
+#define detect_vt630x_with_asm1083_on_amd_ryzen_machine(pdev)	false
+#endif
+
 #ifdef CONFIG_PPC_PMAC
 static void pmac_ohci_on(struct pci_dev *dev)
 {
@@ -3630,6 +3676,9 @@  static int pci_probe(struct pci_dev *dev,
 	if (param_quirks)
 		ohci->quirks = param_quirks;
 
+	if (detect_vt630x_with_asm1083_on_amd_ryzen_machine(dev, ohci))
+		ohci->quirks |= QUIRK_REBOOT_BY_CYCLE_TIMER_READ;
+
 	/*
 	 * Because dma_alloc_coherent() allocates at least one page,
 	 * we save space by using a common buffer for the AR request/