[v2,1/2] nvme-apple: Reset controller during shutdown

Message ID 20230114-apple-nvme-suspend-fixes-v6.2-v2-1-9157bf633dba@jannau.net
State New
Headers
Series nvme-apple: Fix suspend-resume regression |

Commit Message

Janne Grunau Jan. 17, 2023, 6:25 p.m. UTC
  This is a functional revert of c76b8308e4c9 ("nvme-apple: fix controller
shutdown in apple_nvme_disable").

The commit broke suspend/resume since apple_nvme_reset_work() tries to
disable the controller on resume. This does not work for the apple NVMe
controller since register access only works while the co-processor
firmware is running.

Disabling the NVMe controller in the shutdown path is also required
for shutting the co-processor down. The original code was appropriate
for this hardware. Add a comment to prevent a similar breaking changes
in the future.

Fixes: c76b8308e4c9 ("nvme-apple: fix controller shutdown in apple_nvme_disable")
Reported-by: Janne Grunau <j@jannau.net>
Link: https://lore.kernel.org/all/20230110174745.GA3576@jannau.net/
Signed-off-by: Janne Grunau <j@jannau.net>
---
 drivers/nvme/host/apple.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
  

Comments

Christoph Hellwig Jan. 18, 2023, 5:24 a.m. UTC | #1
On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
> +		/*
> +		 * Always reset the NVMe controller on shutdown. The reset is
> +		 * required to shutdown the co-processor cleanly.
> +		 */

Hmm.  This comment doesn't seem to match the discussion we had last
week.  Which would be:

		/*
		 * NVMe requires a reset before setting up a controller to
		 * ensure it is in a clean state.  For NVMe PCIe this is
		 * done in the setup path to be able to deal with controllers
		 * in any kind of state.  For for Apple devices, the firmware
		 * will not be available at that time and the reset will
		 * time out.  Thus reset after shutting the NVMe controller
		 * down and before shutting the firmware down.
		 */
  
Christoph Hellwig Jan. 19, 2023, 6:14 a.m. UTC | #2
Folks, can you chime in if this comment makes sense?  I'd really
like to send the patches off to Jens before rc5.

On Wed, Jan 18, 2023 at 06:24:50AM +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
> > +		/*
> > +		 * Always reset the NVMe controller on shutdown. The reset is
> > +		 * required to shutdown the co-processor cleanly.
> > +		 */
> 
> Hmm.  This comment doesn't seem to match the discussion we had last
> week.  Which would be:
> 
> 		/*
> 		 * NVMe requires a reset before setting up a controller to
> 		 * ensure it is in a clean state.  For NVMe PCIe this is
> 		 * done in the setup path to be able to deal with controllers
> 		 * in any kind of state.  For for Apple devices, the firmware
> 		 * will not be available at that time and the reset will
> 		 * time out.  Thus reset after shutting the NVMe controller
> 		 * down and before shutting the firmware down.
> 		 */
---end quoted text---
  
Janne Grunau Jan. 19, 2023, 7:48 a.m. UTC | #3
Hej,

On 2023-01-18 06:24:50 +0100, Christoph Hellwig wrote:
> On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
> > +		/*
> > +		 * Always reset the NVMe controller on shutdown. The reset is
> > +		 * required to shutdown the co-processor cleanly.
> > +		 */
> 
> Hmm.  This comment doesn't seem to match the discussion we had last
> week.  Which would be:
> 
> 		/*
> 		 * NVMe requires a reset before setting up a controller to
> 		 * ensure it is in a clean state.  For NVMe PCIe this is
> 		 * done in the setup path to be able to deal with controllers
> 		 * in any kind of state.  For for Apple devices, the firmware
> 		 * will not be available at that time and the reset will
> 		 * time out.  Thus reset after shutting the NVMe controller
> 		 * down and before shutting the firmware down.
> 		 */

yes, it differs from the discussion last week. I tried to issue the 
reset later in the setup path after the firmware was brought back up.  
That fixes the hang but the device is still not useable. So it appears 
we need to reset the controller before the firmware is shutdown.

Janne
  
Hector Martin Jan. 19, 2023, 7:58 a.m. UTC | #4
(Replying from mobile, please excuse formatting)

I'm actually not sure exactly how this works any more. The previous series I sent (which had slightly different logic) worked for me on a t8103 Mac Mini in smoke tests and I'd assumed fixed the issue, but it turned out to fail (in a different way) on other machines/circumstances. This one seems to work everywhere, but I can't explain exactly why. Maybe we do in fact need to issue an NVMe disable before shutting down the firmware to reliably come up properly on firmware restart.

Maybe something like this?

/*
 * Always disable the NVMe controller after shutdown.
 * We need to do this to bring it back up later anyway,
 * and we can't do it while the firmware is not running
 * (e.g. in the resume reset path before RTKit is
 * initialized), so for Apple controllers it makes sense to
 * unconditionally do it here. Additionally, this sequence
 * of events is reliable, while others (like disabling after
 * bringing back the firmware on resume) seem to run
 * into trouble under some circumstances.
 *
 * Both U-Boot and m1n1 also use this convention
 * (i.e. an ANS NVMe controller is handed off with
 * firmware shut down, in an NVMe disabled state,
 * after a clean shutdown).
 */

On 2023年1月19日 15:14:52 JST, Christoph Hellwig <hch@lst.de> wrote:
>Folks, can you chime in if this comment makes sense?  I'd really
>like to send the patches off to Jens before rc5.
>
>On Wed, Jan 18, 2023 at 06:24:50AM +0100, Christoph Hellwig wrote:
>> On Tue, Jan 17, 2023 at 07:25:00PM +0100, Janne Grunau wrote:
>> > +		/*
>> > +		 * Always reset the NVMe controller on shutdown. The reset is
>> > +		 * required to shutdown the co-processor cleanly.
>> > +		 */
>> 
>> Hmm.  This comment doesn't seem to match the discussion we had last
>> week.  Which would be:
>> 
>> 		/*
>> 		 * NVMe requires a reset before setting up a controller to
>> 		 * ensure it is in a clean state.  For NVMe PCIe this is
>> 		 * done in the setup path to be able to deal with controllers
>> 		 * in any kind of state.  For for Apple devices, the firmware
>> 		 * will not be available at that time and the reset will
>> 		 * time out.  Thus reset after shutting the NVMe controller
>> 		 * down and before shutting the firmware down.
>> 		 */
>---end quoted text---
>
  
Christoph Hellwig Jan. 19, 2023, 8:08 a.m. UTC | #5
Thanks, this looks good.  Updated commit here:

http://git.infradead.org/nvme.git/commitdiff/c06ba7b892a50b48522ad441a40053f483dfee9e
  
Janne Grunau Jan. 19, 2023, 8:12 a.m. UTC | #6
On 2023-01-19 09:08:39 +0100, Christoph Hellwig wrote:
> Thanks, this looks good.  Updated commit here:
> 
> http://git.infradead.org/nvme.git/commitdiff/c06ba7b892a50b48522ad441a40053f483dfee9e

looks good to me as well.

thanks

Janne
  

Patch

diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index bf1c60edb7f9..2a1f11b30615 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -829,7 +829,13 @@  static void apple_nvme_disable(struct apple_nvme *anv, bool shutdown)
 			apple_nvme_remove_cq(anv);
 		}
 
-		nvme_disable_ctrl(&anv->ctrl, shutdown);
+		/*
+		 * Always reset the NVMe controller on shutdown. The reset is
+		 * required to shutdown the co-processor cleanly.
+		 */
+		if (shutdown)
+			nvme_disable_ctrl(&anv->ctrl, shutdown);
+		nvme_disable_ctrl(&anv->ctrl, false);
 	}
 
 	WRITE_ONCE(anv->ioq.enabled, false);