[v2,2/5] nvme-fc: do not retry when auth fails or connection is refused

Message ID 20240221132404.6311-3-dwagner@suse.de
State New
Headers
Series nvme-fc: fix blktests nvme/041 |

Commit Message

Daniel Wagner Feb. 21, 2024, 1:24 p.m. UTC
  There is no point in retrying to connect if the authentication fails.

Connection refused is also issued from the authentication path, thus
also do not retry.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/nvme/host/fc.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Hannes Reinecke Feb. 21, 2024, 3:53 p.m. UTC | #1
On 2/21/24 14:24, Daniel Wagner wrote:
> There is no point in retrying to connect if the authentication fails.
> 
> Connection refused is also issued from the authentication path, thus
> also do not retry.
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>   drivers/nvme/host/fc.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index a5b29e9ad342..b81046c9f171 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -3312,6 +3312,8 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
>   			ctrl->cnum, status);
>   		if (status > 0 && (status & NVME_SC_DNR))
>   			recon = false;
> +		if (status == NVME_SC_AUTH_REQUIRED || status == -ECONNREFUSED)
> +			recon = false;
>   	} else if (time_after_eq(jiffies, rport->dev_loss_end))
>   		recon = false;
>   
Is this still required after the patchset to retry authentication errors?

Cheers,

Hannes
  
Daniel Wagner Feb. 21, 2024, 4:37 p.m. UTC | #2
On Wed, Feb 21, 2024 at 04:53:44PM +0100, Hannes Reinecke wrote:
> On 2/21/24 14:24, Daniel Wagner wrote:
> > There is no point in retrying to connect if the authentication fails.
> > 
> > Connection refused is also issued from the authentication path, thus
> > also do not retry.
> > 
> > Reviewed-by: Christoph Hellwig <hch@lst.de>
> > Signed-off-by: Daniel Wagner <dwagner@suse.de>
> > ---
> >   drivers/nvme/host/fc.c | 2 ++
> >   1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> > index a5b29e9ad342..b81046c9f171 100644
> > --- a/drivers/nvme/host/fc.c
> > +++ b/drivers/nvme/host/fc.c
> > @@ -3312,6 +3312,8 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
> >   			ctrl->cnum, status);
> >   		if (status > 0 && (status & NVME_SC_DNR))
> >   			recon = false;
> > +		if (status == NVME_SC_AUTH_REQUIRED || status == -ECONNREFUSED)
> > +			recon = false;
> >   	} else if (time_after_eq(jiffies, rport->dev_loss_end))
> >   		recon = false;
> Is this still required after the patchset to retry authentication errors?

Do you mean

  48dae46676d1 ("nvme: enable retries for authentication commands")

?

In this case yes, I've tested on top of this patch. This breaks the loop
where the provided key is invalid or is missing. The loop would happy
retry until reaching max of retries.
  
Hannes Reinecke Feb. 22, 2024, 6:46 a.m. UTC | #3
On 2/21/24 17:37, Daniel Wagner wrote:
> On Wed, Feb 21, 2024 at 04:53:44PM +0100, Hannes Reinecke wrote:
>> On 2/21/24 14:24, Daniel Wagner wrote:
>>> There is no point in retrying to connect if the authentication fails.
>>>
>>> Connection refused is also issued from the authentication path, thus
>>> also do not retry.
>>>
>>> Reviewed-by: Christoph Hellwig <hch@lst.de>
>>> Signed-off-by: Daniel Wagner <dwagner@suse.de>
>>> ---
>>>    drivers/nvme/host/fc.c | 2 ++
>>>    1 file changed, 2 insertions(+)
>>>
>>> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
>>> index a5b29e9ad342..b81046c9f171 100644
>>> --- a/drivers/nvme/host/fc.c
>>> +++ b/drivers/nvme/host/fc.c
>>> @@ -3312,6 +3312,8 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
>>>    			ctrl->cnum, status);
>>>    		if (status > 0 && (status & NVME_SC_DNR))
>>>    			recon = false;
>>> +		if (status == NVME_SC_AUTH_REQUIRED || status == -ECONNREFUSED)
>>> +			recon = false;
>>>    	} else if (time_after_eq(jiffies, rport->dev_loss_end))
>>>    		recon = false;
>> Is this still required after the patchset to retry authentication errors?
> 
> Do you mean
> 
>    48dae46676d1 ("nvme: enable retries for authentication commands")
> 
> ?
Yes.

> 
> In this case yes, I've tested on top of this patch. This breaks the loop
> where the provided key is invalid or is missing. The loop would happy
> retry until reaching max of retries.

But that's to be expected, no? After all, that's _precisely_ what 
NVME_SC_DNR is for; if you shouldn't retry, that bit is set.
If it's not set, you should.
So why is NVME_SC_AUTH_REQUIRED an exception?

Cheers,

Hannes
  
Daniel Wagner Feb. 22, 2024, 7:45 a.m. UTC | #4
On Thu, Feb 22, 2024 at 07:46:12AM +0100, Hannes Reinecke wrote:
> On 2/21/24 17:37, Daniel Wagner wrote:
> > On Wed, Feb 21, 2024 at 04:53:44PM +0100, Hannes Reinecke wrote:
> > In this case yes, I've tested on top of this patch. This breaks the loop
> > where the provided key is invalid or is missing. The loop would happy
> > retry until reaching max of retries.
> 
> But that's to be expected, no?

Why? If the key is wrong/missing it will be likely wrong/missing the
next retry again. So what's the point in retrying?

> After all, that's _precisely_ what
> NVME_SC_DNR is for;
> if you shouldn't retry, that bit is set.
> If it's not set, you should.

Okay, in this case there is a bug in the auth code somewhere.

> So why is NVME_SC_AUTH_REQUIRED an exception?

status is either NVME_SC_AUTH_REQUIRED or -ECONNREFUSED for the first
part of the nvme/041 (no key set). In this case DNR bit is not set.

Note, when -ECONNREFUSED is return

	if (status > 0 && (status & NVME_SC_DNR))

will not catch it either.

Glad we have these tests now.
  
Daniel Wagner Feb. 22, 2024, 5:02 p.m. UTC | #5
On Thu, Feb 22, 2024 at 08:45:04AM +0100, Daniel Wagner wrote:
> On Thu, Feb 22, 2024 at 07:46:12AM +0100, Hannes Reinecke wrote:
> > On 2/21/24 17:37, Daniel Wagner wrote:
> > > On Wed, Feb 21, 2024 at 04:53:44PM +0100, Hannes Reinecke wrote:
> > > In this case yes, I've tested on top of this patch. This breaks the loop
> > > where the provided key is invalid or is missing. The loop would happy
> > > retry until reaching max of retries.
> > 
> > But that's to be expected, no?
> 
> Why? If the key is wrong/missing it will be likely wrong/missing the
> next retry again. So what's the point in retrying?
> 
> > After all, that's _precisely_ what
> > NVME_SC_DNR is for;
> > if you shouldn't retry, that bit is set.
> > If it's not set, you should.
> 
> Okay, in this case there is a bug in the auth code somewhere.

With the change below nvme/041 also passes:

modified   drivers/nvme/host/fabrics.c
@@ -467,7 +467,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 		if (result & NVME_CONNECT_AUTHREQ_ASCR) {
 			dev_warn(ctrl->device,
 				 "qid 0: secure concatenation is not supported\n");
-			ret = NVME_SC_AUTH_REQUIRED;
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 			goto out_free_data;
 		}
 		/* Authentication required */
@@ -475,7 +475,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
 		if (ret) {
 			dev_warn(ctrl->device,
 				 "qid 0: authentication setup failed\n");
-			ret = NVME_SC_AUTH_REQUIRED;
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 			goto out_free_data;
 		}
 		ret = nvme_auth_wait(ctrl, 0);
@@ -540,8 +540,9 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 		/* Secure concatenation is not implemented */
 		if (result & NVME_CONNECT_AUTHREQ_ASCR) {
 			dev_warn(ctrl->device,
-				 "qid 0: secure concatenation is not supported\n");
-			ret = NVME_SC_AUTH_REQUIRED;
+				 "qid %d: secure concatenation is not supported\n",
+				 qid);
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 			goto out_free_data;
 		}
 		/* Authentication required */
@@ -549,7 +550,7 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
 		if (ret) {
 			dev_warn(ctrl->device,
 				 "qid %d: authentication setup failed\n", qid);
-			ret = NVME_SC_AUTH_REQUIRED;
+			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
 		} else {
 			ret = nvme_auth_wait(ctrl, qid);
 			if (ret)

Is this what you had in mind?
  
Hannes Reinecke Feb. 23, 2024, 11:58 a.m. UTC | #6
On 2/22/24 18:02, Daniel Wagner wrote:
> On Thu, Feb 22, 2024 at 08:45:04AM +0100, Daniel Wagner wrote:
>> On Thu, Feb 22, 2024 at 07:46:12AM +0100, Hannes Reinecke wrote:
>>> On 2/21/24 17:37, Daniel Wagner wrote:
>>>> On Wed, Feb 21, 2024 at 04:53:44PM +0100, Hannes Reinecke wrote:
>>>> In this case yes, I've tested on top of this patch. This breaks the loop
>>>> where the provided key is invalid or is missing. The loop would happy
>>>> retry until reaching max of retries.
>>>
>>> But that's to be expected, no?
>>
>> Why? If the key is wrong/missing it will be likely wrong/missing the
>> next retry again. So what's the point in retrying?
>>
>>> After all, that's _precisely_ what
>>> NVME_SC_DNR is for;
>>> if you shouldn't retry, that bit is set.
>>> If it's not set, you should.
>>
>> Okay, in this case there is a bug in the auth code somewhere.
> 
> With the change below nvme/041 also passes:
> 
> modified   drivers/nvme/host/fabrics.c
> @@ -467,7 +467,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
>   		if (result & NVME_CONNECT_AUTHREQ_ASCR) {
>   			dev_warn(ctrl->device,
>   				 "qid 0: secure concatenation is not supported\n");
> -			ret = NVME_SC_AUTH_REQUIRED;
> +			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
>   			goto out_free_data;
>   		}
>   		/* Authentication required */
> @@ -475,7 +475,7 @@ int nvmf_connect_admin_queue(struct nvme_ctrl *ctrl)
>   		if (ret) {
>   			dev_warn(ctrl->device,
>   				 "qid 0: authentication setup failed\n");
> -			ret = NVME_SC_AUTH_REQUIRED;
> +			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
>   			goto out_free_data;
>   		}
>   		ret = nvme_auth_wait(ctrl, 0);
> @@ -540,8 +540,9 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
>   		/* Secure concatenation is not implemented */
>   		if (result & NVME_CONNECT_AUTHREQ_ASCR) {
>   			dev_warn(ctrl->device,
> -				 "qid 0: secure concatenation is not supported\n");
> -			ret = NVME_SC_AUTH_REQUIRED;
> +				 "qid %d: secure concatenation is not supported\n",
> +				 qid);
> +			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
>   			goto out_free_data;
>   		}
>   		/* Authentication required */
> @@ -549,7 +550,7 @@ int nvmf_connect_io_queue(struct nvme_ctrl *ctrl, u16 qid)
>   		if (ret) {
>   			dev_warn(ctrl->device,
>   				 "qid %d: authentication setup failed\n", qid);
> -			ret = NVME_SC_AUTH_REQUIRED;
> +			ret = NVME_SC_AUTH_REQUIRED | NVME_SC_DNR;
>   		} else {
>   			ret = nvme_auth_wait(ctrl, qid);
>   			if (ret)
> 
> Is this what you had in mind?

Which, incidentally, is basically the patch I just posted.

Cheers,

Hannes
  

Patch

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index a5b29e9ad342..b81046c9f171 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3312,6 +3312,8 @@  nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
 			ctrl->cnum, status);
 		if (status > 0 && (status & NVME_SC_DNR))
 			recon = false;
+		if (status == NVME_SC_AUTH_REQUIRED || status == -ECONNREFUSED)
+			recon = false;
 	} else if (time_after_eq(jiffies, rport->dev_loss_end))
 		recon = false;