[v3,4/9] cxl/mem: Skip intermediate port enumeration of restricted endpoints (RCDs)

Message ID 20221109104059.766720-5-rrichter@amd.com
State New
Headers
Series cxl: Add support for Restricted CXL hosts (RCD mode) |

Commit Message

Robert Richter Nov. 9, 2022, 10:40 a.m. UTC
  When an endpoint is found, all ports in beetween the endpoint and the
CXL host bridge need to be created. In the RCH case there are no ports
in between a host bridge and the endpoint. Skip the enumeration of
intermediate ports.

The port enumeration does not only create all ports, it also
initializes the endpoint chain by adding the endpoint to every
downstream port up to the root bridge. This must be done also in RCD
mode, but is much more simple as the endpoint only needs to be added
to the host bridge's dport.

Note: For endpoint removal the cxl_detach_ep() is not needed as it is
released in cxl_port_release().

Signed-off-by: Robert Richter <rrichter@amd.com>
---
 drivers/cxl/core/port.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)
  

Comments

Dave Jiang Nov. 9, 2022, 4:55 p.m. UTC | #1
On 11/9/2022 2:40 AM, Robert Richter wrote:
> When an endpoint is found, all ports in beetween the endpoint and the
s/beetween/between/

DJ
> CXL host bridge need to be created. In the RCH case there are no ports
> in between a host bridge and the endpoint. Skip the enumeration of
> intermediate ports.
> 
> The port enumeration does not only create all ports, it also
> initializes the endpoint chain by adding the endpoint to every
> downstream port up to the root bridge. This must be done also in RCD
> mode, but is much more simple as the endpoint only needs to be added
> to the host bridge's dport.
> 
> Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> released in cxl_port_release().
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>   drivers/cxl/core/port.c | 18 ++++++++++++++++--
>   1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d10c3580719b..e21a9c3fe4da 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>   {
>   	struct device *dev = &cxlmd->dev;
>   	struct device *iter;
> +	struct cxl_dport *dport;
> +	struct cxl_port *port;
>   	int rc;
>   
> +	/*
> +	 * Skip intermediate port enumeration in the RCH case, there
> +	 * are no ports in between a host bridge and an endpoint. Only
> +	 * initialize the EP chain.
> +	 */
> +	if (is_cxl_restricted(cxlmd)) {
> +		port = cxl_mem_find_port(cxlmd, &dport);
> +		if (!port)
> +			return -ENXIO;
> +		rc = cxl_add_ep(dport, &cxlmd->dev);
> +		put_device(&port->dev);
> +		return rc;
> +	}
> +
>   	rc = devm_add_action_or_reset(&cxlmd->dev, cxl_detach_ep, cxlmd);
>   	if (rc)
>   		return rc;
> @@ -1381,8 +1397,6 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>   	for (iter = dev; iter; iter = grandparent(iter)) {
>   		struct device *dport_dev = grandparent(iter);
>   		struct device *uport_dev;
> -		struct cxl_dport *dport;
> -		struct cxl_port *port;
>   
>   		if (!dport_dev)
>   			return 0;
  
Dan Williams Nov. 15, 2022, 12:07 a.m. UTC | #2
Robert Richter wrote:
> When an endpoint is found, all ports in beetween the endpoint and the
> CXL host bridge need to be created. In the RCH case there are no ports
> in between a host bridge and the endpoint. Skip the enumeration of
> intermediate ports.
> 
> The port enumeration does not only create all ports, it also
> initializes the endpoint chain by adding the endpoint to every
> downstream port up to the root bridge. This must be done also in RCD
> mode, but is much more simple as the endpoint only needs to be added
> to the host bridge's dport.
> 
> Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> released in cxl_port_release().
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/core/port.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d10c3580719b..e21a9c3fe4da 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>  {
>  	struct device *dev = &cxlmd->dev;
>  	struct device *iter;
> +	struct cxl_dport *dport;
> +	struct cxl_port *port;
>  	int rc;
>  
> +	/*
> +	 * Skip intermediate port enumeration in the RCH case, there
> +	 * are no ports in between a host bridge and an endpoint. Only
> +	 * initialize the EP chain.
> +	 */
> +	if (is_cxl_restricted(cxlmd)) {

I changed this to:

	if (cxlmd->cxlds->rcd) {

...where the cxl_pci driver sets that rcd flag. Aside from keeping some
PCI specifics out of this helper it also allows RCH/RCD configurations
to be mocked with cxl_test.

> +		port = cxl_mem_find_port(cxlmd, &dport);
> +		if (!port)
> +			return -ENXIO;
> +		rc = cxl_add_ep(dport, &cxlmd->dev);

Ah, good catch, I had missed this detail previously.

> +		put_device(&port->dev);
> +		return rc;
> +	}
> +
>  	rc = devm_add_action_or_reset(&cxlmd->dev, cxl_detach_ep, cxlmd);
>  	if (rc)
>  		return rc;
> @@ -1381,8 +1397,6 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>  	for (iter = dev; iter; iter = grandparent(iter)) {
>  		struct device *dport_dev = grandparent(iter);
>  		struct device *uport_dev;
> -		struct cxl_dport *dport;
> -		struct cxl_port *port;
>  
>  		if (!dport_dev)
>  			return 0;
> -- 
> 2.30.2
>
  
Dan Williams Nov. 15, 2022, 12:24 a.m. UTC | #3
Robert Richter wrote:
> When an endpoint is found, all ports in beetween the endpoint and the
> CXL host bridge need to be created. In the RCH case there are no ports
> in between a host bridge and the endpoint. Skip the enumeration of
> intermediate ports.
> 
> The port enumeration does not only create all ports, it also
> initializes the endpoint chain by adding the endpoint to every
> downstream port up to the root bridge. This must be done also in RCD
> mode, but is much more simple as the endpoint only needs to be added
> to the host bridge's dport.
> 
> Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> released in cxl_port_release().
> 
> Signed-off-by: Robert Richter <rrichter@amd.com>
> ---
>  drivers/cxl/core/port.c | 18 ++++++++++++++++--
>  1 file changed, 16 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> index d10c3580719b..e21a9c3fe4da 100644
> --- a/drivers/cxl/core/port.c
> +++ b/drivers/cxl/core/port.c
> @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
>  {
>  	struct device *dev = &cxlmd->dev;
>  	struct device *iter;
> +	struct cxl_dport *dport;
> +	struct cxl_port *port;
>  	int rc;
>  
> +	/*
> +	 * Skip intermediate port enumeration in the RCH case, there
> +	 * are no ports in between a host bridge and an endpoint. Only
> +	 * initialize the EP chain.
> +	 */
> +	if (is_cxl_restricted(cxlmd)) {
> +		port = cxl_mem_find_port(cxlmd, &dport);
> +		if (!port)
> +			return -ENXIO;
> +		rc = cxl_add_ep(dport, &cxlmd->dev);

On second look, this seems problematic. While yes it will be deleted
when the root CXL port dies, it will not be deleted if the cxl_pci
driver is reloaded. I will code up a unit test to double check.

I note that cxl_add_ep() for the VH case is skipped for the root CXL
port, so I do not suspect it is needed here either. Did you add it for a
specific reason?
  
Robert Richter Nov. 15, 2022, 1:17 p.m. UTC | #4
On 14.11.22 16:07:58, Dan Williams wrote:
> Robert Richter wrote:
> > When an endpoint is found, all ports in beetween the endpoint and the
> > CXL host bridge need to be created. In the RCH case there are no ports
> > in between a host bridge and the endpoint. Skip the enumeration of
> > intermediate ports.
> > 
> > The port enumeration does not only create all ports, it also
> > initializes the endpoint chain by adding the endpoint to every
> > downstream port up to the root bridge. This must be done also in RCD
> > mode, but is much more simple as the endpoint only needs to be added
> > to the host bridge's dport.
> > 
> > Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> > released in cxl_port_release().
> > 
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> >  drivers/cxl/core/port.c | 18 ++++++++++++++++--
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > index d10c3580719b..e21a9c3fe4da 100644
> > --- a/drivers/cxl/core/port.c
> > +++ b/drivers/cxl/core/port.c
> > @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >  {
> >  	struct device *dev = &cxlmd->dev;
> >  	struct device *iter;
> > +	struct cxl_dport *dport;
> > +	struct cxl_port *port;
> >  	int rc;
> >  
> > +	/*
> > +	 * Skip intermediate port enumeration in the RCH case, there
> > +	 * are no ports in between a host bridge and an endpoint. Only
> > +	 * initialize the EP chain.
> > +	 */
> > +	if (is_cxl_restricted(cxlmd)) {
> 
> I changed this to:
> 
> 	if (cxlmd->cxlds->rcd) {

I a mail to Bjorn I suggested to have cxl_dev_cap and a cxl_port_cap
in the pci_dev struct that could be looked up too including RCD mode.
Checking the pci_dev looks more reasonable to me here, though we could
have a flag of it in cxlds too.

-Robert

> 
> ...where the cxl_pci driver sets that rcd flag. Aside from keeping some
> PCI specifics out of this helper it also allows RCH/RCD configurations
> to be mocked with cxl_test.
> 
> > +		port = cxl_mem_find_port(cxlmd, &dport);
> > +		if (!port)
> > +			return -ENXIO;
> > +		rc = cxl_add_ep(dport, &cxlmd->dev);
> 
> Ah, good catch, I had missed this detail previously.
> 
> > +		put_device(&port->dev);
> > +		return rc;
> > +	}
> > +
> >  	rc = devm_add_action_or_reset(&cxlmd->dev, cxl_detach_ep, cxlmd);
> >  	if (rc)
> >  		return rc;
> > @@ -1381,8 +1397,6 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >  	for (iter = dev; iter; iter = grandparent(iter)) {
> >  		struct device *dport_dev = grandparent(iter);
> >  		struct device *uport_dev;
> > -		struct cxl_dport *dport;
> > -		struct cxl_port *port;
> >  
> >  		if (!dport_dev)
> >  			return 0;
> > -- 
> > 2.30.2
> > 
> 
>
  
Robert Richter Nov. 15, 2022, 1:28 p.m. UTC | #5
On 14.11.22 16:24:06, Dan Williams wrote:
> Robert Richter wrote:
> > When an endpoint is found, all ports in beetween the endpoint and the
> > CXL host bridge need to be created. In the RCH case there are no ports
> > in between a host bridge and the endpoint. Skip the enumeration of
> > intermediate ports.
> > 
> > The port enumeration does not only create all ports, it also
> > initializes the endpoint chain by adding the endpoint to every
> > downstream port up to the root bridge. This must be done also in RCD
> > mode, but is much more simple as the endpoint only needs to be added
> > to the host bridge's dport.
> > 
> > Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> > released in cxl_port_release().
> > 
> > Signed-off-by: Robert Richter <rrichter@amd.com>
> > ---
> >  drivers/cxl/core/port.c | 18 ++++++++++++++++--
> >  1 file changed, 16 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > index d10c3580719b..e21a9c3fe4da 100644
> > --- a/drivers/cxl/core/port.c
> > +++ b/drivers/cxl/core/port.c
> > @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> >  {
> >  	struct device *dev = &cxlmd->dev;
> >  	struct device *iter;
> > +	struct cxl_dport *dport;
> > +	struct cxl_port *port;
> >  	int rc;
> >  
> > +	/*
> > +	 * Skip intermediate port enumeration in the RCH case, there
> > +	 * are no ports in between a host bridge and an endpoint. Only
> > +	 * initialize the EP chain.
> > +	 */
> > +	if (is_cxl_restricted(cxlmd)) {
> > +		port = cxl_mem_find_port(cxlmd, &dport);
> > +		if (!port)
> > +			return -ENXIO;
> > +		rc = cxl_add_ep(dport, &cxlmd->dev);
> 
> On second look, this seems problematic. While yes it will be deleted
> when the root CXL port dies, it will not be deleted if the cxl_pci
> driver is reloaded. I will code up a unit test to double check.
> 
> I note that cxl_add_ep() for the VH case is skipped for the root CXL
> port, so I do not suspect it is needed here either. Did you add it for a
> specific reason?

Yes, all endpoint iterators need to be reworked. Also true, the first
endpoint is skipped in the chain. So only intermediate EP structs are
touched by the loops actually.

In particular, cxl_ep_load() returned NULL for the first lookup if the
ep is missing. We could stop the iteration then. I tried to avoid a
rework here, but maybe it is not too extensive as I expected first.

-Robert
  
Dan Williams Nov. 15, 2022, 6:08 p.m. UTC | #6
Robert Richter wrote:
> On 14.11.22 16:07:58, Dan Williams wrote:
> > Robert Richter wrote:
> > > When an endpoint is found, all ports in beetween the endpoint and the
> > > CXL host bridge need to be created. In the RCH case there are no ports
> > > in between a host bridge and the endpoint. Skip the enumeration of
> > > intermediate ports.
> > > 
> > > The port enumeration does not only create all ports, it also
> > > initializes the endpoint chain by adding the endpoint to every
> > > downstream port up to the root bridge. This must be done also in RCD
> > > mode, but is much more simple as the endpoint only needs to be added
> > > to the host bridge's dport.
> > > 
> > > Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> > > released in cxl_port_release().
> > > 
> > > Signed-off-by: Robert Richter <rrichter@amd.com>
> > > ---
> > >  drivers/cxl/core/port.c | 18 ++++++++++++++++--
> > >  1 file changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > index d10c3580719b..e21a9c3fe4da 100644
> > > --- a/drivers/cxl/core/port.c
> > > +++ b/drivers/cxl/core/port.c
> > > @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> > >  {
> > >  	struct device *dev = &cxlmd->dev;
> > >  	struct device *iter;
> > > +	struct cxl_dport *dport;
> > > +	struct cxl_port *port;
> > >  	int rc;
> > >  
> > > +	/*
> > > +	 * Skip intermediate port enumeration in the RCH case, there
> > > +	 * are no ports in between a host bridge and an endpoint. Only
> > > +	 * initialize the EP chain.
> > > +	 */
> > > +	if (is_cxl_restricted(cxlmd)) {
> > 
> > I changed this to:
> > 
> > 	if (cxlmd->cxlds->rcd) {
> 
> I a mail to Bjorn I suggested to have cxl_dev_cap and a cxl_port_cap
> in the pci_dev struct that could be looked up too including RCD mode.
> Checking the pci_dev looks more reasonable to me here, though we could
> have a flag of it in cxlds too.

Would that not need the PCI core to understand how to walk the RCRB
generically? As far as I understand the RCRB association is ACPI.CEDT
specific.
  
Dan Williams Nov. 15, 2022, 6:09 p.m. UTC | #7
Robert Richter wrote:
> On 14.11.22 16:24:06, Dan Williams wrote:
> > Robert Richter wrote:
> > > When an endpoint is found, all ports in beetween the endpoint and the
> > > CXL host bridge need to be created. In the RCH case there are no ports
> > > in between a host bridge and the endpoint. Skip the enumeration of
> > > intermediate ports.
> > > 
> > > The port enumeration does not only create all ports, it also
> > > initializes the endpoint chain by adding the endpoint to every
> > > downstream port up to the root bridge. This must be done also in RCD
> > > mode, but is much more simple as the endpoint only needs to be added
> > > to the host bridge's dport.
> > > 
> > > Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> > > released in cxl_port_release().
> > > 
> > > Signed-off-by: Robert Richter <rrichter@amd.com>
> > > ---
> > >  drivers/cxl/core/port.c | 18 ++++++++++++++++--
> > >  1 file changed, 16 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > index d10c3580719b..e21a9c3fe4da 100644
> > > --- a/drivers/cxl/core/port.c
> > > +++ b/drivers/cxl/core/port.c
> > > @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> > >  {
> > >  	struct device *dev = &cxlmd->dev;
> > >  	struct device *iter;
> > > +	struct cxl_dport *dport;
> > > +	struct cxl_port *port;
> > >  	int rc;
> > >  
> > > +	/*
> > > +	 * Skip intermediate port enumeration in the RCH case, there
> > > +	 * are no ports in between a host bridge and an endpoint. Only
> > > +	 * initialize the EP chain.
> > > +	 */
> > > +	if (is_cxl_restricted(cxlmd)) {
> > > +		port = cxl_mem_find_port(cxlmd, &dport);
> > > +		if (!port)
> > > +			return -ENXIO;
> > > +		rc = cxl_add_ep(dport, &cxlmd->dev);
> > 
> > On second look, this seems problematic. While yes it will be deleted
> > when the root CXL port dies, it will not be deleted if the cxl_pci
> > driver is reloaded. I will code up a unit test to double check.
> > 
> > I note that cxl_add_ep() for the VH case is skipped for the root CXL
> > port, so I do not suspect it is needed here either. Did you add it for a
> > specific reason?
> 
> Yes, all endpoint iterators need to be reworked. Also true, the first
> endpoint is skipped in the chain. So only intermediate EP structs are
> touched by the loops actually.
> 
> In particular, cxl_ep_load() returned NULL for the first lookup if the
> ep is missing. We could stop the iteration then. I tried to avoid a
> rework here, but maybe it is not too extensive as I expected first.

Hmm, ok, let me get the unit test topology working for this to make sure
my assumptions are correct about when an @ep reference is used / needed.
  
Robert Richter Nov. 17, 2022, 6:46 p.m. UTC | #8
On 15.11.22 10:08:51, Dan Williams wrote:
> Robert Richter wrote:
> > On 14.11.22 16:07:58, Dan Williams wrote:
> > > Robert Richter wrote:
> > > > When an endpoint is found, all ports in beetween the endpoint and the
> > > > CXL host bridge need to be created. In the RCH case there are no ports
> > > > in between a host bridge and the endpoint. Skip the enumeration of
> > > > intermediate ports.
> > > > 
> > > > The port enumeration does not only create all ports, it also
> > > > initializes the endpoint chain by adding the endpoint to every
> > > > downstream port up to the root bridge. This must be done also in RCD
> > > > mode, but is much more simple as the endpoint only needs to be added
> > > > to the host bridge's dport.
> > > > 
> > > > Note: For endpoint removal the cxl_detach_ep() is not needed as it is
> > > > released in cxl_port_release().
> > > > 
> > > > Signed-off-by: Robert Richter <rrichter@amd.com>
> > > > ---
> > > >  drivers/cxl/core/port.c | 18 ++++++++++++++++--
> > > >  1 file changed, 16 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
> > > > index d10c3580719b..e21a9c3fe4da 100644
> > > > --- a/drivers/cxl/core/port.c
> > > > +++ b/drivers/cxl/core/port.c
> > > > @@ -1366,8 +1366,24 @@ int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
> > > >  {
> > > >  	struct device *dev = &cxlmd->dev;
> > > >  	struct device *iter;
> > > > +	struct cxl_dport *dport;
> > > > +	struct cxl_port *port;
> > > >  	int rc;
> > > >  
> > > > +	/*
> > > > +	 * Skip intermediate port enumeration in the RCH case, there
> > > > +	 * are no ports in between a host bridge and an endpoint. Only
> > > > +	 * initialize the EP chain.
> > > > +	 */
> > > > +	if (is_cxl_restricted(cxlmd)) {
> > > 
> > > I changed this to:
> > > 
> > > 	if (cxlmd->cxlds->rcd) {
> > 
> > I a mail to Bjorn I suggested to have cxl_dev_cap and a cxl_port_cap
> > in the pci_dev struct that could be looked up too including RCD mode.
> > Checking the pci_dev looks more reasonable to me here, though we could
> > have a flag of it in cxlds too.
> 
> Would that not need the PCI core to understand how to walk the RCRB
> generically? As far as I understand the RCRB association is ACPI.CEDT
> specific.

I am thinking of doing some sort of cxl_setup_pci_dev(pdev) in
cxl_pci_probe() which extracts the caps and writes them into a struct
pci_dev. Possibly the CXL 68B Flit and VH Capable/Enable bit could be
used for RCD mode. But if that is not feasible for some reason a flag
somewhere could work too.

-Robert
  

Patch

diff --git a/drivers/cxl/core/port.c b/drivers/cxl/core/port.c
index d10c3580719b..e21a9c3fe4da 100644
--- a/drivers/cxl/core/port.c
+++ b/drivers/cxl/core/port.c
@@ -1366,8 +1366,24 @@  int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
 {
 	struct device *dev = &cxlmd->dev;
 	struct device *iter;
+	struct cxl_dport *dport;
+	struct cxl_port *port;
 	int rc;
 
+	/*
+	 * Skip intermediate port enumeration in the RCH case, there
+	 * are no ports in between a host bridge and an endpoint. Only
+	 * initialize the EP chain.
+	 */
+	if (is_cxl_restricted(cxlmd)) {
+		port = cxl_mem_find_port(cxlmd, &dport);
+		if (!port)
+			return -ENXIO;
+		rc = cxl_add_ep(dport, &cxlmd->dev);
+		put_device(&port->dev);
+		return rc;
+	}
+
 	rc = devm_add_action_or_reset(&cxlmd->dev, cxl_detach_ep, cxlmd);
 	if (rc)
 		return rc;
@@ -1381,8 +1397,6 @@  int devm_cxl_enumerate_ports(struct cxl_memdev *cxlmd)
 	for (iter = dev; iter; iter = grandparent(iter)) {
 		struct device *dport_dev = grandparent(iter);
 		struct device *uport_dev;
-		struct cxl_dport *dport;
-		struct cxl_port *port;
 
 		if (!dport_dev)
 			return 0;