PCI: mediatek-gen3: Fix translation window

Message ID 20231011122633.31559-1-jianjun.wang@mediatek.com
State New
Headers
Series PCI: mediatek-gen3: Fix translation window |

Commit Message

Jianjun Wang (王建军) Oct. 11, 2023, 12:26 p.m. UTC
  The size of translation table should be a power of 2, using fls()
cannot get the proper value when the size is not a power of 2. For
example, fls(0x3e00000) - 1 = 25, hence the PCIe translation window
size will be set to 0x2000000 instead of the expected size 0x3e00000.

Fix translation window by splitting the MMIO space to multiple tables if
its size is not a power of 2.

Fixes: d3bf75b579b9 ("PCI: mediatek-gen3: Add MediaTek Gen3 driver for MT8192")
Signed-off-by: Jianjun Wang <jianjun.wang@mediatek.com>

---
Bootup logs on MT8195 Platform:

> Before this patch:
mtk-pcie-gen3 112f0000.pcie: Parsing ranges property...
mtk-pcie-gen3 112f0000.pcie:       IO 0x0020000000..0x00201fffff -> 0x0020000000
mtk-pcie-gen3 112f0000.pcie:      MEM 0x0020200000..0x0023ffffff -> 0x0020200000
mtk-pcie-gen3 112f0000.pcie: set IO trans window[0]: cpu_addr = 0x20000000, pci_addr = 0x20000000, size = 0x200000
mtk-pcie-gen3 112f0000.pcie: set MEM trans window[1]: cpu_addr = 0x20200000, pci_addr = 0x20200000, size = 0x3e00000

> We expect the MEM trans window size to be 0x3e00000, but its actual available size is 0x2000000.

> After applying this patch:
mtk-pcie-gen3 112f0000.pcie: Parsing ranges property...
mtk-pcie-gen3 112f0000.pcie:       IO 0x0020000000..0x00201fffff -> 0x0020000000
mtk-pcie-gen3 112f0000.pcie:      MEM 0x0020200000..0x0023ffffff -> 0x0020200000
mtk-pcie-gen3 112f0000.pcie: set IO trans window[0]: cpu_addr = 0x20000000, pci_addr = 0x20000000, size = 0x200000
mtk-pcie-gen3 112f0000.pcie: set MEM trans window[1]: cpu_addr = 0x20200000, pci_addr = 0x20200000, size = 0x200000
mtk-pcie-gen3 112f0000.pcie: set MEM trans window[2]: cpu_addr = 0x20400000, pci_addr = 0x20400000, size = 0x400000
mtk-pcie-gen3 112f0000.pcie: set MEM trans window[3]: cpu_addr = 0x20800000, pci_addr = 0x20800000, size = 0x800000
mtk-pcie-gen3 112f0000.pcie: set MEM trans window[4]: cpu_addr = 0x21000000, pci_addr = 0x21000000, size = 0x1000000
mtk-pcie-gen3 112f0000.pcie: set MEM trans window[5]: cpu_addr = 0x22000000, pci_addr = 0x22000000, size = 0x2000000

> Total available size for MEM trans window is 0x3e00000.
---
---
 drivers/pci/controller/pcie-mediatek-gen3.c | 87 ++++++++++++---------
 1 file changed, 52 insertions(+), 35 deletions(-)
  

Comments

Alexandre Mergnat Oct. 11, 2023, 3:38 p.m. UTC | #1
On 11/10/2023 14:26, Jianjun Wang wrote:
> The size of translation table should be a power of 2, using fls() cannot 
> get the proper value when the size is not a power of 2. For example, 
> fls(0x3e00000) - 1 = 25, hence the PCIe translation window size will be 
> set to 0x2000000 instead of the expected size 0x3e00000. Fix translation 
> window by splitting the MMIO space to multiple tables if its size is not 
> a power of 2.

Hi Jianjun,

I've no knowledge in PCIE, so maybe what my suggestion is stupid:

Is it mandatory to fit the translation table size with 0x3e00000 (in 
this example) ?
I'm asking because you can have an issue by reaching the maximum 
translation table number.

Is it possible to just use only one table with the power of 2 size above 
0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000). The downside 
of this method is wasting allocation space. AFAIK I already see this 
kind of method for memory protection/allocation in embedded systems, so 
I'm wondering if this method is safer than using multiple table for only 
one size which isn't a power of 2.
  
Jianjun Wang (王建军) Oct. 12, 2023, 6:17 a.m. UTC | #2
On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  
> 
> On 11/10/2023 14:26, Jianjun Wang wrote:
> > The size of translation table should be a power of 2, using fls()
> cannot 
> > get the proper value when the size is not a power of 2. For
> example, 
> > fls(0x3e00000) - 1 = 25, hence the PCIe translation window size
> will be 
> > set to 0x2000000 instead of the expected size 0x3e00000. Fix
> translation 
> > window by splitting the MMIO space to multiple tables if its size
> is not 
> > a power of 2.
> 
> Hi Jianjun,
> 
> I've no knowledge in PCIE, so maybe what my suggestion is stupid:
> 
> Is it mandatory to fit the translation table size with 0x3e00000 (in 
> this example) ?
> I'm asking because you can have an issue by reaching the maximum 
> translation table number.
> 
> Is it possible to just use only one table with the power of 2 size
> above 
> 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000). The
> downside 
> of this method is wasting allocation space. AFAIK I already see this 
> kind of method for memory protection/allocation in embedded systems,
> so 
> I'm wondering if this method is safer than using multiple table for
> only 
> one size which isn't a power of 2.

Hi Alexandre,

It's not mandatory to fit the translation table size with 0x3e00000,
and yes we can use only one table with the power of 2 size to prevent
this.

For MediaTek's SoCs, the MMIO space range for each PCIe port is fixed,
and it will always be a power of 2, most of them will be 64MB. The
reason we have the size which isn't a power of 2 is that we reserve an
IO space for compatible purpose, some older devices may still use IO
space.

Take MT8195 as an example, its MMIO size is 64MB, and the declaration
in the DT is like:
ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
         <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;

The MMIO space is splited to 2MB IO space and 62MB MEM space, that's
cause the current risk of the MEM space range, its actual available MEM
space is 32MB. But it still works for now because most of the devices
only require a very small amount of MEM space and will not reach ranges
higher than 32MB.

So for the concern of reaching the maximum translation table number, I
think maybe we can just print the warning message instead of return
error code, since it still works but have some limitations(MEM space
not set as DT expected).

Thanks.
> 
> 
> -- 
> Regards,
> Alexandre
  
Alexandre Mergnat Oct. 12, 2023, 10:27 a.m. UTC | #3
On 12/10/2023 08:17, Jianjun Wang (王建军) wrote:
> On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
>>   
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>  
>> 
>> On 11/10/2023 14:26, Jianjun Wang wrote:
>> > The size of translation table should be a power of 2, using fls()
>> cannot 
>> > get the proper value when the size is not a power of 2. For
>> example, 
>> > fls(0x3e00000) - 1 = 25, hence the PCIe translation window size
>> will be 
>> > set to 0x2000000 instead of the expected size 0x3e00000. Fix
>> translation 
>> > window by splitting the MMIO space to multiple tables if its size
>> is not 
>> > a power of 2.
>> 
>> Hi Jianjun,
>> 
>> I've no knowledge in PCIE, so maybe what my suggestion is stupid:
>> 
>> Is it mandatory to fit the translation table size with 0x3e00000 (in 
>> this example) ?
>> I'm asking because you can have an issue by reaching the maximum 
>> translation table number.
>> 
>> Is it possible to just use only one table with the power of 2 size
>> above 
>> 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000). The
>> downside 
>> of this method is wasting allocation space. AFAIK I already see this 
>> kind of method for memory protection/allocation in embedded systems,
>> so 
>> I'm wondering if this method is safer than using multiple table for
>> only 
>> one size which isn't a power of 2.
> 
> Hi Alexandre,
> 
> It's not mandatory to fit the translation table size with 0x3e00000,
> and yes we can use only one table with the power of 2 size to prevent
> this.
> 
> For MediaTek's SoCs, the MMIO space range for each PCIe port is fixed,
> and it will always be a power of 2, most of them will be 64MB. The
> reason we have the size which isn't a power of 2 is that we reserve an
> IO space for compatible purpose, some older devices may still use IO
> space.
> 
> Take MT8195 as an example, its MMIO size is 64MB, and the declaration
> in the DT is like:
> ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
>           <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;
> 
> The MMIO space is splited to 2MB IO space and 62MB MEM space, that's
> cause the current risk of the MEM space range, its actual available MEM
> space is 32MB. But it still works for now because most of the devices
> only require a very small amount of MEM space and will not reach ranges
> higher than 32MB.
> 
> So for the concern of reaching the maximum translation table number, I
> think maybe we can just print the warning message instead of return
> error code, since it still works but have some limitations(MEM space
> not set as DT expected).
> 

Ok understood, thanks for your explanation.
Then, IMHO, you should use only one table with the power of 2 size above 
to make the code simpler, efficient, robust, more readable and avoid 
confusion about the warning.

This is what is done for pci-mvebu.c AFAII.

If you prefer waiting another reviewer with a better PCIE expertise than 
me, it's ok for me. With the information I have currently, I prefer to 
not approve the current implementation because, from my PoV, it 
introduce unnecessary complexity.

Thanks
  
AngeloGioacchino Del Regno Oct. 12, 2023, 12:52 p.m. UTC | #4
Il 12/10/23 12:27, Alexandre Mergnat ha scritto:
> 
> 
> On 12/10/2023 08:17, Jianjun Wang (王建军) wrote:
>> On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
>>> External email : Please do not click links or open attachments until
>>> you have verified the sender or the content.
>>>
>>>
>>> On 11/10/2023 14:26, Jianjun Wang wrote:
>>> > The size of translation table should be a power of 2, using fls()
>>> cannot > get the proper value when the size is not a power of 2. For
>>> example, > fls(0x3e00000) - 1 = 25, hence the PCIe translation window size
>>> will be > set to 0x2000000 instead of the expected size 0x3e00000. Fix
>>> translation > window by splitting the MMIO space to multiple tables if its size
>>> is not > a power of 2.
>>>
>>> Hi Jianjun,
>>>
>>> I've no knowledge in PCIE, so maybe what my suggestion is stupid:
>>>
>>> Is it mandatory to fit the translation table size with 0x3e00000 (in this 
>>> example) ?
>>> I'm asking because you can have an issue by reaching the maximum translation 
>>> table number.
>>>
>>> Is it possible to just use only one table with the power of 2 size
>>> above 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000). The
>>> downside of this method is wasting allocation space. AFAIK I already see this 
>>> kind of method for memory protection/allocation in embedded systems,
>>> so I'm wondering if this method is safer than using multiple table for
>>> only one size which isn't a power of 2.
>>
>> Hi Alexandre,
>>
>> It's not mandatory to fit the translation table size with 0x3e00000,
>> and yes we can use only one table with the power of 2 size to prevent
>> this.
>>
>> For MediaTek's SoCs, the MMIO space range for each PCIe port is fixed,
>> and it will always be a power of 2, most of them will be 64MB. The
>> reason we have the size which isn't a power of 2 is that we reserve an
>> IO space for compatible purpose, some older devices may still use IO
>> space.
>>
>> Take MT8195 as an example, its MMIO size is 64MB, and the declaration
>> in the DT is like:
>> ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
>>           <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;
>>
>> The MMIO space is splited to 2MB IO space and 62MB MEM space, that's
>> cause the current risk of the MEM space range, its actual available MEM
>> space is 32MB. But it still works for now because most of the devices
>> only require a very small amount of MEM space and will not reach ranges
>> higher than 32MB.
>>
>> So for the concern of reaching the maximum translation table number, I
>> think maybe we can just print the warning message instead of return
>> error code, since it still works but have some limitations(MEM space
>> not set as DT expected).
>>
> 
> Ok understood, thanks for your explanation.
> Then, IMHO, you should use only one table with the power of 2 size above to make 
> the code simpler, efficient, robust, more readable and avoid confusion about the 
> warning.
> 
> This is what is done for pci-mvebu.c AFAII.
> 
> If you prefer waiting another reviewer with a better PCIE expertise than me, it's 
> ok for me. With the information I have currently, I prefer to not approve the 
> current implementation because, from my PoV, it introduce unnecessary complexity.
> 

 From what I understand, using only one table with a size that is a power of two
won't let us use the entire MMIO space, hence the only solution to allow using
the entire range is to split to more than one table.

I'm not sure, though, whether PCIe devices would be able to use a MEM space that
is not power of two, or if those do even exist.

If there are devices that can use 32MB < mem <= 62MB, then I completely agree
with Jianjun on this commit.... so please, can any PCI(/e) maintainer comment
on this situation?

Regards,
Angelo
  
Alexandre Mergnat Oct. 12, 2023, 1:30 p.m. UTC | #5
On 12/10/2023 14:52, AngeloGioacchino Del Regno wrote:
> Il 12/10/23 12:27, Alexandre Mergnat ha scritto:
>>
>>
>> On 12/10/2023 08:17, Jianjun Wang (王建军) wrote:
>>> On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
>>>> External email : Please do not click links or open attachments until
>>>> you have verified the sender or the content.
>>>>
>>>>
>>>> On 11/10/2023 14:26, Jianjun Wang wrote:
>>>> > The size of translation table should be a power of 2, using fls()
>>>> cannot > get the proper value when the size is not a power of 2. For
>>>> example, > fls(0x3e00000) - 1 = 25, hence the PCIe translation 
>>>> window size
>>>> will be > set to 0x2000000 instead of the expected size 0x3e00000. Fix
>>>> translation > window by splitting the MMIO space to multiple tables 
>>>> if its size
>>>> is not > a power of 2.
>>>>
>>>> Hi Jianjun,
>>>>
>>>> I've no knowledge in PCIE, so maybe what my suggestion is stupid:
>>>>
>>>> Is it mandatory to fit the translation table size with 0x3e00000 (in 
>>>> this example) ?
>>>> I'm asking because you can have an issue by reaching the maximum 
>>>> translation table number.
>>>>
>>>> Is it possible to just use only one table with the power of 2 size
>>>> above 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000). The
>>>> downside of this method is wasting allocation space. AFAIK I already 
>>>> see this kind of method for memory protection/allocation in embedded 
>>>> systems,
>>>> so I'm wondering if this method is safer than using multiple table for
>>>> only one size which isn't a power of 2.
>>>
>>> Hi Alexandre,
>>>
>>> It's not mandatory to fit the translation table size with 0x3e00000,
>>> and yes we can use only one table with the power of 2 size to prevent
>>> this.
>>>
>>> For MediaTek's SoCs, the MMIO space range for each PCIe port is fixed,
>>> and it will always be a power of 2, most of them will be 64MB. The
>>> reason we have the size which isn't a power of 2 is that we reserve an
>>> IO space for compatible purpose, some older devices may still use IO
>>> space.
>>>
>>> Take MT8195 as an example, its MMIO size is 64MB, and the declaration
>>> in the DT is like:
>>> ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
>>>           <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;
>>>
>>> The MMIO space is splited to 2MB IO space and 62MB MEM space, that's
>>> cause the current risk of the MEM space range, its actual available MEM
>>> space is 32MB. But it still works for now because most of the devices
>>> only require a very small amount of MEM space and will not reach ranges
>>> higher than 32MB.
>>>
>>> So for the concern of reaching the maximum translation table number, I
>>> think maybe we can just print the warning message instead of return
>>> error code, since it still works but have some limitations(MEM space
>>> not set as DT expected).
>>>
>>
>> Ok understood, thanks for your explanation.
>> Then, IMHO, you should use only one table with the power of 2 size 
>> above to make the code simpler, efficient, robust, more readable and 
>> avoid confusion about the warning.
>>
>> This is what is done for pci-mvebu.c AFAII.
>>
>> If you prefer waiting another reviewer with a better PCIE expertise 
>> than me, it's ok for me. With the information I have currently, I 
>> prefer to not approve the current implementation because, from my PoV, 
>> it introduce unnecessary complexity.
>>
> 
>  From what I understand, using only one table with a size that is a 
> power of two
> won't let us use the entire MMIO space, hence the only solution to allow 
> using
> the entire range is to split to more than one table.

You can take the power of 2 above, which is directly returned by fls().
That let us use the entire MMIO space.
In this example, if your size is 0x3e00000, the you will allow 0x4000000.
  
Jianjun Wang (王建军) Oct. 13, 2023, 2:52 a.m. UTC | #6
On Thu, 2023-10-12 at 15:30 +0200, Alexandre Mergnat wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  
> 
> On 12/10/2023 14:52, AngeloGioacchino Del Regno wrote:
> > Il 12/10/23 12:27, Alexandre Mergnat ha scritto:
> >>
> >>
> >> On 12/10/2023 08:17, Jianjun Wang (王建军) wrote:
> >>> On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
> >>>> External email : Please do not click links or open attachments
> until
> >>>> you have verified the sender or the content.
> >>>>
> >>>>
> >>>> On 11/10/2023 14:26, Jianjun Wang wrote:
> >>>> > The size of translation table should be a power of 2, using
> fls()
> >>>> cannot > get the proper value when the size is not a power of 2.
> For
> >>>> example, > fls(0x3e00000) - 1 = 25, hence the PCIe translation 
> >>>> window size
> >>>> will be > set to 0x2000000 instead of the expected size
> 0x3e00000. Fix
> >>>> translation > window by splitting the MMIO space to multiple
> tables 
> >>>> if its size
> >>>> is not > a power of 2.
> >>>>
> >>>> Hi Jianjun,
> >>>>
> >>>> I've no knowledge in PCIE, so maybe what my suggestion is
> stupid:
> >>>>
> >>>> Is it mandatory to fit the translation table size with 0x3e00000
> (in 
> >>>> this example) ?
> >>>> I'm asking because you can have an issue by reaching the
> maximum 
> >>>> translation table number.
> >>>>
> >>>> Is it possible to just use only one table with the power of 2
> size
> >>>> above 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000).
> The
> >>>> downside of this method is wasting allocation space. AFAIK I
> already 
> >>>> see this kind of method for memory protection/allocation in
> embedded 
> >>>> systems,
> >>>> so I'm wondering if this method is safer than using multiple
> table for
> >>>> only one size which isn't a power of 2.
> >>>
> >>> Hi Alexandre,
> >>>
> >>> It's not mandatory to fit the translation table size with
> 0x3e00000,
> >>> and yes we can use only one table with the power of 2 size to
> prevent
> >>> this.
> >>>
> >>> For MediaTek's SoCs, the MMIO space range for each PCIe port is
> fixed,
> >>> and it will always be a power of 2, most of them will be 64MB.
> The
> >>> reason we have the size which isn't a power of 2 is that we
> reserve an
> >>> IO space for compatible purpose, some older devices may still use
> IO
> >>> space.
> >>>
> >>> Take MT8195 as an example, its MMIO size is 64MB, and the
> declaration
> >>> in the DT is like:
> >>> ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
> >>>           <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;
> >>>
> >>> The MMIO space is splited to 2MB IO space and 62MB MEM space,
> that's
> >>> cause the current risk of the MEM space range, its actual
> available MEM
> >>> space is 32MB. But it still works for now because most of the
> devices
> >>> only require a very small amount of MEM space and will not reach
> ranges
> >>> higher than 32MB.
> >>>
> >>> So for the concern of reaching the maximum translation table
> number, I
> >>> think maybe we can just print the warning message instead of
> return
> >>> error code, since it still works but have some limitations(MEM
> space
> >>> not set as DT expected).
> >>>
> >>
> >> Ok understood, thanks for your explanation.
> >> Then, IMHO, you should use only one table with the power of 2
> size 
> >> above to make the code simpler, efficient, robust, more readable
> and 
> >> avoid confusion about the warning.
> >>
> >> This is what is done for pci-mvebu.c AFAII.
> >>
> >> If you prefer waiting another reviewer with a better PCIE
> expertise 
> >> than me, it's ok for me. With the information I have currently, I 
> >> prefer to not approve the current implementation because, from my
> PoV, 
> >> it introduce unnecessary complexity.
> >>
> > 
> >  From what I understand, using only one table with a size that is
> a 
> > power of two
> > won't let us use the entire MMIO space, hence the only solution to
> allow 
> > using
> > the entire range is to split to more than one table.
> 
> You can take the power of 2 above, which is directly returned by
> fls().
> That let us use the entire MMIO space.
> In this example, if your size is 0x3e00000, the you will allow
> 0x4000000.

Take the power of 2 above size is a solution, but another concern will
be the flexibility. With this patch, we can split the MMIO space to
multiple ranges like:
ranges = <0x82000000 0 0x20000000 0x0 0x20000000 0 0x100000>,
         <0x81000000 0 0x20100000 0x0 0x20100000 0 0x300000>,
         <0x82000000 0 0x20300000 0x0 0x20300000 0 0x3c00000>;
Not sure if that can really happen, but it will have overlap ranges
when take the power of 2 above.

Thanks.
> 
> 
> -- 
> Regards,
> Alexandre
  
Jianjun Wang (王建军) Oct. 13, 2023, 8:37 a.m. UTC | #7
On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
>  	 
> External email : Please do not click links or open attachments until
> you have verified the sender or the content.
>  
> 
> On 11/10/2023 14:26, Jianjun Wang wrote:
> > The size of translation table should be a power of 2, using fls()
> cannot 
> > get the proper value when the size is not a power of 2. For
> example, 
> > fls(0x3e00000) - 1 = 25, hence the PCIe translation window size
> will be 
> > set to 0x2000000 instead of the expected size 0x3e00000. Fix
> translation 
> > window by splitting the MMIO space to multiple tables if its size
> is not 
> > a power of 2.
> 
> Hi Jianjun,
> 
> I've no knowledge in PCIE, so maybe what my suggestion is stupid:
> 
> Is it mandatory to fit the translation table size with 0x3e00000 (in 
> this example) ?
> I'm asking because you can have an issue by reaching the maximum 
> translation table number.
> 
> Is it possible to just use only one table with the power of 2 size
> above 
> 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000). The
> downside 
> of this method is wasting allocation space. AFAIK I already see this 
> kind of method for memory protection/allocation in embedded systems,
> so 
> I'm wondering if this method is safer than using multiple table for
> only 
> one size which isn't a power of 2.
Hi Alexandre,

It's not mandatory to fit the translation table size with 0x3e00000,
and yes we can use only one table with the power of 2 size to prevent
this.

For MediaTek's SoCs, the MMIO space range for each PCIe port is fixed,
and it will always be a power of 2, most of them will be 64MB.

The reason we have the size which isn't a power of 2 is that we reserve
an IO space for compatible purposes, some older devices may still use
IO space.

Take MT8195 as an example, its MMIO size is 64MB, and the declaration
in the DT is like:
ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
         <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;

The MMIO space is splited to 2MB IO space and 62MB MEM space, that's
cause the current risk of using the MEM space, since its actual
available MEM space is 32MB. But it still works for now because most of
the devices only require a very small amount of MEM space and will not
reach ranges higher than 32MB.

So for the concern of reaching the maximum translation number, I think
maybe we can just print the warning message instead of return error
code, since it still works but have some limitations(MEM space not set
as DT expected).

Thanks.
> 
> 
> -- 
> Regards,
> Alexandre
  
Alexandre Mergnat Oct. 13, 2023, 9:52 a.m. UTC | #8
On 13/10/2023 04:52, Jianjun Wang (王建军) wrote:
> On Thu, 2023-10-12 at 15:30 +0200, Alexandre Mergnat wrote:
>>   
>> External email : Please do not click links or open attachments until
>> you have verified the sender or the content.
>>  
>> 
>> On 12/10/2023 14:52, AngeloGioacchino Del Regno wrote:
>> > Il 12/10/23 12:27, Alexandre Mergnat ha scritto:
>> >>
>> >>
>> >> On 12/10/2023 08:17, Jianjun Wang (王建军) wrote:
>> >>> On Wed, 2023-10-11 at 17:38 +0200, Alexandre Mergnat wrote:
>> >>>> External email : Please do not click links or open attachments
>> until
>> >>>> you have verified the sender or the content.
>> >>>>
>> >>>>
>> >>>> On 11/10/2023 14:26, Jianjun Wang wrote:
>> >>>> > The size of translation table should be a power of 2, using
>> fls()
>> >>>> cannot > get the proper value when the size is not a power of 2.
>> For
>> >>>> example, > fls(0x3e00000) - 1 = 25, hence the PCIe translation 
>> >>>> window size
>> >>>> will be > set to 0x2000000 instead of the expected size
>> 0x3e00000. Fix
>> >>>> translation > window by splitting the MMIO space to multiple
>> tables 
>> >>>> if its size
>> >>>> is not > a power of 2.
>> >>>>
>> >>>> Hi Jianjun,
>> >>>>
>> >>>> I've no knowledge in PCIE, so maybe what my suggestion is
>> stupid:
>> >>>>
>> >>>> Is it mandatory to fit the translation table size with 0x3e00000
>> (in 
>> >>>> this example) ?
>> >>>> I'm asking because you can have an issue by reaching the
>> maximum 
>> >>>> translation table number.
>> >>>>
>> >>>> Is it possible to just use only one table with the power of 2
>> size
>> >>>> above 0x3e00000 => 0x4000000 ( fls(0x3e00000) = 26 = 0x4000000).
>> The
>> >>>> downside of this method is wasting allocation space. AFAIK I
>> already 
>> >>>> see this kind of method for memory protection/allocation in
>> embedded 
>> >>>> systems,
>> >>>> so I'm wondering if this method is safer than using multiple
>> table for
>> >>>> only one size which isn't a power of 2.
>> >>>
>> >>> Hi Alexandre,
>> >>>
>> >>> It's not mandatory to fit the translation table size with
>> 0x3e00000,
>> >>> and yes we can use only one table with the power of 2 size to
>> prevent
>> >>> this.
>> >>>
>> >>> For MediaTek's SoCs, the MMIO space range for each PCIe port is
>> fixed,
>> >>> and it will always be a power of 2, most of them will be 64MB.
>> The
>> >>> reason we have the size which isn't a power of 2 is that we
>> reserve an
>> >>> IO space for compatible purpose, some older devices may still use
>> IO
>> >>> space.
>> >>>
>> >>> Take MT8195 as an example, its MMIO size is 64MB, and the
>> declaration
>> >>> in the DT is like:
>> >>> ranges = <0x81000000 0 0x20000000 0x0 0x20000000 0 0x200000>,
>> >>>           <0x82000000 0 0x20200000 0x0 0x20200000 0 0x3e00000>;
>> >>>
>> >>> The MMIO space is splited to 2MB IO space and 62MB MEM space,
>> that's
>> >>> cause the current risk of the MEM space range, its actual
>> available MEM
>> >>> space is 32MB. But it still works for now because most of the
>> devices
>> >>> only require a very small amount of MEM space and will not reach
>> ranges
>> >>> higher than 32MB.
>> >>>
>> >>> So for the concern of reaching the maximum translation table
>> number, I
>> >>> think maybe we can just print the warning message instead of
>> return
>> >>> error code, since it still works but have some limitations(MEM
>> space
>> >>> not set as DT expected).
>> >>>
>> >>
>> >> Ok understood, thanks for your explanation.
>> >> Then, IMHO, you should use only one table with the power of 2
>> size 
>> >> above to make the code simpler, efficient, robust, more readable
>> and 
>> >> avoid confusion about the warning.
>> >>
>> >> This is what is done for pci-mvebu.c AFAII.
>> >>
>> >> If you prefer waiting another reviewer with a better PCIE
>> expertise 
>> >> than me, it's ok for me. With the information I have currently, I 
>> >> prefer to not approve the current implementation because, from my
>> PoV, 
>> >> it introduce unnecessary complexity.
>> >>
>> > 
>> >  From what I understand, using only one table with a size that is
>> a 
>> > power of two
>> > won't let us use the entire MMIO space, hence the only solution to
>> allow 
>> > using
>> > the entire range is to split to more than one table.
>> 
>> You can take the power of 2 above, which is directly returned by
>> fls().
>> That let us use the entire MMIO space.
>> In this example, if your size is 0x3e00000, the you will allow
>> 0x4000000.
> 
> Take the power of 2 above size is a solution, but another concern will
> be the flexibility. With this patch, we can split the MMIO space to
> multiple ranges like:
> ranges = <0x82000000 0 0x20000000 0x0 0x20000000 0 0x100000>,
>           <0x81000000 0 0x20100000 0x0 0x20100000 0 0x300000>,
>           <0x82000000 0 0x20300000 0x0 0x20300000 0 0x3c00000>;
> Not sure if that can really happen, but it will have overlap ranges
> when take the power of 2 above.

Yes, you can avoid overlap by changing the next start address to fit the 
previous allocated range. If that isn't possible or introduce too much 
complexity compared to your solution, then your implementation could be 
the best from my PoV. :)
  

Patch

diff --git a/drivers/pci/controller/pcie-mediatek-gen3.c b/drivers/pci/controller/pcie-mediatek-gen3.c
index e0e27645fdf4..3f2496b135ae 100644
--- a/drivers/pci/controller/pcie-mediatek-gen3.c
+++ b/drivers/pci/controller/pcie-mediatek-gen3.c
@@ -245,35 +245,62 @@  static int mtk_pcie_set_trans_table(struct mtk_gen3_pcie *pcie,
 				    resource_size_t cpu_addr,
 				    resource_size_t pci_addr,
 				    resource_size_t size,
-				    unsigned long type, int num)
+				    unsigned long type, int *num)
 {
+	resource_size_t remaining = size;
+	resource_size_t table_size;
+	resource_size_t addr_align;
+	const char *range_type;
 	void __iomem *table;
 	u32 val;
 
-	if (num >= PCIE_MAX_TRANS_TABLES) {
-		dev_err(pcie->dev, "not enough translate table for addr: %#llx, limited to [%d]\n",
-			(unsigned long long)cpu_addr, PCIE_MAX_TRANS_TABLES);
-		return -ENODEV;
-	}
+	while (remaining && (*num < PCIE_MAX_TRANS_TABLES)) {
+		/* Table size needs to be a power of 2 */
+		table_size = BIT(fls(remaining) - 1);
 
-	table = pcie->base + PCIE_TRANS_TABLE_BASE_REG +
-		num * PCIE_ATR_TLB_SET_OFFSET;
+		if (cpu_addr > 0) {
+			addr_align = BIT(ffs(cpu_addr) - 1);
+			table_size = min(table_size, addr_align);
+		}
 
-	writel_relaxed(lower_32_bits(cpu_addr) | PCIE_ATR_SIZE(fls(size) - 1),
-		       table);
-	writel_relaxed(upper_32_bits(cpu_addr),
-		       table + PCIE_ATR_SRC_ADDR_MSB_OFFSET);
-	writel_relaxed(lower_32_bits(pci_addr),
-		       table + PCIE_ATR_TRSL_ADDR_LSB_OFFSET);
-	writel_relaxed(upper_32_bits(pci_addr),
-		       table + PCIE_ATR_TRSL_ADDR_MSB_OFFSET);
+		/* Minimum size of translate table is 4KiB */
+		if (table_size < 0x1000) {
+			dev_err(pcie->dev, "illegal table size %#llx\n",
+				(unsigned long long)table_size);
+			return -EINVAL;
+		}
 
-	if (type == IORESOURCE_IO)
-		val = PCIE_ATR_TYPE_IO | PCIE_ATR_TLP_TYPE_IO;
-	else
-		val = PCIE_ATR_TYPE_MEM | PCIE_ATR_TLP_TYPE_MEM;
+		table = pcie->base + PCIE_TRANS_TABLE_BASE_REG + *num * PCIE_ATR_TLB_SET_OFFSET;
+		writel_relaxed(lower_32_bits(cpu_addr) | PCIE_ATR_SIZE(fls(table_size) - 1), table);
+		writel_relaxed(upper_32_bits(cpu_addr), table + PCIE_ATR_SRC_ADDR_MSB_OFFSET);
+		writel_relaxed(lower_32_bits(pci_addr), table + PCIE_ATR_TRSL_ADDR_LSB_OFFSET);
+		writel_relaxed(upper_32_bits(pci_addr), table + PCIE_ATR_TRSL_ADDR_MSB_OFFSET);
 
-	writel_relaxed(val, table + PCIE_ATR_TRSL_PARAM_OFFSET);
+		if (type == IORESOURCE_IO) {
+			val = PCIE_ATR_TYPE_IO | PCIE_ATR_TLP_TYPE_IO;
+			range_type = "IO";
+		} else {
+			val = PCIE_ATR_TYPE_MEM | PCIE_ATR_TLP_TYPE_MEM;
+			range_type = "MEM";
+		}
+
+		writel_relaxed(val, table + PCIE_ATR_TRSL_PARAM_OFFSET);
+
+		dev_dbg(pcie->dev, "set %s trans window[%d]: cpu_addr = %#llx, pci_addr = %#llx, size = %#llx\n",
+			range_type, *num, (unsigned long long)cpu_addr,
+			(unsigned long long)pci_addr, (unsigned long long)table_size);
+
+		cpu_addr += table_size;
+		pci_addr += table_size;
+		remaining -= table_size;
+		(*num)++;
+	}
+
+	if (remaining) {
+		dev_err(pcie->dev, "not enough translate table for addr: %#llx, limited to [%d]\n",
+			(unsigned long long)cpu_addr, PCIE_MAX_TRANS_TABLES);
+		return -ENODEV;
+	}
 
 	return 0;
 }
@@ -380,30 +407,20 @@  static int mtk_pcie_startup_port(struct mtk_gen3_pcie *pcie)
 		resource_size_t cpu_addr;
 		resource_size_t pci_addr;
 		resource_size_t size;
-		const char *range_type;
 
-		if (type == IORESOURCE_IO) {
+		if (type == IORESOURCE_IO)
 			cpu_addr = pci_pio_to_address(res->start);
-			range_type = "IO";
-		} else if (type == IORESOURCE_MEM) {
+		else if (type == IORESOURCE_MEM)
 			cpu_addr = res->start;
-			range_type = "MEM";
-		} else {
+		else
 			continue;
-		}
 
 		pci_addr = res->start - entry->offset;
 		size = resource_size(res);
 		err = mtk_pcie_set_trans_table(pcie, cpu_addr, pci_addr, size,
-					       type, table_index);
+					       type, &table_index);
 		if (err)
 			return err;
-
-		dev_dbg(pcie->dev, "set %s trans window[%d]: cpu_addr = %#llx, pci_addr = %#llx, size = %#llx\n",
-			range_type, table_index, (unsigned long long)cpu_addr,
-			(unsigned long long)pci_addr, (unsigned long long)size);
-
-		table_index++;
 	}
 
 	return 0;