[v5,03/25] mm: Make pte_next_pfn() a wrapper around pte_advance_pfn()

Message ID 20240202080756.1453939-4-ryan.roberts@arm.com
State New
Headers
Series Transparent Contiguous PTEs for User Mappings |

Commit Message

Ryan Roberts Feb. 2, 2024, 8:07 a.m. UTC
  The goal is to be able to advance a PTE by an arbitrary number of PFNs.
So introduce a new API that takes a nr param.

We are going to remove pte_next_pfn() and replace it with
pte_advance_pfn(). As a first step, implement pte_next_pfn() as a
wrapper around pte_advance_pfn() so that we can incrementally switch the
architectures over. Once all arches are moved over, we will change all
the core-mm callers to call pte_advance_pfn() directly and remove the
wrapper.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 include/linux/pgtable.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)
  

Comments

David Hildenbrand Feb. 12, 2024, 12:14 p.m. UTC | #1
On 02.02.24 09:07, Ryan Roberts wrote:
> The goal is to be able to advance a PTE by an arbitrary number of PFNs.
> So introduce a new API that takes a nr param.
> 
> We are going to remove pte_next_pfn() and replace it with
> pte_advance_pfn(). As a first step, implement pte_next_pfn() as a
> wrapper around pte_advance_pfn() so that we can incrementally switch the
> architectures over. Once all arches are moved over, we will change all
> the core-mm callers to call pte_advance_pfn() directly and remove the
> wrapper.
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>   include/linux/pgtable.h | 8 +++++++-
>   1 file changed, 7 insertions(+), 1 deletion(-)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 5e7eaf8f2b97..815d92dcb96b 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -214,9 +214,15 @@ static inline int pmd_dirty(pmd_t pmd)
>   
>   
>   #ifndef pte_next_pfn
> +#ifndef pte_advance_pfn
> +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
> +{
> +	return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
> +}
> +#endif
>   static inline pte_t pte_next_pfn(pte_t pte)
>   {
> -	return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
> +	return pte_advance_pfn(pte, 1);
>   }
>   #endif
>   

I do wonder if we simply want to leave pte_next_pfn() around? Especially 
patch #4, #6 don't really benefit from the change? So are the other 
set_ptes() implementations.

That is, only convert all pte_next_pfn()->pte_advance_pfn(), and leave a
pte_next_pfn() macro in place.

Any downsides to that? This patch here would become:

#ifndef pte_advance_pfn
static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
{
	return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
}
#endif

#ifndef pte_next_pfn
#define pte_next_pfn(pte) pte_advance_pfn(pte, 1)
#endif

As you convert the three arches, make them define pte_advance_pfn and 
udnefine pte_next_pfn. in the end, you can drop the #ifdef around 
pte_next_pfn here.
  
Ryan Roberts Feb. 12, 2024, 2:10 p.m. UTC | #2
On 12/02/2024 12:14, David Hildenbrand wrote:
> On 02.02.24 09:07, Ryan Roberts wrote:
>> The goal is to be able to advance a PTE by an arbitrary number of PFNs.
>> So introduce a new API that takes a nr param.
>>
>> We are going to remove pte_next_pfn() and replace it with
>> pte_advance_pfn(). As a first step, implement pte_next_pfn() as a
>> wrapper around pte_advance_pfn() so that we can incrementally switch the
>> architectures over. Once all arches are moved over, we will change all
>> the core-mm callers to call pte_advance_pfn() directly and remove the
>> wrapper.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>   include/linux/pgtable.h | 8 +++++++-
>>   1 file changed, 7 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index 5e7eaf8f2b97..815d92dcb96b 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -214,9 +214,15 @@ static inline int pmd_dirty(pmd_t pmd)
>>       #ifndef pte_next_pfn
>> +#ifndef pte_advance_pfn
>> +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
>> +{
>> +    return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
>> +}
>> +#endif
>>   static inline pte_t pte_next_pfn(pte_t pte)
>>   {
>> -    return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
>> +    return pte_advance_pfn(pte, 1);
>>   }
>>   #endif
>>   
> 
> I do wonder if we simply want to leave pte_next_pfn() around? Especially patch
> #4, #6 don't really benefit from the change? So are the other set_ptes()
> implementations.
> 
> That is, only convert all pte_next_pfn()->pte_advance_pfn(), and leave a
> pte_next_pfn() macro in place.
> 
> Any downsides to that? 

The downside is just having multiple functions that effectively do the same
thing. Personally I think its cleaner and easier to understand the code with
just one generic function which we pass 1 to it where we only want to advance by
1. In the end, there are only a couple of places where pte_advance_pfn(1) is
used, so doesn't really seem valuable to me to maintain a specialization.

Unless you feel strongly that we need to keep pte_next_pfn() then I'd prefer to
leave it as I've done in this series.

> This patch here would become:
> 
> #ifndef pte_advance_pfn
> static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
> {
>     return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
> }
> #endif
> 
> #ifndef pte_next_pfn
> #define pte_next_pfn(pte) pte_advance_pfn(pte, 1)
> #endif
> 
> As you convert the three arches, make them define pte_advance_pfn and udnefine
> pte_next_pfn. in the end, you can drop the #ifdef around pte_next_pfn here.
>
  
David Hildenbrand Feb. 12, 2024, 2:29 p.m. UTC | #3
On 12.02.24 15:10, Ryan Roberts wrote:
> On 12/02/2024 12:14, David Hildenbrand wrote:
>> On 02.02.24 09:07, Ryan Roberts wrote:
>>> The goal is to be able to advance a PTE by an arbitrary number of PFNs.
>>> So introduce a new API that takes a nr param.
>>>
>>> We are going to remove pte_next_pfn() and replace it with
>>> pte_advance_pfn(). As a first step, implement pte_next_pfn() as a
>>> wrapper around pte_advance_pfn() so that we can incrementally switch the
>>> architectures over. Once all arches are moved over, we will change all
>>> the core-mm callers to call pte_advance_pfn() directly and remove the
>>> wrapper.
>>>
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> ---
>>>    include/linux/pgtable.h | 8 +++++++-
>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>> index 5e7eaf8f2b97..815d92dcb96b 100644
>>> --- a/include/linux/pgtable.h
>>> +++ b/include/linux/pgtable.h
>>> @@ -214,9 +214,15 @@ static inline int pmd_dirty(pmd_t pmd)
>>>        #ifndef pte_next_pfn
>>> +#ifndef pte_advance_pfn
>>> +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
>>> +{
>>> +    return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
>>> +}
>>> +#endif
>>>    static inline pte_t pte_next_pfn(pte_t pte)
>>>    {
>>> -    return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
>>> +    return pte_advance_pfn(pte, 1);
>>>    }
>>>    #endif
>>>    
>>
>> I do wonder if we simply want to leave pte_next_pfn() around? Especially patch
>> #4, #6 don't really benefit from the change? So are the other set_ptes()
>> implementations.
>>
>> That is, only convert all pte_next_pfn()->pte_advance_pfn(), and leave a
>> pte_next_pfn() macro in place.
>>
>> Any downsides to that?
> 
> The downside is just having multiple functions that effectively do the same
> thing. Personally I think its cleaner and easier to understand the code with
> just one generic function which we pass 1 to it where we only want to advance by
> 1. In the end, there are only a couple of places where pte_advance_pfn(1) is
> used, so doesn't really seem valuable to me to maintain a specialization.

Well, not really functions, just a macro. Like we have set_pte_at() 
translating to set_ptes().

Arguably, we have more callers of set_pte_at().

"Easier to understand", I don't know. :)

> 
> Unless you feel strongly that we need to keep pte_next_pfn() then I'd prefer to
> leave it as I've done in this series.

Well, it makes you patch set shorter and there is less code churn.

So personally, I'd just leave pte_next_pfn() in there. But whatever you 
prefer, not the end of the world.
  
Ryan Roberts Feb. 12, 2024, 9:34 p.m. UTC | #4
On 12/02/2024 14:29, David Hildenbrand wrote:
> On 12.02.24 15:10, Ryan Roberts wrote:
>> On 12/02/2024 12:14, David Hildenbrand wrote:
>>> On 02.02.24 09:07, Ryan Roberts wrote:
>>>> The goal is to be able to advance a PTE by an arbitrary number of PFNs.
>>>> So introduce a new API that takes a nr param.
>>>>
>>>> We are going to remove pte_next_pfn() and replace it with
>>>> pte_advance_pfn(). As a first step, implement pte_next_pfn() as a
>>>> wrapper around pte_advance_pfn() so that we can incrementally switch the
>>>> architectures over. Once all arches are moved over, we will change all
>>>> the core-mm callers to call pte_advance_pfn() directly and remove the
>>>> wrapper.
>>>>
>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>> ---
>>>>    include/linux/pgtable.h | 8 +++++++-
>>>>    1 file changed, 7 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>>> index 5e7eaf8f2b97..815d92dcb96b 100644
>>>> --- a/include/linux/pgtable.h
>>>> +++ b/include/linux/pgtable.h
>>>> @@ -214,9 +214,15 @@ static inline int pmd_dirty(pmd_t pmd)
>>>>        #ifndef pte_next_pfn
>>>> +#ifndef pte_advance_pfn
>>>> +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
>>>> +{
>>>> +    return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
>>>> +}
>>>> +#endif
>>>>    static inline pte_t pte_next_pfn(pte_t pte)
>>>>    {
>>>> -    return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
>>>> +    return pte_advance_pfn(pte, 1);
>>>>    }
>>>>    #endif
>>>>    
>>>
>>> I do wonder if we simply want to leave pte_next_pfn() around? Especially patch
>>> #4, #6 don't really benefit from the change? So are the other set_ptes()
>>> implementations.
>>>
>>> That is, only convert all pte_next_pfn()->pte_advance_pfn(), and leave a
>>> pte_next_pfn() macro in place.
>>>
>>> Any downsides to that?
>>
>> The downside is just having multiple functions that effectively do the same
>> thing. Personally I think its cleaner and easier to understand the code with
>> just one generic function which we pass 1 to it where we only want to advance by
>> 1. In the end, there are only a couple of places where pte_advance_pfn(1) is
>> used, so doesn't really seem valuable to me to maintain a specialization.
> 
> Well, not really functions, just a macro. Like we have set_pte_at() translating
> to set_ptes().
> 
> Arguably, we have more callers of set_pte_at().
> 
> "Easier to understand", I don't know. :)
> 
>>
>> Unless you feel strongly that we need to keep pte_next_pfn() then I'd prefer to
>> leave it as I've done in this series.
> 
> Well, it makes you patch set shorter and there is less code churn.
> 
> So personally, I'd just leave pte_next_pfn() in there. But whatever you prefer,
> not the end of the world.

I thought about this a bit more and remembered that I'm the apprentice so I've
changed it as you suggested.
  
David Hildenbrand Feb. 13, 2024, 9:54 a.m. UTC | #5
On 12.02.24 22:34, Ryan Roberts wrote:
> On 12/02/2024 14:29, David Hildenbrand wrote:
>> On 12.02.24 15:10, Ryan Roberts wrote:
>>> On 12/02/2024 12:14, David Hildenbrand wrote:
>>>> On 02.02.24 09:07, Ryan Roberts wrote:
>>>>> The goal is to be able to advance a PTE by an arbitrary number of PFNs.
>>>>> So introduce a new API that takes a nr param.
>>>>>
>>>>> We are going to remove pte_next_pfn() and replace it with
>>>>> pte_advance_pfn(). As a first step, implement pte_next_pfn() as a
>>>>> wrapper around pte_advance_pfn() so that we can incrementally switch the
>>>>> architectures over. Once all arches are moved over, we will change all
>>>>> the core-mm callers to call pte_advance_pfn() directly and remove the
>>>>> wrapper.
>>>>>
>>>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>>>> ---
>>>>>     include/linux/pgtable.h | 8 +++++++-
>>>>>     1 file changed, 7 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>>>>> index 5e7eaf8f2b97..815d92dcb96b 100644
>>>>> --- a/include/linux/pgtable.h
>>>>> +++ b/include/linux/pgtable.h
>>>>> @@ -214,9 +214,15 @@ static inline int pmd_dirty(pmd_t pmd)
>>>>>         #ifndef pte_next_pfn
>>>>> +#ifndef pte_advance_pfn
>>>>> +static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
>>>>> +{
>>>>> +    return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
>>>>> +}
>>>>> +#endif
>>>>>     static inline pte_t pte_next_pfn(pte_t pte)
>>>>>     {
>>>>> -    return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
>>>>> +    return pte_advance_pfn(pte, 1);
>>>>>     }
>>>>>     #endif
>>>>>     
>>>>
>>>> I do wonder if we simply want to leave pte_next_pfn() around? Especially patch
>>>> #4, #6 don't really benefit from the change? So are the other set_ptes()
>>>> implementations.
>>>>
>>>> That is, only convert all pte_next_pfn()->pte_advance_pfn(), and leave a
>>>> pte_next_pfn() macro in place.
>>>>
>>>> Any downsides to that?
>>>
>>> The downside is just having multiple functions that effectively do the same
>>> thing. Personally I think its cleaner and easier to understand the code with
>>> just one generic function which we pass 1 to it where we only want to advance by
>>> 1. In the end, there are only a couple of places where pte_advance_pfn(1) is
>>> used, so doesn't really seem valuable to me to maintain a specialization.
>>
>> Well, not really functions, just a macro. Like we have set_pte_at() translating
>> to set_ptes().
>>
>> Arguably, we have more callers of set_pte_at().
>>
>> "Easier to understand", I don't know. :)
>>
>>>
>>> Unless you feel strongly that we need to keep pte_next_pfn() then I'd prefer to
>>> leave it as I've done in this series.
>>
>> Well, it makes you patch set shorter and there is less code churn.
>>
>> So personally, I'd just leave pte_next_pfn() in there. But whatever you prefer,
>> not the end of the world.
> 
> I thought about this a bit more and remembered that I'm the apprentice so I've
> changed it as you suggested.

Oh, I say stupid things all the time. Please push back if you disagree. :)

[shrinking a patch set if possible and reasonable is often a good idea]
  

Patch

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 5e7eaf8f2b97..815d92dcb96b 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -214,9 +214,15 @@  static inline int pmd_dirty(pmd_t pmd)
 
 
 #ifndef pte_next_pfn
+#ifndef pte_advance_pfn
+static inline pte_t pte_advance_pfn(pte_t pte, unsigned long nr)
+{
+	return __pte(pte_val(pte) + (nr << PFN_PTE_SHIFT));
+}
+#endif
 static inline pte_t pte_next_pfn(pte_t pte)
 {
-	return __pte(pte_val(pte) + (1UL << PFN_PTE_SHIFT));
+	return pte_advance_pfn(pte, 1);
 }
 #endif