[v3,2/4] mm: Default implementation of arch_wants_pte_order()

Message ID 20230714161733.4144503-2-ryan.roberts@arm.com
State New
Headers
Series variable-order, large folios for anonymous memory |

Commit Message

Ryan Roberts July 14, 2023, 4:17 p.m. UTC
  arch_wants_pte_order() can be overridden by the arch to return the
preferred folio order for pte-mapped memory. This is useful as some
architectures (e.g. arm64) can coalesce TLB entries when the physical
memory is suitably contiguous.

The first user for this hint will be FLEXIBLE_THP, which aims to
allocate large folios for anonymous memory to reduce page faults and
other per-page operation costs.

Here we add the default implementation of the function, used when the
architecture does not define it, which returns -1, implying that the HW
has no preference. In this case, mm will choose it's own default order.

Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
---
 include/linux/pgtable.h | 13 +++++++++++++
 1 file changed, 13 insertions(+)
  

Comments

Yu Zhao July 14, 2023, 4:54 p.m. UTC | #1
On Fri, Jul 14, 2023 at 10:17 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> arch_wants_pte_order() can be overridden by the arch to return the
> preferred folio order for pte-mapped memory. This is useful as some
> architectures (e.g. arm64) can coalesce TLB entries when the physical
> memory is suitably contiguous.
>
> The first user for this hint will be FLEXIBLE_THP, which aims to
> allocate large folios for anonymous memory to reduce page faults and
> other per-page operation costs.
>
> Here we add the default implementation of the function, used when the
> architecture does not define it, which returns -1, implying that the HW
> has no preference. In this case, mm will choose it's own default order.
>
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>

Reviewed-by: Yu Zhao <yuzhao@google.com>

Thanks: -1 actually is better than 0 (what I suggested) for the obvious reason.
  
Yin Fengwei July 17, 2023, 11:13 a.m. UTC | #2
On 7/15/23 00:17, Ryan Roberts wrote:
> arch_wants_pte_order() can be overridden by the arch to return the
> preferred folio order for pte-mapped memory. This is useful as some
> architectures (e.g. arm64) can coalesce TLB entries when the physical
> memory is suitably contiguous.
> 
> The first user for this hint will be FLEXIBLE_THP, which aims to
> allocate large folios for anonymous memory to reduce page faults and
> other per-page operation costs.
> 
> Here we add the default implementation of the function, used when the
> architecture does not define it, which returns -1, implying that the HW
> has no preference. In this case, mm will choose it's own default order.
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Yin Fengwei <fengwei.yin@intel.com>


Regards
Yin, Fengwei

> ---
>  include/linux/pgtable.h | 13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 5063b482e34f..2a1d83775837 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void)
>  }
>  #endif
>  
> +#ifndef arch_wants_pte_order
> +/*
> + * Returns preferred folio order for pte-mapped memory. Must be in range [0,
> + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios
> + * to be at least order-2. Negative value implies that the HW has no preference
> + * and mm will choose it's own default order.
> + */
> +static inline int arch_wants_pte_order(void)
> +{
> +	return -1;
> +}
> +#endif
> +
>  #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
>  static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>  				       unsigned long address,
  
David Hildenbrand July 17, 2023, 1:01 p.m. UTC | #3
On 14.07.23 18:17, Ryan Roberts wrote:
> arch_wants_pte_order() can be overridden by the arch to return the
> preferred folio order for pte-mapped memory. This is useful as some
> architectures (e.g. arm64) can coalesce TLB entries when the physical
> memory is suitably contiguous.
> 
> The first user for this hint will be FLEXIBLE_THP, which aims to
> allocate large folios for anonymous memory to reduce page faults and
> other per-page operation costs.
> 
> Here we add the default implementation of the function, used when the
> architecture does not define it, which returns -1, implying that the HW
> has no preference. In this case, mm will choose it's own default order.
> 
> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
> ---
>   include/linux/pgtable.h | 13 +++++++++++++
>   1 file changed, 13 insertions(+)
> 
> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
> index 5063b482e34f..2a1d83775837 100644
> --- a/include/linux/pgtable.h
> +++ b/include/linux/pgtable.h
> @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void)
>   }
>   #endif
>   
> +#ifndef arch_wants_pte_order
> +/*
> + * Returns preferred folio order for pte-mapped memory. Must be in range [0,
> + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios
> + * to be at least order-2. Negative value implies that the HW has no preference
> + * and mm will choose it's own default order.
> + */
> +static inline int arch_wants_pte_order(void)
> +{
> +	return -1;
> +}
> +#endif
> +
>   #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
>   static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>   				       unsigned long address,

What is the reason to have this into a separate patch? That should 
simply be squashed into the actual user -- patch #3.
  
Ryan Roberts July 17, 2023, 1:15 p.m. UTC | #4
On 17/07/2023 14:01, David Hildenbrand wrote:
> On 14.07.23 18:17, Ryan Roberts wrote:
>> arch_wants_pte_order() can be overridden by the arch to return the
>> preferred folio order for pte-mapped memory. This is useful as some
>> architectures (e.g. arm64) can coalesce TLB entries when the physical
>> memory is suitably contiguous.
>>
>> The first user for this hint will be FLEXIBLE_THP, which aims to
>> allocate large folios for anonymous memory to reduce page faults and
>> other per-page operation costs.
>>
>> Here we add the default implementation of the function, used when the
>> architecture does not define it, which returns -1, implying that the HW
>> has no preference. In this case, mm will choose it's own default order.
>>
>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>   include/linux/pgtable.h | 13 +++++++++++++
>>   1 file changed, 13 insertions(+)
>>
>> diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
>> index 5063b482e34f..2a1d83775837 100644
>> --- a/include/linux/pgtable.h
>> +++ b/include/linux/pgtable.h
>> @@ -313,6 +313,19 @@ static inline bool arch_has_hw_pte_young(void)
>>   }
>>   #endif
>>   +#ifndef arch_wants_pte_order
>> +/*
>> + * Returns preferred folio order for pte-mapped memory. Must be in range [0,
>> + * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios
>> + * to be at least order-2. Negative value implies that the HW has no preference
>> + * and mm will choose it's own default order.
>> + */
>> +static inline int arch_wants_pte_order(void)
>> +{
>> +    return -1;
>> +}
>> +#endif
>> +
>>   #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
>>   static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>>                          unsigned long address,
> 
> What is the reason to have this into a separate patch? That should simply be
> squashed into the actual user -- patch #3.

There was a lot more in this at v1 IIRC, so made more sense as standalone. I
agree it can be squashed into the next patch now. Will do for next version.

>
  

Patch

diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index 5063b482e34f..2a1d83775837 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -313,6 +313,19 @@  static inline bool arch_has_hw_pte_young(void)
 }
 #endif
 
+#ifndef arch_wants_pte_order
+/*
+ * Returns preferred folio order for pte-mapped memory. Must be in range [0,
+ * PMD_SHIFT-PAGE_SHIFT) and must not be order-1 since THP requires large folios
+ * to be at least order-2. Negative value implies that the HW has no preference
+ * and mm will choose it's own default order.
+ */
+static inline int arch_wants_pte_order(void)
+{
+	return -1;
+}
+#endif
+
 #ifndef __HAVE_ARCH_PTEP_GET_AND_CLEAR
 static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
 				       unsigned long address,