[v2,2/2] clk: divider: Fix divisions

Message ID 20230526171057.66876-3-sebastian.reichel@collabora.com
State New
Headers
Series Fix 64 bit issues in common clock framework |

Commit Message

Sebastian Reichel May 26, 2023, 5:10 p.m. UTC
  The clock framework handles clock rates as "unsigned long", so u32 on
32-bit architectures and u64 on 64-bit architectures.

The current code pointlessly casts the dividend to u64 on 32-bit
architectures and thus pointlessly reducing the performance.

On the other hand on 64-bit architectures the divisor is masked and only
the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results
in incorrect values. For example requesting 4300000000 (4.3 GHz) will
effectively request ca. 5 MHz. Requesting clk_round_rate(clk, ULONG_MAX)
is a bit of a special case, since that still returns correct values as
long as the parent clock is below 8.5 GHz.

Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
---
 drivers/clk/clk-divider.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
  

Comments

AngeloGioacchino Del Regno May 29, 2023, 8:50 a.m. UTC | #1
Il 26/05/23 19:10, Sebastian Reichel ha scritto:
> The clock framework handles clock rates as "unsigned long", so u32 on
> 32-bit architectures and u64 on 64-bit architectures.
> 
> The current code pointlessly casts the dividend to u64 on 32-bit
> architectures and thus pointlessly reducing the performance.
> 
> On the other hand on 64-bit architectures the divisor is masked and only
> the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results
> in incorrect values. For example requesting 4300000000 (4.3 GHz) will
> effectively request ca. 5 MHz. Requesting clk_round_rate(clk, ULONG_MAX)
> is a bit of a special case, since that still returns correct values as
> long as the parent clock is below 8.5 GHz.
> 
> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>

Reviewed-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
  
Stephen Boyd June 13, 2023, 12:41 a.m. UTC | #2
Quoting Sebastian Reichel (2023-05-26 10:10:57)
> The clock framework handles clock rates as "unsigned long", so u32 on
> 32-bit architectures and u64 on 64-bit architectures.
> 
> The current code pointlessly casts the dividend to u64 on 32-bit
> architectures and thus pointlessly reducing the performance.

It looks like that was done to make the DIV_ROUND_UP() macro not
overflow the dividend on 32-bit machines (from 9556f9dad8f5):

  DIV_ROUND_UP(3000000000, 1500000000) = (3.0G + 1.5G - 1) / 1.5G
                                       = OVERFLOW / 1.5G

but I agree, the u64 cast is not necessary if DIV_ROUND_UP_ULL() is
used as that macro casts the dividend to unsigned long long anyway.

> 
> On the other hand on 64-bit architectures the divisor is masked and only
> the lower 32-bit are used. Thus requesting a frequency >= 4.3GHz results
> in incorrect values. For example requesting 4300000000 (4.3 GHz) will
> effectively request ca. 5 MHz.

Nice catch. But I'm concerned that the case above is broken by changing
to DIV_ROUND_UP(). As this code is generic, I fear we'll have to change
this code that divides rates to use DIV64_U64_ROUND_UP() because we
don't know how large the rate is (i.e. it could be larger than 32-bits
on a 64-bit machine).

> Requesting clk_round_rate(clk, ULONG_MAX)
> is a bit of a special case, since that still returns correct values as
> long as the parent clock is below 8.5 GHz.
> 
> Signed-off-by: Sebastian Reichel <sebastian.reichel@collabora.com>
> ---
>  drivers/clk/clk-divider.c | 6 +++---
>  1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
> index a2c2b5203b0a..c38e8aa60e54 100644
> --- a/drivers/clk/clk-divider.c
> +++ b/drivers/clk/clk-divider.c
> @@ -220,7 +220,7 @@ static int _div_round_up(const struct clk_div_table *table,
>                          unsigned long parent_rate, unsigned long rate,
>                          unsigned long flags)
>  {
> -       int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
> +       int div = DIV_ROUND_UP(parent_rate, rate);
>  
>         if (flags & CLK_DIVIDER_POWER_OF_TWO)
>                 div = __roundup_pow_of_two(div);
> @@ -237,7 +237,7 @@ static int _div_round_closest(const struct clk_div_table *table,
>         int up, down;
>         unsigned long up_rate, down_rate;
>  
> -       up = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
> +       up = DIV_ROUND_UP(parent_rate, rate);
>         down = parent_rate / rate;
>  
>         if (flags & CLK_DIVIDER_POWER_OF_TWO) {
> @@ -473,7 +473,7 @@ int divider_get_val(unsigned long rate, unsigned long parent_rate,
>  {
>         unsigned int div, value;
>  
> -       div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
> +       div = DIV_ROUND_UP(parent_rate, rate);
>  
>         if (!_is_valid_div(table, div, flags))
>                 return -EINVAL;

This is undoing parts of commit 9556f9dad8f5 ("clk: divider: handle
integer overflow when dividing large clock rates"). Please pair this
patch with extensive kunit tests in a new test suite clk-divider_test.c
file. I don't know if UML supports changing sizeof(long), but that would
be a cool feature to tease out these sorts of issues. I suppose we'll
just have to run the kunit tests on various architectures to cover the
possibilities.
  
David Laight June 13, 2023, 8:05 a.m. UTC | #3
From: Stephen Boyd
> Sent: 13 June 2023 01:42
> 
> Quoting Sebastian Reichel (2023-05-26 10:10:57)
> > The clock framework handles clock rates as "unsigned long", so u32 on
> > 32-bit architectures and u64 on 64-bit architectures.
> >
> > The current code pointlessly casts the dividend to u64 on 32-bit
> > architectures and thus pointlessly reducing the performance.
> 
> It looks like that was done to make the DIV_ROUND_UP() macro not
> overflow the dividend on 32-bit machines (from 9556f9dad8f5):
> 
>   DIV_ROUND_UP(3000000000, 1500000000) = (3.0G + 1.5G - 1) / 1.5G
>                                        = OVERFLOW / 1.5G

Maybe add:
#define DIV_ROUND_UP_NZ(x, y) (((x) - 1)/(y) + 1)
which doesn't overflow but requires x != 0.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  

Patch

diff --git a/drivers/clk/clk-divider.c b/drivers/clk/clk-divider.c
index a2c2b5203b0a..c38e8aa60e54 100644
--- a/drivers/clk/clk-divider.c
+++ b/drivers/clk/clk-divider.c
@@ -220,7 +220,7 @@  static int _div_round_up(const struct clk_div_table *table,
 			 unsigned long parent_rate, unsigned long rate,
 			 unsigned long flags)
 {
-	int div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
+	int div = DIV_ROUND_UP(parent_rate, rate);
 
 	if (flags & CLK_DIVIDER_POWER_OF_TWO)
 		div = __roundup_pow_of_two(div);
@@ -237,7 +237,7 @@  static int _div_round_closest(const struct clk_div_table *table,
 	int up, down;
 	unsigned long up_rate, down_rate;
 
-	up = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
+	up = DIV_ROUND_UP(parent_rate, rate);
 	down = parent_rate / rate;
 
 	if (flags & CLK_DIVIDER_POWER_OF_TWO) {
@@ -473,7 +473,7 @@  int divider_get_val(unsigned long rate, unsigned long parent_rate,
 {
 	unsigned int div, value;
 
-	div = DIV_ROUND_UP_ULL((u64)parent_rate, rate);
+	div = DIV_ROUND_UP(parent_rate, rate);
 
 	if (!_is_valid_div(table, div, flags))
 		return -EINVAL;