soc: qcom: stats: Fix division issue on 32-bit platforms

Message ID 20231205-qcom_stats-aeabi_uldivmod-fix-v1-1-f94ecec5e894@quicinc.com
State New
Headers
Series soc: qcom: stats: Fix division issue on 32-bit platforms |

Commit Message

Bjorn Andersson Dec. 6, 2023, 12:44 a.m. UTC
  commit 'e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")' made it
in with a mult_frac() which causes link errors on Arm and PowerPC
builds:

  ERROR: modpost: "__aeabi_uldivmod" [drivers/soc/qcom/qcom_stats.ko] undefined!

Expand the mult_frac() to avoid this problem.

Fixes: e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
---
 drivers/soc/qcom/qcom_stats.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)


---
base-commit: adcad44bd1c73a5264bff525e334e2f6fc01bb9b
change-id: 20231205-qcom_stats-aeabi_uldivmod-fix-4a63c7ec013f

Best regards,
  

Comments

Randy Dunlap Dec. 6, 2023, 1:52 a.m. UTC | #1
On 12/5/23 16:44, Bjorn Andersson wrote:
> commit 'e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")' made it
> in with a mult_frac() which causes link errors on Arm and PowerPC
> builds:
> 
>   ERROR: modpost: "__aeabi_uldivmod" [drivers/soc/qcom/qcom_stats.ko] undefined!
> 
> Expand the mult_frac() to avoid this problem.
> 
> Fixes: e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>

That works. Thanks.

Acked-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org> # build-tested



> ---
>  drivers/soc/qcom/qcom_stats.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/qcom/qcom_stats.c b/drivers/soc/qcom/qcom_stats.c
> index 4763d62a8cb0..5ba61232313e 100644
> --- a/drivers/soc/qcom/qcom_stats.c
> +++ b/drivers/soc/qcom/qcom_stats.c
> @@ -221,7 +221,8 @@ static int qcom_ddr_stats_show(struct seq_file *s, void *unused)
>  
>  	for (i = 0; i < ddr.entry_count; i++) {
>  		/* Convert the period to ms */
> -		entry[i].dur = mult_frac(MSEC_PER_SEC, entry[i].dur, ARCH_TIMER_FREQ);
> +		entry[i].dur *= MSEC_PER_SEC;
> +		entry[i].dur = div_u64(entry[i].dur, ARCH_TIMER_FREQ);
>  	}
>  
>  	for (i = 0; i < ddr.entry_count; i++)
> 
> ---
> base-commit: adcad44bd1c73a5264bff525e334e2f6fc01bb9b
> change-id: 20231205-qcom_stats-aeabi_uldivmod-fix-4a63c7ec013f
> 
> Best regards,
  
Konrad Dybcio Dec. 6, 2023, 12:21 p.m. UTC | #2
On 12/6/23 01:44, Bjorn Andersson wrote:
> commit 'e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")' made it
> in with a mult_frac() which causes link errors on Arm and PowerPC
> builds:
> 
>    ERROR: modpost: "__aeabi_uldivmod" [drivers/soc/qcom/qcom_stats.ko] undefined!
> 
> Expand the mult_frac() to avoid this problem.
> 
> Fixes: e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
> ---
Thanks, I keep believeing mult_frac is generic enough to work
on something else than arm64..

Reviewed-by: Konrad Dybcio <konrad.dybcio@linaro.org>

Konrad
  
David Laight Dec. 6, 2023, 2:07 p.m. UTC | #3
From: Bjorn Andersson
> Sent: 06 December 2023 00:44
> 
> commit 'e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")' made it
> in with a mult_frac() which causes link errors on Arm and PowerPC
> builds:
> 
>   ERROR: modpost: "__aeabi_uldivmod" [drivers/soc/qcom/qcom_stats.ko] undefined!
> 
> Expand the mult_frac() to avoid this problem.
> 
> Fixes: e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")
> Reported-by: Randy Dunlap <rdunlap@infradead.org>
> Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
> ---
>  drivers/soc/qcom/qcom_stats.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/soc/qcom/qcom_stats.c b/drivers/soc/qcom/qcom_stats.c
> index 4763d62a8cb0..5ba61232313e 100644
> --- a/drivers/soc/qcom/qcom_stats.c
> +++ b/drivers/soc/qcom/qcom_stats.c
> @@ -221,7 +221,8 @@ static int qcom_ddr_stats_show(struct seq_file *s, void *unused)
> 
>  	for (i = 0; i < ddr.entry_count; i++) {
>  		/* Convert the period to ms */
> -		entry[i].dur = mult_frac(MSEC_PER_SEC, entry[i].dur, ARCH_TIMER_FREQ);
> +		entry[i].dur *= MSEC_PER_SEC;
> +		entry[i].dur = div_u64(entry[i].dur, ARCH_TIMER_FREQ);

Is that right?
At a guess mult_frac(a, b, c) is doing a 32x32 multiply and then a 64x32
divide to generate a 32bit result.
So I'd guess entry[i].dur is 32bit? (this code isn't in -rc4 ...).
Which means you are now discarding the high bits.

You've also added a very slow 64bit divide.
A multiple by reciprocal calculation will be much better.
Since absolute accuracy almost certainly doesn't matter here convert:
	dur * 1000 / FREQ
to
	(dur * (u32)(1000ull << 32 / FREQ)) >> 32
which will be fine provided FREQ >= 1000

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
  
Bjorn Andersson Dec. 9, 2023, 3:23 a.m. UTC | #4
On Wed, Dec 06, 2023 at 02:07:16PM +0000, David Laight wrote:
> From: Bjorn Andersson
> > Sent: 06 December 2023 00:44
> > 
> > commit 'e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")' made it
> > in with a mult_frac() which causes link errors on Arm and PowerPC
> > builds:
> > 
> >   ERROR: modpost: "__aeabi_uldivmod" [drivers/soc/qcom/qcom_stats.ko] undefined!
> > 
> > Expand the mult_frac() to avoid this problem.
> > 
> > Fixes: e84e61bdb97c ("soc: qcom: stats: Add DDR sleep stats")
> > Reported-by: Randy Dunlap <rdunlap@infradead.org>
> > Signed-off-by: Bjorn Andersson <quic_bjorande@quicinc.com>
> > ---
> >  drivers/soc/qcom/qcom_stats.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> > 
> > diff --git a/drivers/soc/qcom/qcom_stats.c b/drivers/soc/qcom/qcom_stats.c
> > index 4763d62a8cb0..5ba61232313e 100644
> > --- a/drivers/soc/qcom/qcom_stats.c
> > +++ b/drivers/soc/qcom/qcom_stats.c
> > @@ -221,7 +221,8 @@ static int qcom_ddr_stats_show(struct seq_file *s, void *unused)
> > 
> >  	for (i = 0; i < ddr.entry_count; i++) {
> >  		/* Convert the period to ms */
> > -		entry[i].dur = mult_frac(MSEC_PER_SEC, entry[i].dur, ARCH_TIMER_FREQ);
> > +		entry[i].dur *= MSEC_PER_SEC;
> > +		entry[i].dur = div_u64(entry[i].dur, ARCH_TIMER_FREQ);
> 
> Is that right?
> At a guess mult_frac(a, b, c) is doing a 32x32 multiply and then a 64x32
> divide to generate a 32bit result.
> So I'd guess entry[i].dur is 32bit? (this code isn't in -rc4 ...).
> Which means you are now discarding the high bits.
> 

entry[i].dur is 64 bit, so this should work just fine.

Arnd proposed that as ARCH_TIMER_FREQ is evenly divisible by
MSEC_PER_SEC we just div_u64(dur, ARCH_TIMER_FREQ / MSEC_PER_SEC), and I
picked that patch instead.

> You've also added a very slow 64bit divide.

Without checking the generated code, I'd expect this to be a slow 64-bit
division already. But this is a debug function, so it should be fine to
take that penalty.

> A multiple by reciprocal calculation will be much better.
> Since absolute accuracy almost certainly doesn't matter here convert:
> 	dur * 1000 / FREQ
> to
> 	(dur * (u32)(1000ull << 32 / FREQ)) >> 32
> which will be fine provided FREQ >= 1000
> 

I'm quite sure you're right regarding the accuracy. I think as this
isn't in a hot path, the more readable div_u64() feels like a reasonable
choice.

Thank you for your input and suggestion though!

Regards,
Bjorn

> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)
  

Patch

diff --git a/drivers/soc/qcom/qcom_stats.c b/drivers/soc/qcom/qcom_stats.c
index 4763d62a8cb0..5ba61232313e 100644
--- a/drivers/soc/qcom/qcom_stats.c
+++ b/drivers/soc/qcom/qcom_stats.c
@@ -221,7 +221,8 @@  static int qcom_ddr_stats_show(struct seq_file *s, void *unused)
 
 	for (i = 0; i < ddr.entry_count; i++) {
 		/* Convert the period to ms */
-		entry[i].dur = mult_frac(MSEC_PER_SEC, entry[i].dur, ARCH_TIMER_FREQ);
+		entry[i].dur *= MSEC_PER_SEC;
+		entry[i].dur = div_u64(entry[i].dur, ARCH_TIMER_FREQ);
 	}
 
 	for (i = 0; i < ddr.entry_count; i++)