cpufreq: tegra186: Use flexible array to simplify memory allocation

Message ID f6b75a33df6f5fd94da3cfecb1e9e7590bf8cd37.1668963937.git.christophe.jaillet@wanadoo.fr
State New
Headers
Series cpufreq: tegra186: Use flexible array to simplify memory allocation |

Commit Message

Christophe JAILLET Nov. 20, 2022, 5:19 p.m. UTC
  Use flexible array to simplify memory allocation.
It saves some memory, avoids an indirection when reading the 'clusters'
array and removes some LoC.


Detailed explanation:
====================
Knowing that:
  - each devm_ allocation over-allocates 40 bytes for internal needs
  - Some rounding is done by the memory allocator on 8, 16, 32, 64, 96,
    128, 192, 256, 512, 1024, 2048, 4096, 8192 boundaries

and that:
  - sizeof(struct tegra186_cpufreq_data) = 24
  - sizeof(struct tegra186_cpufreq_cluster) = 16

Memory allocations in tegra186_cpufreq_probe() are:
  data:           (24 + 40) = 64 		      => 64 bytes
  data->clusters: (2 * 16 + 40) = 72     => 96 bytes
So a total of 160 bytes are allocated.
56 for the real need, 80 for internal uses and 24 are wasted.


If 'struct tegra186_cpufreq_data' is reordered so that 'clusters' is a
flexible array:
  - it saves one pointer in the structure
  - only one allocation is needed

So, only 96 bytes are allocated:
  16 + 2 * 16 + 40 = 88  => 96 bytes

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
---
Compile tested only
---
 drivers/cpufreq/tegra186-cpufreq.c | 11 ++++-------
 1 file changed, 4 insertions(+), 7 deletions(-)
  

Comments

Viresh Kumar Dec. 1, 2022, 9:20 a.m. UTC | #1
On 20-11-22, 18:19, Christophe JAILLET wrote:
> Use flexible array to simplify memory allocation.
> It saves some memory, avoids an indirection when reading the 'clusters'
> array and removes some LoC.
> 
> 
> Detailed explanation:
> ====================
> Knowing that:
>   - each devm_ allocation over-allocates 40 bytes for internal needs
>   - Some rounding is done by the memory allocator on 8, 16, 32, 64, 96,
>     128, 192, 256, 512, 1024, 2048, 4096, 8192 boundaries
> 
> and that:
>   - sizeof(struct tegra186_cpufreq_data) = 24
>   - sizeof(struct tegra186_cpufreq_cluster) = 16
> 
> Memory allocations in tegra186_cpufreq_probe() are:
>   data:           (24 + 40) = 64 		      => 64 bytes
>   data->clusters: (2 * 16 + 40) = 72     => 96 bytes
> So a total of 160 bytes are allocated.
> 56 for the real need, 80 for internal uses and 24 are wasted.
> 
> 
> If 'struct tegra186_cpufreq_data' is reordered so that 'clusters' is a
> flexible array:
>   - it saves one pointer in the structure
>   - only one allocation is needed
> 
> So, only 96 bytes are allocated:
>   16 + 2 * 16 + 40 = 88  => 96 bytes
> 
> Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
> ---

Applied. Thanks.
  

Patch

diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c
index 6c88827f4e62..f98f53bf1011 100644
--- a/drivers/cpufreq/tegra186-cpufreq.c
+++ b/drivers/cpufreq/tegra186-cpufreq.c
@@ -65,8 +65,8 @@  struct tegra186_cpufreq_cluster {
 
 struct tegra186_cpufreq_data {
 	void __iomem *regs;
-	struct tegra186_cpufreq_cluster *clusters;
 	const struct tegra186_cpufreq_cpu *cpus;
+	struct tegra186_cpufreq_cluster clusters[];
 };
 
 static int tegra186_cpufreq_init(struct cpufreq_policy *policy)
@@ -221,15 +221,12 @@  static int tegra186_cpufreq_probe(struct platform_device *pdev)
 	struct tegra_bpmp *bpmp;
 	unsigned int i = 0, err;
 
-	data = devm_kzalloc(&pdev->dev, sizeof(*data), GFP_KERNEL);
+	data = devm_kzalloc(&pdev->dev,
+			    struct_size(data, clusters, TEGRA186_NUM_CLUSTERS),
+			    GFP_KERNEL);
 	if (!data)
 		return -ENOMEM;
 
-	data->clusters = devm_kcalloc(&pdev->dev, TEGRA186_NUM_CLUSTERS,
-				      sizeof(*data->clusters), GFP_KERNEL);
-	if (!data->clusters)
-		return -ENOMEM;
-
 	data->cpus = tegra186_cpus;
 
 	bpmp = tegra_bpmp_get(&pdev->dev);