Message ID | 20230606231252.94838-11-william.zhang@broadcom.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp3738060vqr; Tue, 6 Jun 2023 16:38:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7nbEQjmCoO2I8UHm2a9BUN5aYLo2CJi8UrnJxC5g4J3To2W16o4TFoW3/JeCYeKdT7Wpkx X-Received: by 2002:a05:6a00:99b:b0:655:6bcf:208a with SMTP id u27-20020a056a00099b00b006556bcf208amr4328948pfg.6.1686094735273; Tue, 06 Jun 2023 16:38:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686094735; cv=none; d=google.com; s=arc-20160816; b=dYU5wqZcXl4FyrSLjUQaO3qB+38lmX/fvPh5H+hW0fYu/pIPAVBMECKFbqBeeW/iEY 4GIrThdWXYs14wUBQPf/Br+2VRO16PhJc2kCTFRb2amAXhDt3bgWUUl19HR+8u4r/wy8 +cT7nbeplJha9lor/N+N4qQqNgx1N1iju07wjgIuqKLiXzghUsyKWy01rFFw4Ue1Fqcn w64AmtFXFQxqbwAzBp08yuKmiuDxj/5nr0/cUC0e6RMWBlFORcidCVoyQr6fWAdHgg9e KSvrxnqsRGMlEWXMdYKHjcS7N8Tc8en11KqCnx9/s/OxZrSZx5YBlHIdUKRCB1nxfPC3 2hdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=U+Z84/AfzP1MYacvhjEamKy0f7DBRojW2nAeEre6Y5c=; b=Bzo0+58aftkitKLZ21QVp42DKeUPorzNVHgeR8kGmM+4n4JU7EZ3Aj8rc1jS9hek+r AH/o6OGNUz1NX4XC8AWoYe5rSSrcLpvRg2IqUa4yNlIbsH8TjuUiQMyNLUrSdn+HZEaW AWv3zkehoL10pTqAWk5Ei4jKlUfQcElkHSuJpgLQcqleb3XGD66Tnwty7taggOWHLd+A 2le2QgFOsa71/gtUSUlAjuQcX7g4J8HEnrmfh+FsZh7ohAWq+n20gYgXWP54iEb6Rkgi tz1JfL5u6A/51KhwLEzfXg+usqIfGJt7/1LffRBm0dlcLzbexz5CS7n3pzCZtKkR1AZR JAwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b="dWM/Zn8f"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y12-20020aa79aec000000b0064f44863407si6188383pfp.384.2023.06.06.16.38.41; Tue, 06 Jun 2023 16:38:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@broadcom.com header.s=google header.b="dWM/Zn8f"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=broadcom.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240315AbjFFXPQ (ORCPT <rfc822;xxoosimple@gmail.com> + 99 others); Tue, 6 Jun 2023 19:15:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240236AbjFFXOq (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 6 Jun 2023 19:14:46 -0400 Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [IPv6:2607:f8b0:4864:20::f2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A6CD81723 for <linux-kernel@vger.kernel.org>; Tue, 6 Jun 2023 16:14:45 -0700 (PDT) Received: by mail-qv1-xf2a.google.com with SMTP id 6a1803df08f44-62b699b88e1so4354076d6.1 for <linux-kernel@vger.kernel.org>; Tue, 06 Jun 2023 16:14:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1686093285; x=1688685285; h=mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:from:to:cc:subject:date:message-id:reply-to; bh=U+Z84/AfzP1MYacvhjEamKy0f7DBRojW2nAeEre6Y5c=; b=dWM/Zn8fIv72Rap3CadIPAy7izpJgg1EUz+A4LqaCoNOfUTJEoOcevr97mTdfsPbvG JQmFoCXVqr8q5Ytlb0NroBv5KT8A4JF6IHlrY80YbZG5qivCGKKE3vu0I8iIhc9mO7Iq pKjxxELjXXcJXpGot5mMb+7Up+ZGf1OroYIyU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686093285; x=1688685285; h=mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=U+Z84/AfzP1MYacvhjEamKy0f7DBRojW2nAeEre6Y5c=; b=NtQCysrmXSMcIKG3c/T5wRlRKe2Hqeq1uwXHNOPAqF9s7J6s48VoctjiQf71HVM+qD 6qVeBCYIYLBzMv1hBETfql4c+S9svjd9C87KzTb6OlryJZGjjJQ1YH7eydv81G+DUYRe lG4S3dPlFnysqVFCxRP9/ims3WpMtJOAujT09vlRqS6sL1ZAMzLCgu+7OpMdjIi8phnN G73FDQC/cWs/dQMKMeDBa7On2fIZ0oYTLfqNEwcLxNC5NUIQw8g4yXmQzzfT0OCDs/vm j1OQ01IrDCCsGz2ruheEKIuekOdQlkfhEhRnWyZgfK92P+F3LCol3KW18QCaykMozW+9 yEUw== X-Gm-Message-State: AC+VfDytWE30q/VkYDi7Uz/Ehy1Bd7RkfT5jKYCL0i3qXQIpXifgmKUV +Ao+w/cfeJ0oQqaBW51K86slYg== X-Received: by 2002:ad4:5c8e:0:b0:625:aa49:19f3 with SMTP id o14-20020ad45c8e000000b00625aa4919f3mr1208739qvh.64.1686093284715; Tue, 06 Jun 2023 16:14:44 -0700 (PDT) Received: from ubuntu-22.localdomain ([192.19.222.250]) by smtp.gmail.com with ESMTPSA id x9-20020ae9e909000000b0075b23e55640sm5221519qkf.123.2023.06.06.16.14.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 16:14:44 -0700 (PDT) From: William Zhang <william.zhang@broadcom.com> To: Broadcom Kernel List <bcm-kernel-feedback-list@broadcom.com>, Linux MTD List <linux-mtd@lists.infradead.org> Cc: f.fainelli@gmail.com, rafal@milecki.pl, kursad.oney@broadcom.com, joel.peshkin@broadcom.com, computersforpeace@gmail.com, anand.gore@broadcom.com, dregan@mail.com, kamal.dasu@broadcom.com, tomer.yacoby@broadcom.com, dan.beygelman@broadcom.com, William Zhang <william.zhang@broadcom.com>, Miquel Raynal <miquel.raynal@bootlin.com>, linux-kernel@vger.kernel.org, Vignesh Raghavendra <vigneshr@ti.com>, Richard Weinberger <richard@nod.at>, Kamal Dasu <kdasu.kdev@gmail.com>, linux-arm-kernel@lists.infradead.org Subject: [PATCH 10/12] mtd: rawnand: brcmnand: Add BCMBCA read data bus interface Date: Tue, 6 Jun 2023 16:12:50 -0700 Message-Id: <20230606231252.94838-11-william.zhang@broadcom.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230606231252.94838-1-william.zhang@broadcom.com> References: <20230606231252.94838-1-william.zhang@broadcom.com> MIME-Version: 1.0 Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-256; boundary="0000000000004ff41a05fd7e2f0b" X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767998472933372452?= X-GMAIL-MSGID: =?utf-8?q?1767998472933372452?= |
Series |
mtd: rawnand: brcmnand: driver and doc updates
|
|
Commit Message
William Zhang
June 6, 2023, 11:12 p.m. UTC
The BCMBCA broadband SoC integrates the NAND controller differently than
STB, iProc and other SoCs. It has different endianness for NAND cache
data and ONFI parameter data.
Add a SoC read data bus shim for BCMBCA to meet the specific SoC need
and performance improvement using the optimized memcpy function on NAND
cache memory.
Signed-off-by: William Zhang <william.zhang@broadcom.com>
---
drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++
drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++-------
drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 +
3 files changed, 68 insertions(+), 14 deletions(-)
Comments
Hi William, william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > The BCMBCA broadband SoC integrates the NAND controller differently than > STB, iProc and other SoCs. It has different endianness for NAND cache > data and ONFI parameter data. > > Add a SoC read data bus shim for BCMBCA to meet the specific SoC need > and performance improvement using the optimized memcpy function on NAND > cache memory. > > Signed-off-by: William Zhang <william.zhang@broadcom.com> > --- > > drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- > drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + > 3 files changed, 68 insertions(+), 14 deletions(-) > > diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > index 7e48b6a0bfa2..899103a62c98 100644 > --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > @@ -26,6 +26,18 @@ enum { > BCMBCA_CTLRDY = BIT(4), > }; > > +#if defined(CONFIG_ARM64) > +#define ALIGN_REQ 8 > +#else > +#define ALIGN_REQ 4 > +#endif > + > +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) > +{ > + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && > + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); > +} > + > static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) > { > struct bcmbca_nand_soc *priv = > @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) > brcmnand_writel(val, mmio); > } > > +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, > + void __iomem *flash_cache, u32 *buffer, > + int fc_words, bool is_param) > +{ > + int i; > + > + if (!is_param) { > + /* > + * memcpy can do unaligned aligned access depending on source > + * and dest address, which is incompatible with nand cache. Fallback > + * to the memcpy for io version > + */ > + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) > + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); > + else > + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); > + } else { > + /* Flash cache has same endian as the host for parameter pages */ > + for (i = 0; i < fc_words; i++, buffer++) > + *buffer = __raw_readl(flash_cache + i * 4); > + } > +} > + > static int bcmbca_nand_probe(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) > > soc->ctlrdy_ack = bcmbca_nand_intc_ack; > soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; > + soc->read_data_bus = bcmbca_read_data_bus; > > return brcmnand_probe(pdev, soc); > } > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > index d920e88c7f5b..656be4d73016 100644 > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, > return brcmnand_readl(ctrl->edu_base + offs); > } > > +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, > + void __iomem *flash_cache, u32 *buffer, > + int fc_words, bool is_param) I strongly dislike this "is_param" boolean. When is the data in host endianness? When is it not? If we think about an exec_op() conversion and drop cmdfunc(), what would be the discriminant? > +{ > + struct brcmnand_soc *soc = ctrl->soc; > + int i; > + > + if (soc->read_data_bus) { > + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); > + } else { > + if (!is_param) { > + for (i = 0; i < fc_words; i++, buffer++) > + *buffer = brcmnand_read_fc(ctrl, i); > + } else { > + for (i = 0; i < fc_words; i++) > + /* > + * Flash cache is big endian for parameter pages, at > + * least on STB SoCs > + */ > + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > + } > + } > +} > + > static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) > { > > @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, > native_cmd == CMD_PARAMETER_CHANGE_COL) { > /* Copy flash cache word-wise */ > u32 *flash_cache = (u32 *)ctrl->flash_cache; > - int i; > > brcmnand_soc_data_bus_prepare(ctrl->soc, true); > > - /* > - * Must cache the FLASH_CACHE now, since changes in > - * SECTOR_SIZE_1K may invalidate it > - */ > - for (i = 0; i < FC_WORDS; i++) > - /* > - * Flash cache is big endian for parameter pages, at > - * least on STB SoCs > - */ > - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, > + FC_WORDS, true); > > brcmnand_soc_data_bus_unprepare(ctrl->soc, true); > > @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > { > struct brcmnand_host *host = nand_get_controller_data(chip); > struct brcmnand_controller *ctrl = host->ctrl; > - int i, j, ret = 0; > + int i, ret = 0; > > brcmnand_clear_ecc_addr(ctrl); > > @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > if (likely(buf)) { > brcmnand_soc_data_bus_prepare(ctrl->soc, false); > > - for (j = 0; j < FC_WORDS; j++, buf++) > - *buf = brcmnand_read_fc(ctrl, j); > + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, > + FC_WORDS, false); > + buf += FC_WORDS; > > brcmnand_soc_data_bus_unprepare(ctrl->soc, false); > } > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > index f1f93d85f50d..88819bc395f8 100644 > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > @@ -24,6 +24,8 @@ struct brcmnand_soc { > void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); > void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, > bool is_param); > + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, > + u32 *buffer, int fc_words, bool is_param); > const struct brcmnand_io_ops *ops; > }; > Thanks, Miquèl
Hi William, william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > The BCMBCA broadband SoC integrates the NAND controller differently than > STB, iProc and other SoCs. It has different endianness for NAND cache > data and ONFI parameter data. > > Add a SoC read data bus shim for BCMBCA to meet the specific SoC need > and performance improvement using the optimized memcpy function on NAND > cache memory. > > Signed-off-by: William Zhang <william.zhang@broadcom.com> > --- > > drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ > drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- > drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + > 3 files changed, 68 insertions(+), 14 deletions(-) > > diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > index 7e48b6a0bfa2..899103a62c98 100644 > --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > @@ -26,6 +26,18 @@ enum { > BCMBCA_CTLRDY = BIT(4), > }; > > +#if defined(CONFIG_ARM64) > +#define ALIGN_REQ 8 > +#else > +#define ALIGN_REQ 4 > +#endif > + > +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) > +{ > + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && > + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); > +} > + > static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) > { > struct bcmbca_nand_soc *priv = > @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) > brcmnand_writel(val, mmio); > } > > +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, > + void __iomem *flash_cache, u32 *buffer, > + int fc_words, bool is_param) > +{ > + int i; > + > + if (!is_param) { > + /* > + * memcpy can do unaligned aligned access depending on source > + * and dest address, which is incompatible with nand cache. Fallback > + * to the memcpy for io version > + */ > + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) > + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); > + else > + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); > + } else { > + /* Flash cache has same endian as the host for parameter pages */ > + for (i = 0; i < fc_words; i++, buffer++) > + *buffer = __raw_readl(flash_cache + i * 4); > + } > +} > + > static int bcmbca_nand_probe(struct platform_device *pdev) > { > struct device *dev = &pdev->dev; > @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) > > soc->ctlrdy_ack = bcmbca_nand_intc_ack; > soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; > + soc->read_data_bus = bcmbca_read_data_bus; > > return brcmnand_probe(pdev, soc); > } > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > index d920e88c7f5b..656be4d73016 100644 > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, > return brcmnand_readl(ctrl->edu_base + offs); > } > > +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, > + void __iomem *flash_cache, u32 *buffer, > + int fc_words, bool is_param) > +{ > + struct brcmnand_soc *soc = ctrl->soc; > + int i; > + > + if (soc->read_data_bus) { > + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); > + } else { > + if (!is_param) { > + for (i = 0; i < fc_words; i++, buffer++) > + *buffer = brcmnand_read_fc(ctrl, i); > + } else { > + for (i = 0; i < fc_words; i++) > + /* > + * Flash cache is big endian for parameter pages, at > + * least on STB SoCs > + */ > + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > + } > + } Perhaps we could have a single function that is statically assigned at probe time instead of a first helper with two conditions which calls in one case another hook... This can be simplified I guess. > +} > + > static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) > { > > @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, > native_cmd == CMD_PARAMETER_CHANGE_COL) { > /* Copy flash cache word-wise */ > u32 *flash_cache = (u32 *)ctrl->flash_cache; > - int i; > > brcmnand_soc_data_bus_prepare(ctrl->soc, true); > > - /* > - * Must cache the FLASH_CACHE now, since changes in > - * SECTOR_SIZE_1K may invalidate it > - */ > - for (i = 0; i < FC_WORDS; i++) > - /* > - * Flash cache is big endian for parameter pages, at > - * least on STB SoCs > - */ > - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, > + FC_WORDS, true); > > brcmnand_soc_data_bus_unprepare(ctrl->soc, true); > > @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > { > struct brcmnand_host *host = nand_get_controller_data(chip); > struct brcmnand_controller *ctrl = host->ctrl; > - int i, j, ret = 0; > + int i, ret = 0; > > brcmnand_clear_ecc_addr(ctrl); > > @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > if (likely(buf)) { > brcmnand_soc_data_bus_prepare(ctrl->soc, false); > > - for (j = 0; j < FC_WORDS; j++, buf++) > - *buf = brcmnand_read_fc(ctrl, j); > + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, > + FC_WORDS, false); > + buf += FC_WORDS; > > brcmnand_soc_data_bus_unprepare(ctrl->soc, false); > } > diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > index f1f93d85f50d..88819bc395f8 100644 > --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h > +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > @@ -24,6 +24,8 @@ struct brcmnand_soc { > void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); > void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, > bool is_param); > + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, > + u32 *buffer, int fc_words, bool is_param); > const struct brcmnand_io_ops *ops; > }; > Thanks, Miquèl
Hi Miquel, On 06/07/2023 01:20 AM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > >> The BCMBCA broadband SoC integrates the NAND controller differently than >> STB, iProc and other SoCs. It has different endianness for NAND cache >> data and ONFI parameter data. >> >> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >> and performance improvement using the optimized memcpy function on NAND >> cache memory. >> >> Signed-off-by: William Zhang <william.zhang@broadcom.com> >> --- >> >> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >> 3 files changed, 68 insertions(+), 14 deletions(-) >> >> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >> index 7e48b6a0bfa2..899103a62c98 100644 >> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >> @@ -26,6 +26,18 @@ enum { >> BCMBCA_CTLRDY = BIT(4), >> }; >> >> +#if defined(CONFIG_ARM64) >> +#define ALIGN_REQ 8 >> +#else >> +#define ALIGN_REQ 4 >> +#endif >> + >> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >> +{ >> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >> +} >> + >> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >> { >> struct bcmbca_nand_soc *priv = >> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >> brcmnand_writel(val, mmio); >> } >> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >> + void __iomem *flash_cache, u32 *buffer, >> + int fc_words, bool is_param) >> +{ >> + int i; >> + >> + if (!is_param) { >> + /* >> + * memcpy can do unaligned aligned access depending on source >> + * and dest address, which is incompatible with nand cache. Fallback >> + * to the memcpy for io version >> + */ >> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >> + else >> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >> + } else { >> + /* Flash cache has same endian as the host for parameter pages */ >> + for (i = 0; i < fc_words; i++, buffer++) >> + *buffer = __raw_readl(flash_cache + i * 4); >> + } >> +} >> + >> static int bcmbca_nand_probe(struct platform_device *pdev) >> { >> struct device *dev = &pdev->dev; >> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >> + soc->read_data_bus = bcmbca_read_data_bus; >> >> return brcmnand_probe(pdev, soc); >> } >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >> index d920e88c7f5b..656be4d73016 100644 >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >> return brcmnand_readl(ctrl->edu_base + offs); >> } >> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >> + void __iomem *flash_cache, u32 *buffer, >> + int fc_words, bool is_param) > > I strongly dislike this "is_param" boolean. > > When is the data in host endianness? When is it not? This is little bit complicated. We have two type data read from nand cache. One for page read and the other for parameter and onfi data read from the controller side. But it depends on how SoC integrate the nand cache to system. In broadband SoC, both page and parameter data are in host endianess but other SoCs is not the same. I am open to suggestion for is_param function argument but to factor out this common code in more structured way, I don't see other way around. > > If we think about an exec_op() conversion and drop cmdfunc(), what > would be the discriminant? > If we need to implement exec_op in the future, the data is not coming from nand cache but some other low level data register which may not subject to the endianess issue. >> +{ >> + struct brcmnand_soc *soc = ctrl->soc; >> + int i; >> + >> + if (soc->read_data_bus) { >> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >> + } else { >> + if (!is_param) { >> + for (i = 0; i < fc_words; i++, buffer++) >> + *buffer = brcmnand_read_fc(ctrl, i); >> + } else { >> + for (i = 0; i < fc_words; i++) >> + /* >> + * Flash cache is big endian for parameter pages, at >> + * least on STB SoCs >> + */ >> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >> + } >> + } >> +} >> + >> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >> { >> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >> native_cmd == CMD_PARAMETER_CHANGE_COL) { >> /* Copy flash cache word-wise */ >> u32 *flash_cache = (u32 *)ctrl->flash_cache; >> - int i; >> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >> >> - /* >> - * Must cache the FLASH_CACHE now, since changes in >> - * SECTOR_SIZE_1K may invalidate it >> - */ >> - for (i = 0; i < FC_WORDS; i++) >> - /* >> - * Flash cache is big endian for parameter pages, at >> - * least on STB SoCs >> - */ >> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >> + FC_WORDS, true); >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >> { >> struct brcmnand_host *host = nand_get_controller_data(chip); >> struct brcmnand_controller *ctrl = host->ctrl; >> - int i, j, ret = 0; >> + int i, ret = 0; >> >> brcmnand_clear_ecc_addr(ctrl); >> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >> if (likely(buf)) { >> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >> >> - for (j = 0; j < FC_WORDS; j++, buf++) >> - *buf = brcmnand_read_fc(ctrl, j); >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >> + FC_WORDS, false); >> + buf += FC_WORDS; >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >> } >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >> index f1f93d85f50d..88819bc395f8 100644 >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >> @@ -24,6 +24,8 @@ struct brcmnand_soc { >> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >> bool is_param); >> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >> + u32 *buffer, int fc_words, bool is_param); >> const struct brcmnand_io_ops *ops; >> }; >> > > > Thanks, > Miquèl >
Hi Miquel, On 06/07/2023 01:22 AM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > >> The BCMBCA broadband SoC integrates the NAND controller differently than >> STB, iProc and other SoCs. It has different endianness for NAND cache >> data and ONFI parameter data. >> >> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >> and performance improvement using the optimized memcpy function on NAND >> cache memory. >> >> Signed-off-by: William Zhang <william.zhang@broadcom.com> >> --- >> >> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >> 3 files changed, 68 insertions(+), 14 deletions(-) >> >> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >> index 7e48b6a0bfa2..899103a62c98 100644 >> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >> @@ -26,6 +26,18 @@ enum { >> BCMBCA_CTLRDY = BIT(4), >> }; >> >> +#if defined(CONFIG_ARM64) >> +#define ALIGN_REQ 8 >> +#else >> +#define ALIGN_REQ 4 >> +#endif >> + >> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >> +{ >> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >> +} >> + >> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >> { >> struct bcmbca_nand_soc *priv = >> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >> brcmnand_writel(val, mmio); >> } >> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >> + void __iomem *flash_cache, u32 *buffer, >> + int fc_words, bool is_param) >> +{ >> + int i; >> + >> + if (!is_param) { >> + /* >> + * memcpy can do unaligned aligned access depending on source >> + * and dest address, which is incompatible with nand cache. Fallback >> + * to the memcpy for io version >> + */ >> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >> + else >> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >> + } else { >> + /* Flash cache has same endian as the host for parameter pages */ >> + for (i = 0; i < fc_words; i++, buffer++) >> + *buffer = __raw_readl(flash_cache + i * 4); >> + } >> +} >> + >> static int bcmbca_nand_probe(struct platform_device *pdev) >> { >> struct device *dev = &pdev->dev; >> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >> + soc->read_data_bus = bcmbca_read_data_bus; >> >> return brcmnand_probe(pdev, soc); >> } >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >> index d920e88c7f5b..656be4d73016 100644 >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >> return brcmnand_readl(ctrl->edu_base + offs); >> } >> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >> + void __iomem *flash_cache, u32 *buffer, >> + int fc_words, bool is_param) >> +{ >> + struct brcmnand_soc *soc = ctrl->soc; >> + int i; >> + >> + if (soc->read_data_bus) { >> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >> + } else { >> + if (!is_param) { >> + for (i = 0; i < fc_words; i++, buffer++) >> + *buffer = brcmnand_read_fc(ctrl, i); >> + } else { >> + for (i = 0; i < fc_words; i++) >> + /* >> + * Flash cache is big endian for parameter pages, at >> + * least on STB SoCs >> + */ >> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >> + } >> + } > > Perhaps we could have a single function that is statically assigned at > probe time instead of a first helper with two conditions which calls in > one case another hook... This can be simplified I guess. > Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. Not sure how much this can be simplified... Or we have default implementation in brcmnand.c but then there is one condition check too. Page read is done at 512 bytes burst. One or two conditions check outside of the per 512 bytes read loop does not sounds too bad if performance is concern. >> +} >> + >> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >> { >> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >> native_cmd == CMD_PARAMETER_CHANGE_COL) { >> /* Copy flash cache word-wise */ >> u32 *flash_cache = (u32 *)ctrl->flash_cache; >> - int i; >> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >> >> - /* >> - * Must cache the FLASH_CACHE now, since changes in >> - * SECTOR_SIZE_1K may invalidate it >> - */ >> - for (i = 0; i < FC_WORDS; i++) >> - /* >> - * Flash cache is big endian for parameter pages, at >> - * least on STB SoCs >> - */ >> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >> + FC_WORDS, true); >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >> { >> struct brcmnand_host *host = nand_get_controller_data(chip); >> struct brcmnand_controller *ctrl = host->ctrl; >> - int i, j, ret = 0; >> + int i, ret = 0; >> >> brcmnand_clear_ecc_addr(ctrl); >> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >> if (likely(buf)) { >> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >> >> - for (j = 0; j < FC_WORDS; j++, buf++) >> - *buf = brcmnand_read_fc(ctrl, j); >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >> + FC_WORDS, false); >> + buf += FC_WORDS; >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >> } >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >> index f1f93d85f50d..88819bc395f8 100644 >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >> @@ -24,6 +24,8 @@ struct brcmnand_soc { >> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >> bool is_param); >> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >> + u32 *buffer, int fc_words, bool is_param); >> const struct brcmnand_io_ops *ops; >> }; >> > > > Thanks, > Miquèl >
Hi William, william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:12:02 -0700: > Hi Miquel, > > On 06/07/2023 01:20 AM, Miquel Raynal wrote: > > Hi William, > > > > william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > > > >> The BCMBCA broadband SoC integrates the NAND controller differently than > >> STB, iProc and other SoCs. It has different endianness for NAND cache > >> data and ONFI parameter data. > >> > >> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need > >> and performance improvement using the optimized memcpy function on NAND > >> cache memory. > >> > >> Signed-off-by: William Zhang <william.zhang@broadcom.com> > >> --- > >> > >> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ > >> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- > >> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + > >> 3 files changed, 68 insertions(+), 14 deletions(-) > >> > >> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >> index 7e48b6a0bfa2..899103a62c98 100644 > >> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >> @@ -26,6 +26,18 @@ enum { > >> BCMBCA_CTLRDY = BIT(4), > >> }; > >> >> +#if defined(CONFIG_ARM64) > >> +#define ALIGN_REQ 8 > >> +#else > >> +#define ALIGN_REQ 4 > >> +#endif > >> + > >> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) > >> +{ > >> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && > >> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); > >> +} > >> + > >> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) > >> { > >> struct bcmbca_nand_soc *priv = > >> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) > >> brcmnand_writel(val, mmio); > >> } > >> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, > >> + void __iomem *flash_cache, u32 *buffer, > >> + int fc_words, bool is_param) > >> +{ > >> + int i; > >> + > >> + if (!is_param) { > >> + /* > >> + * memcpy can do unaligned aligned access depending on source > >> + * and dest address, which is incompatible with nand cache. Fallback > >> + * to the memcpy for io version > >> + */ > >> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) > >> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); > >> + else > >> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); > >> + } else { > >> + /* Flash cache has same endian as the host for parameter pages */ > >> + for (i = 0; i < fc_words; i++, buffer++) > >> + *buffer = __raw_readl(flash_cache + i * 4); > >> + } > >> +} > >> + > >> static int bcmbca_nand_probe(struct platform_device *pdev) > >> { > >> struct device *dev = &pdev->dev; > >> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) > >> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; > >> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; > >> + soc->read_data_bus = bcmbca_read_data_bus; > >> >> return brcmnand_probe(pdev, soc); > >> } > >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >> index d920e88c7f5b..656be4d73016 100644 > >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, > >> return brcmnand_readl(ctrl->edu_base + offs); > >> } > >> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, > >> + void __iomem *flash_cache, u32 *buffer, > >> + int fc_words, bool is_param) > > > > I strongly dislike this "is_param" boolean. > > > > When is the data in host endianness? When is it not? > This is little bit complicated. We have two type data read from nand cache. One for page read and the other for parameter and onfi data read from the controller side. But it depends on how SoC integrate the nand cache to system. In broadband SoC, both page and parameter data are in host endianess but other SoCs is not the same. > > I am open to suggestion for is_param function argument but to factor out this common code in more structured way, I don't see other way around. Alright, so this is SoC dependent, very well -> a (sub)compatible per SoC + platform data associated to it with the right function. > > If we think about an exec_op() conversion and drop cmdfunc(), what > > would be the discriminant? > > > If we need to implement exec_op in the future, the data is not coming from nand cache but some other low level data register which may not subject to the endianess issue. Can't you use the same cache all the time here as well then? And avoid the need for this overly complex logic? > > >> +{ > >> + struct brcmnand_soc *soc = ctrl->soc; > >> + int i; > >> + > >> + if (soc->read_data_bus) { > >> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); > >> + } else { > >> + if (!is_param) { > >> + for (i = 0; i < fc_words; i++, buffer++) > >> + *buffer = brcmnand_read_fc(ctrl, i); > >> + } else { > >> + for (i = 0; i < fc_words; i++) > >> + /* > >> + * Flash cache is big endian for parameter pages, at > >> + * least on STB SoCs > >> + */ > >> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >> + } > >> + } > >> +} > >> + > >> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) > >> { > >> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, > >> native_cmd == CMD_PARAMETER_CHANGE_COL) { > >> /* Copy flash cache word-wise */ > >> u32 *flash_cache = (u32 *)ctrl->flash_cache; > >> - int i; > >> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); > >> >> - /* > >> - * Must cache the FLASH_CACHE now, since changes in > >> - * SECTOR_SIZE_1K may invalidate it > >> - */ > >> - for (i = 0; i < FC_WORDS; i++) > >> - /* > >> - * Flash cache is big endian for parameter pages, at > >> - * least on STB SoCs > >> - */ > >> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, > >> + FC_WORDS, true); > >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); > >> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >> { > >> struct brcmnand_host *host = nand_get_controller_data(chip); > >> struct brcmnand_controller *ctrl = host->ctrl; > >> - int i, j, ret = 0; > >> + int i, ret = 0; > >> >> brcmnand_clear_ecc_addr(ctrl); > >> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >> if (likely(buf)) { > >> brcmnand_soc_data_bus_prepare(ctrl->soc, false); > >> >> - for (j = 0; j < FC_WORDS; j++, buf++) > >> - *buf = brcmnand_read_fc(ctrl, j); > >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, > >> + FC_WORDS, false); > >> + buf += FC_WORDS; > >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); > >> } > >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >> index f1f93d85f50d..88819bc395f8 100644 > >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >> @@ -24,6 +24,8 @@ struct brcmnand_soc { > >> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); > >> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, > >> bool is_param); > >> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, > >> + u32 *buffer, int fc_words, bool is_param); > >> const struct brcmnand_io_ops *ops; > >> }; > >> > > > > Thanks, > > Miquèl > > Thanks, Miquèl
Hi William, william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: > Hi Miquel, > > On 06/07/2023 01:22 AM, Miquel Raynal wrote: > > Hi William, > > > > william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > > > >> The BCMBCA broadband SoC integrates the NAND controller differently than > >> STB, iProc and other SoCs. It has different endianness for NAND cache > >> data and ONFI parameter data. > >> > >> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need > >> and performance improvement using the optimized memcpy function on NAND > >> cache memory. > >> > >> Signed-off-by: William Zhang <william.zhang@broadcom.com> > >> --- > >> > >> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ > >> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- > >> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + > >> 3 files changed, 68 insertions(+), 14 deletions(-) > >> > >> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >> index 7e48b6a0bfa2..899103a62c98 100644 > >> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >> @@ -26,6 +26,18 @@ enum { > >> BCMBCA_CTLRDY = BIT(4), > >> }; > >> >> +#if defined(CONFIG_ARM64) > >> +#define ALIGN_REQ 8 > >> +#else > >> +#define ALIGN_REQ 4 > >> +#endif > >> + > >> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) > >> +{ > >> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && > >> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); > >> +} > >> + > >> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) > >> { > >> struct bcmbca_nand_soc *priv = > >> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) > >> brcmnand_writel(val, mmio); > >> } > >> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, > >> + void __iomem *flash_cache, u32 *buffer, > >> + int fc_words, bool is_param) > >> +{ > >> + int i; > >> + > >> + if (!is_param) { > >> + /* > >> + * memcpy can do unaligned aligned access depending on source > >> + * and dest address, which is incompatible with nand cache. Fallback > >> + * to the memcpy for io version > >> + */ > >> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) > >> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); > >> + else > >> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); > >> + } else { > >> + /* Flash cache has same endian as the host for parameter pages */ > >> + for (i = 0; i < fc_words; i++, buffer++) > >> + *buffer = __raw_readl(flash_cache + i * 4); > >> + } > >> +} > >> + > >> static int bcmbca_nand_probe(struct platform_device *pdev) > >> { > >> struct device *dev = &pdev->dev; > >> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) > >> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; > >> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; > >> + soc->read_data_bus = bcmbca_read_data_bus; > >> >> return brcmnand_probe(pdev, soc); > >> } > >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >> index d920e88c7f5b..656be4d73016 100644 > >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, > >> return brcmnand_readl(ctrl->edu_base + offs); > >> } > >> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, > >> + void __iomem *flash_cache, u32 *buffer, > >> + int fc_words, bool is_param) > >> +{ > >> + struct brcmnand_soc *soc = ctrl->soc; > >> + int i; > >> + > >> + if (soc->read_data_bus) { > >> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); > >> + } else { > >> + if (!is_param) { > >> + for (i = 0; i < fc_words; i++, buffer++) > >> + *buffer = brcmnand_read_fc(ctrl, i); > >> + } else { > >> + for (i = 0; i < fc_words; i++) > >> + /* > >> + * Flash cache is big endian for parameter pages, at > >> + * least on STB SoCs > >> + */ > >> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >> + } > >> + } > > > > Perhaps we could have a single function that is statically assigned at > > probe time instead of a first helper with two conditions which calls in > > one case another hook... This can be simplified I guess. > > > Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. You told me in case we would use exec_op we could avoid the param cache. If that's true then the whole support can be simplified. > Not sure how much this can be simplified... Or we have default > implementation in brcmnand.c but then there is one condition check > too. Page read is done at 512 bytes burst. One or two conditions > check outside of the per 512 bytes read loop does not sounds too bad > if performance is concern. It is unreadable. That is my main concern. > > >> +} > >> + > >> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) > >> { > >> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, > >> native_cmd == CMD_PARAMETER_CHANGE_COL) { > >> /* Copy flash cache word-wise */ > >> u32 *flash_cache = (u32 *)ctrl->flash_cache; > >> - int i; > >> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); > >> >> - /* > >> - * Must cache the FLASH_CACHE now, since changes in > >> - * SECTOR_SIZE_1K may invalidate it > >> - */ > >> - for (i = 0; i < FC_WORDS; i++) > >> - /* > >> - * Flash cache is big endian for parameter pages, at > >> - * least on STB SoCs > >> - */ > >> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, > >> + FC_WORDS, true); > >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); > >> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >> { > >> struct brcmnand_host *host = nand_get_controller_data(chip); > >> struct brcmnand_controller *ctrl = host->ctrl; > >> - int i, j, ret = 0; > >> + int i, ret = 0; > >> >> brcmnand_clear_ecc_addr(ctrl); > >> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >> if (likely(buf)) { > >> brcmnand_soc_data_bus_prepare(ctrl->soc, false); > >> >> - for (j = 0; j < FC_WORDS; j++, buf++) > >> - *buf = brcmnand_read_fc(ctrl, j); > >> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, > >> + FC_WORDS, false); > >> + buf += FC_WORDS; > >> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); > >> } > >> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >> index f1f93d85f50d..88819bc395f8 100644 > >> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >> @@ -24,6 +24,8 @@ struct brcmnand_soc { > >> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); > >> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, > >> bool is_param); > >> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, > >> + u32 *buffer, int fc_words, bool is_param); > >> const struct brcmnand_io_ops *ops; > >> }; > >> > > > > Thanks, > > Miquèl > > Thanks, Miquèl
On 06/07/2023 11:15 PM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:12:02 -0700: > >> Hi Miquel, >> >> On 06/07/2023 01:20 AM, Miquel Raynal wrote: >>> Hi William, >>> >>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: >>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than >>>> STB, iProc and other SoCs. It has different endianness for NAND cache >>>> data and ONFI parameter data. >>>> >>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >>>> and performance improvement using the optimized memcpy function on NAND >>>> cache memory. >>>> >>>> Signed-off-by: William Zhang <william.zhang@broadcom.com> >>>> --- >>>> >>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >>>> 3 files changed, 68 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>> index 7e48b6a0bfa2..899103a62c98 100644 >>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>> @@ -26,6 +26,18 @@ enum { >>>> BCMBCA_CTLRDY = BIT(4), >>>> }; >>>> >> +#if defined(CONFIG_ARM64) >>>> +#define ALIGN_REQ 8 >>>> +#else >>>> +#define ALIGN_REQ 4 >>>> +#endif >>>> + >>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >>>> +{ >>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >>>> +} >>>> + >>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >>>> { >>>> struct bcmbca_nand_soc *priv = >>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >>>> brcmnand_writel(val, mmio); >>>> } >>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >>>> + void __iomem *flash_cache, u32 *buffer, >>>> + int fc_words, bool is_param) >>>> +{ >>>> + int i; >>>> + >>>> + if (!is_param) { >>>> + /* >>>> + * memcpy can do unaligned aligned access depending on source >>>> + * and dest address, which is incompatible with nand cache. Fallback >>>> + * to the memcpy for io version >>>> + */ >>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >>>> + else >>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >>>> + } else { >>>> + /* Flash cache has same endian as the host for parameter pages */ >>>> + for (i = 0; i < fc_words; i++, buffer++) >>>> + *buffer = __raw_readl(flash_cache + i * 4); >>>> + } >>>> +} >>>> + >>>> static int bcmbca_nand_probe(struct platform_device *pdev) >>>> { >>>> struct device *dev = &pdev->dev; >>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >>>> + soc->read_data_bus = bcmbca_read_data_bus; >>>> >> return brcmnand_probe(pdev, soc); >>>> } >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>> index d920e88c7f5b..656be4d73016 100644 >>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >>>> return brcmnand_readl(ctrl->edu_base + offs); >>>> } >>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >>>> + void __iomem *flash_cache, u32 *buffer, >>>> + int fc_words, bool is_param) >>> >>> I strongly dislike this "is_param" boolean. >>> >>> When is the data in host endianness? When is it not? >> This is little bit complicated. We have two type data read from nand cache. One for page read and the other for parameter and onfi data read from the controller side. But it depends on how SoC integrate the nand cache to system. In broadband SoC, both page and parameter data are in host endianess but other SoCs is not the same. >> >> I am open to suggestion for is_param function argument but to factor out this common code in more structured way, I don't see other way around. > > Alright, so this is SoC dependent, very well -> a (sub)compatible per > SoC + platform data associated to it with the right function. > Right we have per SoC compatible and can have per SoC implementation but I prefer to have a default implementation in the brcmnand.c because right now only bcmcba SoC need some different handling. The other four implementations are the same. To make the code a little more readable and less complicated, I am thinking to separate the brcmnand_read_data_bus into brcmnand_read_page_data and brcmnand_read_param_data as default in brcmnand.c. But bcmbca will override them. Would that be okay with you? >>> If we think about an exec_op() conversion and drop cmdfunc(), what >>> would be the discriminant? >>> >> If we need to implement exec_op in the future, the data is not coming from nand cache but some other low level data register which may not subject to the endianess issue. > > Can't you use the same cache all the time here as well then? And avoid > the need for this overly complex logic? > Unfortunately exec_op will not use nand cache for parameter data read but some other low level data register. This is dictated by the controller. >> >>>> +{ >>>> + struct brcmnand_soc *soc = ctrl->soc; >>>> + int i; >>>> + >>>> + if (soc->read_data_bus) { >>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >>>> + } else { >>>> + if (!is_param) { >>>> + for (i = 0; i < fc_words; i++, buffer++) >>>> + *buffer = brcmnand_read_fc(ctrl, i); >>>> + } else { >>>> + for (i = 0; i < fc_words; i++) >>>> + /* >>>> + * Flash cache is big endian for parameter pages, at >>>> + * least on STB SoCs >>>> + */ >>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>> + } >>>> + } >>>> +} >>>> + >>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >>>> { >>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { >>>> /* Copy flash cache word-wise */ >>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; >>>> - int i; >>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >>>> >> - /* >>>> - * Must cache the FLASH_CACHE now, since changes in >>>> - * SECTOR_SIZE_1K may invalidate it >>>> - */ >>>> - for (i = 0; i < FC_WORDS; i++) >>>> - /* >>>> - * Flash cache is big endian for parameter pages, at >>>> - * least on STB SoCs >>>> - */ >>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >>>> + FC_WORDS, true); >>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>> { >>>> struct brcmnand_host *host = nand_get_controller_data(chip); >>>> struct brcmnand_controller *ctrl = host->ctrl; >>>> - int i, j, ret = 0; >>>> + int i, ret = 0; >>>> >> brcmnand_clear_ecc_addr(ctrl); >>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>> if (likely(buf)) { >>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) >>>> - *buf = brcmnand_read_fc(ctrl, j); >>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >>>> + FC_WORDS, false); >>>> + buf += FC_WORDS; >>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >>>> } >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>> index f1f93d85f50d..88819bc395f8 100644 >>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { >>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >>>> bool is_param); >>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >>>> + u32 *buffer, int fc_words, bool is_param); >>>> const struct brcmnand_io_ops *ops; >>>> }; >>>> > > >>> Thanks, >>> Miquèl >>> > > > Thanks, > Miquèl >
On 06/07/2023 11:18 PM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: > >> Hi Miquel, >> >> On 06/07/2023 01:22 AM, Miquel Raynal wrote: >>> Hi William, >>> >>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: >>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than >>>> STB, iProc and other SoCs. It has different endianness for NAND cache >>>> data and ONFI parameter data. >>>> >>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >>>> and performance improvement using the optimized memcpy function on NAND >>>> cache memory. >>>> >>>> Signed-off-by: William Zhang <william.zhang@broadcom.com> >>>> --- >>>> >>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >>>> 3 files changed, 68 insertions(+), 14 deletions(-) >>>> >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>> index 7e48b6a0bfa2..899103a62c98 100644 >>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>> @@ -26,6 +26,18 @@ enum { >>>> BCMBCA_CTLRDY = BIT(4), >>>> }; >>>> >> +#if defined(CONFIG_ARM64) >>>> +#define ALIGN_REQ 8 >>>> +#else >>>> +#define ALIGN_REQ 4 >>>> +#endif >>>> + >>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >>>> +{ >>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >>>> +} >>>> + >>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >>>> { >>>> struct bcmbca_nand_soc *priv = >>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >>>> brcmnand_writel(val, mmio); >>>> } >>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >>>> + void __iomem *flash_cache, u32 *buffer, >>>> + int fc_words, bool is_param) >>>> +{ >>>> + int i; >>>> + >>>> + if (!is_param) { >>>> + /* >>>> + * memcpy can do unaligned aligned access depending on source >>>> + * and dest address, which is incompatible with nand cache. Fallback >>>> + * to the memcpy for io version >>>> + */ >>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >>>> + else >>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >>>> + } else { >>>> + /* Flash cache has same endian as the host for parameter pages */ >>>> + for (i = 0; i < fc_words; i++, buffer++) >>>> + *buffer = __raw_readl(flash_cache + i * 4); >>>> + } >>>> +} >>>> + >>>> static int bcmbca_nand_probe(struct platform_device *pdev) >>>> { >>>> struct device *dev = &pdev->dev; >>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >>>> + soc->read_data_bus = bcmbca_read_data_bus; >>>> >> return brcmnand_probe(pdev, soc); >>>> } >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>> index d920e88c7f5b..656be4d73016 100644 >>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >>>> return brcmnand_readl(ctrl->edu_base + offs); >>>> } >>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >>>> + void __iomem *flash_cache, u32 *buffer, >>>> + int fc_words, bool is_param) >>>> +{ >>>> + struct brcmnand_soc *soc = ctrl->soc; >>>> + int i; >>>> + >>>> + if (soc->read_data_bus) { >>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >>>> + } else { >>>> + if (!is_param) { >>>> + for (i = 0; i < fc_words; i++, buffer++) >>>> + *buffer = brcmnand_read_fc(ctrl, i); >>>> + } else { >>>> + for (i = 0; i < fc_words; i++) >>>> + /* >>>> + * Flash cache is big endian for parameter pages, at >>>> + * least on STB SoCs >>>> + */ >>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>> + } >>>> + } >>> >>> Perhaps we could have a single function that is statically assigned at >>> probe time instead of a first helper with two conditions which calls in >>> one case another hook... This can be simplified I guess. >>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. > > You told me in case we would use exec_op we could avoid the param > cache. If that's true then the whole support can be simplified. > Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. >> Not sure how much this can be simplified... Or we have default >> implementation in brcmnand.c but then there is one condition check >> too. Page read is done at 512 bytes burst. One or two conditions >> check outside of the per 512 bytes read loop does not sounds too bad >> if performance is concern. > > It is unreadable. That is my main concern. > >> >>>> +} >>>> + >>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >>>> { >>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { >>>> /* Copy flash cache word-wise */ >>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; >>>> - int i; >>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >>>> >> - /* >>>> - * Must cache the FLASH_CACHE now, since changes in >>>> - * SECTOR_SIZE_1K may invalidate it >>>> - */ >>>> - for (i = 0; i < FC_WORDS; i++) >>>> - /* >>>> - * Flash cache is big endian for parameter pages, at >>>> - * least on STB SoCs >>>> - */ >>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >>>> + FC_WORDS, true); >>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>> { >>>> struct brcmnand_host *host = nand_get_controller_data(chip); >>>> struct brcmnand_controller *ctrl = host->ctrl; >>>> - int i, j, ret = 0; >>>> + int i, ret = 0; >>>> >> brcmnand_clear_ecc_addr(ctrl); >>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>> if (likely(buf)) { >>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) >>>> - *buf = brcmnand_read_fc(ctrl, j); >>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >>>> + FC_WORDS, false); >>>> + buf += FC_WORDS; >>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >>>> } >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>> index f1f93d85f50d..88819bc395f8 100644 >>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { >>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >>>> bool is_param); >>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >>>> + u32 *buffer, int fc_words, bool is_param); >>>> const struct brcmnand_io_ops *ops; >>>> }; >>>> > > >>> Thanks, >>> Miquèl >>> > > > Thanks, > Miquèl >
Hi William, william.zhang@broadcom.com wrote on Thu, 8 Jun 2023 12:10:06 -0700: > On 06/07/2023 11:18 PM, Miquel Raynal wrote: > > Hi William, > > > > william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: > > > >> Hi Miquel, > >> > >> On 06/07/2023 01:22 AM, Miquel Raynal wrote: > >>> Hi William, > >>> > >>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > >>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than > >>>> STB, iProc and other SoCs. It has different endianness for NAND cache > >>>> data and ONFI parameter data. > >>>> > >>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need > >>>> and performance improvement using the optimized memcpy function on NAND > >>>> cache memory. > >>>> > >>>> Signed-off-by: William Zhang <william.zhang@broadcom.com> > >>>> --- > >>>> > >>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ > >>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- > >>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + > >>>> 3 files changed, 68 insertions(+), 14 deletions(-) > >>>> > >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >>>> index 7e48b6a0bfa2..899103a62c98 100644 > >>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >>>> @@ -26,6 +26,18 @@ enum { > >>>> BCMBCA_CTLRDY = BIT(4), > >>>> }; > >>>> >> +#if defined(CONFIG_ARM64) > >>>> +#define ALIGN_REQ 8 > >>>> +#else > >>>> +#define ALIGN_REQ 4 > >>>> +#endif > >>>> + > >>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) > >>>> +{ > >>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && > >>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); > >>>> +} > >>>> + > >>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) > >>>> { > >>>> struct bcmbca_nand_soc *priv = > >>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) > >>>> brcmnand_writel(val, mmio); > >>>> } > >>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, > >>>> + void __iomem *flash_cache, u32 *buffer, > >>>> + int fc_words, bool is_param) > >>>> +{ > >>>> + int i; > >>>> + > >>>> + if (!is_param) { > >>>> + /* > >>>> + * memcpy can do unaligned aligned access depending on source > >>>> + * and dest address, which is incompatible with nand cache. Fallback > >>>> + * to the memcpy for io version > >>>> + */ > >>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) > >>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); > >>>> + else > >>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); > >>>> + } else { > >>>> + /* Flash cache has same endian as the host for parameter pages */ > >>>> + for (i = 0; i < fc_words; i++, buffer++) > >>>> + *buffer = __raw_readl(flash_cache + i * 4); > >>>> + } > >>>> +} > >>>> + > >>>> static int bcmbca_nand_probe(struct platform_device *pdev) > >>>> { > >>>> struct device *dev = &pdev->dev; > >>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) > >>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; > >>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; > >>>> + soc->read_data_bus = bcmbca_read_data_bus; > >>>> >> return brcmnand_probe(pdev, soc); > >>>> } > >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >>>> index d920e88c7f5b..656be4d73016 100644 > >>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, > >>>> return brcmnand_readl(ctrl->edu_base + offs); > >>>> } > >>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, > >>>> + void __iomem *flash_cache, u32 *buffer, > >>>> + int fc_words, bool is_param) > >>>> +{ > >>>> + struct brcmnand_soc *soc = ctrl->soc; > >>>> + int i; > >>>> + > >>>> + if (soc->read_data_bus) { > >>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); > >>>> + } else { > >>>> + if (!is_param) { > >>>> + for (i = 0; i < fc_words; i++, buffer++) > >>>> + *buffer = brcmnand_read_fc(ctrl, i); > >>>> + } else { > >>>> + for (i = 0; i < fc_words; i++) > >>>> + /* > >>>> + * Flash cache is big endian for parameter pages, at > >>>> + * least on STB SoCs > >>>> + */ > >>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >>>> + } > >>>> + } > >>> > >>> Perhaps we could have a single function that is statically assigned at > >>> probe time instead of a first helper with two conditions which calls in > >>> one case another hook... This can be simplified I guess. > >>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. > > > > You told me in case we would use exec_op we could avoid the param > > cache. If that's true then the whole support can be simplified. > > > Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. > > So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. I am sorry but this series is totally backwards, you're trying to guess what comes next with the 'is_param' thing, it's exactly what we are fighting against since 2017. There are plenty of ->exec_op() conversions out there, I don't believe this one will be harder. You need to convert the driver to this new API and get rid of this whole endianness non-sense to simplify a lot the driver. > > >> Not sure how much this can be simplified... Or we have default > >> implementation in brcmnand.c but then there is one condition check > >> too. Page read is done at 512 bytes burst. One or two conditions > >> check outside of the per 512 bytes read loop does not sounds too bad > >> if performance is concern. > > > > It is unreadable. That is my main concern. > > > >> > >>>> +} > >>>> + > >>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) > >>>> { > >>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, > >>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { > >>>> /* Copy flash cache word-wise */ > >>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; > >>>> - int i; > >>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); > >>>> >> - /* > >>>> - * Must cache the FLASH_CACHE now, since changes in > >>>> - * SECTOR_SIZE_1K may invalidate it > >>>> - */ > >>>> - for (i = 0; i < FC_WORDS; i++) > >>>> - /* > >>>> - * Flash cache is big endian for parameter pages, at > >>>> - * least on STB SoCs > >>>> - */ > >>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, > >>>> + FC_WORDS, true); > >>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); > >>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >>>> { > >>>> struct brcmnand_host *host = nand_get_controller_data(chip); > >>>> struct brcmnand_controller *ctrl = host->ctrl; > >>>> - int i, j, ret = 0; > >>>> + int i, ret = 0; > >>>> >> brcmnand_clear_ecc_addr(ctrl); > >>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >>>> if (likely(buf)) { > >>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); > >>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) > >>>> - *buf = brcmnand_read_fc(ctrl, j); > >>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, > >>>> + FC_WORDS, false); > >>>> + buf += FC_WORDS; > >>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); > >>>> } > >>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >>>> index f1f93d85f50d..88819bc395f8 100644 > >>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { > >>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); > >>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, > >>>> bool is_param); > >>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, > >>>> + u32 *buffer, int fc_words, bool is_param); > >>>> const struct brcmnand_io_ops *ops; > >>>> }; > >>>> > > > >>> Thanks, > >>> Miquèl > >>> > > > > Thanks, > > Miquèl > > Thanks, Miquèl
Hi Miquel, On 06/09/2023 01:35 AM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Thu, 8 Jun 2023 12:10:06 -0700: > >> On 06/07/2023 11:18 PM, Miquel Raynal wrote: >>> Hi William, >>> >>> william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: >>> >>>> Hi Miquel, >>>> >>>> On 06/07/2023 01:22 AM, Miquel Raynal wrote: >>>>> Hi William, >>>>> >>>>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: >>>>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than >>>>>> STB, iProc and other SoCs. It has different endianness for NAND cache >>>>>> data and ONFI parameter data. >>>>>> >>>>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >>>>>> and performance improvement using the optimized memcpy function on NAND >>>>>> cache memory. >>>>>> >>>>>> Signed-off-by: William Zhang <william.zhang@broadcom.com> >>>>>> --- >>>>>> >>>>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >>>>>> 3 files changed, 68 insertions(+), 14 deletions(-) >>>>>> >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>> index 7e48b6a0bfa2..899103a62c98 100644 >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>> @@ -26,6 +26,18 @@ enum { >>>>>> BCMBCA_CTLRDY = BIT(4), >>>>>> }; >>>>>> >> +#if defined(CONFIG_ARM64) >>>>>> +#define ALIGN_REQ 8 >>>>>> +#else >>>>>> +#define ALIGN_REQ 4 >>>>>> +#endif >>>>>> + >>>>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >>>>>> +{ >>>>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >>>>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >>>>>> +} >>>>>> + >>>>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >>>>>> { >>>>>> struct bcmbca_nand_soc *priv = >>>>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >>>>>> brcmnand_writel(val, mmio); >>>>>> } >>>>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >>>>>> + void __iomem *flash_cache, u32 *buffer, >>>>>> + int fc_words, bool is_param) >>>>>> +{ >>>>>> + int i; >>>>>> + >>>>>> + if (!is_param) { >>>>>> + /* >>>>>> + * memcpy can do unaligned aligned access depending on source >>>>>> + * and dest address, which is incompatible with nand cache. Fallback >>>>>> + * to the memcpy for io version >>>>>> + */ >>>>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >>>>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >>>>>> + else >>>>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >>>>>> + } else { >>>>>> + /* Flash cache has same endian as the host for parameter pages */ >>>>>> + for (i = 0; i < fc_words; i++, buffer++) >>>>>> + *buffer = __raw_readl(flash_cache + i * 4); >>>>>> + } >>>>>> +} >>>>>> + >>>>>> static int bcmbca_nand_probe(struct platform_device *pdev) >>>>>> { >>>>>> struct device *dev = &pdev->dev; >>>>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >>>>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >>>>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >>>>>> + soc->read_data_bus = bcmbca_read_data_bus; >>>>>> >> return brcmnand_probe(pdev, soc); >>>>>> } >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>> index d920e88c7f5b..656be4d73016 100644 >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >>>>>> return brcmnand_readl(ctrl->edu_base + offs); >>>>>> } >>>>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >>>>>> + void __iomem *flash_cache, u32 *buffer, >>>>>> + int fc_words, bool is_param) >>>>>> +{ >>>>>> + struct brcmnand_soc *soc = ctrl->soc; >>>>>> + int i; >>>>>> + >>>>>> + if (soc->read_data_bus) { >>>>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >>>>>> + } else { >>>>>> + if (!is_param) { >>>>>> + for (i = 0; i < fc_words; i++, buffer++) >>>>>> + *buffer = brcmnand_read_fc(ctrl, i); >>>>>> + } else { >>>>>> + for (i = 0; i < fc_words; i++) >>>>>> + /* >>>>>> + * Flash cache is big endian for parameter pages, at >>>>>> + * least on STB SoCs >>>>>> + */ >>>>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>>>> + } >>>>>> + } >>>>> >>>>> Perhaps we could have a single function that is statically assigned at >>>>> probe time instead of a first helper with two conditions which calls in >>>>> one case another hook... This can be simplified I guess. >>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. >>> >>> You told me in case we would use exec_op we could avoid the param >>> cache. If that's true then the whole support can be simplified. >>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. >> >> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. > > I am sorry but this series is totally backwards, you're trying to guess > what comes next with the 'is_param' thing, it's exactly what we are > fighting against since 2017. There are plenty of ->exec_op() > conversions out there, I don't believe this one will be harder. You > need to convert the driver to this new API and get rid of this whole > endianness non-sense to simplify a lot the driver. > I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. It's not that I am against exec_op but I want to take one step a time and I'd like to get these fixes and support for bcmbca soc first and then work on the exec_op API to minimize the change and reduce the risk. >> >>>> Not sure how much this can be simplified... Or we have default >>>> implementation in brcmnand.c but then there is one condition check >>>> too. Page read is done at 512 bytes burst. One or two conditions >>>> check outside of the per 512 bytes read loop does not sounds too bad >>>> if performance is concern. >>> >>> It is unreadable. That is my main concern. >>> >>>> >>>>>> +} >>>>>> + >>>>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >>>>>> { >>>>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >>>>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { >>>>>> /* Copy flash cache word-wise */ >>>>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; >>>>>> - int i; >>>>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >>>>>> >> - /* >>>>>> - * Must cache the FLASH_CACHE now, since changes in >>>>>> - * SECTOR_SIZE_1K may invalidate it >>>>>> - */ >>>>>> - for (i = 0; i < FC_WORDS; i++) >>>>>> - /* >>>>>> - * Flash cache is big endian for parameter pages, at >>>>>> - * least on STB SoCs >>>>>> - */ >>>>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >>>>>> + FC_WORDS, true); >>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >>>>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>>>> { >>>>>> struct brcmnand_host *host = nand_get_controller_data(chip); >>>>>> struct brcmnand_controller *ctrl = host->ctrl; >>>>>> - int i, j, ret = 0; >>>>>> + int i, ret = 0; >>>>>> >> brcmnand_clear_ecc_addr(ctrl); >>>>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>>>> if (likely(buf)) { >>>>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >>>>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) >>>>>> - *buf = brcmnand_read_fc(ctrl, j); >>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >>>>>> + FC_WORDS, false); >>>>>> + buf += FC_WORDS; >>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >>>>>> } >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>> index f1f93d85f50d..88819bc395f8 100644 >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { >>>>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >>>>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >>>>>> bool is_param); >>>>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >>>>>> + u32 *buffer, int fc_words, bool is_param); >>>>>> const struct brcmnand_io_ops *ops; >>>>>> }; >>>>>> > > >>>>> Thanks, >>>>> Miquèl >>>>> > > >>> Thanks, >>> Miquèl >>> > > > Thanks, > Miquèl >
Hi William, kernel test robot noticed the following build warnings: [auto build test WARNING on mtd/nand/next] [also build test WARNING on mtd/mtd/next mtd/mtd/fixes linus/master v6.4-rc5 next-20230609] [cannot apply to robh/for-next] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/William-Zhang/mtd-rawnand-brcmnand-Fix-ECC-level-field-setting-for-v7-2-controller/20230607-071834 base: https://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux.git nand/next patch link: https://lore.kernel.org/r/20230606231252.94838-11-william.zhang%40broadcom.com patch subject: [PATCH 10/12] mtd: rawnand: brcmnand: Add BCMBCA read data bus interface config: arm-randconfig-s052-20230611 (https://download.01.org/0day-ci/archive/20230611/202306111741.1Hs0TVE4-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 12.3.0 reproduce: mkdir -p ~/bin wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # apt-get install sparse # sparse version: v0.6.4-39-gce1a6720-dirty # https://github.com/intel-lab-lkp/linux/commit/2fce7300f3e21ad0cb55442e1acfeaf60f41bf7d git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review William-Zhang/mtd-rawnand-brcmnand-Fix-ECC-level-field-setting-for-v7-2-controller/20230607-071834 git checkout 2fce7300f3e21ad0cb55442e1acfeaf60f41bf7d # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.3.0 ~/bin/make.cross C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__' O=build_dir ARCH=arm SHELL=/bin/bash drivers/mtd/nand/raw/brcmnand/ If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202306111741.1Hs0TVE4-lkp@intel.com/ sparse warnings: (new ones prefixed by >>) drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:83:48: sparse: sparse: incorrect type in argument 1 (different address spaces) @@ expected void *flash_cache @@ got void [noderef] __iomem *flash_cache @@ drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:83:48: sparse: expected void *flash_cache drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:83:48: sparse: got void [noderef] __iomem *flash_cache drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:84:25: sparse: sparse: cast removes address space '__iomem' of expression drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:84:25: sparse: sparse: cast removes address space '__iomem' of expression drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:84:25: sparse: sparse: cast removes address space '__iomem' of expression drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:86:25: sparse: sparse: cast removes address space '__iomem' of expression >> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:86:25: sparse: sparse: incorrect type in argument 2 (different address spaces) @@ expected void const volatile [noderef] __iomem *from @@ got void * @@ drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:86:25: sparse: expected void const volatile [noderef] __iomem *from drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c:86:25: sparse: got void * vim +86 drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c 70 71 static void bcmbca_read_data_bus(struct brcmnand_soc *soc, 72 void __iomem *flash_cache, u32 *buffer, 73 int fc_words, bool is_param) 74 { 75 int i; 76 77 if (!is_param) { 78 /* 79 * memcpy can do unaligned aligned access depending on source 80 * and dest address, which is incompatible with nand cache. Fallback 81 * to the memcpy for io version 82 */ > 83 if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) 84 memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); 85 else > 86 memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); 87 } else { 88 /* Flash cache has same endian as the host for parameter pages */ 89 for (i = 0; i < fc_words; i++, buffer++) 90 *buffer = __raw_readl(flash_cache + i * 4); 91 } 92 } 93
Hi William, william.zhang@broadcom.com wrote on Fri, 9 Jun 2023 12:16:27 -0700: > Hi Miquel, > > On 06/09/2023 01:35 AM, Miquel Raynal wrote: > > Hi William, > > > > william.zhang@broadcom.com wrote on Thu, 8 Jun 2023 12:10:06 -0700: > > > >> On 06/07/2023 11:18 PM, Miquel Raynal wrote: > >>> Hi William, > >>> > >>> william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: > >>> >>>> Hi Miquel, > >>>> > >>>> On 06/07/2023 01:22 AM, Miquel Raynal wrote: > >>>>> Hi William, > >>>>> > >>>>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: > >>>>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than > >>>>>> STB, iProc and other SoCs. It has different endianness for NAND cache > >>>>>> data and ONFI parameter data. > >>>>>> > >>>>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need > >>>>>> and performance improvement using the optimized memcpy function on NAND > >>>>>> cache memory. > >>>>>> > >>>>>> Signed-off-by: William Zhang <william.zhang@broadcom.com> > >>>>>> --- > >>>>>> > >>>>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ > >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- > >>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + > >>>>>> 3 files changed, 68 insertions(+), 14 deletions(-) > >>>>>> > >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >>>>>> index 7e48b6a0bfa2..899103a62c98 100644 > >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c > >>>>>> @@ -26,6 +26,18 @@ enum { > >>>>>> BCMBCA_CTLRDY = BIT(4), > >>>>>> }; > >>>>>> >> +#if defined(CONFIG_ARM64) > >>>>>> +#define ALIGN_REQ 8 > >>>>>> +#else > >>>>>> +#define ALIGN_REQ 4 > >>>>>> +#endif > >>>>>> + > >>>>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) > >>>>>> +{ > >>>>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && > >>>>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); > >>>>>> +} > >>>>>> + > >>>>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) > >>>>>> { > >>>>>> struct bcmbca_nand_soc *priv = > >>>>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) > >>>>>> brcmnand_writel(val, mmio); > >>>>>> } > >>>>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, > >>>>>> + void __iomem *flash_cache, u32 *buffer, > >>>>>> + int fc_words, bool is_param) > >>>>>> +{ > >>>>>> + int i; > >>>>>> + > >>>>>> + if (!is_param) { > >>>>>> + /* > >>>>>> + * memcpy can do unaligned aligned access depending on source > >>>>>> + * and dest address, which is incompatible with nand cache. Fallback > >>>>>> + * to the memcpy for io version > >>>>>> + */ > >>>>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) > >>>>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); > >>>>>> + else > >>>>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); > >>>>>> + } else { > >>>>>> + /* Flash cache has same endian as the host for parameter pages */ > >>>>>> + for (i = 0; i < fc_words; i++, buffer++) > >>>>>> + *buffer = __raw_readl(flash_cache + i * 4); > >>>>>> + } > >>>>>> +} > >>>>>> + > >>>>>> static int bcmbca_nand_probe(struct platform_device *pdev) > >>>>>> { > >>>>>> struct device *dev = &pdev->dev; > >>>>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) > >>>>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; > >>>>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; > >>>>>> + soc->read_data_bus = bcmbca_read_data_bus; > >>>>>> >> return brcmnand_probe(pdev, soc); > >>>>>> } > >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >>>>>> index d920e88c7f5b..656be4d73016 100644 > >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c > >>>>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, > >>>>>> return brcmnand_readl(ctrl->edu_base + offs); > >>>>>> } > >>>>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, > >>>>>> + void __iomem *flash_cache, u32 *buffer, > >>>>>> + int fc_words, bool is_param) > >>>>>> +{ > >>>>>> + struct brcmnand_soc *soc = ctrl->soc; > >>>>>> + int i; > >>>>>> + > >>>>>> + if (soc->read_data_bus) { > >>>>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); > >>>>>> + } else { > >>>>>> + if (!is_param) { > >>>>>> + for (i = 0; i < fc_words; i++, buffer++) > >>>>>> + *buffer = brcmnand_read_fc(ctrl, i); > >>>>>> + } else { > >>>>>> + for (i = 0; i < fc_words; i++) > >>>>>> + /* > >>>>>> + * Flash cache is big endian for parameter pages, at > >>>>>> + * least on STB SoCs > >>>>>> + */ > >>>>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >>>>>> + } > >>>>>> + } > >>>>> > >>>>> Perhaps we could have a single function that is statically assigned at > >>>>> probe time instead of a first helper with two conditions which calls in > >>>>> one case another hook... This can be simplified I guess. > >>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. > >>> > >>> You told me in case we would use exec_op we could avoid the param > >>> cache. If that's true then the whole support can be simplified. > >>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. > >> > >> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. > > > > I am sorry but this series is totally backwards, you're trying to guess > > what comes next with the 'is_param' thing, it's exactly what we are > > fighting against since 2017. There are plenty of ->exec_op() > > conversions out there, I don't believe this one will be harder. You > > need to convert the driver to this new API and get rid of this whole > > endianness non-sense to simplify a lot the driver. > > > I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. No, you are trying to guess what type of read the core is performing, either a regular data page read or a parameter page read. > I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. You told me there is no endianess issue with the data pages, so why it won't help on the page read? > It's not that I am against exec_op but I want to take one step a time > and I'd like to get these fixes I don't see any fix here? Let me know if I am missing something but right now I see a new version of the controller being supported with its own constraints. If you are fixing existing code for already supported platform, then make it clear and we can discuss this. But if you just want to support the bcmbca flavor, then there is no risk mitigation involved here, and a conversion is the right step :) > and support for bcmbca soc first and > then work on the exec_op API to minimize the change and reduce the > risk. > > >> > >>>> Not sure how much this can be simplified... Or we have default > >>>> implementation in brcmnand.c but then there is one condition check > >>>> too. Page read is done at 512 bytes burst. One or two conditions > >>>> check outside of the per 512 bytes read loop does not sounds too bad > >>>> if performance is concern. > >>> > >>> It is unreadable. That is my main concern. > >>> >>>> >>>>>> +} > >>>>>> + > >>>>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) > >>>>>> { > >>>>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, > >>>>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { > >>>>>> /* Copy flash cache word-wise */ > >>>>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; > >>>>>> - int i; > >>>>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); > >>>>>> >> - /* > >>>>>> - * Must cache the FLASH_CACHE now, since changes in > >>>>>> - * SECTOR_SIZE_1K may invalidate it > >>>>>> - */ > >>>>>> - for (i = 0; i < FC_WORDS; i++) > >>>>>> - /* > >>>>>> - * Flash cache is big endian for parameter pages, at > >>>>>> - * least on STB SoCs > >>>>>> - */ > >>>>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); > >>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, > >>>>>> + FC_WORDS, true); > >>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); > >>>>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >>>>>> { > >>>>>> struct brcmnand_host *host = nand_get_controller_data(chip); > >>>>>> struct brcmnand_controller *ctrl = host->ctrl; > >>>>>> - int i, j, ret = 0; > >>>>>> + int i, ret = 0; > >>>>>> >> brcmnand_clear_ecc_addr(ctrl); > >>>>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, > >>>>>> if (likely(buf)) { > >>>>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); > >>>>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) > >>>>>> - *buf = brcmnand_read_fc(ctrl, j); > >>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, > >>>>>> + FC_WORDS, false); > >>>>>> + buf += FC_WORDS; > >>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); > >>>>>> } > >>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >>>>>> index f1f93d85f50d..88819bc395f8 100644 > >>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h > >>>>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { > >>>>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); > >>>>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, > >>>>>> bool is_param); > >>>>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, > >>>>>> + u32 *buffer, int fc_words, bool is_param); > >>>>>> const struct brcmnand_io_ops *ops; > >>>>>> }; > >>>>>> > > > >>>>> Thanks, > >>>>> Miquèl > >>>>> > > > >>> Thanks, > >>> Miquèl > >>> > > > > Thanks, > > Miquèl > > Thanks, Miquèl
Hello again, > > >>>>> Perhaps we could have a single function that is statically assigned at > > >>>>> probe time instead of a first helper with two conditions which calls in > > >>>>> one case another hook... This can be simplified I guess. > > >>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. > > >>> > > >>> You told me in case we would use exec_op we could avoid the param > > >>> cache. If that's true then the whole support can be simplified. > > >>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. > > >> > > >> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. > > > > > > I am sorry but this series is totally backwards, you're trying to guess > > > what comes next with the 'is_param' thing, it's exactly what we are > > > fighting against since 2017. There are plenty of ->exec_op() > > > conversions out there, I don't believe this one will be harder. You > > > need to convert the driver to this new API and get rid of this whole > > > endianness non-sense to simplify a lot the driver. > > > > > I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. > > No, you are trying to guess what type of read the core is performing, > either a regular data page read or a parameter page read. > > > I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. > > You told me there is no endianess issue with the data pages, so why it > won't help on the page read? > > > It's not that I am against exec_op but I want to take one step a time > > and I'd like to get these fixes > > I don't see any fix here? Let me know if I am missing something but > right now I see a new version of the controller being supported with > its own constraints. If you are fixing existing code for already > supported platform, then make it clear and we can discuss this. But if > you just want to support the bcmbca flavor, then there is no risk > mitigation involved here, and a conversion is the right step :) > I forgot to mention: the exec_op conversion is almost ready, Boris worked on it but he lacked the hardware so maybe you'll just need to revive the few patches which target your platform and do a little bit of debugging? https://github.com/bbrezillon/linux/commits/nand/exec-op-conversion?after=8a3cf6fd25d5e15c6667f9e95c1fc86e4cb735e6+34&branch=nand%2Fexec-op-conversion&qualified_name=refs%2Fheads%2Fnand%2Fexec-op-conversion Cheers, Miquèl
On 06/12/2023 10:49 AM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Fri, 9 Jun 2023 12:16:27 -0700: > >> Hi Miquel, >> >> On 06/09/2023 01:35 AM, Miquel Raynal wrote: >>> Hi William, >>> >>> william.zhang@broadcom.com wrote on Thu, 8 Jun 2023 12:10:06 -0700: >>> >>>> On 06/07/2023 11:18 PM, Miquel Raynal wrote: >>>>> Hi William, >>>>> >>>>> william.zhang@broadcom.com wrote on Wed, 7 Jun 2023 13:24:23 -0700: >>>>> >>>> Hi Miquel, >>>>>> >>>>>> On 06/07/2023 01:22 AM, Miquel Raynal wrote: >>>>>>> Hi William, >>>>>>> >>>>>>> william.zhang@broadcom.com wrote on Tue, 6 Jun 2023 16:12:50 -0700: >>>>>>> >>>> The BCMBCA broadband SoC integrates the NAND controller differently than >>>>>>>> STB, iProc and other SoCs. It has different endianness for NAND cache >>>>>>>> data and ONFI parameter data. >>>>>>>> >>>>>>>> Add a SoC read data bus shim for BCMBCA to meet the specific SoC need >>>>>>>> and performance improvement using the optimized memcpy function on NAND >>>>>>>> cache memory. >>>>>>>> >>>>>>>> Signed-off-by: William Zhang <william.zhang@broadcom.com> >>>>>>>> --- >>>>>>>> >>>>>>>> drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c | 36 +++++++++++++++++ >>>>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.c | 44 ++++++++++++++------- >>>>>>>> drivers/mtd/nand/raw/brcmnand/brcmnand.h | 2 + >>>>>>>> 3 files changed, 68 insertions(+), 14 deletions(-) >>>>>>>> >>>>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>>>> index 7e48b6a0bfa2..899103a62c98 100644 >>>>>>>> --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c >>>>>>>> @@ -26,6 +26,18 @@ enum { >>>>>>>> BCMBCA_CTLRDY = BIT(4), >>>>>>>> }; >>>>>>>> >> +#if defined(CONFIG_ARM64) >>>>>>>> +#define ALIGN_REQ 8 >>>>>>>> +#else >>>>>>>> +#define ALIGN_REQ 4 >>>>>>>> +#endif >>>>>>>> + >>>>>>>> +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) >>>>>>>> +{ >>>>>>>> + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && >>>>>>>> + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); >>>>>>>> +} >>>>>>>> + >>>>>>>> static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) >>>>>>>> { >>>>>>>> struct bcmbca_nand_soc *priv = >>>>>>>> @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) >>>>>>>> brcmnand_writel(val, mmio); >>>>>>>> } >>>>>>>> >> +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, >>>>>>>> + void __iomem *flash_cache, u32 *buffer, >>>>>>>> + int fc_words, bool is_param) >>>>>>>> +{ >>>>>>>> + int i; >>>>>>>> + >>>>>>>> + if (!is_param) { >>>>>>>> + /* >>>>>>>> + * memcpy can do unaligned aligned access depending on source >>>>>>>> + * and dest address, which is incompatible with nand cache. Fallback >>>>>>>> + * to the memcpy for io version >>>>>>>> + */ >>>>>>>> + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) >>>>>>>> + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); >>>>>>>> + else >>>>>>>> + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); >>>>>>>> + } else { >>>>>>>> + /* Flash cache has same endian as the host for parameter pages */ >>>>>>>> + for (i = 0; i < fc_words; i++, buffer++) >>>>>>>> + *buffer = __raw_readl(flash_cache + i * 4); >>>>>>>> + } >>>>>>>> +} >>>>>>>> + >>>>>>>> static int bcmbca_nand_probe(struct platform_device *pdev) >>>>>>>> { >>>>>>>> struct device *dev = &pdev->dev; >>>>>>>> @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) >>>>>>>> >> soc->ctlrdy_ack = bcmbca_nand_intc_ack; >>>>>>>> soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; >>>>>>>> + soc->read_data_bus = bcmbca_read_data_bus; >>>>>>>> >> return brcmnand_probe(pdev, soc); >>>>>>>> } >>>>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>>>> index d920e88c7f5b..656be4d73016 100644 >>>>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c >>>>>>>> @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, >>>>>>>> return brcmnand_readl(ctrl->edu_base + offs); >>>>>>>> } >>>>>>>> >> +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, >>>>>>>> + void __iomem *flash_cache, u32 *buffer, >>>>>>>> + int fc_words, bool is_param) >>>>>>>> +{ >>>>>>>> + struct brcmnand_soc *soc = ctrl->soc; >>>>>>>> + int i; >>>>>>>> + >>>>>>>> + if (soc->read_data_bus) { >>>>>>>> + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); >>>>>>>> + } else { >>>>>>>> + if (!is_param) { >>>>>>>> + for (i = 0; i < fc_words; i++, buffer++) >>>>>>>> + *buffer = brcmnand_read_fc(ctrl, i); >>>>>>>> + } else { >>>>>>>> + for (i = 0; i < fc_words; i++) >>>>>>>> + /* >>>>>>>> + * Flash cache is big endian for parameter pages, at >>>>>>>> + * least on STB SoCs >>>>>>>> + */ >>>>>>>> + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>>>>>> + } >>>>>>>> + } >>>>>>> >>>>>>> Perhaps we could have a single function that is statically assigned at >>>>>>> probe time instead of a first helper with two conditions which calls in >>>>>>> one case another hook... This can be simplified I guess. >>>>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. >>>>> >>>>> You told me in case we would use exec_op we could avoid the param >>>>> cache. If that's true then the whole support can be simplified. >>>>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. >>>> >>>> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. >>> >>> I am sorry but this series is totally backwards, you're trying to guess >>> what comes next with the 'is_param' thing, it's exactly what we are >>> fighting against since 2017. There are plenty of ->exec_op() >>> conversions out there, I don't believe this one will be harder. You >>> need to convert the driver to this new API and get rid of this whole >>> endianness non-sense to simplify a lot the driver. >>> >> I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. > > No, you are trying to guess what type of read the core is performing, > either a regular data page read or a parameter page read. > Okay this is what you mean by guessing. I didn't realize that ;) >> I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. > > You told me there is no endianess issue with the data pages, so why it > won't help on the page read? > Even with exec_op, the page read path for brcmand(chip->ecc.read_page) will still need brcmnand_read_page function which eventually I need per SoC implementation at least for bcmbca for now besides different endianess between SoC. For bcmbca, I also use the memcpy in the patch as the nand cache in bcmbca chip can handled the optimized copy code as long as the buffer is aligned for better performance. >> It's not that I am against exec_op but I want to take one step a time >> and I'd like to get these fixes > > I don't see any fix here? Let me know if I am missing something but > right now I see a new version of the controller being supported with > its own constraints. If you are fixing existing code for already > supported platform, then make it clear and we can discuss this. But if > you just want to support the bcmbca flavor, then there is no risk > mitigation involved here, and a conversion is the right step :) > I mean the patch 1 to 4 in this series. The exec_op will apply to all the five SoCs under brcmnand folder, not just bcmbca. It will take lot of time even just find people to test/debug all of them as I don't have access to other SoC and boards, on top of the nature of this big change. >> and support for bcmbca soc first and >> then work on the exec_op API to minimize the change and reduce the >> risk. >> >>>> >>>>>> Not sure how much this can be simplified... Or we have default >>>>>> implementation in brcmnand.c but then there is one condition check >>>>>> too. Page read is done at 512 bytes burst. One or two conditions >>>>>> check outside of the per 512 bytes read loop does not sounds too bad >>>>>> if performance is concern. >>>>> >>>>> It is unreadable. That is my main concern. >>>>> >>>> >>>>>> +} >>>>>>>> + >>>>>>>> static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) >>>>>>>> { >>>>>>>> >> @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, >>>>>>>> native_cmd == CMD_PARAMETER_CHANGE_COL) { >>>>>>>> /* Copy flash cache word-wise */ >>>>>>>> u32 *flash_cache = (u32 *)ctrl->flash_cache; >>>>>>>> - int i; >>>>>>>> >> brcmnand_soc_data_bus_prepare(ctrl->soc, true); >>>>>>>> >> - /* >>>>>>>> - * Must cache the FLASH_CACHE now, since changes in >>>>>>>> - * SECTOR_SIZE_1K may invalidate it >>>>>>>> - */ >>>>>>>> - for (i = 0; i < FC_WORDS; i++) >>>>>>>> - /* >>>>>>>> - * Flash cache is big endian for parameter pages, at >>>>>>>> - * least on STB SoCs >>>>>>>> - */ >>>>>>>> - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); >>>>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, >>>>>>>> + FC_WORDS, true); >>>>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, true); >>>>>>>> >> @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>>>>>> { >>>>>>>> struct brcmnand_host *host = nand_get_controller_data(chip); >>>>>>>> struct brcmnand_controller *ctrl = host->ctrl; >>>>>>>> - int i, j, ret = 0; >>>>>>>> + int i, ret = 0; >>>>>>>> >> brcmnand_clear_ecc_addr(ctrl); >>>>>>>> >> @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, >>>>>>>> if (likely(buf)) { >>>>>>>> brcmnand_soc_data_bus_prepare(ctrl->soc, false); >>>>>>>> >> - for (j = 0; j < FC_WORDS; j++, buf++) >>>>>>>> - *buf = brcmnand_read_fc(ctrl, j); >>>>>>>> + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, >>>>>>>> + FC_WORDS, false); >>>>>>>> + buf += FC_WORDS; >>>>>>>> >> brcmnand_soc_data_bus_unprepare(ctrl->soc, false); >>>>>>>> } >>>>>>>> diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>>>> index f1f93d85f50d..88819bc395f8 100644 >>>>>>>> --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>>>> +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h >>>>>>>> @@ -24,6 +24,8 @@ struct brcmnand_soc { >>>>>>>> void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); >>>>>>>> void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, >>>>>>>> bool is_param); >>>>>>>> + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, >>>>>>>> + u32 *buffer, int fc_words, bool is_param); >>>>>>>> const struct brcmnand_io_ops *ops; >>>>>>>> }; >>>>>>>> > > >>>>>>> Thanks, >>>>>>> Miquèl >>>>>>> > > >>>>> Thanks, >>>>> Miquèl >>>>> > > >>> Thanks, >>> Miquèl >>> > > > Thanks, > Miquèl >
On 06/12/2023 10:53 AM, Miquel Raynal wrote: > Hello again, > >>>>>>>> Perhaps we could have a single function that is statically assigned at >>>>>>>> probe time instead of a first helper with two conditions which calls in >>>>>>>> one case another hook... This can be simplified I guess. >>>>>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. >>>>>> >>>>>> You told me in case we would use exec_op we could avoid the param >>>>>> cache. If that's true then the whole support can be simplified. >>>>>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. >>>>> >>>>> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. >>>> >>>> I am sorry but this series is totally backwards, you're trying to guess >>>> what comes next with the 'is_param' thing, it's exactly what we are >>>> fighting against since 2017. There are plenty of ->exec_op() >>>> conversions out there, I don't believe this one will be harder. You >>>> need to convert the driver to this new API and get rid of this whole >>>> endianness non-sense to simplify a lot the driver. >>>> >>> I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. >> >> No, you are trying to guess what type of read the core is performing, >> either a regular data page read or a parameter page read. >> >>> I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. >> >> You told me there is no endianess issue with the data pages, so why it >> won't help on the page read? >> >>> It's not that I am against exec_op but I want to take one step a time >>> and I'd like to get these fixes >> >> I don't see any fix here? Let me know if I am missing something but >> right now I see a new version of the controller being supported with >> its own constraints. If you are fixing existing code for already >> supported platform, then make it clear and we can discuss this. But if >> you just want to support the bcmbca flavor, then there is no risk >> mitigation involved here, and a conversion is the right step :) >> > > I forgot to mention: the exec_op conversion is almost ready, Boris > worked on it but he lacked the hardware so maybe you'll just need to > revive the few patches which target your platform and do a little bit of > debugging? > > https://github.com/bbrezillon/linux/commits/nand/exec-op-conversion?after=8a3cf6fd25d5e15c6667f9e95c1fc86e4cb735e6+34&branch=nand%2Fexec-op-conversion&qualified_name=refs%2Fheads%2Fnand%2Fexec-op-conversion > Yes this is the patch what our exec_op work is based on. Thanks Boris! The issue with patch is that performance is very slow for anything that rely on nand_read_page_op as the patch implementing it using the low level cmd and data register to transfer the data byte by byte. I actually sent out email regarding this to Boris and he cc'ed you in sept last year. We have to use the nand parser to match the page read from exec_op so we can actually match and use the brcmnand_page_read fast path. But there are many situations that we need to match so the project to migrate exce_op are still work in progress just on our bcmbca chip as of now. Just forward that email again to you and I appreciate it if you have any inputs there. So IMHO it is just too risky and too big of scope to have the exec_op added to this patch series and definitively better to do it afterwards with a dedicated patch. > Cheers, > Miquèl >
Hi William, william.zhang@broadcom.com wrote on Mon, 12 Jun 2023 12:18:58 -0700: > On 06/12/2023 10:53 AM, Miquel Raynal wrote: > > Hello again, > > > >>>>>>>> Perhaps we could have a single function that is statically assigned at > >>>>>>>> probe time instead of a first helper with two conditions which calls in > >>>>>>>> one case another hook... This can be simplified I guess. > >>>>>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. > >>>>>> > >>>>>> You told me in case we would use exec_op we could avoid the param > >>>>>> cache. If that's true then the whole support can be simplified. > >>>>>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. > >>>>> > >>>>> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. > >>>> > >>>> I am sorry but this series is totally backwards, you're trying to guess > >>>> what comes next with the 'is_param' thing, it's exactly what we are > >>>> fighting against since 2017. There are plenty of ->exec_op() > >>>> conversions out there, I don't believe this one will be harder. You > >>>> need to convert the driver to this new API and get rid of this whole > >>>> endianness non-sense to simplify a lot the driver. > >>>> >>> I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. > >> > >> No, you are trying to guess what type of read the core is performing, > >> either a regular data page read or a parameter page read. > >> > >>> I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. > >> > >> You told me there is no endianess issue with the data pages, so why it > >> won't help on the page read? > >> > >>> It's not that I am against exec_op but I want to take one step a time > >>> and I'd like to get these fixes > >> > >> I don't see any fix here? Let me know if I am missing something but > >> right now I see a new version of the controller being supported with > >> its own constraints. If you are fixing existing code for already > >> supported platform, then make it clear and we can discuss this. But if > >> you just want to support the bcmbca flavor, then there is no risk > >> mitigation involved here, and a conversion is the right step :) > >> > > > > I forgot to mention: the exec_op conversion is almost ready, Boris > > worked on it but he lacked the hardware so maybe you'll just need to > > revive the few patches which target your platform and do a little bit of > > debugging? > > > > https://github.com/bbrezillon/linux/commits/nand/exec-op-conversion?after=8a3cf6fd25d5e15c6667f9e95c1fc86e4cb735e6+34&branch=nand%2Fexec-op-conversion&qualified_name=refs%2Fheads%2Fnand%2Fexec-op-conversion > > > Yes this is the patch what our exec_op work is based on. Thanks Boris! The issue with patch is that performance is very slow for anything that rely on nand_read_page_op as the patch implementing it using the low level cmd and data register to transfer the data byte by byte. You don't need to use exec_op for your read_page/write_page hooks, quite the opposite actually. exec_op is not meant for high throughput. exec_op is meant to be simple. You can have fast I/Os with a different mechanism in your read/write_page hooks. > I actually sent out email regarding this to Boris and he cc'ed you in > sept last year. We have to use the nand parser to match the page read > from exec_op so we can actually match and use the brcmnand_page_read > fast path. But there are many situations that we need to match so the > project to migrate exce_op are still work in progress just on our > bcmbca chip as of now. Just forward that email again to you and I > appreciate it if you have any inputs there. So IMHO it is just too > risky and too big of scope to have the exec_op added to this patch > series and definitively better to do it afterwards with a dedicated > patch. As long as you add small and orthogonal changes to cmd_ctrl/cmd_func I don't mind, but what you want now is to force me to pull dirty changes "first", the type of change we are refusing since 2018, making me expect you'll perform the conversion after. It would have been terribly less dirty and you would have all your code already upstreamed if you had performed the exec_op conversion since September. Thanks, Miquèl
Hi Miquel, On 06/12/2023 11:42 PM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Mon, 12 Jun 2023 12:18:58 -0700: > >> On 06/12/2023 10:53 AM, Miquel Raynal wrote: >>> Hello again, >>> >>>>>>>>>> Perhaps we could have a single function that is statically assigned at >>>>>>>>>> probe time instead of a first helper with two conditions which calls in >>>>>>>>>> one case another hook... This can be simplified I guess. >>>>>>>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. >>>>>>>> >>>>>>>> You told me in case we would use exec_op we could avoid the param >>>>>>>> cache. If that's true then the whole support can be simplified. >>>>>>>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. >>>>>>> >>>>>>> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. >>>>>> >>>>>> I am sorry but this series is totally backwards, you're trying to guess >>>>>> what comes next with the 'is_param' thing, it's exactly what we are >>>>>> fighting against since 2017. There are plenty of ->exec_op() >>>>>> conversions out there, I don't believe this one will be harder. You >>>>>> need to convert the driver to this new API and get rid of this whole >>>>>> endianness non-sense to simplify a lot the driver. >>>>>> >>> I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. >>>> >>>> No, you are trying to guess what type of read the core is performing, >>>> either a regular data page read or a parameter page read. >>>> >>>>> I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. >>>> >>>> You told me there is no endianess issue with the data pages, so why it >>>> won't help on the page read? >>>> >>>>> It's not that I am against exec_op but I want to take one step a time >>>>> and I'd like to get these fixes >>>> >>>> I don't see any fix here? Let me know if I am missing something but >>>> right now I see a new version of the controller being supported with >>>> its own constraints. If you are fixing existing code for already >>>> supported platform, then make it clear and we can discuss this. But if >>>> you just want to support the bcmbca flavor, then there is no risk >>>> mitigation involved here, and a conversion is the right step :) >>>> >>> >>> I forgot to mention: the exec_op conversion is almost ready, Boris >>> worked on it but he lacked the hardware so maybe you'll just need to >>> revive the few patches which target your platform and do a little bit of >>> debugging? >>> >>> https://github.com/bbrezillon/linux/commits/nand/exec-op-conversion?after=8a3cf6fd25d5e15c6667f9e95c1fc86e4cb735e6+34&branch=nand%2Fexec-op-conversion&qualified_name=refs%2Fheads%2Fnand%2Fexec-op-conversion >>> >> Yes this is the patch what our exec_op work is based on. Thanks Boris! The issue with patch is that performance is very slow for anything that rely on nand_read_page_op as the patch implementing it using the low level cmd and data register to transfer the data byte by byte. > > You don't need to use exec_op for your read_page/write_page hooks, > quite the opposite actually. exec_op is not meant for high throughput. > exec_op is meant to be simple. You can have fast I/Os with a different > mechanism in your read/write_page hooks. > Right it does not impact our fast path: controller based ecc read/write. But things like on-chip ecc nand driver that uses exec_op API get impacted badly. We need to add nand op parser, several matching rules and other logics to use fast path page read/write instead of the low level data register read/write. >> I actually sent out email regarding this to Boris and he cc'ed you in >> sept last year. We have to use the nand parser to match the page read >> from exec_op so we can actually match and use the brcmnand_page_read >> fast path. But there are many situations that we need to match so the >> project to migrate exce_op are still work in progress just on our >> bcmbca chip as of now. Just forward that email again to you and I >> appreciate it if you have any inputs there. So IMHO it is just too >> risky and too big of scope to have the exec_op added to this patch >> series and definitively better to do it afterwards with a dedicated >> patch. > > As long as you add small and orthogonal changes to cmd_ctrl/cmd_func > I don't mind, but what you want now is to force me to pull dirty > changes "first", the type of change we are refusing since 2018, making > me expect you'll perform the conversion after. It would have been > terribly less dirty and you would have all your code already upstreamed > if you had performed the exec_op conversion since September. > I didn't work on open source 5 years ago. I am sorry that I missed the background of the rejected changes since then but I do not agree that this change is dirty change just because I factor out the code with is_param argument(and I offered an alternative to remove is_param with two data read functions). I see your point with exec_op and agree that is the way to go. We had an initial look of the Borris exec_op patch last Sept and noticed the performance issue but we haven't got the chance to actively work on improving the performance and prepare for up-streaming until recently. What if we bring in the original exec_op patch in this series so we don't need to add the parameter data read function(if we verify it works on difference SoCs without endianess)? Or better to have exec_op as separate patch first and then this series? Then we provide another patch to improve the performance for exec_op as this work is still in progress and require more testing. > Thanks, > Miquèl >
Hi William, william.zhang@broadcom.com wrote on Tue, 13 Jun 2023 17:00:19 -0700: > Hi Miquel, > > On 06/12/2023 11:42 PM, Miquel Raynal wrote: > > Hi William, > > > > william.zhang@broadcom.com wrote on Mon, 12 Jun 2023 12:18:58 -0700: > > > >> On 06/12/2023 10:53 AM, Miquel Raynal wrote: > >>> Hello again, > >>> >>>>>>>>>> Perhaps we could have a single function that is statically assigned at > >>>>>>>>>> probe time instead of a first helper with two conditions which calls in > >>>>>>>>>> one case another hook... This can be simplified I guess. > >>>>>>>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. > >>>>>>>> > >>>>>>>> You told me in case we would use exec_op we could avoid the param > >>>>>>>> cache. If that's true then the whole support can be simplified. > >>>>>>>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. > >>>>>>> > >>>>>>> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. > >>>>>> > >>>>>> I am sorry but this series is totally backwards, you're trying to guess > >>>>>> what comes next with the 'is_param' thing, it's exactly what we are > >>>>>> fighting against since 2017. There are plenty of ->exec_op() > >>>>>> conversions out there, I don't believe this one will be harder. You > >>>>>> need to convert the driver to this new API and get rid of this whole > >>>>>> endianness non-sense to simplify a lot the driver. > >>>>>> >>> I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. > >>>> > >>>> No, you are trying to guess what type of read the core is performing, > >>>> either a regular data page read or a parameter page read. > >>>> >>>>> I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. > >>>> > >>>> You told me there is no endianess issue with the data pages, so why it > >>>> won't help on the page read? > >>>> >>>>> It's not that I am against exec_op but I want to take one step a time > >>>>> and I'd like to get these fixes > >>>> > >>>> I don't see any fix here? Let me know if I am missing something but > >>>> right now I see a new version of the controller being supported with > >>>> its own constraints. If you are fixing existing code for already > >>>> supported platform, then make it clear and we can discuss this. But if > >>>> you just want to support the bcmbca flavor, then there is no risk > >>>> mitigation involved here, and a conversion is the right step :) > >>>> >>> > >>> I forgot to mention: the exec_op conversion is almost ready, Boris > >>> worked on it but he lacked the hardware so maybe you'll just need to > >>> revive the few patches which target your platform and do a little bit of > >>> debugging? > >>> > >>> https://github.com/bbrezillon/linux/commits/nand/exec-op-conversion?after=8a3cf6fd25d5e15c6667f9e95c1fc86e4cb735e6+34&branch=nand%2Fexec-op-conversion&qualified_name=refs%2Fheads%2Fnand%2Fexec-op-conversion > >>> >> Yes this is the patch what our exec_op work is based on. Thanks Boris! The issue with patch is that performance is very slow for anything that rely on nand_read_page_op as the patch implementing it using the low level cmd and data register to transfer the data byte by byte. > > > > You don't need to use exec_op for your read_page/write_page hooks, > > quite the opposite actually. exec_op is not meant for high throughput. > > exec_op is meant to be simple. You can have fast I/Os with a different > > mechanism in your read/write_page hooks. > > > Right it does not impact our fast path: controller based ecc read/write. But things like on-chip ecc nand driver that uses exec_op API get impacted badly. We need to add nand op parser, several matching rules and other logics to use fast path page read/write instead of the low level data register read/write. > > >> I actually sent out email regarding this to Boris and he cc'ed you in > >> sept last year. We have to use the nand parser to match the page read > >> from exec_op so we can actually match and use the brcmnand_page_read > >> fast path. But there are many situations that we need to match so the > >> project to migrate exce_op are still work in progress just on our > >> bcmbca chip as of now. Just forward that email again to you and I > >> appreciate it if you have any inputs there. So IMHO it is just too > >> risky and too big of scope to have the exec_op added to this patch > >> series and definitively better to do it afterwards with a dedicated > >> patch. > > > > As long as you add small and orthogonal changes to cmd_ctrl/cmd_func > > I don't mind, but what you want now is to force me to pull dirty > > changes "first", the type of change we are refusing since 2018, making > > me expect you'll perform the conversion after. It would have been > > terribly less dirty and you would have all your code already upstreamed > > if you had performed the exec_op conversion since September. > > > I didn't work on open source 5 years ago. I am sorry that I missed the background of the rejected changes since then but I do not agree that this change is dirty change just because I factor out the code with is_param argument(and I offered an alternative to remove is_param with two data read functions). This _is_ dirty because you cannot know with the cmd_ctrl/cmdfunc API whether we read a parameter page or a page of data. So your are _guessing_. There are plenty ways of reading one of the others, the heuristics on the controller side will _always_ be wrong. That is why exec_op() was introduced. > I see your point with exec_op and agree that is the way to go. We had an initial look of the Borris exec_op patch last Sept and noticed the performance issue but we haven't got the chance to actively work on improving the performance and prepare for up-streaming until recently. What if we bring in the original exec_op patch in this series so we don't need to add the parameter data read function(if we verify it works on difference SoCs without endianess)? Or better to have exec_op as separate patch first and then this series? This one is my favorite: 1/ Add exec_op support 2/ Remove legacy hooks 3/ Add support for the bcmbca SoC Then you can improve the performance for on-die ECC situations, but to be honest this improvement looks little a very little addition. You can take example from the existing hooks, how they match specific operations in the parser and then hook them to specific helpers. Nothing terribly complex, there are dozens of conversions available now. Good luck :) Miquèl
On 06/13/2023 11:22 PM, Miquel Raynal wrote: > Hi William, > > william.zhang@broadcom.com wrote on Tue, 13 Jun 2023 17:00:19 -0700: > >> Hi Miquel, >> >> On 06/12/2023 11:42 PM, Miquel Raynal wrote: >>> Hi William, >>> >>> william.zhang@broadcom.com wrote on Mon, 12 Jun 2023 12:18:58 -0700: >>> >>>> On 06/12/2023 10:53 AM, Miquel Raynal wrote: >>>>> Hello again, >>>>> >>>>>>>>>> Perhaps we could have a single function that is statically assigned at >>>>>>>>>>>> probe time instead of a first helper with two conditions which calls in >>>>>>>>>>>> one case another hook... This can be simplified I guess. >>>>>>>>>>>> >> Well this will need to be done at the SoC specific implementation level (bcm<xxx>_nand.c) and each SoC will need to have either general data bus read func with is_param option or data_bus_read_page, data_bus_read_param. >>>>>>>>>> >>>>>>>>>> You told me in case we would use exec_op we could avoid the param >>>>>>>>>> cache. If that's true then the whole support can be simplified. >>>>>>>>>> >> Correct we may possibly unified the parameter data read but exec_op is long shot and we are not fully ready for that yet. It also depends on if the low level data register has endianess difference for the parameter data between difference SoCs. >>>>>>>>> >>>>>>>>> So I would like to push the current implementation and we can explore the exec_op option late which will be a much big and complete different implementation. >>>>>>>> >>>>>>>> I am sorry but this series is totally backwards, you're trying to guess >>>>>>>> what comes next with the 'is_param' thing, it's exactly what we are >>>>>>>> fighting against since 2017. There are plenty of ->exec_op() >>>>>>>> conversions out there, I don't believe this one will be harder. You >>>>>>>> need to convert the driver to this new API and get rid of this whole >>>>>>>> endianness non-sense to simplify a lot the driver. >>>>>>>> >>> I am not guessing anything but just factor out the existing common nand cache read logic into the single default function(or one for page read and another for parameter read as I mentioned in another thread) and allow SoC to overrides the implementation when needed. >>>>>> >>>>>> No, you are trying to guess what type of read the core is performing, >>>>>> either a regular data page read or a parameter page read. >>>>>> >>>>> I agree ->exec_op can possibly get rid of the parameter page read function and is the way to go. But it won't help on the page read for endianess. >>>>>> >>>>>> You told me there is no endianess issue with the data pages, so why it >>>>>> won't help on the page read? >>>>>> >>>>> It's not that I am against exec_op but I want to take one step a time >>>>>>> and I'd like to get these fixes >>>>>> >>>>>> I don't see any fix here? Let me know if I am missing something but >>>>>> right now I see a new version of the controller being supported with >>>>>> its own constraints. If you are fixing existing code for already >>>>>> supported platform, then make it clear and we can discuss this. But if >>>>>> you just want to support the bcmbca flavor, then there is no risk >>>>>> mitigation involved here, and a conversion is the right step :) >>>>>> >>> >>>>> I forgot to mention: the exec_op conversion is almost ready, Boris >>>>> worked on it but he lacked the hardware so maybe you'll just need to >>>>> revive the few patches which target your platform and do a little bit of >>>>> debugging? >>>>> >>>>> https://github.com/bbrezillon/linux/commits/nand/exec-op-conversion?after=8a3cf6fd25d5e15c6667f9e95c1fc86e4cb735e6+34&branch=nand%2Fexec-op-conversion&qualified_name=refs%2Fheads%2Fnand%2Fexec-op-conversion >>>>> >> Yes this is the patch what our exec_op work is based on. Thanks Boris! The issue with patch is that performance is very slow for anything that rely on nand_read_page_op as the patch implementing it using the low level cmd and data register to transfer the data byte by byte. >>> >>> You don't need to use exec_op for your read_page/write_page hooks, >>> quite the opposite actually. exec_op is not meant for high throughput. >>> exec_op is meant to be simple. You can have fast I/Os with a different >>> mechanism in your read/write_page hooks. >>> >> Right it does not impact our fast path: controller based ecc read/write. But things like on-chip ecc nand driver that uses exec_op API get impacted badly. We need to add nand op parser, several matching rules and other logics to use fast path page read/write instead of the low level data register read/write. >> >>>> I actually sent out email regarding this to Boris and he cc'ed you in >>>> sept last year. We have to use the nand parser to match the page read >>>> from exec_op so we can actually match and use the brcmnand_page_read >>>> fast path. But there are many situations that we need to match so the >>>> project to migrate exce_op are still work in progress just on our >>>> bcmbca chip as of now. Just forward that email again to you and I >>>> appreciate it if you have any inputs there. So IMHO it is just too >>>> risky and too big of scope to have the exec_op added to this patch >>>> series and definitively better to do it afterwards with a dedicated >>>> patch. >>> >>> As long as you add small and orthogonal changes to cmd_ctrl/cmd_func >>> I don't mind, but what you want now is to force me to pull dirty >>> changes "first", the type of change we are refusing since 2018, making >>> me expect you'll perform the conversion after. It would have been >>> terribly less dirty and you would have all your code already upstreamed >>> if you had performed the exec_op conversion since September. >>> >> I didn't work on open source 5 years ago. I am sorry that I missed the background of the rejected changes since then but I do not agree that this change is dirty change just because I factor out the code with is_param argument(and I offered an alternative to remove is_param with two data read functions). > > This _is_ dirty because you cannot know with the cmd_ctrl/cmdfunc > API whether we read a parameter page or a page of data. So your are > _guessing_. There are plenty ways of reading one of the others, the > heuristics on the controller side will _always_ be wrong. That is why > exec_op() was introduced. > alright we have different definition of dirty ;) Understand it is not a preferred way to update the code in controller cmdfunc path especially for large change that can be done in exec_op. >> I see your point with exec_op and agree that is the way to go. We had an initial look of the Borris exec_op patch last Sept and noticed the performance issue but we haven't got the chance to actively work on improving the performance and prepare for up-streaming until recently. What if we bring in the original exec_op patch in this series so we don't need to add the parameter data read function(if we verify it works on difference SoCs without endianess)? Or better to have exec_op as separate patch first and then this series? > > This one is my favorite: > 1/ Add exec_op support > 2/ Remove legacy hooks > 3/ Add support for the bcmbca SoC > Sounds good. We will send exec_op series for 1 and 2 then another series for 3. And I will send v2 of this series to just include the fixes (patch 1 to patch 4) with updates based on the comments received. > Then you can improve the performance for on-die ECC situations, but to > be honest this improvement looks little a very little addition. You can > take example from the existing hooks, how they match specific > operations in the parser and then hook them to specific helpers. > Nothing terribly complex, there are dozens of conversions available > now. > > Good luck :) > Miquèl >
diff --git a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c index 7e48b6a0bfa2..899103a62c98 100644 --- a/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c +++ b/drivers/mtd/nand/raw/brcmnand/bcmbca_nand.c @@ -26,6 +26,18 @@ enum { BCMBCA_CTLRDY = BIT(4), }; +#if defined(CONFIG_ARM64) +#define ALIGN_REQ 8 +#else +#define ALIGN_REQ 4 +#endif + +static inline bool bcmbca_nand_is_buf_aligned(void *flash_cache, void *buffer) +{ + return IS_ALIGNED((uintptr_t)buffer, ALIGN_REQ) && + IS_ALIGNED((uintptr_t)flash_cache, ALIGN_REQ); +} + static bool bcmbca_nand_intc_ack(struct brcmnand_soc *soc) { struct bcmbca_nand_soc *priv = @@ -56,6 +68,29 @@ static void bcmbca_nand_intc_set(struct brcmnand_soc *soc, bool en) brcmnand_writel(val, mmio); } +static void bcmbca_read_data_bus(struct brcmnand_soc *soc, + void __iomem *flash_cache, u32 *buffer, + int fc_words, bool is_param) +{ + int i; + + if (!is_param) { + /* + * memcpy can do unaligned aligned access depending on source + * and dest address, which is incompatible with nand cache. Fallback + * to the memcpy for io version + */ + if (bcmbca_nand_is_buf_aligned(flash_cache, buffer)) + memcpy((void *)buffer, (void *)flash_cache, fc_words * 4); + else + memcpy_fromio((void *)buffer, (void *)flash_cache, fc_words * 4); + } else { + /* Flash cache has same endian as the host for parameter pages */ + for (i = 0; i < fc_words; i++, buffer++) + *buffer = __raw_readl(flash_cache + i * 4); + } +} + static int bcmbca_nand_probe(struct platform_device *pdev) { struct device *dev = &pdev->dev; @@ -75,6 +110,7 @@ static int bcmbca_nand_probe(struct platform_device *pdev) soc->ctlrdy_ack = bcmbca_nand_intc_ack; soc->ctlrdy_set_enabled = bcmbca_nand_intc_set; + soc->read_data_bus = bcmbca_read_data_bus; return brcmnand_probe(pdev, soc); } diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c index d920e88c7f5b..656be4d73016 100644 --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c @@ -814,6 +814,30 @@ static inline u32 edu_readl(struct brcmnand_controller *ctrl, return brcmnand_readl(ctrl->edu_base + offs); } +static inline void brcmnand_read_data_bus(struct brcmnand_controller *ctrl, + void __iomem *flash_cache, u32 *buffer, + int fc_words, bool is_param) +{ + struct brcmnand_soc *soc = ctrl->soc; + int i; + + if (soc->read_data_bus) { + soc->read_data_bus(soc, flash_cache, buffer, fc_words, is_param); + } else { + if (!is_param) { + for (i = 0; i < fc_words; i++, buffer++) + *buffer = brcmnand_read_fc(ctrl, i); + } else { + for (i = 0; i < fc_words; i++) + /* + * Flash cache is big endian for parameter pages, at + * least on STB SoCs + */ + buffer[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); + } + } +} + static void brcmnand_clear_ecc_addr(struct brcmnand_controller *ctrl) { @@ -1811,20 +1835,11 @@ static void brcmnand_cmdfunc(struct nand_chip *chip, unsigned command, native_cmd == CMD_PARAMETER_CHANGE_COL) { /* Copy flash cache word-wise */ u32 *flash_cache = (u32 *)ctrl->flash_cache; - int i; brcmnand_soc_data_bus_prepare(ctrl->soc, true); - /* - * Must cache the FLASH_CACHE now, since changes in - * SECTOR_SIZE_1K may invalidate it - */ - for (i = 0; i < FC_WORDS; i++) - /* - * Flash cache is big endian for parameter pages, at - * least on STB SoCs - */ - flash_cache[i] = be32_to_cpu(brcmnand_read_fc(ctrl, i)); + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, flash_cache, + FC_WORDS, true); brcmnand_soc_data_bus_unprepare(ctrl->soc, true); @@ -2137,7 +2152,7 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, { struct brcmnand_host *host = nand_get_controller_data(chip); struct brcmnand_controller *ctrl = host->ctrl; - int i, j, ret = 0; + int i, ret = 0; brcmnand_clear_ecc_addr(ctrl); @@ -2150,8 +2165,9 @@ static int brcmnand_read_by_pio(struct mtd_info *mtd, struct nand_chip *chip, if (likely(buf)) { brcmnand_soc_data_bus_prepare(ctrl->soc, false); - for (j = 0; j < FC_WORDS; j++, buf++) - *buf = brcmnand_read_fc(ctrl, j); + brcmnand_read_data_bus(ctrl, ctrl->nand_fc, buf, + FC_WORDS, false); + buf += FC_WORDS; brcmnand_soc_data_bus_unprepare(ctrl->soc, false); } diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h index f1f93d85f50d..88819bc395f8 100644 --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h @@ -24,6 +24,8 @@ struct brcmnand_soc { void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, bool is_param); + void (*read_data_bus)(struct brcmnand_soc *soc, void __iomem *flash_cache, + u32 *buffer, int fc_words, bool is_param); const struct brcmnand_io_ops *ops; };