From patchwork Wed Apr 12 23:21:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roman Gushchin X-Patchwork-Id: 82706 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp675751vqo; Wed, 12 Apr 2023 16:50:19 -0700 (PDT) X-Google-Smtp-Source: AKy350ZE8R1S2+odR+2rAaKJQMkUQX03ED2GzuItF1aas0wiEAHO4FRhyktLzXxK4WKcGSo316tm X-Received: by 2002:a05:6a00:2313:b0:634:db05:d477 with SMTP id h19-20020a056a00231300b00634db05d477mr889612pfh.10.1681343419546; Wed, 12 Apr 2023 16:50:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681343419; cv=none; d=google.com; s=arc-20160816; b=OMVSwREZuF0EpiJ82bieC6NwgmS71rLi5nDsME1iQYta42QRwUR8wC5igteNVKQ4Ip oHPbKU4son8gqEWpB1Ys4Ip3e3y54UfJs9rCOzNY5ABMgoJgfHipmV7BLyJjJ9aIL88a tvwjX6SZwAyhmn7AuU5U2qyQxNNywES++XDlCf6dFfUdretgyVAp5GKJZqyII07zg1X/ Bo2ALPKqqTPvqBuZ0okCtBPeGOSzGfUM+SgKSUPg+ohU8S+x9e1QkSmLxRA/UGPmf5uF dsLZPyYqY2yOCNRevLYD0g8W7li2UY09SXoB8J7OhGnMD8xAGRkcf3DZlHfdnQj0nPle 2Yrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=ash0rC8Buu4gkiLlAq8MxDEdJ92MQI6+z71zxIkVBiY=; b=mJMQxRrgOQSwzF1sIQR1cylofqS2zbsjJvSiGPya1JFtdaYCGytxzJI/5pVYexAHMj /u8aB9PvOrBZBEFJqat9MJNyd4U8NOGNnjw6Pzdz8skJtA7fly/0C0ywokVhlvOfIKgi ZlkCcD/M1sZTrSPnP2oeEGGEvN52iz5KCaCxLrYHtFQxUyEVxjaDSg/rS9p9Hh/GVl8/ RqcOxlug/uDOkPkc43BBXkcxdgED6wtU2BUR6B3gt3xDjszBjeHDEC+vLLnWL4o6eb4k TfrEZysTiUj1MvTdrlwCs/TFrrVxZdmYu6e+MVY5KkS/M0sHS4ryPMdr9Dmqn1ef9ent fNTA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=sJrF7XdN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t190-20020a625fc7000000b0063b16cf691esi3403pfb.385.2023.04.12.16.50.04; Wed, 12 Apr 2023 16:50:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=sJrF7XdN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229810AbjDLXW3 (ORCPT + 99 others); Wed, 12 Apr 2023 19:22:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36122 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229441AbjDLXW2 (ORCPT ); Wed, 12 Apr 2023 19:22:28 -0400 Received: from out-61.mta0.migadu.com (out-61.mta0.migadu.com [IPv6:2001:41d0:1004:224b::3d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EBFB04EEC for ; Wed, 12 Apr 2023 16:22:26 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1681341744; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=ash0rC8Buu4gkiLlAq8MxDEdJ92MQI6+z71zxIkVBiY=; b=sJrF7XdNJhWlFXHgNI2ZMyegT3rlbMT1XXa2zvgrn9n0W8gdRS7mLYmp+Vtn9BWQ5kCjm7 N6pu16RASFNuPVADhBwckIcmFZe4cIydQWc2SsPx0Qg+RSReGDsOuB2hzKBSKRFSJv1rED 8oFHcec1CgqQ/ilQPCruih6hgJ33BRk= From: Roman Gushchin To: netdev@vger.kernel.org Cc: Nicolas Ferre , Claudiu Beznea , "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Rafal Ozieblo , Harini Katakam , linux-kernel@vger.kernel.org, Roman Gushchin , Lars-Peter Clausen Subject: [PATCH net v2] net: macb: fix a memory corruption in extended buffer descriptor mode Date: Wed, 12 Apr 2023 16:21:44 -0700 Message-Id: <20230412232144.770336-1-roman.gushchin@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1763016357720100432?= X-GMAIL-MSGID: =?utf-8?q?1763016357720100432?= For quite some time we were chasing a bug which looked like a sudden permanent failure of networking and mmc on some of our devices. The bug was very sensitive to any software changes and even more to any kernel debug options. Finally we got a setup where the problem was reproducible with CONFIG_DMA_API_DEBUG=y and it revealed the issue with the rx dma: [ 16.992082] ------------[ cut here ]------------ [ 16.996779] DMA-API: macb ff0b0000.ethernet: device driver tries to free DMA memory it has not allocated [device address=0x0000000875e3e244] [size=1536 bytes] [ 17.011049] WARNING: CPU: 0 PID: 85 at kernel/dma/debug.c:1011 check_unmap+0x6a0/0x900 [ 17.018977] Modules linked in: xxxxx [ 17.038823] CPU: 0 PID: 85 Comm: irq/55-8000f000 Not tainted 5.4.0 #28 [ 17.045345] Hardware name: xxxxx [ 17.049528] pstate: 60000005 (nZCv daif -PAN -UAO) [ 17.054322] pc : check_unmap+0x6a0/0x900 [ 17.058243] lr : check_unmap+0x6a0/0x900 [ 17.062163] sp : ffffffc010003c40 [ 17.065470] x29: ffffffc010003c40 x28: 000000004000c03c [ 17.070783] x27: ffffffc010da7048 x26: ffffff8878e38800 [ 17.076095] x25: ffffff8879d22810 x24: ffffffc010003cc8 [ 17.081407] x23: 0000000000000000 x22: ffffffc010a08750 [ 17.086719] x21: ffffff8878e3c7c0 x20: ffffffc010acb000 [ 17.092032] x19: 0000000875e3e244 x18: 0000000000000010 [ 17.097343] x17: 0000000000000000 x16: 0000000000000000 [ 17.102647] x15: ffffff8879e4a988 x14: 0720072007200720 [ 17.107959] x13: 0720072007200720 x12: 0720072007200720 [ 17.113261] x11: 0720072007200720 x10: 0720072007200720 [ 17.118565] x9 : 0720072007200720 x8 : 000000000000022d [ 17.123869] x7 : 0000000000000015 x6 : 0000000000000098 [ 17.129173] x5 : 0000000000000000 x4 : 0000000000000000 [ 17.134475] x3 : 00000000ffffffff x2 : ffffffc010a1d370 [ 17.139778] x1 : b420c9d75d27bb00 x0 : 0000000000000000 [ 17.145082] Call trace: [ 17.147524] check_unmap+0x6a0/0x900 [ 17.151091] debug_dma_unmap_page+0x88/0x90 [ 17.155266] gem_rx+0x114/0x2f0 [ 17.158396] macb_poll+0x58/0x100 [ 17.161705] net_rx_action+0x118/0x400 [ 17.165445] __do_softirq+0x138/0x36c [ 17.169100] irq_exit+0x98/0xc0 [ 17.172234] __handle_domain_irq+0x64/0xc0 [ 17.176320] gic_handle_irq+0x5c/0xc0 [ 17.179974] el1_irq+0xb8/0x140 [ 17.183109] xiic_process+0x5c/0xe30 [ 17.186677] irq_thread_fn+0x28/0x90 [ 17.190244] irq_thread+0x208/0x2a0 [ 17.193724] kthread+0x130/0x140 [ 17.196945] ret_from_fork+0x10/0x20 [ 17.200510] ---[ end trace 7240980785f81d6f ]--- [ 237.021490] ------------[ cut here ]------------ [ 237.026129] DMA-API: exceeded 7 overlapping mappings of cacheline 0x0000000021d79e7b [ 237.033886] WARNING: CPU: 0 PID: 0 at kernel/dma/debug.c:499 add_dma_entry+0x214/0x240 [ 237.041802] Modules linked in: xxxxx [ 237.061637] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G W 5.4.0 #28 [ 237.068941] Hardware name: xxxxx [ 237.073116] pstate: 80000085 (Nzcv daIf -PAN -UAO) [ 237.077900] pc : add_dma_entry+0x214/0x240 [ 237.081986] lr : add_dma_entry+0x214/0x240 [ 237.086072] sp : ffffffc010003c30 [ 237.089379] x29: ffffffc010003c30 x28: ffffff8878a0be00 [ 237.094683] x27: 0000000000000180 x26: ffffff8878e387c0 [ 237.099987] x25: 0000000000000002 x24: 0000000000000000 [ 237.105290] x23: 000000000000003b x22: ffffffc010a0fa00 [ 237.110594] x21: 0000000021d79e7b x20: ffffffc010abe600 [ 237.115897] x19: 00000000ffffffef x18: 0000000000000010 [ 237.121201] x17: 0000000000000000 x16: 0000000000000000 [ 237.126504] x15: ffffffc010a0fdc8 x14: 0720072007200720 [ 237.131807] x13: 0720072007200720 x12: 0720072007200720 [ 237.137111] x11: 0720072007200720 x10: 0720072007200720 [ 237.142415] x9 : 0720072007200720 x8 : 0000000000000259 [ 237.147718] x7 : 0000000000000001 x6 : 0000000000000000 [ 237.153022] x5 : ffffffc010003a20 x4 : 0000000000000001 [ 237.158325] x3 : 0000000000000006 x2 : 0000000000000007 [ 237.163628] x1 : 8ac721b3a7dc1c00 x0 : 0000000000000000 [ 237.168932] Call trace: [ 237.171373] add_dma_entry+0x214/0x240 [ 237.175115] debug_dma_map_page+0xf8/0x120 [ 237.179203] gem_rx_refill+0x190/0x280 [ 237.182942] gem_rx+0x224/0x2f0 [ 237.186075] macb_poll+0x58/0x100 [ 237.189384] net_rx_action+0x118/0x400 [ 237.193125] __do_softirq+0x138/0x36c [ 237.196780] irq_exit+0x98/0xc0 [ 237.199914] __handle_domain_irq+0x64/0xc0 [ 237.204000] gic_handle_irq+0x5c/0xc0 [ 237.207654] el1_irq+0xb8/0x140 [ 237.210789] arch_cpu_idle+0x40/0x200 [ 237.214444] default_idle_call+0x18/0x30 [ 237.218359] do_idle+0x200/0x280 [ 237.221578] cpu_startup_entry+0x20/0x30 [ 237.225493] rest_init+0xe4/0xf0 [ 237.228713] arch_call_rest_init+0xc/0x14 [ 237.232714] start_kernel+0x47c/0x4a8 [ 237.236367] ---[ end trace 7240980785f81d70 ]--- Lars was fast to find an explanation: according to the datasheet bit 2 of the rx buffer descriptor entry has a different meaning in the extended mode: Address [2] of beginning of buffer, or in extended buffer descriptor mode (DMA configuration register [28] = 1), indicates a valid timestamp in the buffer descriptor entry. The macb driver didn't mask this bit while getting an address and it eventually caused a memory corruption and a dma failure. The problem is resolved by explicitly clearing the problematic bit if hw timestamping is used. Fixes: 7b4296148066 ("net: macb: Add support for PTP timestamps in DMA descriptors") Signed-off-by: Roman Gushchin Co-developed-by: Lars-Peter Clausen Signed-off-by: Lars-Peter Clausen Acked-by: Nicolas Ferre Reviewed-by: Jacob Keller --- drivers/net/ethernet/cadence/macb_main.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/net/ethernet/cadence/macb_main.c b/drivers/net/ethernet/cadence/macb_main.c index f77bd1223c8f..541e4dda7950 100644 --- a/drivers/net/ethernet/cadence/macb_main.c +++ b/drivers/net/ethernet/cadence/macb_main.c @@ -1063,6 +1063,10 @@ static dma_addr_t macb_get_addr(struct macb *bp, struct macb_dma_desc *desc) } #endif addr |= MACB_BF(RX_WADDR, MACB_BFEXT(RX_WADDR, desc->addr)); +#ifdef CONFIG_MACB_USE_HWSTAMP + if (bp->hw_dma_cap & HW_DMA_CAP_PTP) + addr &= ~GEM_BIT(DMA_RXVALID); +#endif return addr; }