From patchwork Thu Nov 30 03:05:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shifeng Li X-Patchwork-Id: 171721 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp126830vqy; Wed, 29 Nov 2023 19:17:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IHhs1S6WwRgts/WKlJLG+mEuXJRLgdX8CbKOQCUGxCnoGJenN9de4TNbh8r4MHa+QYQwM0F X-Received: by 2002:a05:6a20:6a08:b0:18c:8d21:7346 with SMTP id p8-20020a056a206a0800b0018c8d217346mr12355942pzk.28.1701314233945; Wed, 29 Nov 2023 19:17:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701314233; cv=none; d=google.com; s=arc-20160816; b=l/sLeNDisOXG5pRFyUQc11IOT7M78X9Ib2Z5oaGarATCuMTB4Tma8G252LUyAA6qQM pjJebiKH11rRmGKLUafkF8U8AWSXSrePFR3k+9AOfkiFgQ5M+W03k1z2sq2V6+uCeps8 iYeoxw70B/BE6O6wZ2kkMNu7PvTI5Xr5SGmHqpNETVLHKeXaLRtvEfwInVQlyN7KD9Br ismm4mC9v3aEuN0rleA6xsBashJro/KPaavIznJiyezWHaoMj2jKpxN8H5D/uzqj7Yn2 cjptmoVSEb1VdZV0opJ0KXMB/38BqZ79dWbEFGZmfq5l2DfUS8z5PxfiL6LaGCzwUeWI RGJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=jgrQEHUBPqw8rWHmC2CINeMmInP1FrrhjMpJ5wZX/Fc=; fh=kydX+yae8LtyrNye3NRXaQGXVXNepGqJdSkthl1kd7U=; b=ERgmGbuVF/24es+ACpd1+PvTV4umlNp19wjbmnekfeiQ7BTpwRLuNEN+YO7jeFwWvj GjQsS9RBRpL2ySL2QQRCNFtehn85Zjeu66hYsDsa7U/fD0XlVLyUZQFGh+3igvZ+dSqA Gbga1/aup2jTJKcyFRfWfsSGuD+0Gm9hixRJpNQPhRdMUBCaxIQ1XPfLsXOkj4gqAj4L wqvtMglWbkeI46kBla4opfgiGHH7d6+pgqSyHQcgf/f3YkME5MfHGzQ421/bpBHMF87s DWM9v6n4lrYLem0VQSGUDtjE6cIiwotyO6FHdxP673+ThNBmbvNJCm27UrVKGjXKOdt2 SfQQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id u3-20020a170902b28300b001c3886b091asi220193plr.127.2023.11.29.19.17.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 19:17:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id B6BF0822B714; Wed, 29 Nov 2023 19:17:11 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231639AbjK3DQz (ORCPT + 99 others); Wed, 29 Nov 2023 22:16:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230393AbjK3DQy (ORCPT ); Wed, 29 Nov 2023 22:16:54 -0500 X-Greylist: delayed 605 seconds by postgrey-1.37 at lindbergh.monkeyblade.net; Wed, 29 Nov 2023 19:16:59 PST Received: from mail-m49209.qiye.163.com (mail-m49209.qiye.163.com [45.254.49.209]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 02AA21B4; Wed, 29 Nov 2023 19:16:58 -0800 (PST) Received: from ubuntu.localdomain (unknown [111.222.250.119]) by mail-m12750.qiye.163.com (Hmail) with ESMTPA id 21132F204BA; Thu, 30 Nov 2023 11:06:24 +0800 (CST) From: Shifeng Li To: saeedm@nvidia.com, leon@kernel.org, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, eranbe@mellanox.com, moshe@mellanox.com Cc: netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, dinghui@sangfor.com.cn, lishifeng1992@126.com, Shifeng Li , Moshe Shemesh Subject: [PATCH v3] net/mlx5e: Fix a race in command alloc flow Date: Wed, 29 Nov 2023 19:05:59 -0800 Message-Id: <20231130030559.622165-1-lishifeng@sangfor.com.cn> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlCGBkfVhhCTRpPHksfSEhNSVUTARMWGhIXJBQOD1 lXWRgSC1lBWUpKSlVJSUlVSU5LVUpKQllXWRYaDxIVHRRZQVlPS0hVSk1PSUxOVUpLS1VKQktLWQ Y+ X-HM-Tid: 0a8c1e311e73b21dkuuu21132f204ba X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6PU06Qio*GTw4DhEiFU4dTU89 Aw0wCQtVSlVKTEtKSEpITkNOT0lOVTMWGhIXVRcSCBMSHR4VHDsIGhUcHRQJVRgUFlUYFUVZV1kS C1lBWUpKSlVJSUlVSU5LVUpKQllXWQgBWUFOSk5NNwY+ X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 29 Nov 2023 19:17:12 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783957274572583923 X-GMAIL-MSGID: 1783957274572583923 Fix a cmd->ent use after free due to a race on command entry. Such race occurs when one of the commands releases its last refcount and frees its index and entry while another process running command flush flow takes refcount to this command entry. The process which handles commands flush may see this command as needed to be flushed if the other process allocated a ent->idx but didn't set ent to cmd->ent_arr in cmd_work_handler(). Fix it by moving the assignment of cmd->ent_arr into the spin lock. [70013.081955] BUG: KASAN: use-after-free in mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] [70013.081967] Write of size 4 at addr ffff88880b1510b4 by task kworker/26:1/1433361 [70013.081968] [70013.082028] Workqueue: events aer_isr [70013.082053] Call Trace: [70013.082067] dump_stack+0x8b/0xbb [70013.082086] print_address_description+0x6a/0x270 [70013.082102] kasan_report+0x179/0x2c0 [70013.082173] mlx5_cmd_trigger_completions+0x1e2/0x4c0 [mlx5_core] [70013.082267] mlx5_cmd_flush+0x80/0x180 [mlx5_core] [70013.082304] mlx5_enter_error_state+0x106/0x1d0 [mlx5_core] [70013.082338] mlx5_try_fast_unload+0x2ea/0x4d0 [mlx5_core] [70013.082377] remove_one+0x200/0x2b0 [mlx5_core] [70013.082409] pci_device_remove+0xf3/0x280 [70013.082439] device_release_driver_internal+0x1c3/0x470 [70013.082453] pci_stop_bus_device+0x109/0x160 [70013.082468] pci_stop_and_remove_bus_device+0xe/0x20 [70013.082485] pcie_do_fatal_recovery+0x167/0x550 [70013.082493] aer_isr+0x7d2/0x960 [70013.082543] process_one_work+0x65f/0x12d0 [70013.082556] worker_thread+0x87/0xb50 [70013.082571] kthread+0x2e9/0x3a0 [70013.082592] ret_from_fork+0x1f/0x40 Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler") Reviewed-by: Moshe Shemesh Signed-off-by: Shifeng Li --- drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) --- v1->v2: fix code conflicts. v2->v3: modify Fixes line and massage git log. diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c index f8f0a712c943..a7b1f9686c09 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c @@ -156,15 +156,18 @@ static u8 alloc_token(struct mlx5_cmd *cmd) return token; } -static int cmd_alloc_index(struct mlx5_cmd *cmd) +static int cmd_alloc_index(struct mlx5_cmd *cmd, struct mlx5_cmd_work_ent *ent) { unsigned long flags; int ret; spin_lock_irqsave(&cmd->alloc_lock, flags); ret = find_first_bit(&cmd->vars.bitmask, cmd->vars.max_reg_cmds); - if (ret < cmd->vars.max_reg_cmds) + if (ret < cmd->vars.max_reg_cmds) { clear_bit(ret, &cmd->vars.bitmask); + ent->idx = ret; + cmd->ent_arr[ent->idx] = ent; + } spin_unlock_irqrestore(&cmd->alloc_lock, flags); return ret < cmd->vars.max_reg_cmds ? ret : -ENOMEM; @@ -979,7 +982,7 @@ static void cmd_work_handler(struct work_struct *work) sem = ent->page_queue ? &cmd->vars.pages_sem : &cmd->vars.sem; down(sem); if (!ent->page_queue) { - alloc_ret = cmd_alloc_index(cmd); + alloc_ret = cmd_alloc_index(cmd, ent); if (alloc_ret < 0) { mlx5_core_err_rl(dev, "failed to allocate command entry\n"); if (ent->callback) { @@ -994,15 +997,14 @@ static void cmd_work_handler(struct work_struct *work) up(sem); return; } - ent->idx = alloc_ret; } else { ent->idx = cmd->vars.max_reg_cmds; spin_lock_irqsave(&cmd->alloc_lock, flags); clear_bit(ent->idx, &cmd->vars.bitmask); + cmd->ent_arr[ent->idx] = ent; spin_unlock_irqrestore(&cmd->alloc_lock, flags); } - cmd->ent_arr[ent->idx] = ent; lay = get_inst(cmd, ent->idx); ent->lay = lay; memset(lay, 0, sizeof(*lay));