Message ID | 20230929102726.2985188-22-john.g.garry@oracle.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp3931949vqu; Fri, 29 Sep 2023 03:51:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFFhqlFGN6owXTooOgZfQ0Ryr1YyVxdgzZGzqWNlSY7uzootXsES3y69LHTegpD9B6dBQd5 X-Received: by 2002:a05:6a21:81a3:b0:14b:ecab:a6ba with SMTP id pd35-20020a056a2181a300b0014becaba6bamr3350114pzb.28.1695984712068; Fri, 29 Sep 2023 03:51:52 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1695984712; cv=pass; d=google.com; s=arc-20160816; b=bEx3/Wl3f4yuYp7qY5AjBloiz79aMm9RjNnp6IN3oTXjO7kVNIBpGf7+5vYhS+o3U3 1dyO9Xb1XyX2Zm0oupixldDQfaoFzebQG0vTdqx8vnDxcLqj0Clwbjb1GIJKKff2Pwsm 57ZmmsfnjvJrPKd6KkeoxATHvvm4TrHSD4BaaEQ8l+dYs50cDi48QhfajQ+ea3cY+5TR faDp3SU8LfSK8T4FX8fqCQlg8KQmMkOWXi24KOBSvY+BXxPOhFk8OBE6J8ChfxDBUski rFp8M97muCOj+3QcIO7l3+lSDiWkk3Vhm8Sy8J2DbCXdtviv89YmKxy9V7mcIOgMi6iQ h0vQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:content-transfer-encoding :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-signature; bh=FZl9JhWJFm8ds/0dVoM8Y0ByU2rRnBP5p0YM67SV4sg=; fh=hRLq/3j8/0LTGPtn3R+yaB2okbubFMubCVYS7o8AoO4=; b=DSB171dnAs/c0NT+P8xmrzGcxm5bXaU0GaejlrvIdoE8wTHbEtRVSUtmWg1ONiFEqS REkJWfy/OyUMCHC99FgZBKRcqc86+VX0tSNiWVh/SKZpe4keVWp/+Y/WcqBqBKeBvRjE J/2bVnEtR/sWA4l/IAD+g8xjJtp2d1MOHkU52JxrGGJ7h5IhQ40FWxOBENPf4ETIyHTA 74s59yYoW/Rz0SIAu+S57JRDtvl4AzvyRz4m0El7TIISrSRWQmhnoEdAPHnYLMLB/dC+ SOLDYXv7hgRJuApqNADqR+0KlFyJ84ssn7tvwS5IqbWo0E3BVEfWyJCJ6ldugIsZKKKC UEnA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=CY+VE4gV; dkim=pass header.i=@oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=Uhgv5lly; arc=pass (i=1 spf=pass spfdomain=oracle.com dkim=pass dkdomain=oracle.com dmarc=pass fromdomain=oracle.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from pete.vger.email (pete.vger.email. [23.128.96.36]) by mx.google.com with ESMTPS id f10-20020a170902684a00b001c33d339754si20460036pln.136.2023.09.29.03.51.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Sep 2023 03:51:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) client-ip=23.128.96.36; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=CY+VE4gV; dkim=pass header.i=@oracle.onmicrosoft.com header.s=selector2-oracle-onmicrosoft-com header.b=Uhgv5lly; arc=pass (i=1 spf=pass spfdomain=oracle.com dkim=pass dkdomain=oracle.com dmarc=pass fromdomain=oracle.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.36 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id B758C8049B6A; Fri, 29 Sep 2023 03:35:58 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233293AbjI2KdM (ORCPT <rfc822;pwkd43@gmail.com> + 20 others); Fri, 29 Sep 2023 06:33:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233270AbjI2Kcw (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 29 Sep 2023 06:32:52 -0400 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C1EB61733; Fri, 29 Sep 2023 03:31:16 -0700 (PDT) Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 38SK9u7d020163; Fri, 29 Sep 2023 10:29:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : content-transfer-encoding : content-type : mime-version; s=corp-2023-03-30; bh=FZl9JhWJFm8ds/0dVoM8Y0ByU2rRnBP5p0YM67SV4sg=; b=CY+VE4gVLWjut8K0Yt+3jDDukStXFCgM6EjRLAm6JzJILDbHirz5w/auT5orv6Xm4FGo tOby2uEIM1oDoFn0z4MmP+ecqSPrqHYtnpAulG41zRh9HjksJtvR5FoZnxUgTiDjzCQm yMz9gfGdv4/MaAdCkESkZWvoljNXIyD0GL8vwQ9uuxUsyr6jO3aQ+x3dr7qx8ErPT0vh gTGfSC9fo6R5/h4h6AU0pr0ykke6YLsOW5KGQE3KCiy9jXa9rMMPpoZ+4MrcndNpYUjU TRg1vPIbws9Quf7KrAajqXJ/974dFQAFTGNnBkZ8buelbOwV++lAX7JDmESH7McbkJJC Yg== Received: from phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta03.appoci.oracle.com [138.1.37.129]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3t9peee9wb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 29 Sep 2023 10:29:00 +0000 Received: from pps.filterd (phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 38TADLJA014617; Fri, 29 Sep 2023 10:28:59 GMT Received: from nam11-dm6-obe.outbound.protection.outlook.com (mail-dm6nam11lp2176.outbound.protection.outlook.com [104.47.57.176]) by phxpaimrmta03.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3t9pfbmmjj-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 29 Sep 2023 10:28:59 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OOHa9+zZvvhSuCjwiGE88BbtievlzGt3yfvo2gqT4dIdI6sIMwKdTWzXoB93+DxosHLgVRPncm+2Ho/S/URl66i6r2uBIeuYEBsX36StHToq194rvgvMtJgMqJeQ/y0ber35i31RAEQTXFK/H6RvlwR4Xzsc3/xze1DRKZ5a1QxabJLtgRqtd2+XySYrA3Y1B0HwQCOUMCIVF2QBgYMoQCiTJBb2uRps6t9w3udN/snxUu1fJamY7RTiwE+i5Umu6JnpJUZ4dlQSdzQPYw1mHbdGhLng31e+yYMicEBMZ0kTMUZXOsKNRRF3GgHYFv8GjO+9KgeexhajgJV9FJuRUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FZl9JhWJFm8ds/0dVoM8Y0ByU2rRnBP5p0YM67SV4sg=; b=Bxkfsp5NunqVli18tBzXMXMr48a65zL6t539YhK1Tb/x8S9hIXEXiBGj8vAsFT5YZj9X6bqU8FumZsxLx4lcHMKBXA7BZLJL20PinkiUGM2pkQ+JgwcW95n+TNmSy38HW/mBsJRZ7XObql4Gmjk5P1N/Gzp08T0djMhJJsHj8mzeg2KuyLqK/ZdQMWIopEnM7i/me4tdrwUYIcfMygTAaq+RbRGX0i0K48v1wCVky9n1GQlEzGxi06CricuPbpBDhDG1BZimC0cOzXK0tcRcqafxzC/z2hFf3eywbUyDb/VLKSoi2N4ti7gmySLiBs7VAnYK/aa9Po0DWLMq4vMrQg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oracle.com; dmarc=pass action=none header.from=oracle.com; dkim=pass header.d=oracle.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.onmicrosoft.com; s=selector2-oracle-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=FZl9JhWJFm8ds/0dVoM8Y0ByU2rRnBP5p0YM67SV4sg=; b=Uhgv5llyvwlCx34Ronx8iGZ7AMVK/iwcpJYfyHLjanlIKiBP1WR920bS10EEGWKN7Tm7+5D4Wk1e5k0KckgRIiSxsqedH+ICTLDYqR4G3sxTVbjDtpeuHusfKCiJDHTMot3z88o1NgM4I7ssDCzWWKY9A/yXtQeN3k3pPvrwIKE= Received: from DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) by PH0PR10MB4680.namprd10.prod.outlook.com (2603:10b6:510:3e::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6838.27; Fri, 29 Sep 2023 10:28:40 +0000 Received: from DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::ebfd:c49c:6b8:6fce]) by DM6PR10MB4313.namprd10.prod.outlook.com ([fe80::ebfd:c49c:6b8:6fce%7]) with mapi id 15.20.6813.027; Fri, 29 Sep 2023 10:28:40 +0000 From: John Garry <john.g.garry@oracle.com> To: axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, djwong@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, chandan.babu@oracle.com, dchinner@redhat.com Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-api@vger.kernel.org, Alan Adamson <alan.adamson@oracle.com>, John Garry <john.g.garry@oracle.com> Subject: [PATCH 21/21] nvme: Support atomic writes Date: Fri, 29 Sep 2023 10:27:26 +0000 Message-Id: <20230929102726.2985188-22-john.g.garry@oracle.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230929102726.2985188-1-john.g.garry@oracle.com> References: <20230929102726.2985188-1-john.g.garry@oracle.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain X-ClientProxiedBy: BLAPR05CA0009.namprd05.prod.outlook.com (2603:10b6:208:36e::16) To DM6PR10MB4313.namprd10.prod.outlook.com (2603:10b6:5:212::20) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6PR10MB4313:EE_|PH0PR10MB4680:EE_ X-MS-Office365-Filtering-Correlation-Id: 76345577-4b6a-418a-a0b6-08dbc0d6d8fa X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Uw13hMllyidPmiro5j2HZHhJLrqb8MYc4/CDkYNPoCFiOigNLCDu6l/YNxJfXGln9+RziNdStdB/5gp4bKNUZwTNyWl5dY8C3qHwtfa4aRuV1ONEZd1BFKtDrMAjjzEm6N3zN0xzmrQ/N/PzV8PnaGMQSReDvOr2sXDE30+MgGsK9yENm5+bZU6Oal4HAAOeeOyl53SNxGVzBL98Ruwte0h+giPE9NQSnhpcT3+cHwG/HBfEx9r4vdqLk1mfbfiwpiag61jetBJn3T/4RuKaDpVyu6KfCcrQvfovMMRkGchnPaHM6MktC39v0bFKDcBqpZ3SYaZlG069I7D3bKGgUNjijIXq6PDZjIaubmnE6HbYAAc6Af9mRtvsYxH8yi++HwGF8aI9CSvhgGaVaJ2eQCJSWgAao58zss7YuryyST5BbGlkgd/ufTzIVkolhSXOTmdfW9hhcRquylvy1K3ZxI317d2QuqWRZNTU10GZp/ixXovkKEwt2HsthtKexP2l5ixOZlOGUd9uPBWd026k3bt0eS6TrPxQZEYybvw8UJYlrZPhbV8CBA/sjIlUafwFohOYTXiL3P8jW+99acBFS/Ch4YdTZQKQbVDoUQMvXOo= X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:DM6PR10MB4313.namprd10.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230031)(346002)(366004)(136003)(376002)(39860400002)(396003)(230922051799003)(64100799003)(1800799009)(451199024)(186009)(316002)(66476007)(66946007)(66556008)(54906003)(8676002)(4326008)(8936002)(41300700001)(107886003)(26005)(36756003)(2616005)(1076003)(83380400001)(478600001)(6486002)(6666004)(6506007)(921005)(86362001)(6512007)(38100700002)(103116003)(2906002)(7416002)(5660300002);DIR:OUT;SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: gLShOUaokrHyb6kSex/57BjOP6j5XHeHfRsmuu0WxZJeoRYYVMymYOkPLnRlZmHR550EBKG8ce1REiVRHClHQaR77YbY6/YRjD3WSusIRFAxSZTVx3mZBa7dUkjnw2bWYhPcWe2PQ53tRS68a7y90YQ9Gby8ug1nG8kAMxMnHKM6xbiWnm41tfaHzjBq3dsFIFLdvhgfgHzBO645TvUIoRjXEB2QaNYiBkAhB1yZLj0SwLc/YBV4XOnWNb6Fm983JXZid9FBUS1mIeNadnNVluLnUraIk6VCApiV97wRdrcWBgf7AqaRZISNuwCuLUuhrkNXeWBVbHevngTOV/WF8vssM79mNSkye/HF/sMFFsrci5BDjUCANKTypX2b1m/UuN/q1nhWl58YTNKcYJaUeJsK1/GYEP1pQTNtFwtaHkYaSyQbF50yBckLsfVsut9Irh2rhLFu0PauwzuRQ+cFyPc4r9ZGQzdO2osPJ4InG4Sky7ekj7i66SgLN3nOpXH6tbu3tfmTPi1IkTvx//0rWrhy7Vx6z+ymLWeJZXQKIMvL/ylW2uCLD1vJ0ETcNCNbRMi0QJ4wA9dRD+x1RaBTkEUiQ6I8KTxa0Ieu9SCRg8qaoC3vDYIeEujayM+35PIsZTBHbZGLIchWlECYkTTIEMQfsQKrKxGExLmI9Xt2u9T/mWyEsdvvYsWlXj4JMGQ107rfXLMC1C/oZtudwYYN6vDwwOUP5lGPo8zmfsja9T4v0MMCtMqLg+j47R7TenSKusnr/NoswW8nmKaqzdXlxaUdtjVXAXUAlt17ITeLd85IDS/2JinsWbKo2M99tmWLcWmcaqmKguiFsRNoR+cXn4s5xtbce+AjhAIPeqw6wjzmqOs6RQoDz0TynTqfH13zZZyTGWUxXowrQvBomTyinUYo5QLCILRGs3XQuVRumLG3Vswn9zI8lDLtdO8gmpLRwT1b+DeNAE7No7lphwgXczpH6nmkLNGS3bhGafHWFyOW1pNoz6klVQsdTJCBizYuKEVhFZL1UVnrf6DgQ6cEOsoU0p3YA6vj5cqVgBUkt+mt3S6glVz8OdVt9smpeIvuFlcBrhULVFjnAzglcVVs13/v9bGQ4Bj845sifxo+/QMUVXvu+C8vHyz9yNnhQiORE1A2RZhNv+jwcHliaNQB1uWGOWoEkRzOYxNyM7Y5P76oXl7gaT/J0MyUfLSP4+6VrbwBQSw6au52J3SunsRBa+SJGSv3Ksh64tiPYYh0d9RA7TrNNAkMcf7tfcA5gidR43C8EDbBZlTvizsqEJy+p0RzGbqzJmBzT5uTIXYkdfti2ODq77X0BG9ewmJoHEYlnCSQztFA0I97z1vVKLTw/AiC9rjbeEx394sJN0Q/GpNM/n8CA2UjQZqudPJoUtikHmQssHgx/7LOkxpSHs5RFFUQYsPyE3eDAEitaY31Mskdfa8C/ZAwZotCL6sX6Ww0oARWfCGTUL4Cgpks1Tg2BmETGthty623wc4MFpQgRu9OXarisT9jxGYaFllWUpVQ6KOq5nlmkd87y4wmPZ5mYhz7+9LR2GxM1e0vYYNpJ4Qv/rwxPbQ8jJZ8fR85DIQwBs2YBRSzvoia/4YPu7Eo5A== X-MS-Exchange-AntiSpam-ExternalHop-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-ExternalHop-MessageData-0: 3LrikrejUN52TV7nnoH7dfcSWvNxiKA2WZuX6bQhmEC5EIE6IQId6AbBTMY/dVkowE/xoXhBqyF/+/E4R0/YwZLeyukr6Vn/CNQNHUcm5XDgQvqGKekSAbwa4q54NrlId1sm6MYUqib4c+TxwmfM4QTS7ui6LkBWFHATYHSHlDnqKT7kwyrwMNApGSSuR3zZsF7XSO77fJUsxJIyN7QX2sKPJOnf+RR+ZPYAlVFlYhn0qU6NnfB6mAv0HUFby7UPcnEDiy3Ga9fvLGseGUfg0lv+XJay84Bu2dnIHZB/8JWzNxVi4i+IwuYw9goDVT2dCFk+JklVMyJcpLm2V/rZ0wtFUoRnRqYs8Xiu0zQBo6qRAxtw0agxTwKd6RVrIebBR8y5KhJ8ggGXl+iQPfltrOjgPreEM6hMhVG4pZTCY+z/DUHvh63jVRqSGyRtnRRe+MdTVohXvRdy+HB+RPn+sweZ5qbAS/3ly50VbauF9krcOUWnDKYdrU21VKZh4JPQxLHyCZqS3chodCgHw63tqwk36I+xaEjke/Pt5OWywbM+s04NUwMGST8iO6aQjUXXM4+htZ6ayDyF0WalEH+ZUBrvhfR8ewhXiBm4ZmfSY/JtxzjZ2mov5JYXyMxT4eEI5PdUYetZrLVs5iMsoQugzu87mosUDnM+F4IkNBmbqucmd7wn5oQkZFteQARKDOpBKYjNGzyu0Q5a6Wky5l4gXyFr9wcWFmdATuNVPzdVUTgX8Wk2pft02yu2ctiyrfy94dXA0WVGf+ujke3yMNIQKuSKAXy9Ixnx1DqPu+oglsjsWzL8cuS6csI+y5A6o7Ay//9ZvqpJoI9rhC6PFOeCGQHXD1wPxs45KkkdxzwE3AjSk0VAVwpZwqgWgsQgIYRPOnP3QDDkg1bnJPUOOLUdMnZLvHC6hLCcqdw0vva/BfiRof1IYKIdJK2UXIWD085WBqdsTuu3lmEn2AcG91dhFpR0iVI1BiMednpLKaPbZx5eYvm8pxPiDIAEBTbF0pit3/QIF/qapqvwn8uJorExfYysNqcF4lzQnW+XAkCp8ideOf7rgmdv5pRhJ8qLVMFk/5GVsZT9XgdniNn3b7/3EA== X-OriginatorOrg: oracle.com X-MS-Exchange-CrossTenant-Network-Message-Id: 76345577-4b6a-418a-a0b6-08dbc0d6d8fa X-MS-Exchange-CrossTenant-AuthSource: DM6PR10MB4313.namprd10.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 29 Sep 2023 10:28:40.0181 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: bLkhpuaDIZ9uacDVITn7BPz+d8Zvy7f4s560+aFK6nhSAEKR76IplmCmd/+Q8rLnaBlm0jErXMt1wP1uhMkGZw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: PH0PR10MB4680 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.980,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-09-29_08,2023-09-28_03,2023-05-22_02 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 suspectscore=0 phishscore=0 adultscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2309180000 definitions=main-2309290090 X-Proofpoint-ORIG-GUID: A2RTXmWZQqdEnpftcOxP7cHO2BnvgRXu X-Proofpoint-GUID: A2RTXmWZQqdEnpftcOxP7cHO2BnvgRXu X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Fri, 29 Sep 2023 03:35:58 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778368865463755399 X-GMAIL-MSGID: 1778368865463755399 |
Series |
block atomic writes
|
|
Commit Message
John Garry
Sept. 29, 2023, 10:27 a.m. UTC
From: Alan Adamson <alan.adamson@oracle.com> Support reading atomic write registers to fill in request_queue properties. Use following method to calculate limits: atomic_write_max_bytes = flp2(NAWUPF ?: AWUPF) atomic_write_unit_min = logical_block_size atomic_write_unit_max = flp2(NAWUPF ?: AWUPF) atomic_write_boundary = NABSPF Signed-off-by: Alan Adamson <alan.adamson@oracle.com> Signed-off-by: John Garry <john.g.garry@oracle.com> --- drivers/nvme/host/core.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)
Comments
> +++ b/drivers/nvme/host/core.c > @@ -1926,6 +1926,35 @@ static void nvme_update_disk_info(struct gendisk *disk, > blk_queue_io_min(disk->queue, phys_bs); > blk_queue_io_opt(disk->queue, io_opt); > > + atomic_bs = rounddown_pow_of_two(atomic_bs); > + if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawupf) { > + if (id->nabo) { > + dev_err(ns->ctrl->device, "Support atomic NABO=%x\n", > + id->nabo); > + } else { > + u32 boundary = 0; > + > + if (le16_to_cpu(id->nabspf)) > + boundary = (le16_to_cpu(id->nabspf) + 1) * bs; > + > + if (is_power_of_2(boundary) || !boundary) { > + blk_queue_atomic_write_max_bytes(disk->queue, atomic_bs); > + blk_queue_atomic_write_unit_min_sectors(disk->queue, 1); > + blk_queue_atomic_write_unit_max_sectors(disk->queue, > + atomic_bs / bs); blk_queue_atomic_write_unit_[min| max]_sectors expects sectors (512 bytes unit) as input but no conversion is done here from device logical block size to SECTORs. > + blk_queue_atomic_write_boundary_bytes(disk->queue, boundary); > + } else { > + dev_err(ns->ctrl->device, "Unsupported atomic boundary=0x%x\n", > + boundary); > + } >
On 04/10/2023 12:39, Pankaj Raghav wrote: >> +++ b/drivers/nvme/host/core.c >> @@ -1926,6 +1926,35 @@ static void nvme_update_disk_info(struct gendisk *disk, >> blk_queue_io_min(disk->queue, phys_bs); >> blk_queue_io_opt(disk->queue, io_opt); >> >> + atomic_bs = rounddown_pow_of_two(atomic_bs); >> + if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawupf) { >> + if (id->nabo) { >> + dev_err(ns->ctrl->device, "Support atomic NABO=%x\n", >> + id->nabo); >> + } else { >> + u32 boundary = 0; >> + >> + if (le16_to_cpu(id->nabspf)) >> + boundary = (le16_to_cpu(id->nabspf) + 1) * bs; >> + >> + if (is_power_of_2(boundary) || !boundary) { note to self/Alan: boundary just needs to be multiple of atomic write unit max, and not necessarily a power-of-2 >> + blk_queue_atomic_write_max_bytes(disk->queue, atomic_bs); >> + blk_queue_atomic_write_unit_min_sectors(disk->queue, 1); >> + blk_queue_atomic_write_unit_max_sectors(disk->queue, >> + atomic_bs / bs); > blk_queue_atomic_write_unit_[min| max]_sectors expects sectors (512 bytes unit) > as input but no conversion is done here from device logical block size > to SECTORs. Yeah, you are right. I think that we can just use: blk_queue_atomic_write_unit_max_sectors(disk->queue, atomic_bs >> SECTOR_SHIFT); Thanks, John >> + blk_queue_atomic_write_boundary_bytes(disk->queue, boundary); >> + } else { >> + dev_err(ns->ctrl->device, "Unsupported atomic boundary=0x%x\n", >> + boundary); >> + } >>
>>> + blk_queue_atomic_write_max_bytes(disk->queue, atomic_bs); >>> + blk_queue_atomic_write_unit_min_sectors(disk->queue, 1); >>> + blk_queue_atomic_write_unit_max_sectors(disk->queue, >>> + atomic_bs / bs); >> blk_queue_atomic_write_unit_[min| max]_sectors expects sectors (512 bytes unit) >> as input but no conversion is done here from device logical block size >> to SECTORs. > > Yeah, you are right. I think that we can just use: > > blk_queue_atomic_write_unit_max_sectors(disk->queue, > atomic_bs >> SECTOR_SHIFT); > Makes sense. I still don't grok the difference between max_bytes and unit_max_sectors here. (Maybe NVMe spec does not differentiate it?) I assume min_sectors should be as follows instead of setting it to 1 (512 bytes)? blk_queue_atomic_write_unit_min_sectors(disk->queue, bs >> SECTORS_SHIFT); > Thanks, > John > >>> + blk_queue_atomic_write_boundary_bytes(disk->queue, boundary); >>> + } else { >>> + dev_err(ns->ctrl->device, "Unsupported atomic boundary=0x%x\n", >>> + boundary); >>> + } >>> >
On 05/10/2023 14:32, Pankaj Raghav wrote: >>> te_unit_[min| max]_sectors expects sectors (512 bytes unit) >>> as input but no conversion is done here from device logical block size >>> to SECTORs. >> Yeah, you are right. I think that we can just use: >> >> blk_queue_atomic_write_unit_max_sectors(disk->queue, >> atomic_bs >> SECTOR_SHIFT); >> > Makes sense. > I still don't grok the difference between max_bytes and unit_max_sectors here. > (Maybe NVMe spec does not differentiate it?) I think that max_bytes does not need to be a power-of-2 and could be relaxed. Having said that, max_bytes comes into play for merging of bios - so if we are in a scenario with no merging, then may a well leave atomic_write_max_bytes == atomic_write_unit_max. But let us check this proposal to relax. > > I assume min_sectors should be as follows instead of setting it to 1 (512 bytes)? > > blk_queue_atomic_write_unit_min_sectors(disk->queue, bs >> SECTORS_SHIFT); Yeah, right, we want unit_min to be the logical block size. Thanks, John
> + if (le16_to_cpu(id->nabspf)) > + boundary = (le16_to_cpu(id->nabspf) + 1) * bs; > + > + if (is_power_of_2(boundary) || !boundary) { > + blk_queue_atomic_write_max_bytes(disk->queue, atomic_bs); > + blk_queue_atomic_write_unit_min_sectors(disk->queue, 1); > + blk_queue_atomic_write_unit_max_sectors(disk->queue, > + atomic_bs / bs); > + blk_queue_atomic_write_boundary_bytes(disk->queue, boundary); > + } else { > + dev_err(ns->ctrl->device, "Unsupported atomic boundary=0x%x\n", > + boundary); > + } Please figure out a way to split the atomic configuration into a helper and avoid all those crazy long lines, preferable also avoid the double calls to the block helpers as well while you're at it. Also I really want a check in the NVMe I/O path that any request with the atomic flag set actually adhers to the limits to at least partially paper over the annoying lack of a separate write atomic command in nvme.
On Thu, Nov 09, 2023 at 04:36:03PM +0100, Christoph Hellwig wrote: > Also I really want a check in the NVMe I/O path that any request > with the atomic flag set actually adhers to the limits to at least > partially paper over the annoying lack of a separate write atomic > command in nvme. That wasn't the model we had in mind. In our thinking, it was fine to send a write that crossed the atomic write limit, but the drive wouldn't guarantee that it was atomic except at the atomic write boundary. Eg with an AWUN of 16kB, you could send five 16kB writes, combine them into a single 80kB write, and if the power failed midway through, the drive would guarantee that it had written 0, 16kB, 32kB, 48kB, 64kB or all 80kB. Not necessarily in order; it might have written bytes 16-32kB, 64-80kB and not the other three.
On Thu, Nov 09, 2023 at 03:42:40PM +0000, Matthew Wilcox wrote: > That wasn't the model we had in mind. In our thinking, it was fine to > send a write that crossed the atomic write limit, but the drive wouldn't > guarantee that it was atomic except at the atomic write boundary. > Eg with an AWUN of 16kB, you could send five 16kB writes, combine them > into a single 80kB write, and if the power failed midway through, the > drive would guarantee that it had written 0, 16kB, 32kB, 48kB, 64kB or > all 80kB. Not necessarily in order; it might have written bytes 16-32kB, > 64-80kB and not the other three. I can see some use for that, but I'm really worried that debugging problems in the I/O merging and splitting will be absolute hell.
On 09/11/2023 15:46, Christoph Hellwig wrote: > On Thu, Nov 09, 2023 at 03:42:40PM +0000, Matthew Wilcox wrote: >> That wasn't the model we had in mind. In our thinking, it was fine to >> send a write that crossed the atomic write limit, but the drive wouldn't >> guarantee that it was atomic except at the atomic write boundary. >> Eg with an AWUN of 16kB, you could send five 16kB writes, combine them >> into a single 80kB write, and if the power failed midway through, the >> drive would guarantee that it had written 0, 16kB, 32kB, 48kB, 64kB or >> all 80kB. Not necessarily in order; it might have written bytes 16-32kB, >> 64-80kB and not the other three. I didn't think that there are any atomic write guarantees at all if we ever exceed AWUN or AWUPF or cross the atomic write boundary (if any). > I can see some use for that, but I'm really worried that debugging > problems in the I/O merging and splitting will be absolute hell. Even if bios were merged for NVMe the total request length still should not exceed AWUPF. However a check can be added to ensure this for a submitted atomic write request. As for splitting, it is not permitted for atomic writes and only a single bio is permitted to be created per write. Are more integrity checks required? Thanks, John
On Thu, Nov 09, 2023 at 07:08:40PM +0000, John Garry wrote: >>> send a write that crossed the atomic write limit, but the drive wouldn't >>> guarantee that it was atomic except at the atomic write boundary. >>> Eg with an AWUN of 16kB, you could send five 16kB writes, combine them >>> into a single 80kB write, and if the power failed midway through, the >>> drive would guarantee that it had written 0, 16kB, 32kB, 48kB, 64kB or >>> all 80kB. Not necessarily in order; it might have written bytes 16-32kB, >>> 64-80kB and not the other three. > > I didn't think that there are any atomic write guarantees at all if we ever > exceed AWUN or AWUPF or cross the atomic write boundary (if any). You're quoting a few mails before me, but I agree. >> I can see some use for that, but I'm really worried that debugging >> problems in the I/O merging and splitting will be absolute hell. > > Even if bios were merged for NVMe the total request length still should not > exceed AWUPF. However a check can be added to ensure this for a submitted > atomic write request. Yes. > As for splitting, it is not permitted for atomic writes and only a single > bio is permitted to be created per write. Are more integrity checks > required? I'm more worried about the problem where we accidentally add a split. The whole bio merge/split path is convoluted and we had plenty of bugs in the past by not looking at all the correct flags or opcodes.
On 10/11/2023 06:29, Christoph Hellwig wrote: > Yes. > >> As for splitting, it is not permitted for atomic writes and only a single >> bio is permitted to be created per write. Are more integrity checks >> required? > I'm more worried about the problem where we accidentally add a split. > The whole bio merge/split path is convoluted and we had plenty of > bugs in the past by not looking at all the correct flags or opcodes. Yes, this is always a concern. Some thoughts on things which could be done: - For no merging, ensure request length is a power-of-2 when enqueuing to block driver. This is simple but not watertight. - Create a per-bio checksum when the bio is created for the atomic write and ensure integrity when queuing to the block driver - a new block layer datapath which ensures no merging or splitting, but this seems a bit OTT BTW, on topic of splitting, that NVMe virt boundary is a pain and I hope that we could ignore/avoid it for atomic writes. Thanks, John
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 21783aa2ee8e..aa0daacf4d7c 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1926,6 +1926,35 @@ static void nvme_update_disk_info(struct gendisk *disk, blk_queue_io_min(disk->queue, phys_bs); blk_queue_io_opt(disk->queue, io_opt); + atomic_bs = rounddown_pow_of_two(atomic_bs); + if (id->nsfeat & NVME_NS_FEAT_ATOMICS && id->nawupf) { + if (id->nabo) { + dev_err(ns->ctrl->device, "Support atomic NABO=%x\n", + id->nabo); + } else { + u32 boundary = 0; + + if (le16_to_cpu(id->nabspf)) + boundary = (le16_to_cpu(id->nabspf) + 1) * bs; + + if (is_power_of_2(boundary) || !boundary) { + blk_queue_atomic_write_max_bytes(disk->queue, atomic_bs); + blk_queue_atomic_write_unit_min_sectors(disk->queue, 1); + blk_queue_atomic_write_unit_max_sectors(disk->queue, + atomic_bs / bs); + blk_queue_atomic_write_boundary_bytes(disk->queue, boundary); + } else { + dev_err(ns->ctrl->device, "Unsupported atomic boundary=0x%x\n", + boundary); + } + } + } else if (ns->ctrl->subsys->awupf) { + blk_queue_atomic_write_max_bytes(disk->queue, atomic_bs); + blk_queue_atomic_write_unit_min_sectors(disk->queue, 1); + blk_queue_atomic_write_unit_max_sectors(disk->queue, atomic_bs / bs); + blk_queue_atomic_write_boundary_bytes(disk->queue, 0); + } + /* * Register a metadata profile for PI, or the plain non-integrity NVMe * metadata masquerading as Type 0 if supported, otherwise reject block