From patchwork Fri Oct 28 21:43:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12573 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1058934wru; Fri, 28 Oct 2022 14:48:16 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6nxUXcleX6iYTE2w/D1y4Nu3/050zFHVTglOFR6EPsQgXZ2JiARuDSyB7IciznoYcN3Wh2 X-Received: by 2002:a05:6402:ea8:b0:456:d188:b347 with SMTP id h40-20020a0564020ea800b00456d188b347mr1437979eda.15.1666993696517; Fri, 28 Oct 2022 14:48:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666993696; cv=none; d=google.com; s=arc-20160816; b=yuN35xyxz+29mxgkIqr9xNRFmjm9uPkNd/9oL1ou3ti1Pv7AABlX72OxzXYBWcnsE2 FcuUppEzSJxh+YgexrZOPYdD4y7HpApZp0IHuN+F7HbPqWO53Gv+EP41HEn501jKzaU9 HdX+V765tE/WPDcZ6nOtDHTf83zfx5cCNsglJW14yk5xA4S9RzuTUEcMxf3iLTRfH6k2 iZNuajZcnqGidEC5863URmkmddTUFFw1I9weLKEThCrLZ0JSMBSVifSTRL9kkzwdh53O HISFlbEG7gJICfv4KxbHY2B9/vQ62VeuqeEktOZWWZyxScPKO4UkDe0/FRKUK8PXZpSK CxBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8PHeEQvI03GO1u4arOs5eKGxrrWj+QcJXenB1+d+Ev4=; b=GhCgWBX9pDbWT96ZxcwHNV9fJ7zPzrGJhO56KQT+e/n0s9+qbJ+Ia/kA+9fzjCRCR+ KGY6mkAtp3SYHg+OO+hwX42LOo34CHUGBzoSAMjVrQ6y0oU5vqEx5m4UQ+7l0eeD2STA oC23MMf/NDBXL0N4L1lugq1kH9ztBJ9xaIbystzjBGhIvEPyuElxIWUD2M8lVvIJhpcP VqmvsnQ2aVznGv6cXIp1fNpIBE5kiLYJ0DDe4fzupVSX4n2Fn7N2EXmIRxVnC1ud/QLM yG6voJrDoXnDrpFeJfXsn9BCLtHznbwrW0WqhXpQCTRVYjB6PSSHX7ivqTpgZq3ybDTY 9LXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b="l/6HssGh"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bb6-20020a1709070a0600b007aacd494fe9si6041837ejc.311.2022.10.28.14.47.53; Fri, 28 Oct 2022 14:48:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b="l/6HssGh"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230054AbiJ1Vnk (ORCPT + 99 others); Fri, 28 Oct 2022 17:43:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229919AbiJ1Vnc (ORCPT ); Fri, 28 Oct 2022 17:43:32 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EA5A23B692 for ; Fri, 28 Oct 2022 14:43:31 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id o7so2538124pjj.1 for ; Fri, 28 Oct 2022 14:43:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8PHeEQvI03GO1u4arOs5eKGxrrWj+QcJXenB1+d+Ev4=; b=l/6HssGhPRExvh/FxdFaGlFZfok/tFw3IcJbJihU0Wuk8reK2iRnYuAKeLKryPObFk 8+myno8kH2k+Mx/mhpfG6lxbkp75xrFHJBDOoc0Hi+nCjGGX5XPhIzioZiLQHdYG7m+Z zH6qRIKT7+DBebh6PfSCHHRhFyJrVib7ADoS1iUOnw26Fx2qSxnkfWZ15xoHpPi+Cfkc uX4D3Uu/y7LbtcBERELE4v/eGWzbk+A791IJM5YqSq27H4KLResQNlvAYGSZxE6iTCWr J4Bj2quMAldC3LRqMQU5FPn24iu+tqiGbbAVHn1An+yZFSmKvu/tseANxOGiRShg+W+1 BCrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8PHeEQvI03GO1u4arOs5eKGxrrWj+QcJXenB1+d+Ev4=; b=V+MFuoI/ORtSQNDSVam+37Ew8GI9mmxFfq93bZx3nB8lEJfQoT0wayYTBPV/LtR5Z3 58ZmAKu+5UXnTycWVS/yZ0ATGMF+n/R52rnFX6jWf0yalqlqhGe55Ke8H9hBAzMrQiY6 J4+nhvOLw9PSIBbJhxYj/rNZAcpY78rFQrMLR9AYyD1tgfdn1guLEXq8Wv7G3uTsjyl8 6B4D8Egmcsozup2TrutwsbvHBFOqaHmkVRNuqkywhNNyhgpYbus2PlKy21e0BsnsgmVB VwyGyVpDevOcNUKl/tdmtg1Da2MU3xsfKsz02jnQDh0144nutamKNaRX8o1ikXNxFHqZ 7V2w== X-Gm-Message-State: ACrzQf2W19a9SLrctZWKlgcN93DBxrT9OA4lIaMxnDNB291KtD50/fB7 08NU0bPNmrQpixIQ6m1LXw4s6bPKIWj9Tqld X-Received: by 2002:a17:902:f685:b0:186:fa9c:2fdc with SMTP id l5-20020a170902f68500b00186fa9c2fdcmr1086126plg.25.1666993410580; Fri, 28 Oct 2022 14:43:30 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id u6-20020a17090a1d4600b002130c269b6fsm2993855pju.1.2022.10.28.14.43.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 14:43:30 -0700 (PDT) From: Jens Axboe To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 1/5] eventpoll: cleanup branches around sleeping for events Date: Fri, 28 Oct 2022 15:43:21 -0600 Message-Id: <20221028214325.13496-2-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221028214325.13496-1-axboe@kernel.dk> References: <20221028214325.13496-1-axboe@kernel.dk> MIME-Version: 1.0 X-Spam-Status: No, score=1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747969582237234331?= X-GMAIL-MSGID: =?utf-8?q?1747969582237234331?= Rather than have two separate branches here, collapse them into a single one instead. No functional changes here, just a cleanup in preparation for changes in this area. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 52954d4637b5..3061bdde6cba 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1869,14 +1869,15 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, * important. */ eavail = ep_events_available(ep); - if (!eavail) + if (!eavail) { __add_wait_queue_exclusive(&ep->wq, &wait); - - write_unlock_irq(&ep->lock); - - if (!eavail) + write_unlock_irq(&ep->lock); timed_out = !schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS); + } else { + write_unlock_irq(&ep->lock); + } + __set_current_state(TASK_RUNNING); /* From patchwork Fri Oct 28 21:43:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12574 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1059137wru; Fri, 28 Oct 2022 14:48:56 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4N8LZTSGmI3OSh3j1oHQsKqpe+9wPljkEFBsCn5HUAzu3nhzhM3j06PRDSO/6mwUcjpSFT X-Received: by 2002:a17:906:3607:b0:7ad:a798:cdc0 with SMTP id q7-20020a170906360700b007ada798cdc0mr1220021ejb.357.1666993736214; Fri, 28 Oct 2022 14:48:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666993736; cv=none; d=google.com; s=arc-20160816; b=iTClhYoyWQNZyFLOoEsLc7N3d0zEshFNloTjVX+49GuZ2pbzVVndY09Jd4+Cc+Lbrl o0QWnnZEzOBh9TJPADuFba7ovGGxu755hstvkqfiZRzRpLwEybFqxMRTEJa4pXD9GCL1 qEj7mU7ZQNnx0VG/Q8P6cxtvPVHUSTFyo3eqJA2DcruS6yjA1JKKh3UsLoqFXRQF6EnZ DlBxeED8Bx9DCcYr6ZfOl8SItryZRHlA6dhmIXPXE0u/Qm4ydWdlRcy3WBMs4RtIak0+ 6/nSaKdaaa/7sxAdhh8eFMltlnJ6Jt8KtDKL45263BL/ixtBPvhpqvGrQFQl0uqRKCw5 iqtA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tiguh3u9eggshsoZh3GPe1CsJMkdP/JI4kBBBED1cmY=; b=UtPCYu0FRXCEzBHXeiaxaMsZOekWTYeyIR5e9wxPthttz6kTF5oRvKgUD50X1BGAHO AUG+dC+NuTDSvAqtztPRd6yFPsifM/vTaweY07FsAId/wHfbOD9pmFqcx9it0EH/P309 6/gZVXA+l2+f2+8csQfjaZgYhlam+e/PtYLGD6f9HIYN6BfsnZiSUGZsg1uhI69vL6KB /GDRj0ytUn7SvKOYlDkT3XIXCL8UzEieLN/T78h7o+KYyBMedHIKrJ0uOgn0iFGrO5ht d1QJetBhtEPsjnYffdTRf4SbLm4o/qjxbETdp+3Z0g/8MJ/YdHZ7Y8EXrVKhaZrjkX1j DBkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=hx1oQADE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n14-20020a05640205ce00b00461fa05b004si6448849edx.105.2022.10.28.14.48.33; Fri, 28 Oct 2022 14:48:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=hx1oQADE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229862AbiJ1Vnq (ORCPT + 99 others); Fri, 28 Oct 2022 17:43:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39892 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230015AbiJ1Vnd (ORCPT ); Fri, 28 Oct 2022 17:43:33 -0400 Received: from mail-pj1-x102f.google.com (mail-pj1-x102f.google.com [IPv6:2607:f8b0:4864:20::102f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6E9FA24BA88 for ; Fri, 28 Oct 2022 14:43:32 -0700 (PDT) Received: by mail-pj1-x102f.google.com with SMTP id b11so5739766pjp.2 for ; Fri, 28 Oct 2022 14:43:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tiguh3u9eggshsoZh3GPe1CsJMkdP/JI4kBBBED1cmY=; b=hx1oQADEdyNFaX9tPTLZ7yu68JZNzS/ZRcV4/4ftjbbkrAqlVDlR59NVfME3EASL7q SmUZXZegPFn0vYGmx5BxprclqeaTTW4QnM2arglNSSFwKQrl1iodtZu8BXYqFEqnVuFi ljvLjzPtuZoTpUh1IE0dt0gDTqUuRuJZSvKmAldJaFcuvir511BxZioWBr8snZtpOmNU kGRxJs0HfTGxIVuRLAEQsgxVWerlzfBzFwlw1DdShu+GHUP2kGzSd8BQMNa4mjKlAhrd 4Xx36O4qXEWvcpXh4xVohKWyW6vQoZAyKJl8g8ZBtW4724mea8IKCS9fUeXxyXQ3S0AK SwRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tiguh3u9eggshsoZh3GPe1CsJMkdP/JI4kBBBED1cmY=; b=PWhpEskgsKfnHwgo6xpnEJTKnEdENFlYC5LH546PrL9Ga9DRxhsKFyxpHRfWDgej8e e9thchDMOZEaSFkUX2/GVtr7mVbxLBvyyrTs24R6YnujK0NfNjs5hB8Z6cnSBYX540u/ C6tborJgwwW8rFtrj4JoreSGyd7I4Ir0XipxG5ic+c8RYj0Gf9On6eqserXO5c53vs6k ek5yA33Jo7V5Qa1SHUIqXHdVamD8GlxR4f5/9A4G8cHvQN+xEb5kOXKojSq1DcXz5e6g 5unZ7ncANUQZe3euhPamlyMDhHRfci0bXpDr3zuU/oWJ5pMgtdTKe34f68KKpltkGoVb VxCA== X-Gm-Message-State: ACrzQf0i2ZFuwjrkqmZQ0EHfkZThDFlkAeK1sUm06M9DeOggyoBnl0N4 2o6FJnnTUWELXz1WtIkDlTPb5NkM5MJ78dZ0 X-Received: by 2002:a17:90a:4ece:b0:213:1130:ca9c with SMTP id v14-20020a17090a4ece00b002131130ca9cmr17984235pjl.17.1666993411637; Fri, 28 Oct 2022 14:43:31 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id u6-20020a17090a1d4600b002130c269b6fsm2993855pju.1.2022.10.28.14.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 14:43:31 -0700 (PDT) From: Jens Axboe To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 2/5] eventpoll: split out wait handling Date: Fri, 28 Oct 2022 15:43:22 -0600 Message-Id: <20221028214325.13496-3-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221028214325.13496-1-axboe@kernel.dk> References: <20221028214325.13496-1-axboe@kernel.dk> MIME-Version: 1.0 X-Spam-Status: No, score=1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747969624139229520?= X-GMAIL-MSGID: =?utf-8?q?1747969624139229520?= In preparation for making changes to how wakeups and sleeps are done, move the timeout scheduling into a helper and manage it rather than rely on schedule_hrtimeout_range(). Signed-off-by: Jens Axboe --- fs/eventpoll.c | 70 ++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 56 insertions(+), 14 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 3061bdde6cba..f53bb4ec9e91 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1762,6 +1762,47 @@ static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, return ret; } +struct epoll_wq { + wait_queue_entry_t wait; + struct hrtimer timer; + bool timed_out; +}; + +static enum hrtimer_restart ep_timer(struct hrtimer *timer) +{ + struct epoll_wq *ewq = container_of(timer, struct epoll_wq, timer); + struct task_struct *task = ewq->wait.private; + + ewq->timed_out = true; + wake_up_process(task); + return HRTIMER_NORESTART; +} + +static void ep_schedule(struct eventpoll *ep, struct epoll_wq *ewq, ktime_t *to, + u64 slack) +{ + if (ewq->timed_out) + return; + if (to && *to == 0) { + ewq->timed_out = true; + return; + } + if (!to) { + schedule(); + return; + } + + hrtimer_init_on_stack(&ewq->timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS); + ewq->timer.function = ep_timer; + hrtimer_set_expires_range_ns(&ewq->timer, *to, slack); + hrtimer_start_expires(&ewq->timer, HRTIMER_MODE_ABS); + + schedule(); + + hrtimer_cancel(&ewq->timer); + destroy_hrtimer_on_stack(&ewq->timer); +} + /** * ep_poll - Retrieves ready events, and delivers them to the caller-supplied * event buffer. @@ -1782,13 +1823,15 @@ static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, int maxevents, struct timespec64 *timeout) { - int res, eavail, timed_out = 0; + int res, eavail; u64 slack = 0; - wait_queue_entry_t wait; ktime_t expires, *to = NULL; + struct epoll_wq ewq; lockdep_assert_irqs_enabled(); + ewq.timed_out = false; + if (timeout && (timeout->tv_sec | timeout->tv_nsec)) { slack = select_estimate_accuracy(timeout); to = &expires; @@ -1798,7 +1841,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, * Avoid the unnecessary trip to the wait queue loop, if the * caller specified a non blocking operation. */ - timed_out = 1; + ewq.timed_out = 1; } /* @@ -1823,10 +1866,10 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, return res; } - if (timed_out) + if (ewq.timed_out) return 0; - eavail = ep_busy_loop(ep, timed_out); + eavail = ep_busy_loop(ep, ewq.timed_out); if (eavail) continue; @@ -1850,8 +1893,8 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, * performance issue if a process is killed, causing all of its * threads to wake up without being removed normally. */ - init_wait(&wait); - wait.func = ep_autoremove_wake_function; + init_wait(&ewq.wait); + ewq.wait.func = ep_autoremove_wake_function; write_lock_irq(&ep->lock); /* @@ -1870,10 +1913,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, */ eavail = ep_events_available(ep); if (!eavail) { - __add_wait_queue_exclusive(&ep->wq, &wait); + __add_wait_queue_exclusive(&ep->wq, &ewq.wait); write_unlock_irq(&ep->lock); - timed_out = !schedule_hrtimeout_range(to, slack, - HRTIMER_MODE_ABS); + ep_schedule(ep, &ewq, to, slack); } else { write_unlock_irq(&ep->lock); } @@ -1887,7 +1929,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, */ eavail = 1; - if (!list_empty_careful(&wait.entry)) { + if (!list_empty_careful(&ewq.wait.entry)) { write_lock_irq(&ep->lock); /* * If the thread timed out and is not on the wait queue, @@ -1896,9 +1938,9 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, * Thus, when wait.entry is empty, it needs to harvest * events. */ - if (timed_out) - eavail = list_empty(&wait.entry); - __remove_wait_queue(&ep->wq, &wait); + if (ewq.timed_out) + eavail = list_empty(&ewq.wait.entry); + __remove_wait_queue(&ep->wq, &ewq.wait); write_unlock_irq(&ep->lock); } } From patchwork Fri Oct 28 21:43:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12571 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1058584wru; Fri, 28 Oct 2022 14:47:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4jpoY8J/yESU8FD4DKbhln0sPod5FP0P+F+hIogFPyj4paRtY3tRgVQhnGjPMBcdL6cjMu X-Received: by 2002:a17:907:c03:b0:781:fd5a:c093 with SMTP id ga3-20020a1709070c0300b00781fd5ac093mr1272527ejc.89.1666993631137; Fri, 28 Oct 2022 14:47:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666993631; cv=none; d=google.com; s=arc-20160816; b=LN3lH2d9C7rVsu2V3wTqfnZh0UlwTuitRu4iH/5MTmC4gU/b+elxDtLGCL/Sma9l9d BbonJcIXC5dcLSNbxeNSOya6fXVhptycCI7YgK3t/bMEChhrAffCL322c9fIObnw6DMZ UaK4d6vZIlYikwiVOYVYaGnbIjPcH3rPEJNKaVyOuoULKDQB8K3KsBTC6qqofIgZi+FN 2lUsuJ/HNCPpbv45cqWvqspcNEG5/xJ/3k5kDjKTimyV05yI/hnCf8FD2B4CYM2bY2ME 5UAp2iinCoqGXJdTKraPfDqrjj+42SQCtzmzrQLQamVb101641Jpvlerae57I8qHnBvF Ri1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=MWMQ1OOnzvPZ3+daCqGm09tO9BG/1DEjXdpVAErFn0Q=; b=UUcI4f25EUNZjhYj15+/171Z2qBnWXjXmydUccf+EJfFteYnRtQvhrg7Oh2QYA6eNQ tv5Gzm8SllkrBl/XXXKvMNF6+VFOek9zGNBUZEyHmhn8ZCMZHxojVb+uiIy058FRKDiV +kUxwKHpXGcsFjFfc0UUsjabwZhB329U2/OJQb/TjO5UmHSXjFLOzz5mhkhxyj/QMhT6 bvI3mV1dp9jIXeq+QyqluggkSuGCvjtz/LryiH+zpH+luYY3sD9xuWpRfIBF2I7aHdj+ b/r3y+d5qkEKYpOHG4czbnEow8WFeQu8P2OmLQrk8TF/dVLDuiShxdo7lmktFnEChdc/ ksXw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=TXcQvK4Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id my22-20020a1709065a5600b0077c5ec87ec2si4628076ejc.297.2022.10.28.14.46.46; Fri, 28 Oct 2022 14:47:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=TXcQvK4Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230092AbiJ1Vns (ORCPT + 99 others); Fri, 28 Oct 2022 17:43:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230003AbiJ1Vne (ORCPT ); Fri, 28 Oct 2022 17:43:34 -0400 Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7D18B24BA98 for ; Fri, 28 Oct 2022 14:43:33 -0700 (PDT) Received: by mail-pj1-x102b.google.com with SMTP id c15-20020a17090a1d0f00b0021365864446so5645930pjd.4 for ; Fri, 28 Oct 2022 14:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MWMQ1OOnzvPZ3+daCqGm09tO9BG/1DEjXdpVAErFn0Q=; b=TXcQvK4YM0rjh/gPGX+lSDLs/Ke3EOgKgIbxYBPbl5Bwlrr02MTRdmLdtc8qwoAkJM ZnTfMgGMHMSCWpxIVjkrwLMxMq7Aa5+X1Pw5Y0mYN4QS4pzHAUEb9rTS6GVTIGd9dnBw IEsmLKUjSwfELvRFwzBYFYfvUtbWTA7owJNYHUiYBArJRyJkWoP5yZp0qWjYQnpqE72K iIstp90EbFOvVeQZqEjtyiCUPVnAoOcZI/gUcdUuldBpo/+sb4MubpdLhHuez1mWbXOy e8nwkXwTzG56g5jJQZTktuB6nSknxKEWVF1c4LUCvz1ZBmeyPV3XQR7dc5pBLSjfIamR QWgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MWMQ1OOnzvPZ3+daCqGm09tO9BG/1DEjXdpVAErFn0Q=; b=yTyNBSzoYCuV1fr84VRp1+gaItLhMEBVIrBkLvCidYI3h96mQlZixRMm6guUz3QGNu Msn+2XGcLCSS/lakSnMN5S0BDHGsozBmUsYinpOArjXErI8UG5inWTiO5r3Rn/xo17jB V/WTw56OJ4+7NxMrMvSswinfWjHl54eFL0kO+2b4a8sKV0Bx07w8dMpm5mV6dKImX2Zo eWOSE6olUXTSY+2Fi9+mSjTyyXClVhqGl9qJIkNeCC9bBM03MvPUxzsNcW2muLXAdkPx r95EU8GECtUzGKjIx5uOrQYa+XKlsqwnw8z1J7CVUNffBcfkzwzkPwtXM95bs4iujiIa 0GHA== X-Gm-Message-State: ACrzQf1jOQxhOVuD3nRoq4nSe3ZzBTIdGVKqatQ78mKgZr84SOyvBc5K ChIdEgHtVIfRRNG/CRQCxxPOfI95l1FYstp7 X-Received: by 2002:a17:902:e54b:b0:186:5fba:13a5 with SMTP id n11-20020a170902e54b00b001865fba13a5mr1073947plf.173.1666993412746; Fri, 28 Oct 2022 14:43:32 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id u6-20020a17090a1d4600b002130c269b6fsm2993855pju.1.2022.10.28.14.43.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 14:43:32 -0700 (PDT) From: Jens Axboe To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 3/5] eventpoll: move expires to epoll_wq Date: Fri, 28 Oct 2022 15:43:23 -0600 Message-Id: <20221028214325.13496-4-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221028214325.13496-1-axboe@kernel.dk> References: <20221028214325.13496-1-axboe@kernel.dk> MIME-Version: 1.0 X-Spam-Status: No, score=1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747969513565786040?= X-GMAIL-MSGID: =?utf-8?q?1747969513565786040?= This makes the expiration available to the wakeup handler. No functional changes expected in this patch, purely in preparation for being able to use the timeout on the wakeup side. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index f53bb4ec9e91..8b3c94ab7762 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -1765,6 +1765,7 @@ static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, struct epoll_wq { wait_queue_entry_t wait; struct hrtimer timer; + ktime_t timeout_ts; bool timed_out; }; @@ -1825,7 +1826,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, { int res, eavail; u64 slack = 0; - ktime_t expires, *to = NULL; + ktime_t *to = NULL; struct epoll_wq ewq; lockdep_assert_irqs_enabled(); @@ -1834,7 +1835,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, if (timeout && (timeout->tv_sec | timeout->tv_nsec)) { slack = select_estimate_accuracy(timeout); - to = &expires; + to = &ewq.timeout_ts; *to = timespec64_to_ktime(*timeout); } else if (timeout) { /* From patchwork Fri Oct 28 21:43:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12572 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1058591wru; Fri, 28 Oct 2022 14:47:11 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5bnZb6UIJkfyobyhyKM4lRaW9PbFjuIPayjBw7gs+QQ2+I9EUMrl6nJ7EbuWQWrc6JhSge X-Received: by 2002:a05:6402:2201:b0:44f:443e:2a78 with SMTP id cq1-20020a056402220100b0044f443e2a78mr1457171edb.76.1666993631842; Fri, 28 Oct 2022 14:47:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666993631; cv=none; d=google.com; s=arc-20160816; b=mYBCrr7J8KiorGmEvjO11vJczsuMBlvXUSeDxH9Wv7KD+LKbentYy2Y/PjDGeAn7zr Y09pGtiu+ctog7hhBkyDpdpvp1kK4Alj7bKX9mloZy6k2UzqBRuzji2hTfQcnT7jZ9pF m6CYClSciGFwK3qw5YR3l9TmWPKKtLcp7sD+GMREw8uydDvMd/Q7ez5Kr6hBPEPPRtTN 85NUrsuSww4Z63rjJWi+IYRRI0Uk9Nvt+5vwa1V3j+K0Nx9wwZ4aVQSq8pBUlT9P5hyZ PkiAnpccZQPzf5oM5qIHhXfeyv6adxxfUbNAsrVJ2cPKMYIU7Jp+Kxt5Qna0kBJwq116 nPuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oO43xlLma+tv8tmMXvsUkJytaNEXn7co0M840UITGjM=; b=R/dp25aVnflM17vkp4IlNuv9sj8kdn0OzFWX+tKrOfHSIC9MFCKc7C3gcL6NChg6cm sfuAlH+YRvWV6QgcA6F/1Id5YNw73MK5M6ZNd0ADR/BC0XYazzy62OnGwsHinAD3GwKs kF7taepz61s/DYTAnIy7OJhDrLqsGzeXDvIHzYUjuD6hOthRuVShO9oMewRtahkCOEw3 MuzHPkZ40KvXv9swpEeiWB6AY8d5a5qheENd826oS9+D+Ypzb7W1ZuNYQdyrA3oqBNY2 EqohUQ0YuJhLCSfW8bFjcDrQcHeHZps4EsQppOgAt7I/ugaEQ/RIMbzulT1VhXnA1ghM gXsg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=acHh5cwg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id hs9-20020a1709073e8900b0072a6c18f1fasi6550175ejc.639.2022.10.28.14.46.47; Fri, 28 Oct 2022 14:47:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=acHh5cwg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229665AbiJ1Vnw (ORCPT + 99 others); Fri, 28 Oct 2022 17:43:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39856 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230039AbiJ1Vnf (ORCPT ); Fri, 28 Oct 2022 17:43:35 -0400 Received: from mail-pf1-x42a.google.com (mail-pf1-x42a.google.com [IPv6:2607:f8b0:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A473624AE26 for ; Fri, 28 Oct 2022 14:43:34 -0700 (PDT) Received: by mail-pf1-x42a.google.com with SMTP id e4so5867744pfl.2 for ; Fri, 28 Oct 2022 14:43:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oO43xlLma+tv8tmMXvsUkJytaNEXn7co0M840UITGjM=; b=acHh5cwgkAmQ3dtoHxW9mLeYEZVag9qz5bdERqpVL2WrAMRtFTbFR4LDwgRwGSBlIQ 5CjocgoZo+dqh/Kn1Gdf/IIsiyaMWIaedyvrwHGQnNYO+Boy5qfbvJJKmEoxO/GOR8/S wOHqfc4E0Q47fcdWM9poaiM9muj6DDd5I3qsbcUkrF5oL7CbaEgCZU2bWpRU/2OHKv7k UiYt0WpcrTEUbi9+89xe3KRybHGilPnNUEDqTL7C9nlv0GTP/eE41KOyezm2dNi93MZP hZgJcWKCT41ljDihKIHqSsW2tbgVuJrYN/0ceMa68ovP+M7Q+Z9AgCda4s7+zAV0nAVW +OPg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oO43xlLma+tv8tmMXvsUkJytaNEXn7co0M840UITGjM=; b=JLaENCwgK90MBDv5sDuejn+r7g7vDLsrRdwkWFVNLUntk46kvF2od9u9zrZD69yK6T x/LrVm4TrNjNuWZCZ1NeUWsxhzP5kQvGW52WcscK0CX15JL9+5ErXila9ulJ4t10cuOP qLUrzOi1dpdYyAv0+BJuzmX6TPFKLOwq/DQtkRJzAZVni4jJwMOSQkFzVyyUg3USUbtK Ic5J5geMpJRTQJP+XAwrE2c6Nni6lqK5sDROVBUaDyVzkvFNZICs/4hZE+oxfYSyqbH3 ufqeOm2cwzdqtlXWw4gN/9aw9tkvqsLmAUBDvpxoXJ/Job9jKde+pQIXDiIFXzJTBcDw Ht0Q== X-Gm-Message-State: ACrzQf0ztir3K12UFOXY+IIhsK9L65xs729iCQTYvp5ZhN2joddzvK4b Z8t6F2Znt1tQSH5zLhN5hZSz3vV1hXWPwWNt X-Received: by 2002:a63:d241:0:b0:43c:474c:c6c6 with SMTP id t1-20020a63d241000000b0043c474cc6c6mr1342605pgi.523.1666993413875; Fri, 28 Oct 2022 14:43:33 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id u6-20020a17090a1d4600b002130c269b6fsm2993855pju.1.2022.10.28.14.43.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 14:43:33 -0700 (PDT) From: Jens Axboe To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 4/5] eventpoll: move file checking earlier for epoll_ctl() Date: Fri, 28 Oct 2022 15:43:24 -0600 Message-Id: <20221028214325.13496-5-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221028214325.13496-1-axboe@kernel.dk> References: <20221028214325.13496-1-axboe@kernel.dk> MIME-Version: 1.0 X-Spam-Status: No, score=1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747969514606788469?= X-GMAIL-MSGID: =?utf-8?q?1747969514606788469?= This just cleans up the checking a bit, in preparation for a change that will need access to 'ep' earlier. Signed-off-by: Jens Axboe --- fs/eventpoll.c | 26 ++++++++++++++++---------- 1 file changed, 16 insertions(+), 10 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 8b3c94ab7762..cd2138d02bda 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -2111,6 +2111,20 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds, if (!f.file) goto error_return; + /* + * We have to check that the file structure underneath the file + * descriptor the user passed to us _is_ an eventpoll file. + */ + error = -EINVAL; + if (!is_file_epoll(f.file)) + goto error_fput; + + /* + * At this point it is safe to assume that the "private_data" contains + * our own data structure. + */ + ep = f.file->private_data; + /* Get the "struct file *" for the target file */ tf = fdget(fd); if (!tf.file) @@ -2126,12 +2140,10 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds, ep_take_care_of_epollwakeup(epds); /* - * We have to check that the file structure underneath the file descriptor - * the user passed to us _is_ an eventpoll file. And also we do not permit - * adding an epoll file descriptor inside itself. + * We do not permit adding an epoll file descriptor inside itself. */ error = -EINVAL; - if (f.file == tf.file || !is_file_epoll(f.file)) + if (f.file == tf.file) goto error_tgt_fput; /* @@ -2147,12 +2159,6 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds, goto error_tgt_fput; } - /* - * At this point it is safe to assume that the "private_data" contains - * our own data structure. - */ - ep = f.file->private_data; - /* * When we insert an epoll file descriptor inside another epoll file * descriptor, there is the chance of creating closed loops, which are From patchwork Fri Oct 28 21:43:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 12575 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1059242wru; Fri, 28 Oct 2022 14:49:29 -0700 (PDT) X-Google-Smtp-Source: AMsMyM488+nfq286eSa/3nJ1+0hHbVj3QgfFoPyc0eK1dSWYwYD0kBVVfA5brTkSsJLTU+pVLeYR X-Received: by 2002:a05:6402:428d:b0:460:b26c:82a5 with SMTP id g13-20020a056402428d00b00460b26c82a5mr1516881edc.66.1666993769429; Fri, 28 Oct 2022 14:49:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666993769; cv=none; d=google.com; s=arc-20160816; b=iM4k9hyHnD9lsadFVRao3RmWR30TXWRIlGjjW1BbFTOQUHOdsQjUvqqYd99TWx0/hO e3dWOiS9LjRq2PR8+0LeXEGA7IloEHP4MyEZU18TgaHy9xYGSLbuqbOeRAZoYnl0aKBP onJV58OyMzc3OTfzV4W/7TLv+5BTPzfZFlmc3wTEUYj7XDPhdyD00IrllgMsbz6EaHQU BexAPyAdPGOwrQ42X38DPcFvxUcRJKHyM140i7903UjZ1LglXsQrqhuIvuCGhyoqhDHS GG63ILen4J83YJqB8l1Vp29Uu/5b/yMDr9X8nUWB4ethX4YPRCo+eRw+8nPf25CgPsR8 2qEA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=2NM4YfZaBNFzd1+6iQahn7Cz4qt0108M3OXrThmpXB0=; b=t/IU3xNy/EZ6LCVLDgZZdxgxRZgteP81SaD14+a/7SXuG8AUGiPKzUmU31yety2gib +lPio5XZvvjhtNbx2KBH41U3xmemzwd9gUX1qlPENRvN0sMceIyAld8Juq1Fy5C3e4oE xwx5ewA7+HrzviIbKl+d/UKykfDiFkjUTngW7lhBSs4A+Km5blI1uoeU5YeOIjvQEQm6 +nHyTn38pK75LgGTfjAfxysYLY3P0tInW0rx3id5IQB81BLvFhfESUj1t32BpWMRUfO/ tGk3mXaf8GmKiYgBKvFgaIbyn6cS2pEgWo5JqDcrTKKCJDJj4Ff/hT8CnC2CdTFgJueB 0oxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=QSevLRS9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f17-20020a50a6d1000000b0045d8bff7b1asi5207508edc.403.2022.10.28.14.49.05; Fri, 28 Oct 2022 14:49:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20210112.gappssmtp.com header.s=20210112 header.b=QSevLRS9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230103AbiJ1Vn4 (ORCPT + 99 others); Fri, 28 Oct 2022 17:43:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230013AbiJ1Vnh (ORCPT ); Fri, 28 Oct 2022 17:43:37 -0400 Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6C5124BA90 for ; Fri, 28 Oct 2022 14:43:35 -0700 (PDT) Received: by mail-pj1-x102d.google.com with SMTP id 3-20020a17090a0f8300b00212d5cd4e5eso11019170pjz.4 for ; Fri, 28 Oct 2022 14:43:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2NM4YfZaBNFzd1+6iQahn7Cz4qt0108M3OXrThmpXB0=; b=QSevLRS9OTiO72mHM8DveErxfnY+PPoTQqAemtA7ZmP7zXeOfUkue/zXHJ8/mkdf0m sXP4l2vc/RUP6ag7sC0cZVDBjKTWfOMOdb8zhLO+kQ+58pJPfBkkDKUlAI9KsZN5MNzD AWVJfLl1GYoZJrFJCyW2UjuBirgE9HkkKejDzGQ8Ks9YfPMiZm62NK7f6LcuARcx0NUC aojtrmB9RMAf0YVCzztXWbarybI2q8PTtvsHH2nluNW8M8IbNGqbXW/r2R/npCK4kwA4 9a3OW6U/OJalaH0ZSbjkGoArC/dCB1UWBi5VTLtUNwt4/5m4Nxd8OcKP4KcyUC5nUqk0 JhTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2NM4YfZaBNFzd1+6iQahn7Cz4qt0108M3OXrThmpXB0=; b=46bgZ0S0pyyOAVJU4oO25shovnoUEXIV15NmOJEveryV0U5p1U6/itrUu4LNR6CtH5 Qv/CL9QfiJVUPAPKIQBgCrOv2Kjr6oII90xg/MXkWNSQe2t3PBx77CwBM74AmbyoXW6A wfJZ3TDpV8jdirsLGB1bGqYpvLNbKfPSH34DLqrXQVTZHCCIxWP/ZkTDR+VOK7lE66CH VEv/5SmM2uzzr02b7MxQI/W42a06dOGwbcbDQXEXF1uoanQflPwfc1d66IGOQCWwkx/U UVOZXPW7ubhM+fr/q9fYO3T+vBDjRDOWWM8DoxbnZzAv2robu/49HFGqfeGp1J1fqClg OtNQ== X-Gm-Message-State: ACrzQf0CU8mxfeNFtpH2SwxvZ4MQw6zBmS9drd6rO/L5A7lT/7Cb1jMh ySZMyuhEHGLiBPP3lnCZATpMIN/Jlrlt/dcX X-Received: by 2002:a17:90b:1d0f:b0:20d:1ec3:f732 with SMTP id on15-20020a17090b1d0f00b0020d1ec3f732mr1385609pjb.84.1666993415000; Fri, 28 Oct 2022 14:43:35 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id u6-20020a17090a1d4600b002130c269b6fsm2993855pju.1.2022.10.28.14.43.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 28 Oct 2022 14:43:34 -0700 (PDT) From: Jens Axboe To: linux-kernel@vger.kernel.org, netdev@vger.kernel.org Cc: Jens Axboe Subject: [PATCH 5/5] eventpoll: add support for min-wait Date: Fri, 28 Oct 2022 15:43:25 -0600 Message-Id: <20221028214325.13496-6-axboe@kernel.dk> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20221028214325.13496-1-axboe@kernel.dk> References: <20221028214325.13496-1-axboe@kernel.dk> MIME-Version: 1.0 X-Spam-Status: No, score=1.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: * X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747969658797239475?= X-GMAIL-MSGID: =?utf-8?q?1747969658797239475?= Rather than just have a timeout value for waiting on events, add EPOLL_CTL_MIN_WAIT to allow setting a minimum time that epoll_wait() should always wait for events to arrive. For medium workload efficiencies, some production workloads inject artificial timers or sleeps before calling epoll_wait() to get better batching and higher efficiencies. While this does help, it's not as efficient as it could be. By adding support for epoll_wait() for this directly, we can avoids extra context switches and scheduler and timer overhead. As an example, running an AB test on an identical workload at about ~370K reqs/second, without this change and with the sleep hack mentioned above (using 200 usec as the timeout), we're doing 310K-340K non-voluntary context switches per second. Idle CPU on the host is 27-34%. With the the sleep hack removed and epoll set to the same 200 usec value, we're handling the exact same load but at 292K-315k non-voluntary context switches and idle CPU of 33-41%, a substantial win. Basic test case: struct d { int p1, p2; }; static void *fn(void *data) { struct d *d = data; char b = 0x89; /* Generate 2 events 20 msec apart */ usleep(10000); write(d->p1, &b, sizeof(b)); usleep(10000); write(d->p2, &b, sizeof(b)); return NULL; } int main(int argc, char *argv[]) { struct epoll_event ev, events[2]; pthread_t thread; int p1[2], p2[2]; struct d d; int efd, ret; efd = epoll_create1(0); if (efd < 0) { perror("epoll_create"); return 1; } if (pipe(p1) < 0) { perror("pipe"); return 1; } if (pipe(p2) < 0) { perror("pipe"); return 1; } ev.events = EPOLLIN; ev.data.fd = p1[0]; if (epoll_ctl(efd, EPOLL_CTL_ADD, p1[0], &ev) < 0) { perror("epoll add"); return 1; } ev.events = EPOLLIN; ev.data.fd = p2[0]; if (epoll_ctl(efd, EPOLL_CTL_ADD, p2[0], &ev) < 0) { perror("epoll add"); return 1; } /* always wait 200 msec for events */ ev.data.u64 = 200000; if (epoll_ctl(efd, EPOLL_CTL_MIN_WAIT, -1, &ev) < 0) { perror("epoll add set timeout"); return 1; } d.p1 = p1[1]; d.p2 = p2[1]; pthread_create(&thread, NULL, fn, &d); /* expect to get 2 events here rather than just 1 */ ret = epoll_wait(efd, events, 2, -1); printf("epoll_wait=%d\n", ret); return 0; } Signed-off-by: Jens Axboe --- fs/eventpoll.c | 100 ++++++++++++++++++++++++++++----- include/linux/eventpoll.h | 2 +- include/uapi/linux/eventpoll.h | 1 + 3 files changed, 87 insertions(+), 16 deletions(-) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index cd2138d02bda..828e2b9771d6 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -117,6 +117,9 @@ struct eppoll_entry { /* The "base" pointer is set to the container "struct epitem" */ struct epitem *base; + /* min wait time if (min_wait_ts) & 1 != 0 */ + ktime_t min_wait_ts; + /* * Wait queue item that will be linked to the target file wait * queue head. @@ -217,6 +220,9 @@ struct eventpoll { u64 gen; struct hlist_head refs; + /* min wait for epoll_wait() */ + unsigned int min_wait_ts; + #ifdef CONFIG_NET_RX_BUSY_POLL /* used to track busy poll napi_id */ unsigned int napi_id; @@ -1747,6 +1753,32 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms) return to; } +struct epoll_wq { + wait_queue_entry_t wait; + struct hrtimer timer; + ktime_t timeout_ts; + ktime_t min_wait_ts; + struct eventpoll *ep; + bool timed_out; + int maxevents; + int wakeups; +}; + +static bool ep_should_min_wait(struct epoll_wq *ewq) +{ + if (ewq->min_wait_ts & 1) { + /* just an approximation */ + if (++ewq->wakeups >= ewq->maxevents) + goto stop_wait; + if (ktime_before(ktime_get_ns(), ewq->min_wait_ts)) + return true; + } + +stop_wait: + ewq->min_wait_ts &= ~(u64) 1; + return false; +} + /* * autoremove_wake_function, but remove even on failure to wake up, because we * know that default_wake_function/ttwu will only fail if the thread is already @@ -1756,27 +1788,37 @@ static struct timespec64 *ep_timeout_to_timespec(struct timespec64 *to, long ms) static int ep_autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode, int sync, void *key) { - int ret = default_wake_function(wq_entry, mode, sync, key); + struct epoll_wq *ewq = container_of(wq_entry, struct epoll_wq, wait); + int ret; + + /* + * If min wait time hasn't been satisfied yet, keep waiting + */ + if (ep_should_min_wait(ewq)) + return 0; + ret = default_wake_function(wq_entry, mode, sync, key); list_del_init(&wq_entry->entry); return ret; } -struct epoll_wq { - wait_queue_entry_t wait; - struct hrtimer timer; - ktime_t timeout_ts; - bool timed_out; -}; - static enum hrtimer_restart ep_timer(struct hrtimer *timer) { struct epoll_wq *ewq = container_of(timer, struct epoll_wq, timer); struct task_struct *task = ewq->wait.private; + const bool is_min_wait = ewq->min_wait_ts & 1; + + if (!is_min_wait || ep_events_available(ewq->ep)) { + if (!is_min_wait) + ewq->timed_out = true; + ewq->min_wait_ts &= ~(u64) 1; + wake_up_process(task); + return HRTIMER_NORESTART; + } - ewq->timed_out = true; - wake_up_process(task); - return HRTIMER_NORESTART; + ewq->min_wait_ts &= ~(u64) 1; + hrtimer_set_expires_range_ns(&ewq->timer, ewq->timeout_ts, 0); + return HRTIMER_RESTART; } static void ep_schedule(struct eventpoll *ep, struct epoll_wq *ewq, ktime_t *to, @@ -1831,12 +1873,14 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, lockdep_assert_irqs_enabled(); + ewq.ep = ep; ewq.timed_out = false; + ewq.maxevents = maxevents; + ewq.wakeups = 0; if (timeout && (timeout->tv_sec | timeout->tv_nsec)) { slack = select_estimate_accuracy(timeout); - to = &ewq.timeout_ts; - *to = timespec64_to_ktime(*timeout); + ewq.timeout_ts = timespec64_to_ktime(*timeout); } else if (timeout) { /* * Avoid the unnecessary trip to the wait queue loop, if the @@ -1845,6 +1889,21 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, ewq.timed_out = 1; } + /* + * If min_wait is set for this epoll instance, note the min_wait + * time. Ensure the lowest bit is set in ewq.min_wait_ts, that's + * the state bit for whether or not min_wait is enabled. + */ + if (ep->min_wait_ts) { + ewq.min_wait_ts = ktime_add_us(ktime_get_ns(), + ep->min_wait_ts); + ewq.min_wait_ts |= (u64) 1; + to = &ewq.min_wait_ts; + } else { + ewq.min_wait_ts = 0; + to = &ewq.timeout_ts; + } + /* * This call is racy: We may or may not see events that are being added * to the ready list under the lock (e.g., in IRQ callbacks). For cases @@ -1913,7 +1972,7 @@ static int ep_poll(struct eventpoll *ep, struct epoll_event __user *events, * important. */ eavail = ep_events_available(ep); - if (!eavail) { + if (!eavail || ewq.min_wait_ts & 1) { __add_wait_queue_exclusive(&ep->wq, &ewq.wait); write_unlock_irq(&ep->lock); ep_schedule(ep, &ewq, to, slack); @@ -2125,6 +2184,17 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds, */ ep = f.file->private_data; + /* + * Handle EPOLL_CTL_MIN_WAIT upfront as we don't need to care about + * the fd being passed in. + */ + if (op == EPOLL_CTL_MIN_WAIT) { + /* return old value */ + error = ep->min_wait_ts; + ep->min_wait_ts = epds->data; + goto error_fput; + } + /* Get the "struct file *" for the target file */ tf = fdget(fd); if (!tf.file) @@ -2257,7 +2327,7 @@ SYSCALL_DEFINE4(epoll_ctl, int, epfd, int, op, int, fd, { struct epoll_event epds; - if (ep_op_has_event(op) && + if ((ep_op_has_event(op) || op == EPOLL_CTL_MIN_WAIT) && copy_from_user(&epds, event, sizeof(struct epoll_event))) return -EFAULT; diff --git a/include/linux/eventpoll.h b/include/linux/eventpoll.h index 3337745d81bd..cbef635cb7e4 100644 --- a/include/linux/eventpoll.h +++ b/include/linux/eventpoll.h @@ -59,7 +59,7 @@ int do_epoll_ctl(int epfd, int op, int fd, struct epoll_event *epds, /* Tells if the epoll_ctl(2) operation needs an event copy from userspace */ static inline int ep_op_has_event(int op) { - return op != EPOLL_CTL_DEL; + return op != EPOLL_CTL_DEL && op != EPOLL_CTL_MIN_WAIT; } #else diff --git a/include/uapi/linux/eventpoll.h b/include/uapi/linux/eventpoll.h index 8a3432d0f0dc..81ecb1ca36e0 100644 --- a/include/uapi/linux/eventpoll.h +++ b/include/uapi/linux/eventpoll.h @@ -26,6 +26,7 @@ #define EPOLL_CTL_ADD 1 #define EPOLL_CTL_DEL 2 #define EPOLL_CTL_MOD 3 +#define EPOLL_CTL_MIN_WAIT 4 /* Epoll event masks */ #define EPOLLIN (__force __poll_t)0x00000001