From patchwork Tue Apr 25 18:48:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 87527 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3614354vqo; Tue, 25 Apr 2023 12:08:12 -0700 (PDT) X-Google-Smtp-Source: AKy350Z13NaDfkGOeT+5L/r2nr6eXmjD67wMtpWLrBLKxaG2wtIkSwUzy1tEZ5qe+lMMx8ePZXHI X-Received: by 2002:a05:6a00:892:b0:63b:6774:8f1d with SMTP id q18-20020a056a00089200b0063b67748f1dmr23936548pfj.31.1682449691939; Tue, 25 Apr 2023 12:08:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682449691; cv=none; d=google.com; s=arc-20160816; b=xK9HI+ARZFwLDdm9m6v4BUJYxWs13rv7CPmFMmKy0SeIwRl37mqWRjk4RmZ3cDVZI/ nmJfmKup3Hm3J2Yo03w/CAGM0mvUgQD5fIhZOR2okxqt09cwLnCs6X8XCXt9FTGYP0I8 BUsJALHhSjR1NMwiRiF2uOzSqaYy5x1aSpeZhC9l94N/h/yO88uKO4luvgxdrHHQ5HIj ajUusdwMWaYTkfqBjOBCpSgTdDk/lsblTYCQK24j8a7Y2ZxXUOqhNTlP1m00KuZTBMVD p3W0P81K1Ih72PkiM/4bSdCyihFtgOgh5gFi2b978E6/yHWrQboL75iPtL3GJ0QDt6sJ 5GLw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:mime-version:references:subject:cc:to:from :dkim-signature:dkim-signature:message-id; bh=ma15CcBtPBAO08sKCc2bZL60SF+cskQ6NkR0WFr8uz4=; b=lrP7LaUyNr1ki4HPNWrS4cF9CukZifaq6oQxX9vBgsbr62lyhmvzI8/Go9lttWXQLw oTsaV0xEiLshDw2kJhLHuSSnpxHU3TMMaHsW9SmYw/vAMfRifUJ5exSl+hsD3Um16q+8 MOyboqDO/NhZbohLs8mFc7oX/TFohHMqKsjcDuPtgK3i5kaEu1Klhp8vMzTORN5jk2D6 kvw8CUqQDMZrm8UqGjdezWTN4ygJuZkxYmMqy6WGMLCzgHhKE2/jxKYcfEFfOfcfo4gw SbxIV/iA0CvZmCjapzNW9qYhH7xgkfoXdeSyaBFstqhXn34rI8chwSmrHxaQxKf/394+ Axuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=wXqXLZK0; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b="qoln3u/j"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p20-20020a63c154000000b0050300b179f3si13811448pgi.444.2023.04.25.12.07.56; Tue, 25 Apr 2023 12:08:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=wXqXLZK0; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b="qoln3u/j"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234720AbjDYStN (ORCPT + 99 others); Tue, 25 Apr 2023 14:49:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234679AbjDYStG (ORCPT ); Tue, 25 Apr 2023 14:49:06 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 800BE16F3C for ; Tue, 25 Apr 2023 11:49:00 -0700 (PDT) Message-ID: <20230425183312.932345089@linutronix.de> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1682448539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=ma15CcBtPBAO08sKCc2bZL60SF+cskQ6NkR0WFr8uz4=; b=wXqXLZK0I9S1eIVXdIW73RV90sJQGz+JCRCrczABAACivcLflQxHoUpEMchlSc1f5ROQ6/ PCa45iAF2VdtMYjV40MAbKsBITKjz1FGH61BNwMudqo5aesOxVGGo/VUt3Kt1tsEvh3x7y no6HKxjnr2njW6V1W/ElACb3cmotfqKznm28lI6zeK0d5FJChkY2kK4jjqMVDQGM8d8V/x 8WckZ/mUJ2xDUfFw/BT8TIHN8vujJURTILNMUlHnRchFhGPoXjhHihaU77MYDDuBZON55k rNyitzkSBXmE3DiWabte4FJQ/0UPhRyco4TElk4/vwA3CBOrwMWXcIdOf4a80A== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1682448539; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=ma15CcBtPBAO08sKCc2bZL60SF+cskQ6NkR0WFr8uz4=; b=qoln3u/jsJxr2/gaQYE10qZLx/DDjeDSwFMjMKBCngltWOrSa+H1Ux1/vC+YlJTtND1ZQJ R/9J3lKVyanqYADw== From: Thomas Gleixner To: LKML Cc: Frederic Weisbecker , Anna-Maria Behnsen , Peter Zijlstra , syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com, Dmitry Vyukov , Sebastian Siewior , Michael Kerrisk Subject: [patch 02/20] posix-timers: Ensure timer ID search-loop limit is valid References: <20230425181827.219128101@linutronix.de> MIME-Version: 1.0 Date: Tue, 25 Apr 2023 20:48:58 +0200 (CEST) X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764176368182784315?= X-GMAIL-MSGID: =?utf-8?q?1764176368182784315?= posix_timer_add() tries to allocate a posix timer ID by starting from the cached ID which was stored by the last successful allocation. This is done in a loop searching the ID space for a free slot one by one. The loop has to terminate when the search wrapped around to the starting point. But that's racy vs. establishing the starting point. That is read out lockless, which leads to the following problem: CPU0 CPU1 posix_timer_add() start = sig->posix_timer_id; lock(hash_lock); ... posix_timer_add() if (++sig->posix_timer_id < 0) start = sig->posix_timer_id; sig->posix_timer_id = 0; So CPU1 can observe a negative start value, i.e. -1, and the loop break never happens because the condition can never be true: if (sig->posix_timer_id == start) break; While this is unlikely to ever turn into an endless loop as the ID space is huge (INT_MAX), the racy read of the start value caught the attention of KCSAN and Dmitry unearthed that incorrectness. Rewrite it so that the start condition can never observe the negative value and annotate the read and the write with READ_ONCE()/WRITE_ONCE(). Reported-by: syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com Reported-by: Dmitry Vyukov Signed-off-by: Thomas Gleixner Signed-off-by:, hope it's ok: --- include/linux/sched/signal.h | 2 +- kernel/time/posix-timers.c | 30 +++++++++++++++++------------- 2 files changed, 18 insertions(+), 14 deletions(-) --- a/include/linux/sched/signal.h +++ b/include/linux/sched/signal.h @@ -135,7 +135,7 @@ struct signal_struct { #ifdef CONFIG_POSIX_TIMERS /* POSIX.1b Interval Timers */ - int posix_timer_id; + unsigned int next_posix_timer_id; struct list_head posix_timers; /* ITIMER_REAL timer for the process */ --- a/kernel/time/posix-timers.c +++ b/kernel/time/posix-timers.c @@ -140,25 +140,29 @@ static struct k_itimer *posix_timer_by_i static int posix_timer_add(struct k_itimer *timer) { struct signal_struct *sig = current->signal; - int first_free_id = sig->posix_timer_id; struct hlist_head *head; - int ret = -ENOENT; + unsigned int start, id; - do { + /* Can be written by a different task concurrently in the loop below */ + start = READ_ONCE(sig->next_posix_timer_id); + + for (id = ~start; start != id; id++) { spin_lock(&hash_lock); - head = &posix_timers_hashtable[hash(sig, sig->posix_timer_id)]; - if (!__posix_timers_find(head, sig, sig->posix_timer_id)) { + id = sig->next_posix_timer_id; + + /* Write the next ID back. Clamp it to the positive space */ + WRITE_ONCE(sig->next_posix_timer_id, (id + 1) & INT_MAX); + + head = &posix_timers_hashtable[hash(sig, id)]; + if (!__posix_timers_find(head, sig, id)) { hlist_add_head_rcu(&timer->t_hash, head); - ret = sig->posix_timer_id; + spin_unlock(&hash_lock); + return id; } - if (++sig->posix_timer_id < 0) - sig->posix_timer_id = 0; - if ((sig->posix_timer_id == first_free_id) && (ret == -ENOENT)) - /* Loop over all possible ids completed */ - ret = -EAGAIN; spin_unlock(&hash_lock); - } while (ret == -ENOENT); - return ret; + } + /* POSIX return code when no timer ID could be allocated */ + return -EAGAIN; } static inline void unlock_timer(struct k_itimer *timr, unsigned long flags)