From patchwork Mon Oct 24 17:44:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Waiman Long X-Patchwork-Id: 9957 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp609103wru; Mon, 24 Oct 2022 12:10:07 -0700 (PDT) X-Google-Smtp-Source: AMsMyM47CF6yrj5btKdw2Ih/7nPwNFoER14C9WTdp4hcucH93PJTlBkVrn4aSX9P9bWYkPkHAaMG X-Received: by 2002:a63:4283:0:b0:457:dced:8ba3 with SMTP id p125-20020a634283000000b00457dced8ba3mr29097952pga.220.1666638607090; Mon, 24 Oct 2022 12:10:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666638607; cv=none; d=google.com; s=arc-20160816; b=eNUDZW59ol/ytKL75QJbSeRDG2KiO0zsXGYmLpkQTmEvzrdg23naCMKFyfv7mANn4w QlmqT14/LMeWqGleB+wmdytflcrMH1BYvFWEd5ZCmJwnCoRVlSlWnoBboSwAzefyhQ+X bZTpTE7+JZzBYX27iEAJQHyT59ln+LGQ4O9M0hkJTR5INqOoBl93cRO2nx4eNUmMeafD +OLAEjlWpGRgxVUs3s7i8kql950W6OncJODCXlcGJcE810GcFVLvT1Z2p6XQKGChkD/i v6wnr7IDol7MYhnEbLzHr7rY0viOadoQJVvvWfs8nHeN/DvR9PLoqOQNqZ+rtVYoI5bE A4Gw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=jqegOKwKp0b5k01538L+8elHQKTxwvJermmvNJSeQ8I=; b=GD0rvb9djGHJVb1vHZU1Er9DLW/qYHLzf9lsxWc/I93izdEWqEhQ7RY4Iw5mfNMwt2 f7tL/XWG7vG6ilXBCw6PxoIQHvjI91sjTxL95+jL+ON9Q7rVMuDMiL537r4papvdUPua XNG04g8QFfe1DxPujP3hZw9Kwt7hmp1rja9RjzQD1DC/cZKNjuU3qFOlg7elAWTZH2dr 6Qft39LLlA09qU3Bj38xqPV1v6vru7x2F2SV8W5sbil29RqVF99q5b4pXMhqULrADobU s2qIkVZ2Bm/6S05j2YiFbrNa+9irenKVI6OnWYnJTqp8oe3C53mENWa14KaLCECj+K41 lK+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R91sIa4L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kb10-20020a17090ae7ca00b00200a43d4d58si12304023pjb.80.2022.10.24.12.09.52; Mon, 24 Oct 2022 12:10:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=R91sIa4L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232689AbiJXTIK (ORCPT + 99 others); Mon, 24 Oct 2022 15:08:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54508 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232682AbiJXTHs (ORCPT ); Mon, 24 Oct 2022 15:07:48 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 876D07963D for ; Mon, 24 Oct 2022 10:47:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666633496; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jqegOKwKp0b5k01538L+8elHQKTxwvJermmvNJSeQ8I=; b=R91sIa4LDybOjV3Yv6P6Qbjt/MyLNaUv1Ow+KfTzO7XxpDD44R7EBQiyIqyfhEc+WSWdwJ UEBuRQJ7mAeImuA5UB0gHdsgbJoTBTkpF7UvWXCqIse+20ac2WOB/jzDkg+rHgnlFz6w82 Yc9wUdnRHe/BtxCvVIp1pQ41yIjImG8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-135-O_QEXEtDO8y_40DJ98iRpw-1; Mon, 24 Oct 2022 13:44:51 -0400 X-MC-Unique: O_QEXEtDO8y_40DJ98iRpw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 8B99B8027FD; Mon, 24 Oct 2022 17:44:50 +0000 (UTC) Received: from llong.com (dhcp-17-153.bos.redhat.com [10.18.17.153]) by smtp.corp.redhat.com (Postfix) with ESMTP id 40FAA492B0F; Mon, 24 Oct 2022 17:44:50 +0000 (UTC) From: Waiman Long To: Peter Zijlstra , Ingo Molnar , Will Deacon , Boqun Feng Cc: linux-kernel@vger.kernel.org, john.p.donnelly@oracle.com, Hillf Danton , Mukesh Ojha , =?utf-8?b?VGluZzExIFdhbmcg546L5am3?= , Waiman Long , stable@vger.kernel.org Subject: [PATCH v4 1/5] locking/rwsem: Prevent non-first waiter from spinning in down_write() slowpath Date: Mon, 24 Oct 2022 13:44:14 -0400 Message-Id: <20221024174418.796468-2-longman@redhat.com> In-Reply-To: <20221024174418.796468-1-longman@redhat.com> References: <20221024174418.796468-1-longman@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747597244240570178?= X-GMAIL-MSGID: =?utf-8?q?1747597244240570178?= A non-first waiter can potentially spin in the for loop of rwsem_down_write_slowpath() without sleeping but fail to acquire the lock even if the rwsem is free if the following sequence happens: Non-first RT waiter First waiter Lock holder ------------------- ------------ ----------- Acquire wait_lock rwsem_try_write_lock(): Set handoff bit if RT or wait too long Set waiter->handoff_set Release wait_lock Acquire wait_lock Inherit waiter->handoff_set Release wait_lock Clear owner Release lock if (waiter.handoff_set) { rwsem_spin_on_owner((); if (OWNER_NULL) goto trylock_again; } trylock_again: Acquire wait_lock rwsem_try_write_lock(): if (first->handoff_set && (waiter != first)) return false; Release wait_lock A non-first waiter cannot really acquire the rwsem even if it mistakenly believes that it can spin on OWNER_NULL value. If that waiter happens to be an RT task running on the same CPU as the first waiter, it can block the first waiter from acquiring the rwsem leading to live lock. Fix this problem by making sure that a non-first waiter cannot spin in the slowpath loop without sleeping. Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent") Reviewed-and-tested-by: Mukesh Ojha Signed-off-by: Waiman Long Cc: stable@vger.kernel.org --- kernel/locking/rwsem.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-) diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c index 44873594de03..be2df9ea7c30 100644 --- a/kernel/locking/rwsem.c +++ b/kernel/locking/rwsem.c @@ -624,18 +624,16 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, */ if (first->handoff_set && (waiter != first)) return false; - - /* - * First waiter can inherit a previously set handoff - * bit and spin on rwsem if lock acquisition fails. - */ - if (waiter == first) - waiter->handoff_set = true; } new = count; if (count & RWSEM_LOCK_MASK) { + /* + * A waiter (first or not) can set the handoff bit + * if it is an RT task or wait in the wait queue + * for too long. + */ if (has_handoff || (!rt_task(waiter->task) && !time_after(jiffies, waiter->timeout))) return false; @@ -651,11 +649,12 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem, } while (!atomic_long_try_cmpxchg_acquire(&sem->count, &count, new)); /* - * We have either acquired the lock with handoff bit cleared or - * set the handoff bit. + * We have either acquired the lock with handoff bit cleared or set + * the handoff bit. Only the first waiter can have its handoff_set + * set here to enable optimistic spinning in slowpath loop. */ if (new & RWSEM_FLAG_HANDOFF) { - waiter->handoff_set = true; + first->handoff_set = true; lockevent_inc(rwsem_wlock_handoff); return false; }