[v2,0/5] md: fix uaf for sync_thread

Message ID	20230315061810.653263-1-yukuai1@huaweicloud.com
Headers	Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; From: Yu Kuai <yukuai1@huaweicloud.com> To: agk@redhat.com, snitzer@kernel.org, song@kernel.org Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org, yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com, yangerkun@huawei.com Subject: [PATCH v2 0/5] md: fix uaf for sync_thread Date: Wed, 15 Mar 2023 14:18:05 +0800 Message-Id: <20230315061810.653263-1-yukuai1@huaweicloud.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	md: fix uaf for sync_thread \| [v2,0/5] md: fix uaf for sync_thread [v2,1/5] md: pass a md_thread pointer to md_register_thread() [v2,2/5] md: refactor md_wakeup_thread() [v2,3/5] md: use md_thread api to wake up sync_thread [v2,4/5] md: pass a mddev to md_unregister_thread() [v2,5/5] md: protect md_thread with a new disk level spin lock

Message ID

20230315061810.653263-1-yukuai1@huaweicloud.com

Headers

Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org
 designates 2620:137:e000::1:20 as permitted sender)
 client-ip=2620:137:e000::1:20;
From: Yu Kuai <yukuai1@huaweicloud.com>
To: agk@redhat.com, snitzer@kernel.org, song@kernel.org
Cc: linux-kernel@vger.kernel.org, linux-raid@vger.kernel.org,
        yukuai3@huawei.com, yukuai1@huaweicloud.com, yi.zhang@huawei.com,
        yangerkun@huawei.com
Subject: [PATCH v2 0/5] md: fix uaf for sync_thread
Date: Wed, 15 Mar 2023 14:18:05 +0800
Message-Id: <20230315061810.653263-1-yukuai1@huaweicloud.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

md: fix uaf for sync_thread |

Message

Yu Kuai March 15, 2023, 6:18 a.m. UTC

  From: Yu Kuai <yukuai3@huawei.com>

Changes in v2:
 - fix a compile error for for md-cluster in patch 2
 - replace spin_lock/unlock with spin_lock/unlock_irq in patch 5
 - don't wake up inside the new lock in md wakeup_thread in patch 5

Our test reports a uaf for 'mddev->sync_thread':

T1                      T2
md_start_sync
 md_register_thread
			raid1d
			 md_check_recovery
			  md_reap_sync_thread
			   md_unregister_thread
			    kfree

 md_wakeup_thread
  wake_up
  ->sync_thread was freed

Currently, a global spinlock 'pers_lock' is borrowed to protect
'mddev->thread', this problem can be fixed likewise, however, there might
be similar problem for other md_thread, and I really don't like the idea to
borrow a global lock.

This patchset do some refactor, and then use a disk level spinlock to
protect md_thread in relevant apis.

I tested this pathset with mdadm tests, and there are no new regression,
by the way, following test will failed with or without this patchset:

01raid6integ
04r1update
05r6tor0
10ddf-create
10ddf-fail-spare
10ddf-fail-stop-readd
10ddf-geometry

Yu Kuai (5):
  md: pass a md_thread pointer to md_register_thread()
  md: refactor md_wakeup_thread()
  md: use md_thread api to wake up sync_thread
  md: pass a mddev to md_unregister_thread()
  md: protect md_thread with a new disk level spin lock

 drivers/md/dm-raid.c      |   6 +-
 drivers/md/md-bitmap.c    |   6 +-
 drivers/md/md-cluster.c   |  39 +++++----
 drivers/md/md-multipath.c |   8 +-
 drivers/md/md.c           | 162 ++++++++++++++++++++------------------
 drivers/md/md.h           |  15 ++--
 drivers/md/raid1.c        |  19 +++--
 drivers/md/raid10.c       |  31 ++++----
 drivers/md/raid5-cache.c  |  19 +++--
 drivers/md/raid5-ppl.c    |   2 +-
 drivers/md/raid5.c        |  48 ++++++-----
 11 files changed, 177 insertions(+), 178 deletions(-)

Comments

Paul Menzel March 15, 2023, 8:30 a.m. UTC | #1

Dear Logan,


Am 15.03.23 um 07:18 schrieb Yu Kuai:
> From: Yu Kuai <yukuai3@huawei.com>
> 
> Changes in v2:
>   - fix a compile error for for md-cluster in patch 2
>   - replace spin_lock/unlock with spin_lock/unlock_irq in patch 5
>   - don't wake up inside the new lock in md wakeup_thread in patch 5
> 
> Our test reports a uaf for 'mddev->sync_thread':
> 
> T1                      T2
> md_start_sync
>   md_register_thread
> 			raid1d
> 			 md_check_recovery
> 			  md_reap_sync_thread
> 			   md_unregister_thread
> 			    kfree
> 
>   md_wakeup_thread
>    wake_up
>    ->sync_thread was freed
> 
> Currently, a global spinlock 'pers_lock' is borrowed to protect
> 'mddev->thread', this problem can be fixed likewise, however, there might
> be similar problem for other md_thread, and I really don't like the idea to
> borrow a global lock.
> 
> This patchset do some refactor, and then use a disk level spinlock to
> protect md_thread in relevant apis.
> 
> I tested this pathset with mdadm tests, and there are no new regression,
> by the way, following test will failed with or without this patchset:
> 
> 01raid6integ
> 04r1update
> 05r6tor0
> 10ddf-create
> 10ddf-fail-spare
> 10ddf-fail-stop-readd
> 10ddf-geometry

As you improved the tests in the past, can you confirm, these failed on 
your test systems too and are fixed now?

[…]


Kind regards,

Paul

Logan Gunthorpe March 15, 2023, 10:55 p.m. UTC | #2

On 2023-03-15 02:30, Paul Menzel wrote:
> Am 15.03.23 um 07:18 schrieb Yu Kuai:
>> I tested this pathset with mdadm tests, and there are no new regression,
>> by the way, following test will failed with or without this patchset:
>>
>> 01raid6integ
>> 04r1update
>> 05r6tor0
>> 10ddf-create
>> 10ddf-fail-spare
>> 10ddf-fail-stop-readd
>> 10ddf-geometry
> 
> As you improved the tests in the past, can you confirm, these failed on
> your test systems too and are fixed now?

Hmm, well Yu did not claim that those tests were fixed. If you re-read
what was said, the tests listed failed with or without the new changes.
As I read it, Yu asserts no new regressions were created with the patch
set, not that failing tests were fixed.

Unfortunately, the tests listed are largely not ones I saw failing the
last time I ran the tests (though it's been a few months since I last
tried). I know 01raid6integ used to fail some of the time, but the other
6 tests mentioned worked the last time I ran them; and there are many
other tests that failed when I ran them. (My notes on which tests are
broken are included in the most recent mdadm tree in tests/*.broken)

I was going to try and confirm that no new regressions were introduced
by Yu's patches, but seems the tests are getting worse. I tried running
the tests on the current md-next branch and found that one of the early
tests, 00raid5-zero, hangs indefinitely. I quickly ran the same test on
v6.3-rc2 and found that it runs just fine there. So it looks like
there's already a regression in md-next that is not part of this series
and I don't have the time to dig into the root cause right now.

Yu's patches don't apply cleanly to v6.3-rc2 and I can't run the tests
against md-next; so I didn't bother running them, but I did do a quick
review. The locking changes make sense to me so it might be worth
merging for correctness. However, I'm not entirely sure it's the best
solution -- the md thread stuff seems like a bit of a mess and passing
an mddev to thread functions that were not related to the mddev to get a
lock seems to just make the mess a bit worse.

For example, it seems a bit ugly to me for the lock mddev->thread_lock
to protect the access of a pointer in struct r5l_log. Just spit-balling,
but perhaps RCU would be more appropriate here. Then md_wakeup_thread()
would just need to hold the RCU read lock when dereferencing, and
md_unregister_thread() would just need to synchronize_rcu() before
stopping and freeing the thread. This has the benefit of not requiring
the mddev object for every md_thread and would probably require a lot
less churn than the current patches.

Logan

Yu Kuai March 16, 2023, 1:26 a.m. UTC | #3

Hi,

在 2023/03/16 6:55, Logan Gunthorpe 写道:
> 
> 
> On 2023-03-15 02:30, Paul Menzel wrote:
>> Am 15.03.23 um 07:18 schrieb Yu Kuai:
>>> I tested this pathset with mdadm tests, and there are no new regression,
>>> by the way, following test will failed with or without this patchset:
>>>
>>> 01raid6integ
>>> 04r1update
>>> 05r6tor0
>>> 10ddf-create
>>> 10ddf-fail-spare
>>> 10ddf-fail-stop-readd
>>> 10ddf-geometry
>>
>> As you improved the tests in the past, can you confirm, these failed on
>> your test systems too and are fixed now?
> 
> Hmm, well Yu did not claim that those tests were fixed. If you re-read
> what was said, the tests listed failed with or without the new changes.
> As I read it, Yu asserts no new regressions were created with the patch
> set, not that failing tests were fixed.
> 
> Unfortunately, the tests listed are largely not ones I saw failing the
> last time I ran the tests (though it's been a few months since I last
> tried). I know 01raid6integ used to fail some of the time, but the other
> 6 tests mentioned worked the last time I ran them; and there are many
> other tests that failed when I ran them. (My notes on which tests are
> broken are included in the most recent mdadm tree in tests/*.broken)
> 
> I was going to try and confirm that no new regressions were introduced
> by Yu's patches, but seems the tests are getting worse. I tried running
> the tests on the current md-next branch and found that one of the early
> tests, 00raid5-zero, hangs indefinitely. I quickly ran the same test on
> v6.3-rc2 and found that it runs just fine there. So it looks like
> there's already a regression in md-next that is not part of this series
> and I don't have the time to dig into the root cause right now.
> 
> Yu's patches don't apply cleanly to v6.3-rc2 and I can't run the tests
> against md-next; so I didn't bother running them, but I did do a quick
> review. The locking changes make sense to me so it might be worth
> merging for correctness. However, I'm not entirely sure it's the best
> solution -- the md thread stuff seems like a bit of a mess and passing
> an mddev to thread functions that were not related to the mddev to get a
> lock seems to just make the mess a bit worse.
> 
> For example, it seems a bit ugly to me for the lock mddev->thread_lock
> to protect the access of a pointer in struct r5l_log. Just spit-balling,
> but perhaps RCU would be more appropriate here. Then md_wakeup_thread()
> would just need to hold the RCU read lock when dereferencing, and
> md_unregister_thread() would just need to synchronize_rcu() before
> stopping and freeing the thread. This has the benefit of not requiring
> the mddev object for every md_thread and would probably require a lot
> less churn than the current patches.

Thanks for your suggestion, this make sense to me. I'll try to use rcu.

Thanks,
Kuai
> 
> Logan
> 
> 
> 
> 
> .
>

Song Liu March 28, 2023, 11:31 p.m. UTC | #4

On Wed, Mar 15, 2023 at 6:26 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>
> Hi,
>
> 在 2023/03/16 6:55, Logan Gunthorpe 写道:
[...]
> > I was going to try and confirm that no new regressions were introduced
> > by Yu's patches, but seems the tests are getting worse. I tried running
> > the tests on the current md-next branch and found that one of the early
> > tests, 00raid5-zero, hangs indefinitely. I quickly ran the same test on

I am not able to repro the issue with 00raid5-zero. (I did a rebase before
running the test, so that might be the reason).

> > v6.3-rc2 and found that it runs just fine there. So it looks like
> > there's already a regression in md-next that is not part of this series
> > and I don't have the time to dig into the root cause right now.
> >
> > Yu's patches don't apply cleanly to v6.3-rc2 and I can't run the tests
> > against md-next; so I didn't bother running them, but I did do a quick
> > review. The locking changes make sense to me so it might be worth
> > merging for correctness. However, I'm not entirely sure it's the best
> > solution -- the md thread stuff seems like a bit of a mess and passing
> > an mddev to thread functions that were not related to the mddev to get a
> > lock seems to just make the mess a bit worse.
> >
> > For example, it seems a bit ugly to me for the lock mddev->thread_lock
> > to protect the access of a pointer in struct r5l_log. Just spit-balling,
> > but perhaps RCU would be more appropriate here. Then md_wakeup_thread()
> > would just need to hold the RCU read lock when dereferencing, and
> > md_unregister_thread() would just need to synchronize_rcu() before
> > stopping and freeing the thread. This has the benefit of not requiring
> > the mddev object for every md_thread and would probably require a lot
> > less churn than the current patches.
>
> Thanks for your suggestion, this make sense to me. I'll try to use rcu.

Yu Kuai, do you plan to resend the set with Logan suggestions?

Thanks,
Song

Yu Kuai March 29, 2023, 1:14 a.m. UTC | #5

Hi, Song!

在 2023/03/29 7:31, Song Liu 写道:
> On Wed, Mar 15, 2023 at 6:26 PM Yu Kuai <yukuai1@huaweicloud.com> wrote:
>>
>> Hi,
>>
>> 在 2023/03/16 6:55, Logan Gunthorpe 写道:
> [...]
>>> I was going to try and confirm that no new regressions were introduced
>>> by Yu's patches, but seems the tests are getting worse. I tried running
>>> the tests on the current md-next branch and found that one of the early
>>> tests, 00raid5-zero, hangs indefinitely. I quickly ran the same test on
> 
> I am not able to repro the issue with 00raid5-zero. (I did a rebase before
> running the test, so that might be the reason).
> 
>>> v6.3-rc2 and found that it runs just fine there. So it looks like
>>> there's already a regression in md-next that is not part of this series
>>> and I don't have the time to dig into the root cause right now.
>>>
>>> Yu's patches don't apply cleanly to v6.3-rc2 and I can't run the tests
>>> against md-next; so I didn't bother running them, but I did do a quick
>>> review. The locking changes make sense to me so it might be worth
>>> merging for correctness. However, I'm not entirely sure it's the best
>>> solution -- the md thread stuff seems like a bit of a mess and passing
>>> an mddev to thread functions that were not related to the mddev to get a
>>> lock seems to just make the mess a bit worse.
>>>
>>> For example, it seems a bit ugly to me for the lock mddev->thread_lock
>>> to protect the access of a pointer in struct r5l_log. Just spit-balling,
>>> but perhaps RCU would be more appropriate here. Then md_wakeup_thread()
>>> would just need to hold the RCU read lock when dereferencing, and
>>> md_unregister_thread() would just need to synchronize_rcu() before
>>> stopping and freeing the thread. This has the benefit of not requiring
>>> the mddev object for every md_thread and would probably require a lot
>>> less churn than the current patches.
>>
>> Thanks for your suggestion, this make sense to me. I'll try to use rcu.
> 
> Yu Kuai, do you plan to resend the set with Logan suggestions?

Yes, of course, it's just some other problems is triggered while I'm
testing the patchset, I'll resend the set once all tests is passed.

Thanks,
Kuai
> 
> Thanks,
> Song
> .
>