[2/2] maple_tree: add a fast path case in mas_wr_slot_store()

Message ID 20230602075353.5917-2-zhangpeng.00@bytedance.com
State New
Headers
Series [1/2] maple_tree: optimize mas_wr_append(), also improve duplicating VMAs |

Commit Message

Peng Zhang June 2, 2023, 7:53 a.m. UTC
  When the new range overwrites three ranges and does not touch the
boundaries on both sides, the number of entries will not be increased,
so we can just update the pivots as a fast path. However, it may
introduce potential risks in RCU mode (although it can pass the test),
because it updates two pivots. We only enable it in non-RCU mode for now.

Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
---
 lib/maple_tree.c | 33 +++++++++++++++++++++------------
 1 file changed, 21 insertions(+), 12 deletions(-)
  

Comments

Liam R. Howlett June 2, 2023, 4:41 p.m. UTC | #1
* Peng Zhang <zhangpeng.00@bytedance.com> [230602 03:54]:
> When the new range overwrites three ranges and does not touch the
> boundaries on both sides, the number of entries will not be increased,
> so we can just update the pivots as a fast path. However, it may
> introduce potential risks in RCU mode (although it can pass the test),
> because it updates two pivots. We only enable it in non-RCU mode for now.

So what you are saying is that you are expanding one entry to consume
portions of the previous and next into a new entry.  We know this is the
case because the end of the node is not moving and we are modifying more
than one slot (so it must be 2 slots)

This scenario is not tested in the testing framework.  We should add
testing before we can add this.

> 
> Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
> ---
>  lib/maple_tree.c | 33 +++++++++++++++++++++------------
>  1 file changed, 21 insertions(+), 12 deletions(-)
> 
> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> index cfd9fad308a2..ec82441ca3e8 100644
> --- a/lib/maple_tree.c
> +++ b/lib/maple_tree.c
> @@ -4100,23 +4100,32 @@ static inline bool mas_wr_slot_store(struct ma_wr_state *wr_mas)
>  {
>  	struct ma_state *mas = wr_mas->mas;
>  	unsigned char offset = mas->offset;
> +	void __rcu **slots = wr_mas->slots;
>  	bool gap = false;
>  
> -	if (wr_mas->offset_end - offset != 1)
> -		return false;
> -
> -	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset);
> -	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset + 1);
> +	gap |= !mt_slot_locked(mas->tree, slots, offset);
> +	gap |= !mt_slot_locked(mas->tree, slots, offset + 1);
>  
> -	if (mas->index == wr_mas->r_min) {
> -		/* Overwriting the range and over a part of the next range. */
> -		rcu_assign_pointer(wr_mas->slots[offset], wr_mas->entry);
> -		wr_mas->pivots[offset] = mas->last;
> -	} else {
> -		/* Overwriting a part of the range and over the next range */
> -		rcu_assign_pointer(wr_mas->slots[offset + 1], wr_mas->entry);
> +	if (wr_mas->offset_end - offset == 1) {
> +		if (mas->index == wr_mas->r_min) {
> +			/* Overwriting the range and a part of the next one */
> +			rcu_assign_pointer(slots[offset], wr_mas->entry);
> +			wr_mas->pivots[offset] = mas->last;
> +		} else {
> +			/* Overwriting a part of the range and the next one */
> +			rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
> +			wr_mas->pivots[offset] = mas->index - 1;
> +			mas->offset++; /* Keep mas accurate. */
> +		}
> +	} else if (!mt_in_rcu(mas->tree)) {
> +		/* Overwriting three ranges, but don't touch the boundaries */

I find this comment misleading.  You actually touch both boundaries for
the entry in this case (start and end).  We are just increasing the
space in both directions.  You are also not overwriting two of the three
entries or ranges, you are expanding one entry in two directions, so
both the previous and next ranges will shrink but they will remain. It's
more of a "modify three ranges but don't change the outside limits." The
similar statement in the commit message should also be changed.

Right now, I don't see this code executed by the test program.
Inserting a BUG_ON() here and it will not be hit.

> +		gap |= !mt_slot_locked(mas->tree, slots, offset + 2);
> +		rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
>  		wr_mas->pivots[offset] = mas->index - 1;
> +		wr_mas->pivots[offset + 1] = mas->last;
>  		mas->offset++; /* Keep mas accurate. */
> +	} else {

We are hitting this else in check_locky at maple.c:35780 only, I think.
You've identified a lack of testing here by the looks of it.

The VMA code does not do this today, and I don't know of any other users
which expand/contract entries like this.  Do you think this will be
common enough for the optimisation vs a node store?

> +		return false;
>  	}
>  
>  	trace_ma_write(__func__, mas, 0, wr_mas->entry);
> -- 
> 2.20.1
> 
>
  
Peng Zhang June 5, 2023, 11:11 a.m. UTC | #2
在 2023/6/3 00:41, Liam R. Howlett 写道:
> * Peng Zhang <zhangpeng.00@bytedance.com> [230602 03:54]:
>> When the new range overwrites three ranges and does not touch the
>> boundaries on both sides, the number of entries will not be increased,
>> so we can just update the pivots as a fast path. However, it may
>> introduce potential risks in RCU mode (although it can pass the test),
>> because it updates two pivots. We only enable it in non-RCU mode for now.
> 
> So what you are saying is that you are expanding one entry to consume
> portions of the previous and next into a new entry.  We know this is the
> case because the end of the node is not moving and we are modifying more
> than one slot (so it must be 2 slots)
> 
> This scenario is not tested in the testing framework.  We should add
> testing before we can add this.
> 
>>
>> Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
>> ---
>>   lib/maple_tree.c | 33 +++++++++++++++++++++------------
>>   1 file changed, 21 insertions(+), 12 deletions(-)
>>
>> diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>> index cfd9fad308a2..ec82441ca3e8 100644
>> --- a/lib/maple_tree.c
>> +++ b/lib/maple_tree.c
>> @@ -4100,23 +4100,32 @@ static inline bool mas_wr_slot_store(struct ma_wr_state *wr_mas)
>>   {
>>   	struct ma_state *mas = wr_mas->mas;
>>   	unsigned char offset = mas->offset;
>> +	void __rcu **slots = wr_mas->slots;
>>   	bool gap = false;
>>   
>> -	if (wr_mas->offset_end - offset != 1)
>> -		return false;
>> -
>> -	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset);
>> -	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset + 1);
>> +	gap |= !mt_slot_locked(mas->tree, slots, offset);
>> +	gap |= !mt_slot_locked(mas->tree, slots, offset + 1);
>>   
>> -	if (mas->index == wr_mas->r_min) {
>> -		/* Overwriting the range and over a part of the next range. */
>> -		rcu_assign_pointer(wr_mas->slots[offset], wr_mas->entry);
>> -		wr_mas->pivots[offset] = mas->last;
>> -	} else {
>> -		/* Overwriting a part of the range and over the next range */
>> -		rcu_assign_pointer(wr_mas->slots[offset + 1], wr_mas->entry);
>> +	if (wr_mas->offset_end - offset == 1) {
>> +		if (mas->index == wr_mas->r_min) {
>> +			/* Overwriting the range and a part of the next one */
>> +			rcu_assign_pointer(slots[offset], wr_mas->entry);
>> +			wr_mas->pivots[offset] = mas->last;
>> +		} else {
>> +			/* Overwriting a part of the range and the next one */
>> +			rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
>> +			wr_mas->pivots[offset] = mas->index - 1;
>> +			mas->offset++; /* Keep mas accurate. */
>> +		}
>> +	} else if (!mt_in_rcu(mas->tree)) {
>> +		/* Overwriting three ranges, but don't touch the boundaries */
> 
> I find this comment misleading.  You actually touch both boundaries for
> the entry in this case (start and end).  We are just increasing the
> space in both directions.  You are also not overwriting two of the three
> entries or ranges, you are expanding one entry in two directions, so
> both the previous and next ranges will shrink but they will remain. It's
> more of a "modify three ranges but don't change the outside limits." The
> similar statement in the commit message should also be changed.
Yes, your understanding is correct.
Sorry my comment is not well written, I mean the left boundary of the
leftmost range and the right boundary of the rightmost range are not
touched, I will fix it in v2.

> 
> Right now, I don't see this code executed by the test program.
> Inserting a BUG_ON() here and it will not be hit.
Yes, the current test program does not run to this branch, I will add
the corresponding test cases in v2.

> 
>> +		gap |= !mt_slot_locked(mas->tree, slots, offset + 2);
>> +		rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
>>   		wr_mas->pivots[offset] = mas->index - 1;
>> +		wr_mas->pivots[offset + 1] = mas->last;
>>   		mas->offset++; /* Keep mas accurate. */
>> +	} else {
> 
> We are hitting this else in check_locky at maple.c:35780 only, I think.
> You've identified a lack of testing here by the looks of it.
> 
> The VMA code does not do this today, and I don't know of any other users
> which expand/contract entries like this.  Do you think this will be
> common enough for the optimisation vs a node store?
I also thought about this problem, but I still regard it as an
optimization of the slot store. Although it is useless for VMA
now, I don't know if it will be used in the future. I think that
if we enter this function, we will most likely enter the first if
branch now, which will not cause additional overhead and have no
negative impact, so try to add this case.

> 
>> +		return false;
>>   	}
>>   
>>   	trace_ma_write(__func__, mas, 0, wr_mas->entry);
>> -- 
>> 2.20.1
>>
>>
  
Liam R. Howlett June 5, 2023, 3:24 p.m. UTC | #3
* Peng Zhang <zhangpeng.00@bytedance.com> [230605 07:11]:
> 
> 
> 在 2023/6/3 00:41, Liam R. Howlett 写道:
> > * Peng Zhang <zhangpeng.00@bytedance.com> [230602 03:54]:
> > > When the new range overwrites three ranges and does not touch the
> > > boundaries on both sides, the number of entries will not be increased,
> > > so we can just update the pivots as a fast path. However, it may
> > > introduce potential risks in RCU mode (although it can pass the test),
> > > because it updates two pivots. We only enable it in non-RCU mode for now.
> > 
> > So what you are saying is that you are expanding one entry to consume
> > portions of the previous and next into a new entry.  We know this is the
> > case because the end of the node is not moving and we are modifying more
> > than one slot (so it must be 2 slots)
> > 
> > This scenario is not tested in the testing framework.  We should add
> > testing before we can add this.
> > 
> > > 
> > > Signed-off-by: Peng Zhang <zhangpeng.00@bytedance.com>
> > > ---
> > >   lib/maple_tree.c | 33 +++++++++++++++++++++------------
> > >   1 file changed, 21 insertions(+), 12 deletions(-)
> > > 
> > > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
> > > index cfd9fad308a2..ec82441ca3e8 100644
> > > --- a/lib/maple_tree.c
> > > +++ b/lib/maple_tree.c
> > > @@ -4100,23 +4100,32 @@ static inline bool mas_wr_slot_store(struct ma_wr_state *wr_mas)
> > >   {
> > >   	struct ma_state *mas = wr_mas->mas;
> > >   	unsigned char offset = mas->offset;
> > > +	void __rcu **slots = wr_mas->slots;
> > >   	bool gap = false;
> > > -	if (wr_mas->offset_end - offset != 1)
> > > -		return false;
> > > -
> > > -	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset);
> > > -	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset + 1);
> > > +	gap |= !mt_slot_locked(mas->tree, slots, offset);
> > > +	gap |= !mt_slot_locked(mas->tree, slots, offset + 1);
> > > -	if (mas->index == wr_mas->r_min) {
> > > -		/* Overwriting the range and over a part of the next range. */
> > > -		rcu_assign_pointer(wr_mas->slots[offset], wr_mas->entry);
> > > -		wr_mas->pivots[offset] = mas->last;
> > > -	} else {
> > > -		/* Overwriting a part of the range and over the next range */
> > > -		rcu_assign_pointer(wr_mas->slots[offset + 1], wr_mas->entry);
> > > +	if (wr_mas->offset_end - offset == 1) {
> > > +		if (mas->index == wr_mas->r_min) {
> > > +			/* Overwriting the range and a part of the next one */
> > > +			rcu_assign_pointer(slots[offset], wr_mas->entry);
> > > +			wr_mas->pivots[offset] = mas->last;
> > > +		} else {
> > > +			/* Overwriting a part of the range and the next one */
> > > +			rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
> > > +			wr_mas->pivots[offset] = mas->index - 1;
> > > +			mas->offset++; /* Keep mas accurate. */
> > > +		}
> > > +	} else if (!mt_in_rcu(mas->tree)) {
> > > +		/* Overwriting three ranges, but don't touch the boundaries */
> > 
> > I find this comment misleading.  You actually touch both boundaries for
> > the entry in this case (start and end).  We are just increasing the
> > space in both directions.  You are also not overwriting two of the three
> > entries or ranges, you are expanding one entry in two directions, so
> > both the previous and next ranges will shrink but they will remain. It's
> > more of a "modify three ranges but don't change the outside limits." The
> > similar statement in the commit message should also be changed.
> Yes, your understanding is correct.
> Sorry my comment is not well written, I mean the left boundary of the
> leftmost range and the right boundary of the rightmost range are not
> touched, I will fix it in v2.
> 
> > 
> > Right now, I don't see this code executed by the test program.
> > Inserting a BUG_ON() here and it will not be hit.
> Yes, the current test program does not run to this branch, I will add
> the corresponding test cases in v2.
> 
> > 
> > > +		gap |= !mt_slot_locked(mas->tree, slots, offset + 2);
> > > +		rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
> > >   		wr_mas->pivots[offset] = mas->index - 1;
> > > +		wr_mas->pivots[offset + 1] = mas->last;
> > >   		mas->offset++; /* Keep mas accurate. */
> > > +	} else {
> > 
> > We are hitting this else in check_locky at maple.c:35780 only, I think.
> > You've identified a lack of testing here by the looks of it.
> > 
> > The VMA code does not do this today, and I don't know of any other users
> > which expand/contract entries like this.  Do you think this will be
> > common enough for the optimisation vs a node store?
> I also thought about this problem, but I still regard it as an
> optimization of the slot store. Although it is useless for VMA
> now, I don't know if it will be used in the future. I think that
> if we enter this function, we will most likely enter the first if
> branch now, which will not cause additional overhead and have no
> negative impact, so try to add this case.

Sounds good.  Thanks.

> 
> > 
> > > +		return false;
> > >   	}
> > >   	trace_ma_write(__func__, mas, 0, wr_mas->entry);
> > > -- 
> > > 2.20.1
> > > 
> > >
  

Patch

diff --git a/lib/maple_tree.c b/lib/maple_tree.c
index cfd9fad308a2..ec82441ca3e8 100644
--- a/lib/maple_tree.c
+++ b/lib/maple_tree.c
@@ -4100,23 +4100,32 @@  static inline bool mas_wr_slot_store(struct ma_wr_state *wr_mas)
 {
 	struct ma_state *mas = wr_mas->mas;
 	unsigned char offset = mas->offset;
+	void __rcu **slots = wr_mas->slots;
 	bool gap = false;
 
-	if (wr_mas->offset_end - offset != 1)
-		return false;
-
-	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset);
-	gap |= !mt_slot_locked(mas->tree, wr_mas->slots, offset + 1);
+	gap |= !mt_slot_locked(mas->tree, slots, offset);
+	gap |= !mt_slot_locked(mas->tree, slots, offset + 1);
 
-	if (mas->index == wr_mas->r_min) {
-		/* Overwriting the range and over a part of the next range. */
-		rcu_assign_pointer(wr_mas->slots[offset], wr_mas->entry);
-		wr_mas->pivots[offset] = mas->last;
-	} else {
-		/* Overwriting a part of the range and over the next range */
-		rcu_assign_pointer(wr_mas->slots[offset + 1], wr_mas->entry);
+	if (wr_mas->offset_end - offset == 1) {
+		if (mas->index == wr_mas->r_min) {
+			/* Overwriting the range and a part of the next one */
+			rcu_assign_pointer(slots[offset], wr_mas->entry);
+			wr_mas->pivots[offset] = mas->last;
+		} else {
+			/* Overwriting a part of the range and the next one */
+			rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
+			wr_mas->pivots[offset] = mas->index - 1;
+			mas->offset++; /* Keep mas accurate. */
+		}
+	} else if (!mt_in_rcu(mas->tree)) {
+		/* Overwriting three ranges, but don't touch the boundaries */
+		gap |= !mt_slot_locked(mas->tree, slots, offset + 2);
+		rcu_assign_pointer(slots[offset + 1], wr_mas->entry);
 		wr_mas->pivots[offset] = mas->index - 1;
+		wr_mas->pivots[offset + 1] = mas->last;
 		mas->offset++; /* Keep mas accurate. */
+	} else {
+		return false;
 	}
 
 	trace_ma_write(__func__, mas, 0, wr_mas->entry);