[GIT,PULL] Pin control fixes for v6.7

Message ID CACRpkdZtVNZFWSUgb4=gUE2mQRb=aT_3=zRv1U71Vsq0Mm34eg@mail.gmail.com
State New
Headers
Series [GIT,PULL] Pin control fixes for v6.7 |

Pull-request

git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl.git tags/pinctrl-v6.7-2

Message

Linus Walleij Nov. 29, 2023, 12:09 p.m. UTC
  Hi Linus,

this is a first belated round of pin control fixes for the v6.7 series..

The most interesting patch is the list iterator fix in the core by Maria
Yu, it took a while for me to realize what was going on there.

Some details on the fixes are in the tag.

Please pull them in!

Yours,
Linus Walleij

The following changes since commit b85ea95d086471afb4ad062012a4d73cd328fa86:

  Linux 6.7-rc1 (2023-11-12 16:19:07 -0800)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl.git
tags/pinctrl-v6.7-2

for you to fetch changes up to 90785ea8158b6923c5d6a024f2b1c076110577b5:

  dt-bindings: pinctrl: s32g2: change a maintainer email address
(2023-11-24 11:21:55 +0100)

----------------------------------------------------------------
Some pin control fixes for the v6.7 cycle:

- Fix a really interesting potential core bug in the list iterator
  requireing the use of READ_ONCE() discovered when testing kernel
  compiles with clang.

- Check devm_kcalloc() return value and an array bounds in the STM32
  driver.

- Fix an exotic string truncation issue in the s32cc driver, found
  by the kernel test robot (impressive!)

- Fix an undocumented struct member in the cy8c95x0 driver.

- Fix a symbol overlap with MIPS in the Lochnagar driver, MIPS
  defines a global symbol "RST" which is a bit too generic and
  collide with stuff. OK this one should be renamed too, we will
  fix that as well.

- Fix erroneous branch taking in the Realtek driver.

- Fix the mail address in MAINTAINERS for the s32g2 driver.

----------------------------------------------------------------
Antonio Borneo (1):
      pinctrl: stm32: fix array read out of bound

Charles Keepax (1):
      pinctrl: lochnagar: Don't build on MIPS

Chen Ni (1):
      pinctrl: stm32: Add check for devm_kcalloc

Chester Lin (2):
      pinctrl: s32cc: Avoid possible string truncation
      dt-bindings: pinctrl: s32g2: change a maintainer email address

Linus Walleij (1):
      pinctrl: cy8c95x0: Fix doc warning

Maria Yu (1):
      pinctrl: avoid reload of p state in list iteration

Tzuyi Chang (1):
      pinctrl: realtek: Fix logical error when finding descriptor

 .../bindings/pinctrl/nxp,s32g2-siul2-pinctrl.yaml           |  2 +-
 drivers/pinctrl/cirrus/Kconfig                              |  3 ++-
 drivers/pinctrl/core.c                                      |  6 +++---
 drivers/pinctrl/nxp/pinctrl-s32cc.c                         |  4 ++--
 drivers/pinctrl/pinctrl-cy8c95x0.c                          |  1 +
 drivers/pinctrl/realtek/pinctrl-rtd.c                       |  4 ++--
 drivers/pinctrl/stm32/pinctrl-stm32.c                       | 13 ++++++++++---
 7 files changed, 21 insertions(+), 12 deletions(-)
  

Comments

Linus Torvalds Nov. 29, 2023, 2:55 p.m. UTC | #1
On Wed, 29 Nov 2023 at 04:09, Linus Walleij <linus.walleij@linaro.org> wrote:
>
> The most interesting patch is the list iterator fix in the core by Maria
> Yu, it took a while for me to realize what was going on there.

That commit message still doesn't explain what the problem was.

Why is p->state volatile there? It seems to be a serious locking bug
if p->state can randomly change there, and the READ_ONCE() looks like
a "this hides the problem" rather than an actual real fix.

                   Linus
  
Linus Walleij Nov. 29, 2023, 3:08 p.m. UTC | #2
On Wed, Nov 29, 2023 at 3:56 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Wed, 29 Nov 2023 at 04:09, Linus Walleij <linus.walleij@linaro.org> wrote:
> >
> > The most interesting patch is the list iterator fix in the core by Maria
> > Yu, it took a while for me to realize what was going on there.
>
> That commit message still doesn't explain what the problem was.
>
> Why is p->state volatile there? It seems to be a serious locking bug
> if p->state can randomly change there, and the READ_ONCE() looks like
> a "this hides the problem" rather than an actual real fix.

Thanks for looking into it Linus, Maria can you look closer at this and
try to pinpoint exactly what happens?

Is the bug never manifesting with GCC for example?

In the meantime I'll cook a fixes branch without this one commit.

Yours,
Linus Walleij
  
pr-tracker-bot@kernel.org Nov. 29, 2023, 3:47 p.m. UTC | #3
The pull request you sent on Wed, 29 Nov 2023 13:09:03 +0100:

> git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl.git tags/pinctrl-v6.7-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/3b47bc037bd44f142ac09848e8d3ecccc726be99

Thank you!
  
Aiqun Yu (Maria) Nov. 30, 2023, 5:37 a.m. UTC | #4
On 11/29/2023 11:08 PM, Linus Walleij wrote:
> On Wed, Nov 29, 2023 at 3:56 PM Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Wed, 29 Nov 2023 at 04:09, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>
>>> The most interesting patch is the list iterator fix in the core by Maria
>>> Yu, it took a while for me to realize what was going on there.
>>
>> That commit message still doesn't explain what the problem was.
>>
>> Why is p->state volatile there? It seems to be a serious locking bug
>> if p->state can randomly change there, and the READ_ONCE() looks like
>> a "this hides the problem" rather than an actual real fix.
This is indeed an interesting issue. Thx for the comment, Linus.
**Let me explain how: "p->state becomes volatile in the list iterator", 
and "why READ_ONCE is suggested".

The current critical code is:
   list_for_each_entry(setting, &p->state->settings, node)

after elaborating the define list_for_each_entry, so above critical code 
will be:
   for (setting = list_head(&p->state->settings, typeof(*setting), node); \
&setting->node != (&p->state->settings); \
setting = list_next(setting , node))

The asm code(refer result from Clang version 10.0) can cleared explain 
the step of p->state reload actions:
loop:
ldr x22,[x22] ; x22=list_next(setting , node))
add x9,x8,#0x18 ; x9=&p->state->setting
cmp x22,x9 ; setting,x9
b.eq 0xFFFFFF9B24483530

ldr w9,[x22,#0x10] ; w9,[setting,#16]
cmp w9,#0x2 ; w9,#2
b.ne 0xFFFFFF9B24483504

mov x0,x22 ; x0,setting
bl 0xFFFFFF9B24486048 ; pinmux_disable_setting

ldr x8,[x19,#0x28] ; x19=p, x8=[p->state], *reload p->state*
b loop

The *reload p->state* inside the loop for checking is not needed and can 
cause possible infinite loop. So READ_ONCE is highly suggested even if 
p->state is not randomly changed. And then unnecessary "ldr 
x8,[x19,#0x28]" can be removed from above loop code.

**Comments about the locking bug:
currently pinctrl_select_state is an export symbol and don't have 
effective reentrance protect design. That's why current infinite loop 
issue was observed with customer's multi thread call with 
pinctrl_select_state without lock protection. pinctrl_select_state 
totally rely on driver module user side to ensure the reentrant state.

Usually the suggested usage from driver side who are using pinctrl would be:
LOCK;
pinctrl_select_state();
gpio pulling;
udelay();
check state;
other hardware behaviors;
UNLOCK;

So the locking bug fix I have told customer side to fix from their own 
driver part. Since usually not only a simple pinctrl_select_state call 
can finish the hardware state transaction.

I myself also is fine to have a small per pinctrl lock to only protect 
the current pinctrl_select_state->pinctrl_commit_state reentrance 
issues. Pls any pinctrl maintainer help to comment to suggest or not and 
I can prepare a change as well.
> 
> Thanks for looking into it Linus, Maria can you look closer at this and
> try to pinpoint exactly what happens? 
Sure, I am trying to explain from my end. If there is other thoughts pls 
feel free to chime in.
> 
> Is the bug never manifesting with GCC for example?
> 
> In the meantime I'll cook a fixes branch without this one commit.
> 
> Yours,
> Linus Walleij
  
Linus Walleij Dec. 1, 2023, 8:10 a.m. UTC | #5
Hi Nathan, Nick,

(just picking some LLVM compiler people I know of... and trust)

Context is this patch:
https://lore.kernel.org/linux-gpio/20231115102824.23727-1-quic_aiquny@quicinc.com/

On Thu, Nov 30, 2023 at 6:37 AM Aiqun(Maria) Yu <quic_aiquny@quicinc.com> wrote:
> On 11/29/2023 11:08 PM, Linus Walleij wrote:
> > On Wed, Nov 29, 2023 at 3:56 PM Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >> On Wed, 29 Nov 2023 at 04:09, Linus Walleij <linus.walleij@linaro.org> wrote:
> >>>
> >>> The most interesting patch is the list iterator fix in the core by Maria
> >>> Yu, it took a while for me to realize what was going on there.
> >>
> >> That commit message still doesn't explain what the problem was.
> >>
> >> Why is p->state volatile there? It seems to be a serious locking bug
> >> if p->state can randomly change there, and the READ_ONCE() looks like
> >> a "this hides the problem" rather than an actual real fix.
>
> This is indeed an interesting issue. Thx for the comment, Linus.
> **Let me explain how: "p->state becomes volatile in the list iterator",
> and "why READ_ONCE is suggested".
>
> The current critical code is:
>    list_for_each_entry(setting, &p->state->settings, node)
>
> after elaborating the define list_for_each_entry, so above critical code
> will be:
>    for (setting = list_head(&p->state->settings, typeof(*setting), node); \
> &setting->node != (&p->state->settings); \
> setting = list_next(setting , node))
>
> The asm code(refer result from Clang version 10.0) can cleared explain
> the step of p->state reload actions:
> loop:
> ldr x22,[x22] ; x22=list_next(setting , node))
> add x9,x8,#0x18 ; x9=&p->state->setting
> cmp x22,x9 ; setting,x9
> b.eq 0xFFFFFF9B24483530
>
> ldr w9,[x22,#0x10] ; w9,[setting,#16]
> cmp w9,#0x2 ; w9,#2
> b.ne 0xFFFFFF9B24483504
>
> mov x0,x22 ; x0,setting
> bl 0xFFFFFF9B24486048 ; pinmux_disable_setting
>
> ldr x8,[x19,#0x28] ; x19=p, x8=[p->state], *reload p->state*
> b loop
>
> The *reload p->state* inside the loop for checking is not needed and can
> cause possible infinite loop. So READ_ONCE is highly suggested even if
> p->state is not randomly changed. And then unnecessary "ldr
> x8,[x19,#0x28]" can be removed from above loop code.
>
> **Comments about the locking bug:
> currently pinctrl_select_state is an export symbol and don't have
> effective reentrance protect design. That's why current infinite loop
> issue was observed with customer's multi thread call with
> pinctrl_select_state without lock protection. pinctrl_select_state
> totally rely on driver module user side to ensure the reentrant state.
>
> Usually the suggested usage from driver side who are using pinctrl would be:
> LOCK;
> pinctrl_select_state();
> gpio pulling;
> udelay();
> check state;
> other hardware behaviors;
> UNLOCK;
>
> So the locking bug fix I have told customer side to fix from their own
> driver part. Since usually not only a simple pinctrl_select_state call
> can finish the hardware state transaction.
>
> I myself also is fine to have a small per pinctrl lock to only protect
> the current pinctrl_select_state->pinctrl_commit_state reentrance
> issues. Pls any pinctrl maintainer help to comment to suggest or not and
> I can prepare a change as well.

Luckily I am the pin control maintainer :D
And I also ha my morning coffee and looked over the patch again.

So tilting the compiler to generate code that is less prone to race
conditions with READ_ONCE() isn't really the solution is it? We need
to introduce a proper lock that stops this from happening if it is
a problem people are facing.

Can you try to make a patch that removes READ_ONCE()
and introduce a lock instead?

Racing is rarely an issue in pin control for reasons explained
in another context here:
https://lore.kernel.org/linux-gpio/CACRpkdZ0cnJpYuzU=47-oW-7N_YGMo2vXpKOeXeNi5PhPY7QMA@mail.gmail.com/

...but if people still manage to run into it, we better have a lock
there. Just make sure it is not just an issue with outoftree code,
but a real problem?

The change that changes the code to use the old_state variable
should stay in, it makes the code more readable, it's just the
READ_ONCE() macro which is dubious.

Yours,
Linus Walleij
  
Linus Walleij Dec. 1, 2023, 8:12 a.m. UTC | #6
On Fri, Dec 1, 2023 at 9:10 AM Linus Walleij <linus.walleij@linaro.org> wrote:

> Hi Nathan, Nick,
>
> (just picking some LLVM compiler people I know of... and trust)

Forget that part of the mail, written out-of-context when I was
running different
disassembly of GCC vs LLVM code. I figured it out and concluded that
the compilers
are fine and doing the right thing, it's a none-issue in that regard.

Yours,
Linus Walleij
  
Aiqun Yu (Maria) Dec. 1, 2023, 10:06 a.m. UTC | #7
On 12/1/2023 4:10 PM, Linus Walleij wrote:
> Hi Nathan, Nick,
> 
> (just picking some LLVM compiler people I know of... and trust)
> 
> Context is this patch:
> https://lore.kernel.org/linux-gpio/20231115102824.23727-1-quic_aiquny@quicinc.com/
> 
> On Thu, Nov 30, 2023 at 6:37 AM Aiqun(Maria) Yu <quic_aiquny@quicinc.com> wrote:
>> On 11/29/2023 11:08 PM, Linus Walleij wrote:
>>> On Wed, Nov 29, 2023 at 3:56 PM Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>> On Wed, 29 Nov 2023 at 04:09, Linus Walleij <linus.walleij@linaro.org> wrote:
>>>>>
>>>>> The most interesting patch is the list iterator fix in the core by Maria
>>>>> Yu, it took a while for me to realize what was going on there.
>>>>
>>>> That commit message still doesn't explain what the problem was.
>>>>
>>>> Why is p->state volatile there? It seems to be a serious locking bug
>>>> if p->state can randomly change there, and the READ_ONCE() looks like
>>>> a "this hides the problem" rather than an actual real fix.
>>
>> This is indeed an interesting issue. Thx for the comment, Linus.
>> **Let me explain how: "p->state becomes volatile in the list iterator",
>> and "why READ_ONCE is suggested".
>>
>> The current critical code is:
>>     list_for_each_entry(setting, &p->state->settings, node)
>>
>> after elaborating the define list_for_each_entry, so above critical code
>> will be:
>>     for (setting = list_head(&p->state->settings, typeof(*setting), node); \
>> &setting->node != (&p->state->settings); \
>> setting = list_next(setting , node))
>>
>> The asm code(refer result from Clang version 10.0) can cleared explain
>> the step of p->state reload actions:
>> loop:
>> ldr x22,[x22] ; x22=list_next(setting , node))
>> add x9,x8,#0x18 ; x9=&p->state->setting
>> cmp x22,x9 ; setting,x9
>> b.eq 0xFFFFFF9B24483530
>>
>> ldr w9,[x22,#0x10] ; w9,[setting,#16]
>> cmp w9,#0x2 ; w9,#2
>> b.ne 0xFFFFFF9B24483504
>>
>> mov x0,x22 ; x0,setting
>> bl 0xFFFFFF9B24486048 ; pinmux_disable_setting
>>
>> ldr x8,[x19,#0x28] ; x19=p, x8=[p->state], *reload p->state*
>> b loop
>>
>> The *reload p->state* inside the loop for checking is not needed and can
>> cause possible infinite loop. So READ_ONCE is highly suggested even if
>> p->state is not randomly changed. And then unnecessary "ldr
>> x8,[x19,#0x28]" can be removed from above loop code.
>>
>> **Comments about the locking bug:
>> currently pinctrl_select_state is an export symbol and don't have
>> effective reentrance protect design. That's why current infinite loop
>> issue was observed with customer's multi thread call with
>> pinctrl_select_state without lock protection. pinctrl_select_state
>> totally rely on driver module user side to ensure the reentrant state.
>>
>> Usually the suggested usage from driver side who are using pinctrl would be:
>> LOCK;
>> pinctrl_select_state();
>> gpio pulling;
>> udelay();
>> check state;
>> other hardware behaviors;
>> UNLOCK;
>>
>> So the locking bug fix I have told customer side to fix from their own
>> driver part. Since usually not only a simple pinctrl_select_state call
>> can finish the hardware state transaction.
>>
>> I myself also is fine to have a small per pinctrl lock to only protect
>> the current pinctrl_select_state->pinctrl_commit_state reentrance
>> issues. Pls any pinctrl maintainer help to comment to suggest or not and
>> I can prepare a change as well.
> 
> Luckily I am the pin control maintainer :D
> And I also ha my morning coffee and looked over the patch again.
> 
> So tilting the compiler to generate code that is less prone to race
> conditions with READ_ONCE() isn't really the solution is it? We need
> to introduce a proper lock that stops this from happening if it is
> a problem people are facing.
> 
> Can you try to make a patch that removes READ_ONCE()
> and introduce a lock instead?
> 
> Racing is rarely an issue in pin control for reasons explained
> in another context here:
> https://lore.kernel.org/linux-gpio/CACRpkdZ0cnJpYuzU=47-oW-7N_YGMo2vXpKOeXeNi5PhPY7QMA@mail.gmail.com/
> 
> ...but if people still manage to run into it, we better have a lock
> there. Just make sure it is not just an issue with outoftree code,
> but a real problem?
> 
> The change that changes the code to use the old_state variable
> should stay in, it makes the code more readable, it's just the
> READ_ONCE() macro which is dubious.
Thx for confirm. I am preparing the new change now. :)

READ_ONCE can only avoid the possible infinite loop and not crash the 
whole kernel, while the lock is needed to protect the multi parallel 
call of pinctrl_commit_state api have a consistent atomic hardware 
result as well.
> 
> Yours,
> Linus Walleij
  
Aiqun Yu (Maria) Dec. 1, 2023, 3:34 p.m. UTC | #8
On 12/1/2023 6:06 PM, Aiqun(Maria) Yu wrote:
> On 12/1/2023 4:10 PM, Linus Walleij wrote:
>> Hi Nathan, Nick,
>>
>> (just picking some LLVM compiler people I know of... and trust)
>>
>> Context is this patch:
>> https://lore.kernel.org/linux-gpio/20231115102824.23727-1-quic_aiquny@quicinc.com/
>>
>> On Thu, Nov 30, 2023 at 6:37 AM Aiqun(Maria) Yu 
>> <quic_aiquny@quicinc.com> wrote:
>>> On 11/29/2023 11:08 PM, Linus Walleij wrote:
>>>> On Wed, Nov 29, 2023 at 3:56 PM Linus Torvalds
>>>> <torvalds@linux-foundation.org> wrote:
>>>>> On Wed, 29 Nov 2023 at 04:09, Linus Walleij 
>>>>> <linus.walleij@linaro.org> wrote:
>>>>>>
>>>>>> The most interesting patch is the list iterator fix in the core by 
>>>>>> Maria
>>>>>> Yu, it took a while for me to realize what was going on there.
>>>>>
>>>>> That commit message still doesn't explain what the problem was.
>>>>>
>>>>> Why is p->state volatile there? It seems to be a serious locking bug
>>>>> if p->state can randomly change there, and the READ_ONCE() looks like
>>>>> a "this hides the problem" rather than an actual real fix.
>>>
>>> This is indeed an interesting issue. Thx for the comment, Linus.
>>> **Let me explain how: "p->state becomes volatile in the list iterator",
>>> and "why READ_ONCE is suggested".
>>>
>>> The current critical code is:
>>>     list_for_each_entry(setting, &p->state->settings, node)
>>>
>>> after elaborating the define list_for_each_entry, so above critical code
>>> will be:
>>>     for (setting = list_head(&p->state->settings, typeof(*setting), 
>>> node); \
>>> &setting->node != (&p->state->settings); \
>>> setting = list_next(setting , node))
>>>
>>> The asm code(refer result from Clang version 10.0) can cleared explain
>>> the step of p->state reload actions:
>>> loop:
>>> ldr x22,[x22] ; x22=list_next(setting , node))
>>> add x9,x8,#0x18 ; x9=&p->state->setting
>>> cmp x22,x9 ; setting,x9
>>> b.eq 0xFFFFFF9B24483530
>>>
>>> ldr w9,[x22,#0x10] ; w9,[setting,#16]
>>> cmp w9,#0x2 ; w9,#2
>>> b.ne 0xFFFFFF9B24483504
>>>
>>> mov x0,x22 ; x0,setting
>>> bl 0xFFFFFF9B24486048 ; pinmux_disable_setting
>>>
>>> ldr x8,[x19,#0x28] ; x19=p, x8=[p->state], *reload p->state*
>>> b loop
>>>
>>> The *reload p->state* inside the loop for checking is not needed and can
>>> cause possible infinite loop. So READ_ONCE is highly suggested even if
>>> p->state is not randomly changed. And then unnecessary "ldr
>>> x8,[x19,#0x28]" can be removed from above loop code.
>>>
>>> **Comments about the locking bug:
>>> currently pinctrl_select_state is an export symbol and don't have
>>> effective reentrance protect design. That's why current infinite loop
>>> issue was observed with customer's multi thread call with
>>> pinctrl_select_state without lock protection. pinctrl_select_state
>>> totally rely on driver module user side to ensure the reentrant state.
>>>
>>> Usually the suggested usage from driver side who are using pinctrl 
>>> would be:
>>> LOCK;
>>> pinctrl_select_state();
>>> gpio pulling;
>>> udelay();
>>> check state;
>>> other hardware behaviors;
>>> UNLOCK;
>>>
>>> So the locking bug fix I have told customer side to fix from their own
>>> driver part. Since usually not only a simple pinctrl_select_state call
>>> can finish the hardware state transaction.
>>>
>>> I myself also is fine to have a small per pinctrl lock to only protect
>>> the current pinctrl_select_state->pinctrl_commit_state reentrance
>>> issues. Pls any pinctrl maintainer help to comment to suggest or not and
>>> I can prepare a change as well.
>>
>> Luckily I am the pin control maintainer :D
>> And I also ha my morning coffee and looked over the patch again.
>>
>> So tilting the compiler to generate code that is less prone to race
>> conditions with READ_ONCE() isn't really the solution is it? We need
>> to introduce a proper lock that stops this from happening if it is
>> a problem people are facing.
>>
>> Can you try to make a patch that removes READ_ONCE()
>> and introduce a lock instead?
>>
>> Racing is rarely an issue in pin control for reasons explained
>> in another context here:
>> https://lore.kernel.org/linux-gpio/CACRpkdZ0cnJpYuzU=47-oW-7N_YGMo2vXpKOeXeNi5PhPY7QMA@mail.gmail.com/
>>
>> ...but if people still manage to run into it, we better have a lock
>> there. Just make sure it is not just an issue with outoftree code,
>> but a real problem?
>>
>> The change that changes the code to use the old_state variable
>> should stay in, it makes the code more readable, it's just the
>> READ_ONCE() macro which is dubious.
> Thx for confirm. I am preparing the new change now. :)
change uploaded link here:
https://lore.kernel.org/linux-gpio/20231201152931.31161-1-quic_aiquny@quicinc.com/ 

> 
> READ_ONCE can only avoid the possible infinite loop and not crash the 
> whole kernel, while the lock is needed to protect the multi parallel 
> call of pinctrl_commit_state api have a consistent atomic hardware 
> result as well.
>>
>> Yours,
>> Linus Walleij
>