[v4,1/2] usb: gadget: udc: core: Invoke usb_gadget_connect only when started

Message ID 20230407030741.3163220-1-badhri@google.com
State New
Headers
Series [v4,1/2] usb: gadget: udc: core: Invoke usb_gadget_connect only when started |

Commit Message

Badhri Jagan Sridharan April 7, 2023, 3:07 a.m. UTC
  usb_udc_connect_control does not check to see if the udc has already
been started. This causes gadget->ops->pullup to be called through
usb_gadget_connect when invoked from usb_udc_vbus_handler even before
usb_gadget_udc_start is called. Guard this by checking for udc->started
in usb_udc_connect_control before invoking usb_gadget_connect.

Guarding udc->vbus, udc->started, gadget->connect, gadget->deactivate
related functions with connect_lock. usb_gadget_connect_locked,
usb_gadget_disconnect_locked, usb_udc_connect_control_locked,
usb_gadget_udc_start_locked, usb_gadget_udc_stop_locked are called with
this lock held as they can be simulataneously invoked from different code
paths.

Adding an additional check to make sure udc is started(udc->started)
before pullup callback is invoked.

Cc: stable@vger.kernel.org
Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
---
Changes since v3:
* Make internal gadget_connect/disconnect functions static
Changes since v2:
* Added __must_hold marking for connect_lock
Changes since v1:
* Fixed commit message comments.
* Renamed udc_connect_control_lock to connect_lock and made it per
device.
* udc->vbus, udc->started, gadget->connect, gadget->deactivate are all
now guarded by connect_lock.
* Code now checks for udc->started to be set before invoking pullup
callback.
---
 drivers/usb/gadget/udc/core.c | 148 ++++++++++++++++++++++++----------
 1 file changed, 104 insertions(+), 44 deletions(-)


base-commit: d629c0e221cd99198b843d8351a0a9bfec6c0423
  

Comments

Alistair Francis May 16, 2023, 12:53 p.m. UTC | #1
On Fri, 2023-04-07 at 03:07 +0000, Badhri Jagan Sridharan wrote:
> usb_udc_connect_control does not check to see if the udc has already
> been started. This causes gadget->ops->pullup to be called through
> usb_gadget_connect when invoked from usb_udc_vbus_handler even before
> usb_gadget_udc_start is called. Guard this by checking for udc-
> >started
> in usb_udc_connect_control before invoking usb_gadget_connect.
> 
> Guarding udc->vbus, udc->started, gadget->connect, gadget->deactivate
> related functions with connect_lock. usb_gadget_connect_locked,
> usb_gadget_disconnect_locked, usb_udc_connect_control_locked,
> usb_gadget_udc_start_locked, usb_gadget_udc_stop_locked are called
> with
> this lock held as they can be simulataneously invoked from different
> code
> paths.
> 
> Adding an additional check to make sure udc is started(udc->started)
> before pullup callback is invoked.
> 
> Cc: stable@vger.kernel.org
> Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>

This patch causes a kernel hang when trying to boot with the
usb/chipidea/udc.c driver.

The call stack below causes the hang:

 - gadget_bind_driver(struct device *dev)
    - mutex_lock(&udc->connect_lock);
    - usb_gadget_udc_start_locked(struct usb_udc *udc)
        - udc->gadget->ops->udc_start(udc->gadget, udc->driver)

At which point we are calling ci_udc_start(..), but with the
connect_lock mutex locked.

ci_udc_start() then calls usb_udc_vbus_handler() which tries to lock
the connect_lock while it's already locked. Resulting in a kernel hang.

Reverting this patch fixes the hang.

Alistair

> ---
> Changes since v3:
> * Make internal gadget_connect/disconnect functions static
> Changes since v2:
> * Added __must_hold marking for connect_lock
> Changes since v1:
> * Fixed commit message comments.
> * Renamed udc_connect_control_lock to connect_lock and made it per
> device.
> * udc->vbus, udc->started, gadget->connect, gadget->deactivate are
> all
> now guarded by connect_lock.
> * Code now checks for udc->started to be set before invoking pullup
> callback.
> ---
>  drivers/usb/gadget/udc/core.c | 148 ++++++++++++++++++++++++--------
> --
>  1 file changed, 104 insertions(+), 44 deletions(-)
> 
> diff --git a/drivers/usb/gadget/udc/core.c
> b/drivers/usb/gadget/udc/core.c
> index 3dcbba739db6..af92c2e8e10c 100644
> --- a/drivers/usb/gadget/udc/core.c
> +++ b/drivers/usb/gadget/udc/core.c
> @@ -37,6 +37,10 @@ static struct bus_type gadget_bus_type;
>   * @vbus: for udcs who care about vbus status, this value is real
> vbus status;
>   * for udcs who do not care about vbus status, this value is always
> true
>   * @started: the UDC's started state. True if the UDC had started.
> + * @connect_lock: protects udc->vbus, udc->started, gadget->connect,
> gadget->deactivate related
> + * functions. usb_gadget_connect_locked,
> usb_gadget_disconnect_locked,
> + * usb_udc_connect_control_locked, usb_gadget_udc_start_locked,
> usb_gadget_udc_stop_locked are
> + * called with this lock held.
>   *
>   * This represents the internal data structure which is used by the
> UDC-class
>   * to hold information about udc driver and gadget together.
> @@ -48,6 +52,7 @@ struct usb_udc {
>         struct list_head                list;
>         bool                            vbus;
>         bool                            started;
> +       struct mutex                    connect_lock;
>  };
>  
>  static struct class *udc_class;
> @@ -687,17 +692,9 @@ int usb_gadget_vbus_disconnect(struct usb_gadget
> *gadget)
>  }
>  EXPORT_SYMBOL_GPL(usb_gadget_vbus_disconnect);
>  
> -/**
> - * usb_gadget_connect - software-controlled connect to USB host
> - * @gadget:the peripheral being connected
> - *
> - * Enables the D+ (or potentially D-) pullup.  The host will start
> - * enumerating this gadget when the pullup is active and a VBUS
> session
> - * is active (the link is powered).
> - *
> - * Returns zero on success, else negative errno.
> - */
> -int usb_gadget_connect(struct usb_gadget *gadget)
> +/* Internal version of usb_gadget_connect needs to be called with
> connect_lock held. */
> +static int usb_gadget_connect_locked(struct usb_gadget *gadget)
> +       __must_hold(&gadget->udc->connect_lock)
>  {
>         int ret = 0;
>  
> @@ -706,10 +703,12 @@ int usb_gadget_connect(struct usb_gadget
> *gadget)
>                 goto out;
>         }
>  
> -       if (gadget->deactivated) {
> +       if (gadget->deactivated || !gadget->udc->started) {
>                 /*
>                  * If gadget is deactivated we only save new state.
>                  * Gadget will be connected automatically after
> activation.
> +                *
> +                * udc first needs to be started before gadget can be
> pulled up.
>                  */
>                 gadget->connected = true;
>                 goto out;
> @@ -724,22 +723,32 @@ int usb_gadget_connect(struct usb_gadget
> *gadget)
>  
>         return ret;
>  }
> -EXPORT_SYMBOL_GPL(usb_gadget_connect);
>  
>  /**
> - * usb_gadget_disconnect - software-controlled disconnect from USB
> host
> - * @gadget:the peripheral being disconnected
> - *
> - * Disables the D+ (or potentially D-) pullup, which the host may
> see
> - * as a disconnect (when a VBUS session is active).  Not all systems
> - * support software pullup controls.
> + * usb_gadget_connect - software-controlled connect to USB host
> + * @gadget:the peripheral being connected
>   *
> - * Following a successful disconnect, invoke the ->disconnect()
> callback
> - * for the current gadget driver so that UDC drivers don't need to.
> + * Enables the D+ (or potentially D-) pullup.  The host will start
> + * enumerating this gadget when the pullup is active and a VBUS
> session
> + * is active (the link is powered).
>   *
>   * Returns zero on success, else negative errno.
>   */
> -int usb_gadget_disconnect(struct usb_gadget *gadget)
> +int usb_gadget_connect(struct usb_gadget *gadget)
> +{
> +       int ret;
> +
> +       mutex_lock(&gadget->udc->connect_lock);
> +       ret = usb_gadget_connect_locked(gadget);
> +       mutex_unlock(&gadget->udc->connect_lock);
> +
> +       return ret;
> +}
> +EXPORT_SYMBOL_GPL(usb_gadget_connect);
> +
> +/* Internal version of usb_gadget_disconnect needs to be called with
> connect_lock held. */
> +static int usb_gadget_disconnect_locked(struct usb_gadget *gadget)
> +       __must_hold(&gadget->udc->connect_lock)
>  {
>         int ret = 0;
>  
> @@ -751,10 +760,12 @@ int usb_gadget_disconnect(struct usb_gadget
> *gadget)
>         if (!gadget->connected)
>                 goto out;
>  
> -       if (gadget->deactivated) {
> +       if (gadget->deactivated || !gadget->udc->started) {
>                 /*
>                  * If gadget is deactivated we only save new state.
>                  * Gadget will stay disconnected after activation.
> +                *
> +                * udc should have been started before gadget being
> pulled down.
>                  */
>                 gadget->connected = false;
>                 goto out;
> @@ -774,6 +785,30 @@ int usb_gadget_disconnect(struct usb_gadget
> *gadget)
>  
>         return ret;
>  }
> +
> +/**
> + * usb_gadget_disconnect - software-controlled disconnect from USB
> host
> + * @gadget:the peripheral being disconnected
> + *
> + * Disables the D+ (or potentially D-) pullup, which the host may
> see
> + * as a disconnect (when a VBUS session is active).  Not all systems
> + * support software pullup controls.
> + *
> + * Following a successful disconnect, invoke the ->disconnect()
> callback
> + * for the current gadget driver so that UDC drivers don't need to.
> + *
> + * Returns zero on success, else negative errno.
> + */
> +int usb_gadget_disconnect(struct usb_gadget *gadget)
> +{
> +       int ret;
> +
> +       mutex_lock(&gadget->udc->connect_lock);
> +       ret = usb_gadget_disconnect_locked(gadget);
> +       mutex_unlock(&gadget->udc->connect_lock);
> +
> +       return ret;
> +}
>  EXPORT_SYMBOL_GPL(usb_gadget_disconnect);
>  
>  /**
> @@ -794,10 +829,11 @@ int usb_gadget_deactivate(struct usb_gadget
> *gadget)
>         if (gadget->deactivated)
>                 goto out;
>  
> +       mutex_lock(&gadget->udc->connect_lock);
>         if (gadget->connected) {
> -               ret = usb_gadget_disconnect(gadget);
> +               ret = usb_gadget_disconnect_locked(gadget);
>                 if (ret)
> -                       goto out;
> +                       goto unlock;
>  
>                 /*
>                  * If gadget was being connected before deactivation,
> we want
> @@ -807,6 +843,8 @@ int usb_gadget_deactivate(struct usb_gadget
> *gadget)
>         }
>         gadget->deactivated = true;
>  
> +unlock:
> +       mutex_unlock(&gadget->udc->connect_lock);
>  out:
>         trace_usb_gadget_deactivate(gadget, ret);
>  
> @@ -830,6 +868,7 @@ int usb_gadget_activate(struct usb_gadget
> *gadget)
>         if (!gadget->deactivated)
>                 goto out;
>  
> +       mutex_lock(&gadget->udc->connect_lock);
>         gadget->deactivated = false;
>  
>         /*
> @@ -837,7 +876,8 @@ int usb_gadget_activate(struct usb_gadget
> *gadget)
>          * while it was being deactivated, we call
> usb_gadget_connect().
>          */
>         if (gadget->connected)
> -               ret = usb_gadget_connect(gadget);
> +               ret = usb_gadget_connect_locked(gadget);
> +       mutex_unlock(&gadget->udc->connect_lock);
>  
>  out:
>         trace_usb_gadget_activate(gadget, ret);
> @@ -1078,12 +1118,13 @@ EXPORT_SYMBOL_GPL(usb_gadget_set_state);
>  
>  /* -----------------------------------------------------------------
> -------- */
>  
> -static void usb_udc_connect_control(struct usb_udc *udc)
> +/* Acquire connect_lock before calling this function. */
> +static void usb_udc_connect_control_locked(struct usb_udc *udc)
> __must_hold(&udc->connect_lock)
>  {
> -       if (udc->vbus)
> -               usb_gadget_connect(udc->gadget);
> +       if (udc->vbus && udc->started)
> +               usb_gadget_connect_locked(udc->gadget);
>         else
> -               usb_gadget_disconnect(udc->gadget);
> +               usb_gadget_disconnect_locked(udc->gadget);
>  }
>  
>  /**
> @@ -1099,10 +1140,12 @@ void usb_udc_vbus_handler(struct usb_gadget
> *gadget, bool status)
>  {
>         struct usb_udc *udc = gadget->udc;
>  
> +       mutex_lock(&udc->connect_lock);
>         if (udc) {
>                 udc->vbus = status;
> -               usb_udc_connect_control(udc);
> +               usb_udc_connect_control_locked(udc);
>         }
> +       mutex_unlock(&udc->connect_lock);
>  }
>  EXPORT_SYMBOL_GPL(usb_udc_vbus_handler);
>  
> @@ -1124,7 +1167,7 @@ void usb_gadget_udc_reset(struct usb_gadget
> *gadget,
>  EXPORT_SYMBOL_GPL(usb_gadget_udc_reset);
>  
>  /**
> - * usb_gadget_udc_start - tells usb device controller to start up
> + * usb_gadget_udc_start_locked - tells usb device controller to
> start up
>   * @udc: The UDC to be started
>   *
>   * This call is issued by the UDC Class driver when it's about
> @@ -1135,8 +1178,11 @@ EXPORT_SYMBOL_GPL(usb_gadget_udc_reset);
>   * necessary to have it powered on.
>   *
>   * Returns zero on success, else negative errno.
> + *
> + * Caller should acquire connect_lock before invoking this function.
>   */
> -static inline int usb_gadget_udc_start(struct usb_udc *udc)
> +static inline int usb_gadget_udc_start_locked(struct usb_udc *udc)
> +       __must_hold(&udc->connect_lock)
>  {
>         int ret;
>  
> @@ -1153,7 +1199,7 @@ static inline int usb_gadget_udc_start(struct
> usb_udc *udc)
>  }
>  
>  /**
> - * usb_gadget_udc_stop - tells usb device controller we don't need
> it anymore
> + * usb_gadget_udc_stop_locked - tells usb device controller we don't
> need it anymore
>   * @udc: The UDC to be stopped
>   *
>   * This call is issued by the UDC Class driver after calling
> @@ -1162,8 +1208,11 @@ static inline int usb_gadget_udc_start(struct
> usb_udc *udc)
>   * The details are implementation specific, but it can go as
>   * far as powering off UDC completely and disable its data
>   * line pullups.
> + *
> + * Caller should acquire connect lock before invoking this function.
>   */
> -static inline void usb_gadget_udc_stop(struct usb_udc *udc)
> +static inline void usb_gadget_udc_stop_locked(struct usb_udc *udc)
> +       __must_hold(&udc->connect_lock)
>  {
>         if (!udc->started) {
>                 dev_err(&udc->dev, "UDC had already stopped\n");
> @@ -1322,6 +1371,7 @@ int usb_add_gadget(struct usb_gadget *gadget)
>  
>         udc->gadget = gadget;
>         gadget->udc = udc;
> +       mutex_init(&udc->connect_lock);
>  
>         udc->started = false;
>  
> @@ -1523,11 +1573,15 @@ static int gadget_bind_driver(struct device
> *dev)
>         if (ret)
>                 goto err_bind;
>  
> -       ret = usb_gadget_udc_start(udc);
> -       if (ret)
> +       mutex_lock(&udc->connect_lock);
> +       ret = usb_gadget_udc_start_locked(udc);
> +       if (ret) {
> +               mutex_unlock(&udc->connect_lock);
>                 goto err_start;
> +       }
>         usb_gadget_enable_async_callbacks(udc);
> -       usb_udc_connect_control(udc);
> +       usb_udc_connect_control_locked(udc);
> +       mutex_unlock(&udc->connect_lock);
>  
>         kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
>         return 0;
> @@ -1558,12 +1612,14 @@ static void gadget_unbind_driver(struct
> device *dev)
>  
>         kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
>  
> -       usb_gadget_disconnect(gadget);
> +       mutex_lock(&udc->connect_lock);
> +       usb_gadget_disconnect_locked(gadget);
>         usb_gadget_disable_async_callbacks(udc);
>         if (gadget->irq)
>                 synchronize_irq(gadget->irq);
>         udc->driver->unbind(gadget);
> -       usb_gadget_udc_stop(udc);
> +       usb_gadget_udc_stop_locked(udc);
> +       mutex_unlock(&udc->connect_lock);
>  
>         mutex_lock(&udc_lock);
>         driver->is_bound = false;
> @@ -1649,11 +1705,15 @@ static ssize_t soft_connect_store(struct
> device *dev,
>         }
>  
>         if (sysfs_streq(buf, "connect")) {
> -               usb_gadget_udc_start(udc);
> -               usb_gadget_connect(udc->gadget);
> +               mutex_lock(&udc->connect_lock);
> +               usb_gadget_udc_start_locked(udc);
> +               usb_gadget_connect_locked(udc->gadget);
> +               mutex_unlock(&udc->connect_lock);
>         } else if (sysfs_streq(buf, "disconnect")) {
> -               usb_gadget_disconnect(udc->gadget);
> -               usb_gadget_udc_stop(udc);
> +               mutex_lock(&udc->connect_lock);
> +               usb_gadget_disconnect_locked(udc->gadget);
> +               usb_gadget_udc_stop_locked(udc);
> +               mutex_unlock(&udc->connect_lock);
>         } else {
>                 dev_err(dev, "unsupported command '%s'\n", buf);
>                 ret = -EINVAL;
> 
> base-commit: d629c0e221cd99198b843d8351a0a9bfec6c0423
  
Linux regression tracking (Thorsten Leemhuis) May 17, 2023, 10:23 a.m. UTC | #2
[CCing Francesco Dolcini; and the regression list too, as it should be
in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

On 16.05.23 14:53, Alistair wrote:
> On Fri, 2023-04-07 at 03:07 +0000, Badhri Jagan Sridharan wrote:
>> usb_udc_connect_control does not check to see if the udc has already
>> been started. This causes gadget->ops->pullup to be called through
>> usb_gadget_connect when invoked from usb_udc_vbus_handler even before
>> usb_gadget_udc_start is called. Guard this by checking for udc-
>>> started
>> in usb_udc_connect_control before invoking usb_gadget_connect.
> [...]
>> Cc: stable@vger.kernel.org
>> Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
>> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> 
> This patch causes a kernel hang when trying to boot with the
> usb/chipidea/udc.c driver.
> 
> The call stack below causes the hang:
> 
>  - gadget_bind_driver(struct device *dev)
>     - mutex_lock(&udc->connect_lock);
>     - usb_gadget_udc_start_locked(struct usb_udc *udc)
>         - udc->gadget->ops->udc_start(udc->gadget, udc->driver)
> 
> At which point we are calling ci_udc_start(..), but with the
> connect_lock mutex locked.
> 
> ci_udc_start() then calls usb_udc_vbus_handler() which tries to lock
> the connect_lock while it's already locked. Resulting in a kernel hang.
> 
> Reverting this patch fixes the hang.

Not my area of expertise, but I guess it might be the same error as this
one:

https://lore.kernel.org/all/ZF4BvgsOyoKxdPFF@francesco-nb.int.toradex.com/

Francesco sent a revert on Friday, but no reaction from Badhri Jagan
Sridharan or Greg yet afaics.

https://lore.kernel.org/all/20230512131435.205464-1-francesco@dolcini.it/

Ciao, Thorsten
  
Badhri Jagan Sridharan May 17, 2023, 10:32 a.m. UTC | #3
Hi Thorsten,

Francesso  had shared the stack dump as well at
https://lore.kernel.org/all/ZGMm2sxN6wW%2FEWrR@francesco-nb.int.toradex.com/.
I am working on a fix based on that. Going to share it in the next
hour and would be requesting Franceso and others help to see if the
regression goes away.

Thanks,
Badhri


On Wed, May 17, 2023 at 3:23 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@leemhuis.info> wrote:
>
> [CCing Francesco Dolcini; and the regression list too, as it should be
> in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>
> On 16.05.23 14:53, Alistair wrote:
> > On Fri, 2023-04-07 at 03:07 +0000, Badhri Jagan Sridharan wrote:
> >> usb_udc_connect_control does not check to see if the udc has already
> >> been started. This causes gadget->ops->pullup to be called through
> >> usb_gadget_connect when invoked from usb_udc_vbus_handler even before
> >> usb_gadget_udc_start is called. Guard this by checking for udc-
> >>> started
> >> in usb_udc_connect_control before invoking usb_gadget_connect.
> > [...]
> >> Cc: stable@vger.kernel.org
> >> Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
> >> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> >
> > This patch causes a kernel hang when trying to boot with the
> > usb/chipidea/udc.c driver.
> >
> > The call stack below causes the hang:
> >
> >  - gadget_bind_driver(struct device *dev)
> >     - mutex_lock(&udc->connect_lock);
> >     - usb_gadget_udc_start_locked(struct usb_udc *udc)
> >         - udc->gadget->ops->udc_start(udc->gadget, udc->driver)
> >
> > At which point we are calling ci_udc_start(..), but with the
> > connect_lock mutex locked.
> >
> > ci_udc_start() then calls usb_udc_vbus_handler() which tries to lock
> > the connect_lock while it's already locked. Resulting in a kernel hang.
> >
> > Reverting this patch fixes the hang.
>
> Not my area of expertise, but I guess it might be the same error as this
> one:
>
> https://lore.kernel.org/all/ZF4BvgsOyoKxdPFF@francesco-nb.int.toradex.com/
>
> Francesco sent a revert on Friday, but no reaction from Badhri Jagan
> Sridharan or Greg yet afaics.
>
> https://lore.kernel.org/all/20230512131435.205464-1-francesco@dolcini.it/
>
> Ciao, Thorsten
  
Francesco Dolcini May 17, 2023, 10:35 a.m. UTC | #4
On Wed, May 17, 2023 at 12:23:39PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> [CCing Francesco Dolcini; and the regression list too, as it should be
> in the loop for regressions:
> https://docs.kernel.org/admin-guide/reporting-regressions.html]
> 
> On 16.05.23 14:53, Alistair wrote:
> > On Fri, 2023-04-07 at 03:07 +0000, Badhri Jagan Sridharan wrote:
> >> usb_udc_connect_control does not check to see if the udc has already
> >> been started. This causes gadget->ops->pullup to be called through
> >> usb_gadget_connect when invoked from usb_udc_vbus_handler even before
> >> usb_gadget_udc_start is called. Guard this by checking for udc-
> >>> started
> >> in usb_udc_connect_control before invoking usb_gadget_connect.
> > [...]
> >> Cc: stable@vger.kernel.org
> >> Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
> >> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> > 
> > This patch causes a kernel hang when trying to boot with the
> > usb/chipidea/udc.c driver.
> > 
> > The call stack below causes the hang:
> > 
> >  - gadget_bind_driver(struct device *dev)
> >     - mutex_lock(&udc->connect_lock);
> >     - usb_gadget_udc_start_locked(struct usb_udc *udc)
> >         - udc->gadget->ops->udc_start(udc->gadget, udc->driver)
> > 
> > At which point we are calling ci_udc_start(..), but with the
> > connect_lock mutex locked.
> > 
> > ci_udc_start() then calls usb_udc_vbus_handler() which tries to lock
> > the connect_lock while it's already locked. Resulting in a kernel hang.
> > 
> > Reverting this patch fixes the hang.
> 
> Not my area of expertise, but I guess it might be the same error as this
> one:
> 
> https://lore.kernel.org/all/ZF4BvgsOyoKxdPFF@francesco-nb.int.toradex.com/
> 
> Francesco sent a revert on Friday, but no reaction from Badhri Jagan
> Sridharan or Greg yet afaics.
> 
> https://lore.kernel.org/all/20230512131435.205464-1-francesco@dolcini.it/

Revert patches were applied and are in linux-next. I expect those to
land in Linus tree with the next pull request from Greg.

Francesco
  
Linux regression tracking (Thorsten Leemhuis) May 17, 2023, 10:56 a.m. UTC | #5
On 17.05.23 12:35, Francesco Dolcini wrote:
> On Wed, May 17, 2023 at 12:23:39PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
>> [CCing Francesco Dolcini; and the regression list too, as it should be
>> in the loop for regressions:
>> https://docs.kernel.org/admin-guide/reporting-regressions.html]
>>
>> On 16.05.23 14:53, Alistair wrote:
>>> On Fri, 2023-04-07 at 03:07 +0000, Badhri Jagan Sridharan wrote:
>>>> usb_udc_connect_control does not check to see if the udc has already
>>>> been started. This causes gadget->ops->pullup to be called through
>>>> usb_gadget_connect when invoked from usb_udc_vbus_handler even before
>>>> usb_gadget_udc_start is called. Guard this by checking for udc-
>>>>> started
>>>> in usb_udc_connect_control before invoking usb_gadget_connect.
>>> [...]
>>>> Cc: stable@vger.kernel.org
>>>> Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
>>>> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
>>>
>>> This patch causes a kernel hang when trying to boot with the
>>> usb/chipidea/udc.c driver.
>>>
>>> The call stack below causes the hang:
>>>
>>>  - gadget_bind_driver(struct device *dev)
>>>     - mutex_lock(&udc->connect_lock);
>>>     - usb_gadget_udc_start_locked(struct usb_udc *udc)
>>>         - udc->gadget->ops->udc_start(udc->gadget, udc->driver)
>>>
>>> At which point we are calling ci_udc_start(..), but with the
>>> connect_lock mutex locked.
>>>
>>> ci_udc_start() then calls usb_udc_vbus_handler() which tries to lock
>>> the connect_lock while it's already locked. Resulting in a kernel hang.
>>>
>>> Reverting this patch fixes the hang.
>>
>> Not my area of expertise, but I guess it might be the same error as this
>> one:
>>
>> https://lore.kernel.org/all/ZF4BvgsOyoKxdPFF@francesco-nb.int.toradex.com/
>>
>> Francesco sent a revert on Friday, but no reaction from Badhri Jagan
>> Sridharan or Greg yet afaics.
>>
>> https://lore.kernel.org/all/20230512131435.205464-1-francesco@dolcini.it/
> 
> Revert patches were applied and are in linux-next. I expect those to
> land in Linus tree with the next pull request from Greg.

Ha, sorry, I missed that, as I only looked at lore. Should have looked
in my own regression tracking, there it's marked as "fix incoming", as
regzbot noticed the fix in next...

Ciao, Thorsten
  
Badhri Jagan Sridharan May 17, 2023, 5:45 p.m. UTC | #6
Keeping the thread updated. I sent out
https://www.spinics.net/lists/kernel/msg4792009.html few hours earlier
and have requested help from Francesco, Alistair and others who
reported the issue.
Discussing with Alan stern on the feedback he had left.

Thanks for the support,
Badhri

On Wed, May 17, 2023 at 3:57 AM Linux regression tracking (Thorsten
Leemhuis) <regressions@leemhuis.info> wrote:
>
> On 17.05.23 12:35, Francesco Dolcini wrote:
> > On Wed, May 17, 2023 at 12:23:39PM +0200, Linux regression tracking (Thorsten Leemhuis) wrote:
> >> [CCing Francesco Dolcini; and the regression list too, as it should be
> >> in the loop for regressions:
> >> https://docs.kernel.org/admin-guide/reporting-regressions.html]
> >>
> >> On 16.05.23 14:53, Alistair wrote:
> >>> On Fri, 2023-04-07 at 03:07 +0000, Badhri Jagan Sridharan wrote:
> >>>> usb_udc_connect_control does not check to see if the udc has already
> >>>> been started. This causes gadget->ops->pullup to be called through
> >>>> usb_gadget_connect when invoked from usb_udc_vbus_handler even before
> >>>> usb_gadget_udc_start is called. Guard this by checking for udc-
> >>>>> started
> >>>> in usb_udc_connect_control before invoking usb_gadget_connect.
> >>> [...]
> >>>> Cc: stable@vger.kernel.org
> >>>> Fixes: 628ef0d273a6 ("usb: udc: add usb_udc_vbus_handler")
> >>>> Signed-off-by: Badhri Jagan Sridharan <badhri@google.com>
> >>>
> >>> This patch causes a kernel hang when trying to boot with the
> >>> usb/chipidea/udc.c driver.
> >>>
> >>> The call stack below causes the hang:
> >>>
> >>>  - gadget_bind_driver(struct device *dev)
> >>>     - mutex_lock(&udc->connect_lock);
> >>>     - usb_gadget_udc_start_locked(struct usb_udc *udc)
> >>>         - udc->gadget->ops->udc_start(udc->gadget, udc->driver)
> >>>
> >>> At which point we are calling ci_udc_start(..), but with the
> >>> connect_lock mutex locked.
> >>>
> >>> ci_udc_start() then calls usb_udc_vbus_handler() which tries to lock
> >>> the connect_lock while it's already locked. Resulting in a kernel hang.
> >>>
> >>> Reverting this patch fixes the hang.
> >>
> >> Not my area of expertise, but I guess it might be the same error as this
> >> one:
> >>
> >> https://lore.kernel.org/all/ZF4BvgsOyoKxdPFF@francesco-nb.int.toradex.com/
> >>
> >> Francesco sent a revert on Friday, but no reaction from Badhri Jagan
> >> Sridharan or Greg yet afaics.
> >>
> >> https://lore.kernel.org/all/20230512131435.205464-1-francesco@dolcini.it/
> >
> > Revert patches were applied and are in linux-next. I expect those to
> > land in Linus tree with the next pull request from Greg.
>
> Ha, sorry, I missed that, as I only looked at lore. Should have looked
> in my own regression tracking, there it's marked as "fix incoming", as
> regzbot noticed the fix in next...
>
> Ciao, Thorsten
  
Avichal Rakesh May 30, 2023, 9:50 p.m. UTC | #7
On 5/17/23 10:45, Badhri Jagan Sridharan wrote:
> Keeping the thread updated. I sent out
> https://www.spinics.net/lists/kernel/msg4792009.html few hours earlier
> and have requested help from Francesco, Alistair and others who
> reported the issue.
> Discussing with Alan stern on the feedback he had left.
> 

Tangential to the original issues: it looks like patch 2/2 also
breaks gadget functions that sets `bind_deactivated` to true.

When usb_gadget_connect is called while the gadget is deactivated from
functions binding, it sets gadget->connected to true, but does not call
the pullup function. Later, when the gadget function calls
usb_gadget_activate which in turn calls usb_gadget_connect, the pullup
function isn't called because gadget->connected is true.

- Avi
  

Patch

diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index 3dcbba739db6..af92c2e8e10c 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -37,6 +37,10 @@  static struct bus_type gadget_bus_type;
  * @vbus: for udcs who care about vbus status, this value is real vbus status;
  * for udcs who do not care about vbus status, this value is always true
  * @started: the UDC's started state. True if the UDC had started.
+ * @connect_lock: protects udc->vbus, udc->started, gadget->connect, gadget->deactivate related
+ * functions. usb_gadget_connect_locked, usb_gadget_disconnect_locked,
+ * usb_udc_connect_control_locked, usb_gadget_udc_start_locked, usb_gadget_udc_stop_locked are
+ * called with this lock held.
  *
  * This represents the internal data structure which is used by the UDC-class
  * to hold information about udc driver and gadget together.
@@ -48,6 +52,7 @@  struct usb_udc {
 	struct list_head		list;
 	bool				vbus;
 	bool				started;
+	struct mutex			connect_lock;
 };
 
 static struct class *udc_class;
@@ -687,17 +692,9 @@  int usb_gadget_vbus_disconnect(struct usb_gadget *gadget)
 }
 EXPORT_SYMBOL_GPL(usb_gadget_vbus_disconnect);
 
-/**
- * usb_gadget_connect - software-controlled connect to USB host
- * @gadget:the peripheral being connected
- *
- * Enables the D+ (or potentially D-) pullup.  The host will start
- * enumerating this gadget when the pullup is active and a VBUS session
- * is active (the link is powered).
- *
- * Returns zero on success, else negative errno.
- */
-int usb_gadget_connect(struct usb_gadget *gadget)
+/* Internal version of usb_gadget_connect needs to be called with connect_lock held. */
+static int usb_gadget_connect_locked(struct usb_gadget *gadget)
+	__must_hold(&gadget->udc->connect_lock)
 {
 	int ret = 0;
 
@@ -706,10 +703,12 @@  int usb_gadget_connect(struct usb_gadget *gadget)
 		goto out;
 	}
 
-	if (gadget->deactivated) {
+	if (gadget->deactivated || !gadget->udc->started) {
 		/*
 		 * If gadget is deactivated we only save new state.
 		 * Gadget will be connected automatically after activation.
+		 *
+		 * udc first needs to be started before gadget can be pulled up.
 		 */
 		gadget->connected = true;
 		goto out;
@@ -724,22 +723,32 @@  int usb_gadget_connect(struct usb_gadget *gadget)
 
 	return ret;
 }
-EXPORT_SYMBOL_GPL(usb_gadget_connect);
 
 /**
- * usb_gadget_disconnect - software-controlled disconnect from USB host
- * @gadget:the peripheral being disconnected
- *
- * Disables the D+ (or potentially D-) pullup, which the host may see
- * as a disconnect (when a VBUS session is active).  Not all systems
- * support software pullup controls.
+ * usb_gadget_connect - software-controlled connect to USB host
+ * @gadget:the peripheral being connected
  *
- * Following a successful disconnect, invoke the ->disconnect() callback
- * for the current gadget driver so that UDC drivers don't need to.
+ * Enables the D+ (or potentially D-) pullup.  The host will start
+ * enumerating this gadget when the pullup is active and a VBUS session
+ * is active (the link is powered).
  *
  * Returns zero on success, else negative errno.
  */
-int usb_gadget_disconnect(struct usb_gadget *gadget)
+int usb_gadget_connect(struct usb_gadget *gadget)
+{
+	int ret;
+
+	mutex_lock(&gadget->udc->connect_lock);
+	ret = usb_gadget_connect_locked(gadget);
+	mutex_unlock(&gadget->udc->connect_lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(usb_gadget_connect);
+
+/* Internal version of usb_gadget_disconnect needs to be called with connect_lock held. */
+static int usb_gadget_disconnect_locked(struct usb_gadget *gadget)
+	__must_hold(&gadget->udc->connect_lock)
 {
 	int ret = 0;
 
@@ -751,10 +760,12 @@  int usb_gadget_disconnect(struct usb_gadget *gadget)
 	if (!gadget->connected)
 		goto out;
 
-	if (gadget->deactivated) {
+	if (gadget->deactivated || !gadget->udc->started) {
 		/*
 		 * If gadget is deactivated we only save new state.
 		 * Gadget will stay disconnected after activation.
+		 *
+		 * udc should have been started before gadget being pulled down.
 		 */
 		gadget->connected = false;
 		goto out;
@@ -774,6 +785,30 @@  int usb_gadget_disconnect(struct usb_gadget *gadget)
 
 	return ret;
 }
+
+/**
+ * usb_gadget_disconnect - software-controlled disconnect from USB host
+ * @gadget:the peripheral being disconnected
+ *
+ * Disables the D+ (or potentially D-) pullup, which the host may see
+ * as a disconnect (when a VBUS session is active).  Not all systems
+ * support software pullup controls.
+ *
+ * Following a successful disconnect, invoke the ->disconnect() callback
+ * for the current gadget driver so that UDC drivers don't need to.
+ *
+ * Returns zero on success, else negative errno.
+ */
+int usb_gadget_disconnect(struct usb_gadget *gadget)
+{
+	int ret;
+
+	mutex_lock(&gadget->udc->connect_lock);
+	ret = usb_gadget_disconnect_locked(gadget);
+	mutex_unlock(&gadget->udc->connect_lock);
+
+	return ret;
+}
 EXPORT_SYMBOL_GPL(usb_gadget_disconnect);
 
 /**
@@ -794,10 +829,11 @@  int usb_gadget_deactivate(struct usb_gadget *gadget)
 	if (gadget->deactivated)
 		goto out;
 
+	mutex_lock(&gadget->udc->connect_lock);
 	if (gadget->connected) {
-		ret = usb_gadget_disconnect(gadget);
+		ret = usb_gadget_disconnect_locked(gadget);
 		if (ret)
-			goto out;
+			goto unlock;
 
 		/*
 		 * If gadget was being connected before deactivation, we want
@@ -807,6 +843,8 @@  int usb_gadget_deactivate(struct usb_gadget *gadget)
 	}
 	gadget->deactivated = true;
 
+unlock:
+	mutex_unlock(&gadget->udc->connect_lock);
 out:
 	trace_usb_gadget_deactivate(gadget, ret);
 
@@ -830,6 +868,7 @@  int usb_gadget_activate(struct usb_gadget *gadget)
 	if (!gadget->deactivated)
 		goto out;
 
+	mutex_lock(&gadget->udc->connect_lock);
 	gadget->deactivated = false;
 
 	/*
@@ -837,7 +876,8 @@  int usb_gadget_activate(struct usb_gadget *gadget)
 	 * while it was being deactivated, we call usb_gadget_connect().
 	 */
 	if (gadget->connected)
-		ret = usb_gadget_connect(gadget);
+		ret = usb_gadget_connect_locked(gadget);
+	mutex_unlock(&gadget->udc->connect_lock);
 
 out:
 	trace_usb_gadget_activate(gadget, ret);
@@ -1078,12 +1118,13 @@  EXPORT_SYMBOL_GPL(usb_gadget_set_state);
 
 /* ------------------------------------------------------------------------- */
 
-static void usb_udc_connect_control(struct usb_udc *udc)
+/* Acquire connect_lock before calling this function. */
+static void usb_udc_connect_control_locked(struct usb_udc *udc) __must_hold(&udc->connect_lock)
 {
-	if (udc->vbus)
-		usb_gadget_connect(udc->gadget);
+	if (udc->vbus && udc->started)
+		usb_gadget_connect_locked(udc->gadget);
 	else
-		usb_gadget_disconnect(udc->gadget);
+		usb_gadget_disconnect_locked(udc->gadget);
 }
 
 /**
@@ -1099,10 +1140,12 @@  void usb_udc_vbus_handler(struct usb_gadget *gadget, bool status)
 {
 	struct usb_udc *udc = gadget->udc;
 
+	mutex_lock(&udc->connect_lock);
 	if (udc) {
 		udc->vbus = status;
-		usb_udc_connect_control(udc);
+		usb_udc_connect_control_locked(udc);
 	}
+	mutex_unlock(&udc->connect_lock);
 }
 EXPORT_SYMBOL_GPL(usb_udc_vbus_handler);
 
@@ -1124,7 +1167,7 @@  void usb_gadget_udc_reset(struct usb_gadget *gadget,
 EXPORT_SYMBOL_GPL(usb_gadget_udc_reset);
 
 /**
- * usb_gadget_udc_start - tells usb device controller to start up
+ * usb_gadget_udc_start_locked - tells usb device controller to start up
  * @udc: The UDC to be started
  *
  * This call is issued by the UDC Class driver when it's about
@@ -1135,8 +1178,11 @@  EXPORT_SYMBOL_GPL(usb_gadget_udc_reset);
  * necessary to have it powered on.
  *
  * Returns zero on success, else negative errno.
+ *
+ * Caller should acquire connect_lock before invoking this function.
  */
-static inline int usb_gadget_udc_start(struct usb_udc *udc)
+static inline int usb_gadget_udc_start_locked(struct usb_udc *udc)
+	__must_hold(&udc->connect_lock)
 {
 	int ret;
 
@@ -1153,7 +1199,7 @@  static inline int usb_gadget_udc_start(struct usb_udc *udc)
 }
 
 /**
- * usb_gadget_udc_stop - tells usb device controller we don't need it anymore
+ * usb_gadget_udc_stop_locked - tells usb device controller we don't need it anymore
  * @udc: The UDC to be stopped
  *
  * This call is issued by the UDC Class driver after calling
@@ -1162,8 +1208,11 @@  static inline int usb_gadget_udc_start(struct usb_udc *udc)
  * The details are implementation specific, but it can go as
  * far as powering off UDC completely and disable its data
  * line pullups.
+ *
+ * Caller should acquire connect lock before invoking this function.
  */
-static inline void usb_gadget_udc_stop(struct usb_udc *udc)
+static inline void usb_gadget_udc_stop_locked(struct usb_udc *udc)
+	__must_hold(&udc->connect_lock)
 {
 	if (!udc->started) {
 		dev_err(&udc->dev, "UDC had already stopped\n");
@@ -1322,6 +1371,7 @@  int usb_add_gadget(struct usb_gadget *gadget)
 
 	udc->gadget = gadget;
 	gadget->udc = udc;
+	mutex_init(&udc->connect_lock);
 
 	udc->started = false;
 
@@ -1523,11 +1573,15 @@  static int gadget_bind_driver(struct device *dev)
 	if (ret)
 		goto err_bind;
 
-	ret = usb_gadget_udc_start(udc);
-	if (ret)
+	mutex_lock(&udc->connect_lock);
+	ret = usb_gadget_udc_start_locked(udc);
+	if (ret) {
+		mutex_unlock(&udc->connect_lock);
 		goto err_start;
+	}
 	usb_gadget_enable_async_callbacks(udc);
-	usb_udc_connect_control(udc);
+	usb_udc_connect_control_locked(udc);
+	mutex_unlock(&udc->connect_lock);
 
 	kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
 	return 0;
@@ -1558,12 +1612,14 @@  static void gadget_unbind_driver(struct device *dev)
 
 	kobject_uevent(&udc->dev.kobj, KOBJ_CHANGE);
 
-	usb_gadget_disconnect(gadget);
+	mutex_lock(&udc->connect_lock);
+	usb_gadget_disconnect_locked(gadget);
 	usb_gadget_disable_async_callbacks(udc);
 	if (gadget->irq)
 		synchronize_irq(gadget->irq);
 	udc->driver->unbind(gadget);
-	usb_gadget_udc_stop(udc);
+	usb_gadget_udc_stop_locked(udc);
+	mutex_unlock(&udc->connect_lock);
 
 	mutex_lock(&udc_lock);
 	driver->is_bound = false;
@@ -1649,11 +1705,15 @@  static ssize_t soft_connect_store(struct device *dev,
 	}
 
 	if (sysfs_streq(buf, "connect")) {
-		usb_gadget_udc_start(udc);
-		usb_gadget_connect(udc->gadget);
+		mutex_lock(&udc->connect_lock);
+		usb_gadget_udc_start_locked(udc);
+		usb_gadget_connect_locked(udc->gadget);
+		mutex_unlock(&udc->connect_lock);
 	} else if (sysfs_streq(buf, "disconnect")) {
-		usb_gadget_disconnect(udc->gadget);
-		usb_gadget_udc_stop(udc);
+		mutex_lock(&udc->connect_lock);
+		usb_gadget_disconnect_locked(udc->gadget);
+		usb_gadget_udc_stop_locked(udc);
+		mutex_unlock(&udc->connect_lock);
 	} else {
 		dev_err(dev, "unsupported command '%s'\n", buf);
 		ret = -EINVAL;