[v9,3/4] tee: optee: support tracking system threads

Message ID 20230517143311.585080-1-sumit.garg@linaro.org
State New
Headers
Series None |

Commit Message

Sumit Garg May 17, 2023, 2:33 p.m. UTC
  From: Etienne Carriere <etienne.carriere@linaro.org>

Adds support in the OP-TEE driver to keep track of reserved system
threads. The optee_cq_*() functions are updated to handle this if
enabled. The SMC ABI part of the driver enables this tracking, but the
FF-A ABI part does not.

The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
sessions. For sake of simplicity, initialization of call queue
management is factorized into new helper function optee_cq_init().

Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---

Disclaimer: Compile tested only

Hi Etienne,

Overall the idea we agreed upon was okay but the implementation looked
complex to me. So I thought it would be harder to explain that via
review and I decided myself to give a try at simplification. I would
like you to test it if this still addresses the SCMI deadlock problem or
not. Also, feel free to include this in your patchset if all goes fine
wrt testing.

-Sumit

Changes since v8:
- Simplified system threads tracking implementation.

 drivers/tee/optee/call.c          | 72 +++++++++++++++++++++++++++++--
 drivers/tee/optee/ffa_abi.c       |  3 +-
 drivers/tee/optee/optee_private.h | 16 +++++++
 drivers/tee/optee/smc_abi.c       | 16 ++++++-
 4 files changed, 99 insertions(+), 8 deletions(-)
  

Comments

Etienne Carriere May 23, 2023, 7:11 a.m. UTC | #1
Hello Sumit,


On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
>
> From: Etienne Carriere <etienne.carriere@linaro.org>
>
> Adds support in the OP-TEE driver to keep track of reserved system
> threads. The optee_cq_*() functions are updated to handle this if
> enabled. The SMC ABI part of the driver enables this tracking, but the
> FF-A ABI part does not.
>
> The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> sessions. For sake of simplicity, initialization of call queue
> management is factorized into new helper function optee_cq_init().
>
> Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---
>
> Disclaimer: Compile tested only
>
> Hi Etienne,
>
> Overall the idea we agreed upon was okay but the implementation looked
> complex to me. So I thought it would be harder to explain that via
> review and I decided myself to give a try at simplification. I would
> like you to test it if this still addresses the SCMI deadlock problem or
> not. Also, feel free to include this in your patchset if all goes fine
> wrt testing.

With these changes, there is no more a specific waiting list for TEE
system threads hence when a waiting queue can complete, we'll pick any
TEE thread, not a TEE system thread first..
Also, as stated in a below answer, these change unconditionally
reserve a TEE thread for TEE system calls even if no TEE client
reserved such.

>
> -Sumit
>
> Changes since v8:
> - Simplified system threads tracking implementation.
>
>  drivers/tee/optee/call.c          | 72 +++++++++++++++++++++++++++++--
>  drivers/tee/optee/ffa_abi.c       |  3 +-
>  drivers/tee/optee/optee_private.h | 16 +++++++
>  drivers/tee/optee/smc_abi.c       | 16 ++++++-
>  4 files changed, 99 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
> index 42e478ac6ce1..09e824e4dcaf 100644
> --- a/drivers/tee/optee/call.c
> +++ b/drivers/tee/optee/call.c
> @@ -39,9 +39,27 @@ struct optee_shm_arg_entry {
>         DECLARE_BITMAP(map, MAX_ARG_COUNT_PER_ENTRY);
>  };
>
> +void optee_cq_init(struct optee_call_queue *cq, int thread_count)
> +{
> +       mutex_init(&cq->mutex);
> +       INIT_LIST_HEAD(&cq->waiters);
> +       /*
> +        * If cq->total_thread_count is 0 then we're not trying to keep
> +        * track of how many free threads we have, instead we're relying on
> +        * the secure world to tell us when we're out of thread and have to
> +        * wait for another thread to become available.
> +        */
> +       cq->total_thread_count = thread_count;
> +       cq->free_thread_count = thread_count;
> +}
> +
>  void optee_cq_wait_init(struct optee_call_queue *cq,
>                         struct optee_call_waiter *w, bool sys_thread)
>  {
> +       bool need_wait = false;
> +
> +       memset(w, 0, sizeof(*w));
> +
>         /*
>          * We're preparing to make a call to secure world. In case we can't
>          * allocate a thread in secure world we'll end up waiting in
> @@ -53,15 +71,43 @@ void optee_cq_wait_init(struct optee_call_queue *cq,
>         mutex_lock(&cq->mutex);
>
>         /*
> -        * We add ourselves to the queue, but we don't wait. This
> -        * guarantees that we don't lose a completion if secure world
> -        * returns busy and another thread just exited and try to complete
> -        * someone.
> +        * We add ourselves to a queue, but we don't wait. This guarantees
> +        * that we don't lose a completion if secure world returns busy and
> +        * another thread just exited and try to complete someone.
>          */
>         init_completion(&w->c);
>         list_add_tail(&w->list_node, &cq->waiters);
>
> +       if (cq->total_thread_count && sys_thread) {
> +               if (cq->free_thread_count > 0)
> +                       cq->free_thread_count--;
> +               else
> +                       need_wait = true;
> +       } else if (cq->total_thread_count) {
> +               if (cq->free_thread_count > 1)

This unconditionally reserves a TEE thread for TEE system calls, even
if no client has claimed such system thread reservation.


> +                       cq->free_thread_count--;
> +               else
> +                       need_wait = true;
> +       }
> +
>         mutex_unlock(&cq->mutex);
> +
> +       while (need_wait) {
> +               optee_cq_wait_for_completion(cq, w);
> +               mutex_lock(&cq->mutex);
> +               if (sys_thread) {
> +                       if (cq->free_thread_count > 0) {
> +                               cq->free_thread_count--;
> +                               need_wait = false;
> +                       }
> +               } else {
> +                       if (cq->free_thread_count > 1) {
> +                               cq->free_thread_count--;
> +                               need_wait = false;
> +                       }
> +               }
> +               mutex_unlock(&cq->mutex);
> +       }
>  }
>
>  void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> @@ -104,6 +150,8 @@ void optee_cq_wait_final(struct optee_call_queue *cq,
>         /* Get out of the list */
>         list_del(&w->list_node);
>
> +       cq->free_thread_count++;
> +
>         /* Wake up one eventual waiting task */
>         optee_cq_complete_one(cq);
>
> @@ -361,6 +409,22 @@ int optee_open_session(struct tee_context *ctx,
>         return rc;
>  }
>
> +int optee_system_session(struct tee_context *ctx, u32 session)
> +{
> +       struct optee_context_data *ctxdata = ctx->data;
> +       struct optee_session *sess;
> +
> +       mutex_lock(&ctxdata->mutex);
> +
> +       sess = find_session(ctxdata, session);
> +       if (sess && !sess->use_sys_thread)
> +               sess->use_sys_thread = true;
> +
> +       mutex_unlock(&ctxdata->mutex);
> +
> +       return 0;

Nitpicking: should rather return 0 only upon session is valid
(sess!=NULL here)  and system thread reservation is supported
(total_thread_count > 0).
But that's not a big deal I guess, can be addressed.

Etienne




> +}
> +
>  int optee_close_session_helper(struct tee_context *ctx, u32 session,
>                                bool system_thread)
>  {
> diff --git a/drivers/tee/optee/ffa_abi.c b/drivers/tee/optee/ffa_abi.c
> index 5fde9d4100e3..0c9055691343 100644
> --- a/drivers/tee/optee/ffa_abi.c
> +++ b/drivers/tee/optee/ffa_abi.c
> @@ -852,8 +852,7 @@ static int optee_ffa_probe(struct ffa_device *ffa_dev)
>         if (rc)
>                 goto err_unreg_supp_teedev;
>         mutex_init(&optee->ffa.mutex);
> -       mutex_init(&optee->call_queue.mutex);
> -       INIT_LIST_HEAD(&optee->call_queue.waiters);
> +       optee_cq_init(&optee->call_queue, 0);
>         optee_supp_init(&optee->supp);
>         optee_shm_arg_cache_init(optee, arg_cache_flags);
>         ffa_dev_set_drvdata(ffa_dev, optee);
> diff --git a/drivers/tee/optee/optee_private.h b/drivers/tee/optee/optee_private.h
> index b68273051454..6dcecb83c893 100644
> --- a/drivers/tee/optee/optee_private.h
> +++ b/drivers/tee/optee/optee_private.h
> @@ -40,15 +40,29 @@ typedef void (optee_invoke_fn)(unsigned long, unsigned long, unsigned long,
>                                 unsigned long, unsigned long,
>                                 struct arm_smccc_res *);
>
> +/*
> + * struct optee_call_waiter - TEE entry may need to wait for a free TEE thread
> + * @list_node          Reference in waiters list
> + * @c                  Waiting completion reference
> + */
>  struct optee_call_waiter {
>         struct list_head list_node;
>         struct completion c;
>  };
>
> +/*
> + * struct optee_call_queue - OP-TEE call queue management
> + * @mutex                      Serializes access to this struct
> + * @waiters                    List of threads waiting to enter OP-TEE
> + * @total_thread_count         Overall number of thread context in OP-TEE or 0
> + * @free_thread_count          Number of threads context free in OP-TEE
> + */
>  struct optee_call_queue {
>         /* Serializes access to this struct */
>         struct mutex mutex;
>         struct list_head waiters;
> +       int total_thread_count;
> +       int free_thread_count;
>  };
>
>  struct optee_notif {
> @@ -254,6 +268,7 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
>  int optee_open_session(struct tee_context *ctx,
>                        struct tee_ioctl_open_session_arg *arg,
>                        struct tee_param *param);
> +int optee_system_session(struct tee_context *ctx, u32 session);
>  int optee_close_session_helper(struct tee_context *ctx, u32 session,
>                                bool system_thread);
>  int optee_close_session(struct tee_context *ctx, u32 session);
> @@ -303,6 +318,7 @@ static inline void optee_to_msg_param_value(struct optee_msg_param *mp,
>         mp->u.value.c = p->u.value.c;
>  }
>
> +void optee_cq_init(struct optee_call_queue *cq, int thread_count);
>  void optee_cq_wait_init(struct optee_call_queue *cq,
>                         struct optee_call_waiter *w, bool sys_thread);
>  void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c
> index e2763cdcf111..3314ffeb91c8 100644
> --- a/drivers/tee/optee/smc_abi.c
> +++ b/drivers/tee/optee/smc_abi.c
> @@ -1209,6 +1209,7 @@ static const struct tee_driver_ops optee_clnt_ops = {
>         .release = optee_release,
>         .open_session = optee_open_session,
>         .close_session = optee_close_session,
> +       .system_session = optee_system_session,
>         .invoke_func = optee_invoke_func,
>         .cancel_req = optee_cancel_req,
>         .shm_register = optee_shm_register,
> @@ -1356,6 +1357,16 @@ static bool optee_msg_exchange_capabilities(optee_invoke_fn *invoke_fn,
>         return true;
>  }
>
> +static unsigned int optee_msg_get_thread_count(optee_invoke_fn *invoke_fn)
> +{
> +       struct arm_smccc_res res;
> +
> +       invoke_fn(OPTEE_SMC_GET_THREAD_COUNT, 0, 0, 0, 0, 0, 0, 0, &res);
> +       if (res.a0)
> +               return 0;
> +       return res.a1;
> +}
> +
>  static struct tee_shm_pool *
>  optee_config_shm_memremap(optee_invoke_fn *invoke_fn, void **memremaped_shm)
>  {
> @@ -1609,6 +1620,7 @@ static int optee_probe(struct platform_device *pdev)
>         struct optee *optee = NULL;
>         void *memremaped_shm = NULL;
>         unsigned int rpc_param_count;
> +       unsigned int thread_count;
>         struct tee_device *teedev;
>         struct tee_context *ctx;
>         u32 max_notif_value;
> @@ -1636,6 +1648,7 @@ static int optee_probe(struct platform_device *pdev)
>                 return -EINVAL;
>         }
>
> +       thread_count = optee_msg_get_thread_count(invoke_fn);
>         if (!optee_msg_exchange_capabilities(invoke_fn, &sec_caps,
>                                              &max_notif_value,
>                                              &rpc_param_count)) {
> @@ -1725,8 +1738,7 @@ static int optee_probe(struct platform_device *pdev)
>         if (rc)
>                 goto err_unreg_supp_teedev;
>
> -       mutex_init(&optee->call_queue.mutex);
> -       INIT_LIST_HEAD(&optee->call_queue.waiters);
> +       optee_cq_init(&optee->call_queue, thread_count);
>         optee_supp_init(&optee->supp);
>         optee->smc.memremaped_shm = memremaped_shm;
>         optee->pool = pool;
> --
> 2.34.1
>
  
Sumit Garg May 24, 2023, 7:31 a.m. UTC | #2
On Tue, 23 May 2023 at 12:41, Etienne Carriere
<etienne.carriere@linaro.org> wrote:
>
> Hello Sumit,
>
>
> On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> >
> > From: Etienne Carriere <etienne.carriere@linaro.org>
> >
> > Adds support in the OP-TEE driver to keep track of reserved system
> > threads. The optee_cq_*() functions are updated to handle this if
> > enabled. The SMC ABI part of the driver enables this tracking, but the
> > FF-A ABI part does not.
> >
> > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > sessions. For sake of simplicity, initialization of call queue
> > management is factorized into new helper function optee_cq_init().
> >
> > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > ---
> >
> > Disclaimer: Compile tested only
> >
> > Hi Etienne,
> >
> > Overall the idea we agreed upon was okay but the implementation looked
> > complex to me. So I thought it would be harder to explain that via
> > review and I decided myself to give a try at simplification. I would
> > like you to test it if this still addresses the SCMI deadlock problem or
> > not. Also, feel free to include this in your patchset if all goes fine
> > wrt testing.
>
> With these changes, there is no more a specific waiting list for TEE
> system threads hence when a waiting queue can complete, we'll pick any
> TEE thread, not a TEE system thread first..

I had thought about this but I can't see any value in having a
separate wait queue for system threads. Here we only need to provide
an extra privileged thread for system sessions (kernel clients) such
that user-space doesn't contend for that thread. This prevents kernel
client's starvation or deadlock like in the SCMI case.

> Also, as stated in a below answer, these change unconditionally
> reserve a TEE thread for TEE system calls even if no TEE client
> reserved such.

I don't think we should make thread reservations based on the presence
of TEE clients. You never know how much user-space or kernel TEE
clients you are dealing with. And reserving a single privileged thread
unconditionally for system sessions shouldn't be much of a burden for
memory constrained devices too.

Also, this way we would enable every kernel TEE client to leverage
system sessions as it's very likely they wouldn't like to compete with
user-space for thread availability. Two other kernel TEE clients that
are on top of my head are HWRNG and Trusted Keys which can benefit
from this feature.

> >
> > Changes since v8:
> > - Simplified system threads tracking implementation.
> >
> >  drivers/tee/optee/call.c          | 72 +++++++++++++++++++++++++++++--
> >  drivers/tee/optee/ffa_abi.c       |  3 +-
> >  drivers/tee/optee/optee_private.h | 16 +++++++
> >  drivers/tee/optee/smc_abi.c       | 16 ++++++-
> >  4 files changed, 99 insertions(+), 8 deletions(-)
> >
> > diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
> > index 42e478ac6ce1..09e824e4dcaf 100644
> > --- a/drivers/tee/optee/call.c
> > +++ b/drivers/tee/optee/call.c
> > @@ -39,9 +39,27 @@ struct optee_shm_arg_entry {
> >         DECLARE_BITMAP(map, MAX_ARG_COUNT_PER_ENTRY);
> >  };
> >
> > +void optee_cq_init(struct optee_call_queue *cq, int thread_count)
> > +{
> > +       mutex_init(&cq->mutex);
> > +       INIT_LIST_HEAD(&cq->waiters);
> > +       /*
> > +        * If cq->total_thread_count is 0 then we're not trying to keep
> > +        * track of how many free threads we have, instead we're relying on
> > +        * the secure world to tell us when we're out of thread and have to
> > +        * wait for another thread to become available.
> > +        */
> > +       cq->total_thread_count = thread_count;
> > +       cq->free_thread_count = thread_count;
> > +}
> > +
> >  void optee_cq_wait_init(struct optee_call_queue *cq,
> >                         struct optee_call_waiter *w, bool sys_thread)
> >  {
> > +       bool need_wait = false;
> > +
> > +       memset(w, 0, sizeof(*w));
> > +
> >         /*
> >          * We're preparing to make a call to secure world. In case we can't
> >          * allocate a thread in secure world we'll end up waiting in
> > @@ -53,15 +71,43 @@ void optee_cq_wait_init(struct optee_call_queue *cq,
> >         mutex_lock(&cq->mutex);
> >
> >         /*
> > -        * We add ourselves to the queue, but we don't wait. This
> > -        * guarantees that we don't lose a completion if secure world
> > -        * returns busy and another thread just exited and try to complete
> > -        * someone.
> > +        * We add ourselves to a queue, but we don't wait. This guarantees
> > +        * that we don't lose a completion if secure world returns busy and
> > +        * another thread just exited and try to complete someone.
> >          */
> >         init_completion(&w->c);
> >         list_add_tail(&w->list_node, &cq->waiters);
> >
> > +       if (cq->total_thread_count && sys_thread) {
> > +               if (cq->free_thread_count > 0)
> > +                       cq->free_thread_count--;
> > +               else
> > +                       need_wait = true;
> > +       } else if (cq->total_thread_count) {
> > +               if (cq->free_thread_count > 1)
>
> This unconditionally reserves a TEE thread for TEE system calls, even
> if no client has claimed such system thread reservation.

See my response above.

>
>
> > +                       cq->free_thread_count--;
> > +               else
> > +                       need_wait = true;
> > +       }
> > +
> >         mutex_unlock(&cq->mutex);
> > +
> > +       while (need_wait) {
> > +               optee_cq_wait_for_completion(cq, w);
> > +               mutex_lock(&cq->mutex);
> > +               if (sys_thread) {
> > +                       if (cq->free_thread_count > 0) {
> > +                               cq->free_thread_count--;
> > +                               need_wait = false;
> > +                       }
> > +               } else {
> > +                       if (cq->free_thread_count > 1) {
> > +                               cq->free_thread_count--;
> > +                               need_wait = false;
> > +                       }
> > +               }
> > +               mutex_unlock(&cq->mutex);
> > +       }
> >  }
> >
> >  void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > @@ -104,6 +150,8 @@ void optee_cq_wait_final(struct optee_call_queue *cq,
> >         /* Get out of the list */
> >         list_del(&w->list_node);
> >
> > +       cq->free_thread_count++;
> > +
> >         /* Wake up one eventual waiting task */
> >         optee_cq_complete_one(cq);
> >
> > @@ -361,6 +409,22 @@ int optee_open_session(struct tee_context *ctx,
> >         return rc;
> >  }
> >
> > +int optee_system_session(struct tee_context *ctx, u32 session)
> > +{
> > +       struct optee_context_data *ctxdata = ctx->data;
> > +       struct optee_session *sess;
> > +
> > +       mutex_lock(&ctxdata->mutex);
> > +
> > +       sess = find_session(ctxdata, session);
> > +       if (sess && !sess->use_sys_thread)
> > +               sess->use_sys_thread = true;
> > +
> > +       mutex_unlock(&ctxdata->mutex);
> > +
> > +       return 0;
>
> Nitpicking: should rather return 0 only upon session is valid
> (sess!=NULL here)  and system thread reservation is supported
> (total_thread_count > 0).
> But that's not a big deal I guess, can be addressed.

Thanks for pointing it out. If this approach works for you then it can
be addressed while integrating in your patch-set.

-Sumit

>
>
>
>
> > +}
> > +
> >  int optee_close_session_helper(struct tee_context *ctx, u32 session,
> >                                bool system_thread)
> >  {
> > diff --git a/drivers/tee/optee/ffa_abi.c b/drivers/tee/optee/ffa_abi.c
> > index 5fde9d4100e3..0c9055691343 100644
> > --- a/drivers/tee/optee/ffa_abi.c
> > +++ b/drivers/tee/optee/ffa_abi.c
> > @@ -852,8 +852,7 @@ static int optee_ffa_probe(struct ffa_device *ffa_dev)
> >         if (rc)
> >                 goto err_unreg_supp_teedev;
> >         mutex_init(&optee->ffa.mutex);
> > -       mutex_init(&optee->call_queue.mutex);
> > -       INIT_LIST_HEAD(&optee->call_queue.waiters);
> > +       optee_cq_init(&optee->call_queue, 0);
> >         optee_supp_init(&optee->supp);
> >         optee_shm_arg_cache_init(optee, arg_cache_flags);
> >         ffa_dev_set_drvdata(ffa_dev, optee);
> > diff --git a/drivers/tee/optee/optee_private.h b/drivers/tee/optee/optee_private.h
> > index b68273051454..6dcecb83c893 100644
> > --- a/drivers/tee/optee/optee_private.h
> > +++ b/drivers/tee/optee/optee_private.h
> > @@ -40,15 +40,29 @@ typedef void (optee_invoke_fn)(unsigned long, unsigned long, unsigned long,
> >                                 unsigned long, unsigned long,
> >                                 struct arm_smccc_res *);
> >
> > +/*
> > + * struct optee_call_waiter - TEE entry may need to wait for a free TEE thread
> > + * @list_node          Reference in waiters list
> > + * @c                  Waiting completion reference
> > + */
> >  struct optee_call_waiter {
> >         struct list_head list_node;
> >         struct completion c;
> >  };
> >
> > +/*
> > + * struct optee_call_queue - OP-TEE call queue management
> > + * @mutex                      Serializes access to this struct
> > + * @waiters                    List of threads waiting to enter OP-TEE
> > + * @total_thread_count         Overall number of thread context in OP-TEE or 0
> > + * @free_thread_count          Number of threads context free in OP-TEE
> > + */
> >  struct optee_call_queue {
> >         /* Serializes access to this struct */
> >         struct mutex mutex;
> >         struct list_head waiters;
> > +       int total_thread_count;
> > +       int free_thread_count;
> >  };
> >
> >  struct optee_notif {
> > @@ -254,6 +268,7 @@ int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
> >  int optee_open_session(struct tee_context *ctx,
> >                        struct tee_ioctl_open_session_arg *arg,
> >                        struct tee_param *param);
> > +int optee_system_session(struct tee_context *ctx, u32 session);
> >  int optee_close_session_helper(struct tee_context *ctx, u32 session,
> >                                bool system_thread);
> >  int optee_close_session(struct tee_context *ctx, u32 session);
> > @@ -303,6 +318,7 @@ static inline void optee_to_msg_param_value(struct optee_msg_param *mp,
> >         mp->u.value.c = p->u.value.c;
> >  }
> >
> > +void optee_cq_init(struct optee_call_queue *cq, int thread_count);
> >  void optee_cq_wait_init(struct optee_call_queue *cq,
> >                         struct optee_call_waiter *w, bool sys_thread);
> >  void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c
> > index e2763cdcf111..3314ffeb91c8 100644
> > --- a/drivers/tee/optee/smc_abi.c
> > +++ b/drivers/tee/optee/smc_abi.c
> > @@ -1209,6 +1209,7 @@ static const struct tee_driver_ops optee_clnt_ops = {
> >         .release = optee_release,
> >         .open_session = optee_open_session,
> >         .close_session = optee_close_session,
> > +       .system_session = optee_system_session,
> >         .invoke_func = optee_invoke_func,
> >         .cancel_req = optee_cancel_req,
> >         .shm_register = optee_shm_register,
> > @@ -1356,6 +1357,16 @@ static bool optee_msg_exchange_capabilities(optee_invoke_fn *invoke_fn,
> >         return true;
> >  }
> >
> > +static unsigned int optee_msg_get_thread_count(optee_invoke_fn *invoke_fn)
> > +{
> > +       struct arm_smccc_res res;
> > +
> > +       invoke_fn(OPTEE_SMC_GET_THREAD_COUNT, 0, 0, 0, 0, 0, 0, 0, &res);
> > +       if (res.a0)
> > +               return 0;
> > +       return res.a1;
> > +}
> > +
> >  static struct tee_shm_pool *
> >  optee_config_shm_memremap(optee_invoke_fn *invoke_fn, void **memremaped_shm)
> >  {
> > @@ -1609,6 +1620,7 @@ static int optee_probe(struct platform_device *pdev)
> >         struct optee *optee = NULL;
> >         void *memremaped_shm = NULL;
> >         unsigned int rpc_param_count;
> > +       unsigned int thread_count;
> >         struct tee_device *teedev;
> >         struct tee_context *ctx;
> >         u32 max_notif_value;
> > @@ -1636,6 +1648,7 @@ static int optee_probe(struct platform_device *pdev)
> >                 return -EINVAL;
> >         }
> >
> > +       thread_count = optee_msg_get_thread_count(invoke_fn);
> >         if (!optee_msg_exchange_capabilities(invoke_fn, &sec_caps,
> >                                              &max_notif_value,
> >                                              &rpc_param_count)) {
> > @@ -1725,8 +1738,7 @@ static int optee_probe(struct platform_device *pdev)
> >         if (rc)
> >                 goto err_unreg_supp_teedev;
> >
> > -       mutex_init(&optee->call_queue.mutex);
> > -       INIT_LIST_HEAD(&optee->call_queue.waiters);
> > +       optee_cq_init(&optee->call_queue, thread_count);
> >         optee_supp_init(&optee->supp);
> >         optee->smc.memremaped_shm = memremaped_shm;
> >         optee->pool = pool;
> > --
> > 2.34.1
> >
  
Etienne CARRIERE May 25, 2023, 11:48 a.m. UTC | #3
>
> De : Sumit Garg <sumit.garg@linaro.org>
> Envoyé : mercredi 24 mai 2023 09:31
> > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> <etienne.carriere@linaro.org> wrote:
> > Hello Sumit,
> >
> >
> > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > >
> > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > >
> > > Adds support in the OP-TEE driver to keep track of reserved system
> > > threads. The optee_cq_*() functions are updated to handle this if
> > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > FF-A ABI part does not.
> > >
> > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > sessions. For sake of simplicity, initialization of call queue
> > > management is factorized into new helper function optee_cq_init().
> > >
> > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > ---
> > >
> > > Disclaimer: Compile tested only
> > >
> > > Hi Etienne,
> > >
> > > Overall the idea we agreed upon was okay but the implementation looked
> > > complex to me. So I thought it would be harder to explain that via
> > > review and I decided myself to give a try at simplification. I would
> > > like you to test it if this still addresses the SCMI deadlock problem or
> > > not. Also, feel free to include this in your patchset if all goes fine
> > > wrt testing.
> >
> > With these changes, there is no more a specific waiting list for TEE
> > system threads hence when a waiting queue can complete, we'll pick any
> > TEE thread, not a TEE system thread first..
>
> I had thought about this but I can't see any value in having a
> separate wait queue for system threads. Here we only need to provide
> an extra privileged thread for system sessions (kernel clients) such
> that user-space doesn't contend for that thread. This prevents kernel
> client's starvation or deadlock like in the SCMI case.
>
> > Also, as stated in a below answer, these change unconditionally
> > reserve a TEE thread for TEE system calls even if no TEE client
> > reserved such.
>
> I don't think we should make thread reservations based on the presence
> of TEE clients. You never know how much user-space or kernel TEE
> clients you are dealing with. And reserving a single privileged thread
> unconditionally for system sessions shouldn't be much of a burden for
> memory constrained devices too.
>
> Also, this way we would enable every kernel TEE client to leverage
> system sessions as it's very likely they wouldn't like to compete with
> user-space for thread availability. Two other kernel TEE clients that
> are on top of my head are HWRNG and Trusted Keys which can benefit
> from this feature.

Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys, it may need to access the eMMC/RPMB using the Linux OS tee-supplicant whichj may repuire an eMMC clock or voltage regulator to be enabled. If that clock or regulator is under an SCMI control, then we need 2 reserved TEE thread: one for invoking the Trusted Key TA and another for the SCMI request to reach the TEE will the Trusted Key TA invocation still consumes a thread.

BR,
Etienne

>
> > >
> > > Changes since v8:
> > > - Simplified system threads tracking implementation.
> > >
> > >  drivers/tee/optee/call.c          | 72 +++++++++++++++++++++++++++++--
> > >  drivers/tee/optee/ffa_abi.c       |  3 +-
> > >  drivers/tee/optee/optee_private.h | 16 +++++++
> > >  drivers/tee/optee/smc_abi.c       | 16 ++++++-
> > >  4 files changed, 99 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
> > > index 42e478ac6ce1..09e824e4dcaf 100644
> > > --- a/drivers/tee/optee/call.c
> > > +++ b/drivers/tee/optee/call.c
> > > @@ -39,9 +39,27 @@ struct optee_shm_arg_entry {
> > >         DECLARE_BITMAP(map, MAX_ARG_COUNT_PER_ENTRY);
> > >  };
> > >
> > > +void optee_cq_init(struct optee_call_queue *cq, int thread_count)
> > > +{
> > > +       mutex_init(&cq->mutex);
> > > +       INIT_LIST_HEAD(&cq->waiters);
> > > +       /*
> > > +        * If cq->total_thread_count is 0 then we're not trying to keep
> > > +        * track of how many free threads we have, instead we're relying on
> > > +        * the secure world to tell us when we're out of thread and have to
> > > +        * wait for another thread to become available.
> > > +        */
> > > +       cq->total_thread_count = thread_count;
> > > +       cq->free_thread_count = thread_count;
> > > +}
> > > +
> > >  void optee_cq_wait_init(struct optee_call_queue *cq,
> > >                         struct optee_call_waiter *w, bool sys_thread)
> > >  {
> > > +       bool need_wait = false;
> > > +
> > > +       memset(w, 0, sizeof(*w));
> > > +
> > >         /*
> > >          * We're preparing to make a call to secure world. In case we can't
> > >          * allocate a thread in secure world we'll end up waiting in
> > > @@ -53,15 +71,43 @@ void optee_cq_wait_init(struct optee_call_queue *cq,
> > >         mutex_lock(&cq->mutex);
> > >
> > >         /*
> > > -        * We add ourselves to the queue, but we don't wait. This
> > > -        * guarantees that we don't lose a completion if secure world
> > > -        * returns busy and another thread just exited and try to complete
> > > -        * someone.
> > > +        * We add ourselves to a queue, but we don't wait. This guarantees
> > > +        * that we don't lose a completion if secure world returns busy and
> > > +        * another thread just exited and try to complete someone.
> > >          */
> > >         init_completion(&w->c);
> > >         list_add_tail(&w->list_node, &cq->waiters);
> > >
> > > +       if (cq->total_thread_count && sys_thread) {
> > > +               if (cq->free_thread_count > 0)
> > > +                       cq->free_thread_count--;
> > > +               else
> > > +                       need_wait = true;
> > > +       } else if (cq->total_thread_count) {
> > > +               if (cq->free_thread_count > 1)
> >
> > This unconditionally reserves a TEE thread for TEE system calls, even
> > if no client has claimed such system thread reservation.
>
> See my response above.
>
> >
> >
> > > +                       cq->free_thread_count--;
> > > +               else
> > > +                       need_wait = true;
> > > +       }
> > > +
> > >         mutex_unlock(&cq->mutex);
> > > +
> > > +       while (need_wait) {
> > > +               optee_cq_wait_for_completion(cq, w);
> > > +               mutex_lock(&cq->mutex);
> > > +               if (sys_thread) {
> > > +                       if (cq->free_thread_count > 0) {
> > > +                               cq->free_thread_count--;
> > > +                               need_wait = false;
> > > +                       }
> > > +               } else {
> > > +                       if (cq->free_thread_count > 1) {
> > > +                               cq->free_thread_count--;
> > > +                               need_wait = false;
> > > +                       }
> > > +               }
> > > +               mutex_unlock(&cq->mutex);
> > > +       }
> > >  }
> > >
> > >  void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > > @@ -104,6 +150,8 @@ void optee_cq_wait_final(struct optee_call_queue *cq,
> > >         /* Get out of the list */
> > >         list_del(&w->list_node);
> > >
> > > +       cq->free_thread_count++;
> > > +
> > >         /* Wake up one eventual waiting task */
> > >         optee_cq_complete_one(cq);
> > >
> > > @@ -361,6 +409,22 @@ int optee_open_session(struct tee_context *ctx,
> > >         return rc;
> > >  }
> > >
> > > +int optee_system_session(struct tee_context *ctx, u32 session)
> > > +{
> > > +       struct optee_context_data *ctxdata = ctx->data;
> > > +       struct optee_session *sess;
> > > +
> > > +       mutex_lock(&ctxdata->mutex);
> > > +
> > > +       sess = find_session(ctxdata, session);
> > > +       if (sess && !sess->use_sys_thread)
> > > +               sess->use_sys_thread = true;
> > > +
> > > +       mutex_unlock(&ctxdata->mutex);
> > > +
> > > +       return 0;
> >
> > Nitpicking: should rather return 0 only upon session is valid
> > (sess!=NULL here)  and system thread reservation is supported
> > (total_thread_count > 0).
> > But that's not a big deal I guess, can be addressed.
>
> Thanks for pointing it out. If this approach works for you then it can
> be addressed while integrating in your patch-set.
>
> -Sumit
>
> > (snip)

ST Restricted
  
Jens Wiklander May 25, 2023, 3:20 p.m. UTC | #4
Hi,

On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
<etienne.carriere@st.com> wrote:
>
> >
> > De : Sumit Garg <sumit.garg@linaro.org>
> > Envoyé : mercredi 24 mai 2023 09:31
> > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > <etienne.carriere@linaro.org> wrote:
> > > Hello Sumit,
> > >
> > >
> > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > >
> > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > >
> > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > FF-A ABI part does not.
> > > >
> > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > sessions. For sake of simplicity, initialization of call queue
> > > > management is factorized into new helper function optee_cq_init().
> > > >
> > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > ---
> > > >
> > > > Disclaimer: Compile tested only
> > > >
> > > > Hi Etienne,
> > > >
> > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > complex to me. So I thought it would be harder to explain that via
> > > > review and I decided myself to give a try at simplification. I would
> > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > wrt testing.
> > >
> > > With these changes, there is no more a specific waiting list for TEE
> > > system threads hence when a waiting queue can complete, we'll pick any
> > > TEE thread, not a TEE system thread first..
> >
> > I had thought about this but I can't see any value in having a
> > separate wait queue for system threads. Here we only need to provide
> > an extra privileged thread for system sessions (kernel clients) such
> > that user-space doesn't contend for that thread. This prevents kernel
> > client's starvation or deadlock like in the SCMI case.
> >
> > > Also, as stated in a below answer, these change unconditionally
> > > reserve a TEE thread for TEE system calls even if no TEE client
> > > reserved such.
> >
> > I don't think we should make thread reservations based on the presence
> > of TEE clients. You never know how much user-space or kernel TEE
> > clients you are dealing with. And reserving a single privileged thread
> > unconditionally for system sessions shouldn't be much of a burden for
> > memory constrained devices too.
> >
> > Also, this way we would enable every kernel TEE client to leverage
> > system sessions as it's very likely they wouldn't like to compete with
> > user-space for thread availability. Two other kernel TEE clients that
> > are on top of my head are HWRNG and Trusted Keys which can benefit
> > from this feature.
>
> Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys, it may need to access the eMMC/RPMB using the Linux OS tee-supplicant whichj may repuire an eMMC clock or voltage regulator to be enabled. If that clock or regulator is under an SCMI control, then we need 2 reserved TEE thread: one for invoking the Trusted Key TA and another for the SCMI request to reach the TEE will the Trusted Key TA invocation still consumes a thread.

Why would the Trusted Keys session need a system thread? To me, it
seems that the session could use the normal client priority.

Thanks,
Jens

>
> BR,
> Etienne
>
> >
> > > >
> > > > Changes since v8:
> > > > - Simplified system threads tracking implementation.
> > > >
> > > >  drivers/tee/optee/call.c          | 72 +++++++++++++++++++++++++++++--
> > > >  drivers/tee/optee/ffa_abi.c       |  3 +-
> > > >  drivers/tee/optee/optee_private.h | 16 +++++++
> > > >  drivers/tee/optee/smc_abi.c       | 16 ++++++-
> > > >  4 files changed, 99 insertions(+), 8 deletions(-)
> > > >
> > > > diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
> > > > index 42e478ac6ce1..09e824e4dcaf 100644
> > > > --- a/drivers/tee/optee/call.c
> > > > +++ b/drivers/tee/optee/call.c
> > > > @@ -39,9 +39,27 @@ struct optee_shm_arg_entry {
> > > >         DECLARE_BITMAP(map, MAX_ARG_COUNT_PER_ENTRY);
> > > >  };
> > > >
> > > > +void optee_cq_init(struct optee_call_queue *cq, int thread_count)
> > > > +{
> > > > +       mutex_init(&cq->mutex);
> > > > +       INIT_LIST_HEAD(&cq->waiters);
> > > > +       /*
> > > > +        * If cq->total_thread_count is 0 then we're not trying to keep
> > > > +        * track of how many free threads we have, instead we're relying on
> > > > +        * the secure world to tell us when we're out of thread and have to
> > > > +        * wait for another thread to become available.
> > > > +        */
> > > > +       cq->total_thread_count = thread_count;
> > > > +       cq->free_thread_count = thread_count;
> > > > +}
> > > > +
> > > >  void optee_cq_wait_init(struct optee_call_queue *cq,
> > > >                         struct optee_call_waiter *w, bool sys_thread)
> > > >  {
> > > > +       bool need_wait = false;
> > > > +
> > > > +       memset(w, 0, sizeof(*w));
> > > > +
> > > >         /*
> > > >          * We're preparing to make a call to secure world. In case we can't
> > > >          * allocate a thread in secure world we'll end up waiting in
> > > > @@ -53,15 +71,43 @@ void optee_cq_wait_init(struct optee_call_queue *cq,
> > > >         mutex_lock(&cq->mutex);
> > > >
> > > >         /*
> > > > -        * We add ourselves to the queue, but we don't wait. This
> > > > -        * guarantees that we don't lose a completion if secure world
> > > > -        * returns busy and another thread just exited and try to complete
> > > > -        * someone.
> > > > +        * We add ourselves to a queue, but we don't wait. This guarantees
> > > > +        * that we don't lose a completion if secure world returns busy and
> > > > +        * another thread just exited and try to complete someone.
> > > >          */
> > > >         init_completion(&w->c);
> > > >         list_add_tail(&w->list_node, &cq->waiters);
> > > >
> > > > +       if (cq->total_thread_count && sys_thread) {
> > > > +               if (cq->free_thread_count > 0)
> > > > +                       cq->free_thread_count--;
> > > > +               else
> > > > +                       need_wait = true;
> > > > +       } else if (cq->total_thread_count) {
> > > > +               if (cq->free_thread_count > 1)
> > >
> > > This unconditionally reserves a TEE thread for TEE system calls, even
> > > if no client has claimed such system thread reservation.
> >
> > See my response above.
> >
> > >
> > >
> > > > +                       cq->free_thread_count--;
> > > > +               else
> > > > +                       need_wait = true;
> > > > +       }
> > > > +
> > > >         mutex_unlock(&cq->mutex);
> > > > +
> > > > +       while (need_wait) {
> > > > +               optee_cq_wait_for_completion(cq, w);
> > > > +               mutex_lock(&cq->mutex);
> > > > +               if (sys_thread) {
> > > > +                       if (cq->free_thread_count > 0) {
> > > > +                               cq->free_thread_count--;
> > > > +                               need_wait = false;
> > > > +                       }
> > > > +               } else {
> > > > +                       if (cq->free_thread_count > 1) {
> > > > +                               cq->free_thread_count--;
> > > > +                               need_wait = false;
> > > > +                       }
> > > > +               }
> > > > +               mutex_unlock(&cq->mutex);
> > > > +       }
> > > >  }
> > > >
> > > >  void optee_cq_wait_for_completion(struct optee_call_queue *cq,
> > > > @@ -104,6 +150,8 @@ void optee_cq_wait_final(struct optee_call_queue *cq,
> > > >         /* Get out of the list */
> > > >         list_del(&w->list_node);
> > > >
> > > > +       cq->free_thread_count++;
> > > > +
> > > >         /* Wake up one eventual waiting task */
> > > >         optee_cq_complete_one(cq);
> > > >
> > > > @@ -361,6 +409,22 @@ int optee_open_session(struct tee_context *ctx,
> > > >         return rc;
> > > >  }
> > > >
> > > > +int optee_system_session(struct tee_context *ctx, u32 session)
> > > > +{
> > > > +       struct optee_context_data *ctxdata = ctx->data;
> > > > +       struct optee_session *sess;
> > > > +
> > > > +       mutex_lock(&ctxdata->mutex);
> > > > +
> > > > +       sess = find_session(ctxdata, session);
> > > > +       if (sess && !sess->use_sys_thread)
> > > > +               sess->use_sys_thread = true;
> > > > +
> > > > +       mutex_unlock(&ctxdata->mutex);
> > > > +
> > > > +       return 0;
> > >
> > > Nitpicking: should rather return 0 only upon session is valid
> > > (sess!=NULL here)  and system thread reservation is supported
> > > (total_thread_count > 0).
> > > But that's not a big deal I guess, can be addressed.
> >
> > Thanks for pointing it out. If this approach works for you then it can
> > be addressed while integrating in your patch-set.
> >
> > -Sumit
> >
> > > (snip)
>
> ST Restricted
  
Etienne CARRIERE May 25, 2023, 7:35 p.m. UTC | #5
> De: Jens Wiklander <jens.wiklander@linaro.org>
> Envoyé : jeudi 25 mai 2023 17:20
>
> Hi,
>
> On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
> <etienne.carriere@st.com> wrote:>
> >
> > >
> > > De : Sumit Garg <sumit.garg@linaro.org>
> > > Envoyé : mercredi 24 mai 2023 09:31
> > > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > > <etienne.carriere@linaro.org> wrote:
> > > > Hello Sumit,
> > > >
> > > >
> > > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > >
> > > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > > >
> > > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > > FF-A ABI part does not.
> > > > >
> > > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > > sessions. For sake of simplicity, initialization of call queue
> > > > > management is factorized into new helper function optee_cq_init().
> > > > >
> > > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > ---
> > > > >
> > > > > Disclaimer: Compile tested only
> > > > >
> > > > > Hi Etienne,
> > > > >
> > > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > > complex to me. So I thought it would be harder to explain that via
> > > > > review and I decided myself to give a try at simplification. I would
> > > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > > wrt testing.
> > > >
> > > > With these changes, there is no more a specific waiting list for TEE
> > > > system threads hence when a waiting queue can complete, we'll pick any
> > > > TEE thread, not a TEE system thread first..
> > >
> > > I had thought about this but I can't see any value in having a
> > > separate wait queue for system threads. Here we only need to provide
> > > an extra privileged thread for system sessions (kernel clients) such
> > > that user-space doesn't contend for that thread. This prevents kernel
> > > client's starvation or deadlock like in the SCMI case.
> > >
> > > > Also, as stated in a below answer, these change unconditionally
> > > > reserve a TEE thread for TEE system calls even if no TEE client
> > > > reserved such.
> > >
> > > I don't think we should make thread reservations based on the presence
> > > of TEE clients. You never know how much user-space or kernel TEE
> > > clients you are dealing with. And reserving a single privileged thread
> > > unconditionally for system sessions shouldn't be much of a burden for
> > > memory constrained devices too.
> > >
> > > Also, this way we would enable every kernel TEE client to leverage
> > > system sessions as it's very likely they wouldn't like to compete with
> > > user-space for thread availability. Two other kernel TEE clients that
> > > are on top of my head are HWRNG and Trusted Keys which can benefit
> > > from this feature.
> >
> > Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys,
> > it may need to access the eMMC/RPMB using the Linux OS tee-supplicant
> >  whichj may repuire an eMMC clock or voltage regulator to be enabled.
> > If that clock or regulator is under an SCMI control, then we need 2
> > reserved TEE thread: one for invoking the Trusted Key TA and
> > another for the SCMI request to reach the TEE will the Trusted Key
> > TA invocation still consumes a thread.
> >
> Why would the Trusted Keys session need a system thread? To me, it
> seems that the session could use the normal client priority.

I agree. I don't Trusted Key services needs a provisioned TEE system thread.

Etienne

>
> Thanks,
> Jens
>
> > (snip)

ST Restricted
  
Sumit Garg May 29, 2023, 7:11 a.m. UTC | #6
On Fri, 26 May 2023 at 01:05, Etienne CARRIERE <etienne.carriere@st.com> wrote:
>
>
> > De: Jens Wiklander <jens.wiklander@linaro.org>
> > Envoyé : jeudi 25 mai 2023 17:20
> >
> > Hi,
> >
> > On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
> > <etienne.carriere@st.com> wrote:>
> > >
> > > >
> > > > De : Sumit Garg <sumit.garg@linaro.org>
> > > > Envoyé : mercredi 24 mai 2023 09:31
> > > > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > > > <etienne.carriere@linaro.org> wrote:
> > > > > Hello Sumit,
> > > > >
> > > > >
> > > > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > > >
> > > > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > >
> > > > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > > > FF-A ABI part does not.
> > > > > >
> > > > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > > > sessions. For sake of simplicity, initialization of call queue
> > > > > > management is factorized into new helper function optee_cq_init().
> > > > > >
> > > > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > ---
> > > > > >
> > > > > > Disclaimer: Compile tested only
> > > > > >
> > > > > > Hi Etienne,
> > > > > >
> > > > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > > > complex to me. So I thought it would be harder to explain that via
> > > > > > review and I decided myself to give a try at simplification. I would
> > > > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > > > wrt testing.
> > > > >
> > > > > With these changes, there is no more a specific waiting list for TEE
> > > > > system threads hence when a waiting queue can complete, we'll pick any
> > > > > TEE thread, not a TEE system thread first..
> > > >
> > > > I had thought about this but I can't see any value in having a
> > > > separate wait queue for system threads. Here we only need to provide
> > > > an extra privileged thread for system sessions (kernel clients) such
> > > > that user-space doesn't contend for that thread. This prevents kernel
> > > > client's starvation or deadlock like in the SCMI case.
> > > >
> > > > > Also, as stated in a below answer, these change unconditionally
> > > > > reserve a TEE thread for TEE system calls even if no TEE client
> > > > > reserved such.
> > > >
> > > > I don't think we should make thread reservations based on the presence
> > > > of TEE clients. You never know how much user-space or kernel TEE
> > > > clients you are dealing with. And reserving a single privileged thread
> > > > unconditionally for system sessions shouldn't be much of a burden for
> > > > memory constrained devices too.
> > > >
> > > > Also, this way we would enable every kernel TEE client to leverage
> > > > system sessions as it's very likely they wouldn't like to compete with
> > > > user-space for thread availability. Two other kernel TEE clients that
> > > > are on top of my head are HWRNG and Trusted Keys which can benefit
> > > > from this feature.
> > >
> > > Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys,
> > > it may need to access the eMMC/RPMB using the Linux OS tee-supplicant
> > >  whichj may repuire an eMMC clock or voltage regulator to be enabled.
> > > If that clock or regulator is under an SCMI control, then we need 2
> > > reserved TEE thread: one for invoking the Trusted Key TA and
> > > another for the SCMI request to reach the TEE will the Trusted Key
> > > TA invocation still consumes a thread.

Trusked keys TA doesn't need access to secure storage (eMMC/RPMB). It
only requires a RNG and access to a key derived from HUK.

> > >
> > Why would the Trusted Keys session need a system thread? To me, it
> > seems that the session could use the normal client priority.

The system thread priority as per my patch is nothing but an extra
thread available in the thread pool for kernel clients as compared to
user-space clients.

Trusted keys use-case was really motivated by: "every kernel TEE
client would like to avoid competing with user-space for thread
availability". However, HWRNG has a real case that user-space
shouldn't starve kernel RNG thread for OP-TEE thread availability.

System thread can be useful for trusted keys in case the disk
encryption key is backed by a trusted key.

-Sumit
  
Jens Wiklander May 29, 2023, 9:40 a.m. UTC | #7
On Mon, May 29, 2023 at 9:11 AM Sumit Garg <sumit.garg@linaro.org> wrote:
>
> On Fri, 26 May 2023 at 01:05, Etienne CARRIERE <etienne.carriere@st.com> wrote:
> >
> >
> > > De: Jens Wiklander <jens.wiklander@linaro.org>
> > > Envoyé : jeudi 25 mai 2023 17:20
> > >
> > > Hi,
> > >
> > > On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
> > > <etienne.carriere@st.com> wrote:>
> > > >
> > > > >
> > > > > De : Sumit Garg <sumit.garg@linaro.org>
> > > > > Envoyé : mercredi 24 mai 2023 09:31
> > > > > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > > > > <etienne.carriere@linaro.org> wrote:
> > > > > > Hello Sumit,
> > > > > >
> > > > > >
> > > > > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > > > >
> > > > > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > >
> > > > > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > > > > FF-A ABI part does not.
> > > > > > >
> > > > > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > > > > sessions. For sake of simplicity, initialization of call queue
> > > > > > > management is factorized into new helper function optee_cq_init().
> > > > > > >
> > > > > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > ---
> > > > > > >
> > > > > > > Disclaimer: Compile tested only
> > > > > > >
> > > > > > > Hi Etienne,
> > > > > > >
> > > > > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > > > > complex to me. So I thought it would be harder to explain that via
> > > > > > > review and I decided myself to give a try at simplification. I would
> > > > > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > > > > wrt testing.
> > > > > >
> > > > > > With these changes, there is no more a specific waiting list for TEE
> > > > > > system threads hence when a waiting queue can complete, we'll pick any
> > > > > > TEE thread, not a TEE system thread first..
> > > > >
> > > > > I had thought about this but I can't see any value in having a
> > > > > separate wait queue for system threads. Here we only need to provide
> > > > > an extra privileged thread for system sessions (kernel clients) such
> > > > > that user-space doesn't contend for that thread. This prevents kernel
> > > > > client's starvation or deadlock like in the SCMI case.
> > > > >
> > > > > > Also, as stated in a below answer, these change unconditionally
> > > > > > reserve a TEE thread for TEE system calls even if no TEE client
> > > > > > reserved such.
> > > > >
> > > > > I don't think we should make thread reservations based on the presence
> > > > > of TEE clients. You never know how much user-space or kernel TEE
> > > > > clients you are dealing with. And reserving a single privileged thread
> > > > > unconditionally for system sessions shouldn't be much of a burden for
> > > > > memory constrained devices too.
> > > > >
> > > > > Also, this way we would enable every kernel TEE client to leverage
> > > > > system sessions as it's very likely they wouldn't like to compete with
> > > > > user-space for thread availability. Two other kernel TEE clients that
> > > > > are on top of my head are HWRNG and Trusted Keys which can benefit
> > > > > from this feature.
> > > >
> > > > Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys,
> > > > it may need to access the eMMC/RPMB using the Linux OS tee-supplicant
> > > >  whichj may repuire an eMMC clock or voltage regulator to be enabled.
> > > > If that clock or regulator is under an SCMI control, then we need 2
> > > > reserved TEE thread: one for invoking the Trusted Key TA and
> > > > another for the SCMI request to reach the TEE will the Trusted Key
> > > > TA invocation still consumes a thread.
>
> Trusked keys TA doesn't need access to secure storage (eMMC/RPMB). It
> only requires a RNG and access to a key derived from HUK.

Because it's always compiled as an early TA so no rollback protection is used?

>
> > > >
> > > Why would the Trusted Keys session need a system thread? To me, it
> > > seems that the session could use the normal client priority.
>
> The system thread priority as per my patch is nothing but an extra
> thread available in the thread pool for kernel clients as compared to
> user-space clients.
>
> Trusted keys use-case was really motivated by: "every kernel TEE
> client would like to avoid competing with user-space for thread
> availability". However, HWRNG has a real case that user-space
> shouldn't starve kernel RNG thread for OP-TEE thread availability.
>
> System thread can be useful for trusted keys in case the disk
> encryption key is backed by a trusted key.

With well-behaving TAs every TEE client will get its thread in due time.

The system thread feature was originally intended as a way of avoiding
a deadlock. So far we have otherwise assigned threads on a first-come
first-served basis. If we now also need a way of giving priority to
kernel clients for less critical reasons we may need to take a step
back and redesign because reserving a thread for each kernel client
doesn't scale.

Thanks,
Jens
  
Sumit Garg May 29, 2023, 10:17 a.m. UTC | #8
On Mon, 29 May 2023 at 15:10, Jens Wiklander <jens.wiklander@linaro.org> wrote:
>
> On Mon, May 29, 2023 at 9:11 AM Sumit Garg <sumit.garg@linaro.org> wrote:
> >
> > On Fri, 26 May 2023 at 01:05, Etienne CARRIERE <etienne.carriere@st.com> wrote:
> > >
> > >
> > > > De: Jens Wiklander <jens.wiklander@linaro.org>
> > > > Envoyé : jeudi 25 mai 2023 17:20
> > > >
> > > > Hi,
> > > >
> > > > On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
> > > > <etienne.carriere@st.com> wrote:>
> > > > >
> > > > > >
> > > > > > De : Sumit Garg <sumit.garg@linaro.org>
> > > > > > Envoyé : mercredi 24 mai 2023 09:31
> > > > > > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > > > > > <etienne.carriere@linaro.org> wrote:
> > > > > > > Hello Sumit,
> > > > > > >
> > > > > > >
> > > > > > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > > > > >
> > > > > > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > >
> > > > > > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > > > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > > > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > > > > > FF-A ABI part does not.
> > > > > > > >
> > > > > > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > > > > > sessions. For sake of simplicity, initialization of call queue
> > > > > > > > management is factorized into new helper function optee_cq_init().
> > > > > > > >
> > > > > > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > ---
> > > > > > > >
> > > > > > > > Disclaimer: Compile tested only
> > > > > > > >
> > > > > > > > Hi Etienne,
> > > > > > > >
> > > > > > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > > > > > complex to me. So I thought it would be harder to explain that via
> > > > > > > > review and I decided myself to give a try at simplification. I would
> > > > > > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > > > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > > > > > wrt testing.
> > > > > > >
> > > > > > > With these changes, there is no more a specific waiting list for TEE
> > > > > > > system threads hence when a waiting queue can complete, we'll pick any
> > > > > > > TEE thread, not a TEE system thread first..
> > > > > >
> > > > > > I had thought about this but I can't see any value in having a
> > > > > > separate wait queue for system threads. Here we only need to provide
> > > > > > an extra privileged thread for system sessions (kernel clients) such
> > > > > > that user-space doesn't contend for that thread. This prevents kernel
> > > > > > client's starvation or deadlock like in the SCMI case.
> > > > > >
> > > > > > > Also, as stated in a below answer, these change unconditionally
> > > > > > > reserve a TEE thread for TEE system calls even if no TEE client
> > > > > > > reserved such.
> > > > > >
> > > > > > I don't think we should make thread reservations based on the presence
> > > > > > of TEE clients. You never know how much user-space or kernel TEE
> > > > > > clients you are dealing with. And reserving a single privileged thread
> > > > > > unconditionally for system sessions shouldn't be much of a burden for
> > > > > > memory constrained devices too.
> > > > > >
> > > > > > Also, this way we would enable every kernel TEE client to leverage
> > > > > > system sessions as it's very likely they wouldn't like to compete with
> > > > > > user-space for thread availability. Two other kernel TEE clients that
> > > > > > are on top of my head are HWRNG and Trusted Keys which can benefit
> > > > > > from this feature.
> > > > >
> > > > > Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys,
> > > > > it may need to access the eMMC/RPMB using the Linux OS tee-supplicant
> > > > >  whichj may repuire an eMMC clock or voltage regulator to be enabled.
> > > > > If that clock or regulator is under an SCMI control, then we need 2
> > > > > reserved TEE thread: one for invoking the Trusted Key TA and
> > > > > another for the SCMI request to reach the TEE will the Trusted Key
> > > > > TA invocation still consumes a thread.
> >
> > Trusked keys TA doesn't need access to secure storage (eMMC/RPMB). It
> > only requires a RNG and access to a key derived from HUK.
>
> Because it's always compiled as an early TA so no rollback protection is used?
>

Yeah it has to be compiled as an early TA to support Trusted Keys
use-cases. BTW, we don't have enumeration support for REE-FS TAs at
the moment.

> >
> > > > >
> > > > Why would the Trusted Keys session need a system thread? To me, it
> > > > seems that the session could use the normal client priority.
> >
> > The system thread priority as per my patch is nothing but an extra
> > thread available in the thread pool for kernel clients as compared to
> > user-space clients.
> >
> > Trusted keys use-case was really motivated by: "every kernel TEE
> > client would like to avoid competing with user-space for thread
> > availability". However, HWRNG has a real case that user-space
> > shouldn't starve kernel RNG thread for OP-TEE thread availability.
> >
> > System thread can be useful for trusted keys in case the disk
> > encryption key is backed by a trusted key.
>
> With well-behaving TAs every TEE client will get its thread in due time.

We should try to keep distinction among user-space clients vs kernel
clients rather than keeping them in the same bucket. The kernel
clients are more privileged than user-space ones. This is similar
hardening as we have done with respect to session login method (REE
kernel login).

>
> The system thread feature was originally intended as a way of avoiding
> a deadlock.

That's true but doing so can also benefit other (mutual independent)
kernel clients as well.

> So far we have otherwise assigned threads on a first-come
> first-served basis. If we now also need a way of giving priority to
> kernel clients for less critical reasons we may need to take a step
> back and redesign because reserving a thread for each kernel client
> doesn't scale.

No, please have a relook at this patch. We have *only* reserved a
single thread for all the allowed (sess->use_sys_thread = true) kernel
clients to compete for. And user-space has access to all the other
threads on a first-come first-served basis except the one thread
reserved for kernel clients.

-Sumit

>
> Thanks,
> Jens
  
Jens Wiklander May 29, 2023, 12:23 p.m. UTC | #9
On Mon, May 29, 2023 at 12:17 PM Sumit Garg <sumit.garg@linaro.org> wrote:
>
> On Mon, 29 May 2023 at 15:10, Jens Wiklander <jens.wiklander@linaro.org> wrote:
> >
> > On Mon, May 29, 2023 at 9:11 AM Sumit Garg <sumit.garg@linaro.org> wrote:
> > >
> > > On Fri, 26 May 2023 at 01:05, Etienne CARRIERE <etienne.carriere@st.com> wrote:
> > > >
> > > >
> > > > > De: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > Envoyé : jeudi 25 mai 2023 17:20
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
> > > > > <etienne.carriere@st.com> wrote:>
> > > > > >
> > > > > > >
> > > > > > > De : Sumit Garg <sumit.garg@linaro.org>
> > > > > > > Envoyé : mercredi 24 mai 2023 09:31
> > > > > > > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > > > > > > <etienne.carriere@linaro.org> wrote:
> > > > > > > > Hello Sumit,
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > > > > > >
> > > > > > > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > > >
> > > > > > > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > > > > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > > > > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > > > > > > FF-A ABI part does not.
> > > > > > > > >
> > > > > > > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > > > > > > sessions. For sake of simplicity, initialization of call queue
> > > > > > > > > management is factorized into new helper function optee_cq_init().
> > > > > > > > >
> > > > > > > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > > ---
> > > > > > > > >
> > > > > > > > > Disclaimer: Compile tested only
> > > > > > > > >
> > > > > > > > > Hi Etienne,
> > > > > > > > >
> > > > > > > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > > > > > > complex to me. So I thought it would be harder to explain that via
> > > > > > > > > review and I decided myself to give a try at simplification. I would
> > > > > > > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > > > > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > > > > > > wrt testing.
> > > > > > > >
> > > > > > > > With these changes, there is no more a specific waiting list for TEE
> > > > > > > > system threads hence when a waiting queue can complete, we'll pick any
> > > > > > > > TEE thread, not a TEE system thread first..
> > > > > > >
> > > > > > > I had thought about this but I can't see any value in having a
> > > > > > > separate wait queue for system threads. Here we only need to provide
> > > > > > > an extra privileged thread for system sessions (kernel clients) such
> > > > > > > that user-space doesn't contend for that thread. This prevents kernel
> > > > > > > client's starvation or deadlock like in the SCMI case.
> > > > > > >
> > > > > > > > Also, as stated in a below answer, these change unconditionally
> > > > > > > > reserve a TEE thread for TEE system calls even if no TEE client
> > > > > > > > reserved such.
> > > > > > >
> > > > > > > I don't think we should make thread reservations based on the presence
> > > > > > > of TEE clients. You never know how much user-space or kernel TEE
> > > > > > > clients you are dealing with. And reserving a single privileged thread
> > > > > > > unconditionally for system sessions shouldn't be much of a burden for
> > > > > > > memory constrained devices too.
> > > > > > >
> > > > > > > Also, this way we would enable every kernel TEE client to leverage
> > > > > > > system sessions as it's very likely they wouldn't like to compete with
> > > > > > > user-space for thread availability. Two other kernel TEE clients that
> > > > > > > are on top of my head are HWRNG and Trusted Keys which can benefit
> > > > > > > from this feature.
> > > > > >
> > > > > > Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys,
> > > > > > it may need to access the eMMC/RPMB using the Linux OS tee-supplicant
> > > > > >  whichj may repuire an eMMC clock or voltage regulator to be enabled.
> > > > > > If that clock or regulator is under an SCMI control, then we need 2
> > > > > > reserved TEE thread: one for invoking the Trusted Key TA and
> > > > > > another for the SCMI request to reach the TEE will the Trusted Key
> > > > > > TA invocation still consumes a thread.
> > >
> > > Trusked keys TA doesn't need access to secure storage (eMMC/RPMB). It
> > > only requires a RNG and access to a key derived from HUK.
> >
> > Because it's always compiled as an early TA so no rollback protection is used?
> >
>
> Yeah it has to be compiled as an early TA to support Trusted Keys
> use-cases. BTW, we don't have enumeration support for REE-FS TAs at
> the moment.
>
> > >
> > > > > >
> > > > > Why would the Trusted Keys session need a system thread? To me, it
> > > > > seems that the session could use the normal client priority.
> > >
> > > The system thread priority as per my patch is nothing but an extra
> > > thread available in the thread pool for kernel clients as compared to
> > > user-space clients.
> > >
> > > Trusted keys use-case was really motivated by: "every kernel TEE
> > > client would like to avoid competing with user-space for thread
> > > availability". However, HWRNG has a real case that user-space
> > > shouldn't starve kernel RNG thread for OP-TEE thread availability.
> > >
> > > System thread can be useful for trusted keys in case the disk
> > > encryption key is backed by a trusted key.
> >
> > With well-behaving TAs every TEE client will get its thread in due time.
>
> We should try to keep distinction among user-space clients vs kernel
> clients rather than keeping them in the same bucket. The kernel
> clients are more privileged than user-space ones. This is similar
> hardening as we have done with respect to session login method (REE
> kernel login).
>
> >
> > The system thread feature was originally intended as a way of avoiding
> > a deadlock.
>
> That's true but doing so can also benefit other (mutual independent)
> kernel clients as well.
>
> > So far we have otherwise assigned threads on a first-come
> > first-served basis. If we now also need a way of giving priority to
> > kernel clients for less critical reasons we may need to take a step
> > back and redesign because reserving a thread for each kernel client
> > doesn't scale.
>
> No, please have a relook at this patch. We have *only* reserved a
> single thread for all the allowed (sess->use_sys_thread = true) kernel
> clients to compete for. And user-space has access to all the other
> threads on a first-come first-served basis except the one thread
> reserved for kernel clients.

Thanks, got it now. User space clients may or may not be able to
starve kernel clients enough to matter. However, if all kernel clients
only will be guaranteed one thread then we're still stuck with the
deadlock problem for SCMI. We can't guarantee that none of the kernel
clients will indirectly access eMMC/RPMB.

Cheers,
Jens

>
> -Sumit
>
> >
> > Thanks,
> > Jens
  
Sumit Garg May 29, 2023, 1:52 p.m. UTC | #10
On Mon, 29 May 2023 at 17:53, Jens Wiklander <jens.wiklander@linaro.org> wrote:
>
> On Mon, May 29, 2023 at 12:17 PM Sumit Garg <sumit.garg@linaro.org> wrote:
> >
> > On Mon, 29 May 2023 at 15:10, Jens Wiklander <jens.wiklander@linaro.org> wrote:
> > >
> > > On Mon, May 29, 2023 at 9:11 AM Sumit Garg <sumit.garg@linaro.org> wrote:
> > > >
> > > > On Fri, 26 May 2023 at 01:05, Etienne CARRIERE <etienne.carriere@st.com> wrote:
> > > > >
> > > > >
> > > > > > De: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > Envoyé : jeudi 25 mai 2023 17:20
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Thu, May 25, 2023 at 1:48 PM Etienne CARRIERE
> > > > > > <etienne.carriere@st.com> wrote:>
> > > > > > >
> > > > > > > >
> > > > > > > > De : Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > Envoyé : mercredi 24 mai 2023 09:31
> > > > > > > > > On Tue, 23 May 2023 at 12:41, Etienne Carriere
> > > > > > > > <etienne.carriere@linaro.org> wrote:
> > > > > > > > > Hello Sumit,
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, 17 May 2023 at 16:33, Sumit Garg <sumit.garg@linaro.org> wrote:
> > > > > > > > > >
> > > > > > > > > > From: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > > > >
> > > > > > > > > > Adds support in the OP-TEE driver to keep track of reserved system
> > > > > > > > > > threads. The optee_cq_*() functions are updated to handle this if
> > > > > > > > > > enabled. The SMC ABI part of the driver enables this tracking, but the
> > > > > > > > > > FF-A ABI part does not.
> > > > > > > > > >
> > > > > > > > > > The logic allows atleast 1 OP-TEE thread can be reserved to TEE system
> > > > > > > > > > sessions. For sake of simplicity, initialization of call queue
> > > > > > > > > > management is factorized into new helper function optee_cq_init().
> > > > > > > > > >
> > > > > > > > > > Co-developed-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > > > > Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
> > > > > > > > > > Signed-off-by: Etienne Carriere <etienne.carriere@linaro.org>
> > > > > > > > > > Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > > > > > > > > ---
> > > > > > > > > >
> > > > > > > > > > Disclaimer: Compile tested only
> > > > > > > > > >
> > > > > > > > > > Hi Etienne,
> > > > > > > > > >
> > > > > > > > > > Overall the idea we agreed upon was okay but the implementation looked
> > > > > > > > > > complex to me. So I thought it would be harder to explain that via
> > > > > > > > > > review and I decided myself to give a try at simplification. I would
> > > > > > > > > > like you to test it if this still addresses the SCMI deadlock problem or
> > > > > > > > > > not. Also, feel free to include this in your patchset if all goes fine
> > > > > > > > > > wrt testing.
> > > > > > > > >
> > > > > > > > > With these changes, there is no more a specific waiting list for TEE
> > > > > > > > > system threads hence when a waiting queue can complete, we'll pick any
> > > > > > > > > TEE thread, not a TEE system thread first..
> > > > > > > >
> > > > > > > > I had thought about this but I can't see any value in having a
> > > > > > > > separate wait queue for system threads. Here we only need to provide
> > > > > > > > an extra privileged thread for system sessions (kernel clients) such
> > > > > > > > that user-space doesn't contend for that thread. This prevents kernel
> > > > > > > > client's starvation or deadlock like in the SCMI case.
> > > > > > > >
> > > > > > > > > Also, as stated in a below answer, these change unconditionally
> > > > > > > > > reserve a TEE thread for TEE system calls even if no TEE client
> > > > > > > > > reserved such.
> > > > > > > >
> > > > > > > > I don't think we should make thread reservations based on the presence
> > > > > > > > of TEE clients. You never know how much user-space or kernel TEE
> > > > > > > > clients you are dealing with. And reserving a single privileged thread
> > > > > > > > unconditionally for system sessions shouldn't be much of a burden for
> > > > > > > > memory constrained devices too.
> > > > > > > >
> > > > > > > > Also, this way we would enable every kernel TEE client to leverage
> > > > > > > > system sessions as it's very likely they wouldn't like to compete with
> > > > > > > > user-space for thread availability. Two other kernel TEE clients that
> > > > > > > > are on top of my head are HWRNG and Trusted Keys which can benefit
> > > > > > > > from this feature.
> > > > > > >
> > > > > > > Trusted Keys is an interesting use case. When OP-TEE accesses Trusted Keys,
> > > > > > > it may need to access the eMMC/RPMB using the Linux OS tee-supplicant
> > > > > > >  whichj may repuire an eMMC clock or voltage regulator to be enabled.
> > > > > > > If that clock or regulator is under an SCMI control, then we need 2
> > > > > > > reserved TEE thread: one for invoking the Trusted Key TA and
> > > > > > > another for the SCMI request to reach the TEE will the Trusted Key
> > > > > > > TA invocation still consumes a thread.
> > > >
> > > > Trusked keys TA doesn't need access to secure storage (eMMC/RPMB). It
> > > > only requires a RNG and access to a key derived from HUK.
> > >
> > > Because it's always compiled as an early TA so no rollback protection is used?
> > >
> >
> > Yeah it has to be compiled as an early TA to support Trusted Keys
> > use-cases. BTW, we don't have enumeration support for REE-FS TAs at
> > the moment.
> >
> > > >
> > > > > > >
> > > > > > Why would the Trusted Keys session need a system thread? To me, it
> > > > > > seems that the session could use the normal client priority.
> > > >
> > > > The system thread priority as per my patch is nothing but an extra
> > > > thread available in the thread pool for kernel clients as compared to
> > > > user-space clients.
> > > >
> > > > Trusted keys use-case was really motivated by: "every kernel TEE
> > > > client would like to avoid competing with user-space for thread
> > > > availability". However, HWRNG has a real case that user-space
> > > > shouldn't starve kernel RNG thread for OP-TEE thread availability.
> > > >
> > > > System thread can be useful for trusted keys in case the disk
> > > > encryption key is backed by a trusted key.
> > >
> > > With well-behaving TAs every TEE client will get its thread in due time.
> >
> > We should try to keep distinction among user-space clients vs kernel
> > clients rather than keeping them in the same bucket. The kernel
> > clients are more privileged than user-space ones. This is similar
> > hardening as we have done with respect to session login method (REE
> > kernel login).
> >
> > >
> > > The system thread feature was originally intended as a way of avoiding
> > > a deadlock.
> >
> > That's true but doing so can also benefit other (mutual independent)
> > kernel clients as well.
> >
> > > So far we have otherwise assigned threads on a first-come
> > > first-served basis. If we now also need a way of giving priority to
> > > kernel clients for less critical reasons we may need to take a step
> > > back and redesign because reserving a thread for each kernel client
> > > doesn't scale.
> >
> > No, please have a relook at this patch. We have *only* reserved a
> > single thread for all the allowed (sess->use_sys_thread = true) kernel
> > clients to compete for. And user-space has access to all the other
> > threads on a first-come first-served basis except the one thread
> > reserved for kernel clients.
>
> Thanks, got it now. User space clients may or may not be able to
> starve kernel clients enough to matter. However, if all kernel clients
> only will be guaranteed one thread then we're still stuck with the
> deadlock problem for SCMI. We can't guarantee that none of the kernel
> clients will indirectly access eMMC/RPMB.

I think we should be able to address this via disallowing system
sessions for kernel clients whose devices are probed via
PTA_CMD_GET_DEVICES_SUPP. But I can't find a sane way to enforce this
via code but can be taken care of during review as well.

-Sumit

>
> Cheers,
> Jens
>
> >
> > -Sumit
> >
> > >
> > > Thanks,
> > > Jens
  

Patch

diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
index 42e478ac6ce1..09e824e4dcaf 100644
--- a/drivers/tee/optee/call.c
+++ b/drivers/tee/optee/call.c
@@ -39,9 +39,27 @@  struct optee_shm_arg_entry {
 	DECLARE_BITMAP(map, MAX_ARG_COUNT_PER_ENTRY);
 };
 
+void optee_cq_init(struct optee_call_queue *cq, int thread_count)
+{
+	mutex_init(&cq->mutex);
+	INIT_LIST_HEAD(&cq->waiters);
+	/*
+	 * If cq->total_thread_count is 0 then we're not trying to keep
+	 * track of how many free threads we have, instead we're relying on
+	 * the secure world to tell us when we're out of thread and have to
+	 * wait for another thread to become available.
+	 */
+	cq->total_thread_count = thread_count;
+	cq->free_thread_count = thread_count;
+}
+
 void optee_cq_wait_init(struct optee_call_queue *cq,
 			struct optee_call_waiter *w, bool sys_thread)
 {
+	bool need_wait = false;
+
+	memset(w, 0, sizeof(*w));
+
 	/*
 	 * We're preparing to make a call to secure world. In case we can't
 	 * allocate a thread in secure world we'll end up waiting in
@@ -53,15 +71,43 @@  void optee_cq_wait_init(struct optee_call_queue *cq,
 	mutex_lock(&cq->mutex);
 
 	/*
-	 * We add ourselves to the queue, but we don't wait. This
-	 * guarantees that we don't lose a completion if secure world
-	 * returns busy and another thread just exited and try to complete
-	 * someone.
+	 * We add ourselves to a queue, but we don't wait. This guarantees
+	 * that we don't lose a completion if secure world returns busy and
+	 * another thread just exited and try to complete someone.
 	 */
 	init_completion(&w->c);
 	list_add_tail(&w->list_node, &cq->waiters);
 
+	if (cq->total_thread_count && sys_thread) {
+		if (cq->free_thread_count > 0)
+			cq->free_thread_count--;
+		else
+			need_wait = true;
+	} else if (cq->total_thread_count) {
+		if (cq->free_thread_count > 1)
+			cq->free_thread_count--;
+		else
+			need_wait = true;
+	}
+
 	mutex_unlock(&cq->mutex);
+
+	while (need_wait) {
+		optee_cq_wait_for_completion(cq, w);
+		mutex_lock(&cq->mutex);
+		if (sys_thread) {
+			if (cq->free_thread_count > 0) {
+				cq->free_thread_count--;
+				need_wait = false;
+			}
+		} else {
+			if (cq->free_thread_count > 1) {
+				cq->free_thread_count--;
+				need_wait = false;
+			}
+		}
+		mutex_unlock(&cq->mutex);
+	}
 }
 
 void optee_cq_wait_for_completion(struct optee_call_queue *cq,
@@ -104,6 +150,8 @@  void optee_cq_wait_final(struct optee_call_queue *cq,
 	/* Get out of the list */
 	list_del(&w->list_node);
 
+	cq->free_thread_count++;
+
 	/* Wake up one eventual waiting task */
 	optee_cq_complete_one(cq);
 
@@ -361,6 +409,22 @@  int optee_open_session(struct tee_context *ctx,
 	return rc;
 }
 
+int optee_system_session(struct tee_context *ctx, u32 session)
+{
+	struct optee_context_data *ctxdata = ctx->data;
+	struct optee_session *sess;
+
+	mutex_lock(&ctxdata->mutex);
+
+	sess = find_session(ctxdata, session);
+	if (sess && !sess->use_sys_thread)
+		sess->use_sys_thread = true;
+
+	mutex_unlock(&ctxdata->mutex);
+
+	return 0;
+}
+
 int optee_close_session_helper(struct tee_context *ctx, u32 session,
 			       bool system_thread)
 {
diff --git a/drivers/tee/optee/ffa_abi.c b/drivers/tee/optee/ffa_abi.c
index 5fde9d4100e3..0c9055691343 100644
--- a/drivers/tee/optee/ffa_abi.c
+++ b/drivers/tee/optee/ffa_abi.c
@@ -852,8 +852,7 @@  static int optee_ffa_probe(struct ffa_device *ffa_dev)
 	if (rc)
 		goto err_unreg_supp_teedev;
 	mutex_init(&optee->ffa.mutex);
-	mutex_init(&optee->call_queue.mutex);
-	INIT_LIST_HEAD(&optee->call_queue.waiters);
+	optee_cq_init(&optee->call_queue, 0);
 	optee_supp_init(&optee->supp);
 	optee_shm_arg_cache_init(optee, arg_cache_flags);
 	ffa_dev_set_drvdata(ffa_dev, optee);
diff --git a/drivers/tee/optee/optee_private.h b/drivers/tee/optee/optee_private.h
index b68273051454..6dcecb83c893 100644
--- a/drivers/tee/optee/optee_private.h
+++ b/drivers/tee/optee/optee_private.h
@@ -40,15 +40,29 @@  typedef void (optee_invoke_fn)(unsigned long, unsigned long, unsigned long,
 				unsigned long, unsigned long,
 				struct arm_smccc_res *);
 
+/*
+ * struct optee_call_waiter - TEE entry may need to wait for a free TEE thread
+ * @list_node		Reference in waiters list
+ * @c			Waiting completion reference
+ */
 struct optee_call_waiter {
 	struct list_head list_node;
 	struct completion c;
 };
 
+/*
+ * struct optee_call_queue - OP-TEE call queue management
+ * @mutex			Serializes access to this struct
+ * @waiters			List of threads waiting to enter OP-TEE
+ * @total_thread_count		Overall number of thread context in OP-TEE or 0
+ * @free_thread_count		Number of threads context free in OP-TEE
+ */
 struct optee_call_queue {
 	/* Serializes access to this struct */
 	struct mutex mutex;
 	struct list_head waiters;
+	int total_thread_count;
+	int free_thread_count;
 };
 
 struct optee_notif {
@@ -254,6 +268,7 @@  int optee_supp_send(struct tee_context *ctx, u32 ret, u32 num_params,
 int optee_open_session(struct tee_context *ctx,
 		       struct tee_ioctl_open_session_arg *arg,
 		       struct tee_param *param);
+int optee_system_session(struct tee_context *ctx, u32 session);
 int optee_close_session_helper(struct tee_context *ctx, u32 session,
 			       bool system_thread);
 int optee_close_session(struct tee_context *ctx, u32 session);
@@ -303,6 +318,7 @@  static inline void optee_to_msg_param_value(struct optee_msg_param *mp,
 	mp->u.value.c = p->u.value.c;
 }
 
+void optee_cq_init(struct optee_call_queue *cq, int thread_count);
 void optee_cq_wait_init(struct optee_call_queue *cq,
 			struct optee_call_waiter *w, bool sys_thread);
 void optee_cq_wait_for_completion(struct optee_call_queue *cq,
diff --git a/drivers/tee/optee/smc_abi.c b/drivers/tee/optee/smc_abi.c
index e2763cdcf111..3314ffeb91c8 100644
--- a/drivers/tee/optee/smc_abi.c
+++ b/drivers/tee/optee/smc_abi.c
@@ -1209,6 +1209,7 @@  static const struct tee_driver_ops optee_clnt_ops = {
 	.release = optee_release,
 	.open_session = optee_open_session,
 	.close_session = optee_close_session,
+	.system_session = optee_system_session,
 	.invoke_func = optee_invoke_func,
 	.cancel_req = optee_cancel_req,
 	.shm_register = optee_shm_register,
@@ -1356,6 +1357,16 @@  static bool optee_msg_exchange_capabilities(optee_invoke_fn *invoke_fn,
 	return true;
 }
 
+static unsigned int optee_msg_get_thread_count(optee_invoke_fn *invoke_fn)
+{
+	struct arm_smccc_res res;
+
+	invoke_fn(OPTEE_SMC_GET_THREAD_COUNT, 0, 0, 0, 0, 0, 0, 0, &res);
+	if (res.a0)
+		return 0;
+	return res.a1;
+}
+
 static struct tee_shm_pool *
 optee_config_shm_memremap(optee_invoke_fn *invoke_fn, void **memremaped_shm)
 {
@@ -1609,6 +1620,7 @@  static int optee_probe(struct platform_device *pdev)
 	struct optee *optee = NULL;
 	void *memremaped_shm = NULL;
 	unsigned int rpc_param_count;
+	unsigned int thread_count;
 	struct tee_device *teedev;
 	struct tee_context *ctx;
 	u32 max_notif_value;
@@ -1636,6 +1648,7 @@  static int optee_probe(struct platform_device *pdev)
 		return -EINVAL;
 	}
 
+	thread_count = optee_msg_get_thread_count(invoke_fn);
 	if (!optee_msg_exchange_capabilities(invoke_fn, &sec_caps,
 					     &max_notif_value,
 					     &rpc_param_count)) {
@@ -1725,8 +1738,7 @@  static int optee_probe(struct platform_device *pdev)
 	if (rc)
 		goto err_unreg_supp_teedev;
 
-	mutex_init(&optee->call_queue.mutex);
-	INIT_LIST_HEAD(&optee->call_queue.waiters);
+	optee_cq_init(&optee->call_queue, thread_count);
 	optee_supp_init(&optee->supp);
 	optee->smc.memremaped_shm = memremaped_shm;
 	optee->pool = pool;