[v6,02/21] dt-bindings: Add binding for gunyah hypervisor

Message ID 20221026185846.3983888-3-quic_eberman@quicinc.com
State New
Headers
Series Drivers for gunyah hypervisor |

Commit Message

Elliot Berman Oct. 26, 2022, 6:58 p.m. UTC
  When Linux is booted as a guest under the Gunyah hypervisor, the Gunyah
Resource Manager applies a devicetree overlay describing the virtual
platform configuration of the guest VM, such as the message queue
capability IDs for communicating with the Resource Manager. This
information is not otherwise discoverable by a VM: the Gunyah hypervisor
core does not provide a direct interface to discover capability IDs nor
a way to communicate with RM without having already known the
corresponding message queue capability ID. Add the DT bindings that
Gunyah adheres for the hypervisor node and message queues.

Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
---
 .../bindings/firmware/gunyah-hypervisor.yaml  | 86 +++++++++++++++++++
 MAINTAINERS                                   |  1 +
 2 files changed, 87 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
  

Comments

Krzysztof Kozlowski Oct. 27, 2022, 7:57 p.m. UTC | #1
On 26/10/2022 14:58, Elliot Berman wrote:
> When Linux is booted as a guest under the Gunyah hypervisor, the Gunyah
> Resource Manager applies a devicetree overlay describing the virtual
> platform configuration of the guest VM, such as the message queue
> capability IDs for communicating with the Resource Manager. This
> information is not otherwise discoverable by a VM: the Gunyah hypervisor
> core does not provide a direct interface to discover capability IDs nor
> a way to communicate with RM without having already known the
> corresponding message queue capability ID. Add the DT bindings that
> Gunyah adheres for the hypervisor node and message queues.
> 
> Signed-off-by: Elliot Berman <quic_eberman@quicinc.com>
> ---
>  .../bindings/firmware/gunyah-hypervisor.yaml  | 86 +++++++++++++++++++
>  MAINTAINERS                                   |  1 +
>  2 files changed, 87 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
> 
> diff --git a/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml b/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
> new file mode 100644
> index 000000000000..3a8c1c2157a4
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
> @@ -0,0 +1,86 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/firmware/gunyah-hypervisor.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Gunyah Hypervisor
> +
> +maintainers:
> +  - Murali Nalajala <quic_mnalajal@quicinc.com>
> +  - Elliot Berman <quic_eberman@quicinc.com>
> +
> +description: |+
> +  Gunyah virtual machines use this information to determine the capability IDs
> +  of the message queues used to communicate with the Gunyah Resource Manager.
> +  See also: https://github.com/quic/gunyah-resource-manager/blob/develop/src/vm_creation/dto_construct.c
> +
> +properties:
> +  compatible:
> +    items:
> +      - const: gunyah-hypervisor-1.0
> +      - const: gunyah-hypervisor

You are sending next version while we still keep discussing old one...
and without necessary changes. Instead keep discussing the previous one
till we reach consensus.

These compatibles look wrong based on our discussion.

Best regards,
Krzysztof
  
Jassi Brar Oct. 28, 2022, 2:33 a.m. UTC | #2
On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
.....
> +
> +        gunyah-resource-mgr@0 {
> +            compatible = "gunyah-resource-manager-1-0", "gunyah-resource-manager";
> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX full IRQ */
> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX empty IRQ */
> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
> +                  /* TX, RX cap ids */
> +        };
>
All these resources are used only by the mailbox controller driver.
So, this should be the mailbox controller node, rather than the
mailbox user.
One option is to load gunyah-resource-manager as a module that relies
on the gunyah-mailbox provider. That would also avoid the "Allow
direct registration to a channel" hack patch.

thanks.
  
Elliot Berman Nov. 1, 2022, 3:19 a.m. UTC | #3
Hi Jassi,

On 10/27/2022 7:33 PM, Jassi Brar wrote:
 > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman 
<quic_eberman@quicinc.com> wrote:
 > .....
 >> +
 >> +        gunyah-resource-mgr@0 {
 >> +            compatible = "gunyah-resource-manager-1-0", 
"gunyah-resource-manager";
 >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX 
full IRQ */
 >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX 
empty IRQ */
 >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
 >> +                  /* TX, RX cap ids */
 >> +        };
 >>
 > All these resources are used only by the mailbox controller driver.
 > So, this should be the mailbox controller node, rather than the
 > mailbox user.> One option is to load gunyah-resource-manager as a 
module that relies
 > on the gunyah-mailbox provider. That would also avoid the "Allow
 > direct registration to a channel" hack patch.

A message queue to another guest VM wouldn't be known at boot time and 
thus couldn't be described on the devicetree. We will need "Allow direct 
registration to a channel" patch anyway to support those message queues. 
I would like to have one consistent mechanism to set up message queues.

- Elliot
  
Jassi Brar Nov. 1, 2022, 4:23 p.m. UTC | #4
On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>
> Hi Jassi,
>
> On 10/27/2022 7:33 PM, Jassi Brar wrote:
>  > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
> <quic_eberman@quicinc.com> wrote:
>  > .....
>  >> +
>  >> +        gunyah-resource-mgr@0 {
>  >> +            compatible = "gunyah-resource-manager-1-0",
> "gunyah-resource-manager";
>  >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
> full IRQ */
>  >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
> empty IRQ */
>  >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>  >> +                  /* TX, RX cap ids */
>  >> +        };
>  >>
>  > All these resources are used only by the mailbox controller driver.
>  > So, this should be the mailbox controller node, rather than the
>  > mailbox user.> One option is to load gunyah-resource-manager as a
> module that relies
>  > on the gunyah-mailbox provider. That would also avoid the "Allow
>  > direct registration to a channel" hack patch.
>
> A message queue to another guest VM wouldn't be known at boot time and
> thus couldn't be described on the devicetree.
>
I think you need to implement of_xlate() ... or please tell me what
exactly you need to specify in the dt.

thnx.
  
Elliot Berman Nov. 1, 2022, 8:35 p.m. UTC | #5
On 11/1/2022 9:23 AM, Jassi Brar wrote:
> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>
>> Hi Jassi,
>>
>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
>>   > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
>> <quic_eberman@quicinc.com> wrote:
>>   > .....
>>   >> +
>>   >> +        gunyah-resource-mgr@0 {
>>   >> +            compatible = "gunyah-resource-manager-1-0",
>> "gunyah-resource-manager";
>>   >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
>> full IRQ */
>>   >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
>> empty IRQ */
>>   >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>>   >> +                  /* TX, RX cap ids */
>>   >> +        };
>>   >>
>>   > All these resources are used only by the mailbox controller driver.
>>   > So, this should be the mailbox controller node, rather than the
>>   > mailbox user.> One option is to load gunyah-resource-manager as a
>> module that relies
>>   > on the gunyah-mailbox provider. That would also avoid the "Allow
>>   > direct registration to a channel" hack patch.
>>
>> A message queue to another guest VM wouldn't be known at boot time and
>> thus couldn't be described on the devicetree.
>>
> I think you need to implement of_xlate() ... or please tell me what
> exactly you need to specify in the dt.

Dynamically created virtual machines can't be known on the dt, so there 
is nothing to specify in the DT. There couldn't be a devicetree node for 
the message queue client because that client is only exists once the VM 
is created by userspace.

As a more concrete example, there is QRTR (net/qrtr) virtualization 
support which is implemented with Gunyah message queues. Whether a QRTR 
client needs to be for VM is only determined when launching the VM as 
well as which message queue resource the QRTR client should be using. 
Since many VMs could be running on a system, it's not possible to know 
the number of mailbox controllers (i.e. message queues) nor the number 
of mailbox clients (e.g. QRTR) as a static configuration in the DT.

Thanks,
Elliot
  
Jassi Brar Nov. 1, 2022, 9:58 p.m. UTC | #6
On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>
>
>
> On 11/1/2022 9:23 AM, Jassi Brar wrote:
> > On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>
> >> Hi Jassi,
> >>
> >> On 10/27/2022 7:33 PM, Jassi Brar wrote:
> >>   > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
> >> <quic_eberman@quicinc.com> wrote:
> >>   > .....
> >>   >> +
> >>   >> +        gunyah-resource-mgr@0 {
> >>   >> +            compatible = "gunyah-resource-manager-1-0",
> >> "gunyah-resource-manager";
> >>   >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
> >> full IRQ */
> >>   >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
> >> empty IRQ */
> >>   >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
> >>   >> +                  /* TX, RX cap ids */
> >>   >> +        };
> >>   >>
> >>   > All these resources are used only by the mailbox controller driver.
> >>   > So, this should be the mailbox controller node, rather than the
> >>   > mailbox user.> One option is to load gunyah-resource-manager as a
> >> module that relies
> >>   > on the gunyah-mailbox provider. That would also avoid the "Allow
> >>   > direct registration to a channel" hack patch.
> >>
> >> A message queue to another guest VM wouldn't be known at boot time and
> >> thus couldn't be described on the devicetree.
> >>
> > I think you need to implement of_xlate() ... or please tell me what
> > exactly you need to specify in the dt.
>
> Dynamically created virtual machines can't be known on the dt, so there
> is nothing to specify in the DT. There couldn't be a devicetree node for
> the message queue client because that client is only exists once the VM
> is created by userspace.
>
The underlying "physical channel" is the synchronous SMC instruction,
which remains 1 irrespective of the number of mailbox instances
created.
So basically you are sharing one resource among users. Why doesn't the
RM request the "smc instruction" channel once and share it among
users?

-j
  
Elliot Berman Nov. 2, 2022, 12:12 a.m. UTC | #7
On 11/1/2022 2:58 PM, Jassi Brar wrote:
> On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>
>>
>>
>> On 11/1/2022 9:23 AM, Jassi Brar wrote:
>>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>
>>>> Hi Jassi,
>>>>
>>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
>>>>    > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
>>>> <quic_eberman@quicinc.com> wrote:
>>>>    > .....
>>>>    >> +
>>>>    >> +        gunyah-resource-mgr@0 {
>>>>    >> +            compatible = "gunyah-resource-manager-1-0",
>>>> "gunyah-resource-manager";
>>>>    >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
>>>> full IRQ */
>>>>    >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
>>>> empty IRQ */
>>>>    >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>>>>    >> +                  /* TX, RX cap ids */
>>>>    >> +        };
>>>>    >>
>>>>    > All these resources are used only by the mailbox controller driver.
>>>>    > So, this should be the mailbox controller node, rather than the
>>>>    > mailbox user.> One option is to load gunyah-resource-manager as a
>>>> module that relies
>>>>    > on the gunyah-mailbox provider. That would also avoid the "Allow
>>>>    > direct registration to a channel" hack patch.
>>>>
>>>> A message queue to another guest VM wouldn't be known at boot time and
>>>> thus couldn't be described on the devicetree.
>>>>
>>> I think you need to implement of_xlate() ... or please tell me what
>>> exactly you need to specify in the dt.
>>
>> Dynamically created virtual machines can't be known on the dt, so there
>> is nothing to specify in the DT. There couldn't be a devicetree node for
>> the message queue client because that client is only exists once the VM
>> is created by userspace.
>>
> The underlying "physical channel" is the synchronous SMC instruction,
> which remains 1 irrespective of the number of mailbox instances
> created.

I disagree that the physical channel is the SMC instruction. Regardless 
though, there are num_online_cpus() "physical channels" with this 
perspective.

> So basically you are sharing one resource among users. Why doesn't the
> RM request the "smc instruction" channel once and share it among
> users?

I suppose in this scenario, a single mailbox channel would represent all 
message queues? This would cause Linux to serialize *all* message queue 
hypercalls. Sorry, I can only think negative implications.

Error handling needs to move into clients: if a TX message queue becomes 
full or an RX message queue becomes empty, then we'll need to return 
error back to the client right away. The clients would need to register 
for the RTS/RTR interrupts to know when to send/receive messages and 
have retry error handling. If the mailbox controller retried for the 
clients as currently proposed, then we could get into a scenario where a 
message queue could never be ready to send/receive and thus stuck 
forever trying to process that message. The effect here would be that 
the mailbox controller becomes a wrapper to some SMC instructions that 
aren't related at the SMC instruction level.

A single channel would limit performance of SMP systems because only one 
core could send/receive a message. There is no such limitation for 
message queues to behave like this.

Thanks,
Elliot
  
Jassi Brar Nov. 2, 2022, 2:01 a.m. UTC | #8
On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>
>
>
> On 11/1/2022 2:58 PM, Jassi Brar wrote:
> > On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 11/1/2022 9:23 AM, Jassi Brar wrote:
> >>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>>>
> >>>> Hi Jassi,
> >>>>
> >>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
> >>>>    > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
> >>>> <quic_eberman@quicinc.com> wrote:
> >>>>    > .....
> >>>>    >> +
> >>>>    >> +        gunyah-resource-mgr@0 {
> >>>>    >> +            compatible = "gunyah-resource-manager-1-0",
> >>>> "gunyah-resource-manager";
> >>>>    >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
> >>>> full IRQ */
> >>>>    >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
> >>>> empty IRQ */
> >>>>    >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
> >>>>    >> +                  /* TX, RX cap ids */
> >>>>    >> +        };
> >>>>    >>
> >>>>    > All these resources are used only by the mailbox controller driver.
> >>>>    > So, this should be the mailbox controller node, rather than the
> >>>>    > mailbox user.> One option is to load gunyah-resource-manager as a
> >>>> module that relies
> >>>>    > on the gunyah-mailbox provider. That would also avoid the "Allow
> >>>>    > direct registration to a channel" hack patch.
> >>>>
> >>>> A message queue to another guest VM wouldn't be known at boot time and
> >>>> thus couldn't be described on the devicetree.
> >>>>
> >>> I think you need to implement of_xlate() ... or please tell me what
> >>> exactly you need to specify in the dt.
> >>
> >> Dynamically created virtual machines can't be known on the dt, so there
> >> is nothing to specify in the DT. There couldn't be a devicetree node for
> >> the message queue client because that client is only exists once the VM
> >> is created by userspace.
> >>
> > The underlying "physical channel" is the synchronous SMC instruction,
> > which remains 1 irrespective of the number of mailbox instances
> > created.
>
> I disagree that the physical channel is the SMC instruction. Regardless
> though, there are num_online_cpus() "physical channels" with this
> perspective.
>
> > So basically you are sharing one resource among users. Why doesn't the
> > RM request the "smc instruction" channel once and share it among
> > users?
>
> I suppose in this scenario, a single mailbox channel would represent all
> message queues? This would cause Linux to serialize *all* message queue
> hypercalls. Sorry, I can only think negative implications.
>
> Error handling needs to move into clients: if a TX message queue becomes
> full or an RX message queue becomes empty, then we'll need to return
> error back to the client right away. The clients would need to register
> for the RTS/RTR interrupts to know when to send/receive messages and
> have retry error handling. If the mailbox controller retried for the
> clients as currently proposed, then we could get into a scenario where a
> message queue could never be ready to send/receive and thus stuck
> forever trying to process that message. The effect here would be that
> the mailbox controller becomes a wrapper to some SMC instructions that
> aren't related at the SMC instruction level.
>
> A single channel would limit performance of SMP systems because only one
> core could send/receive a message. There is no such limitation for
> message queues to behave like this.
>
This is just an illusion. If Gunyah can handle multiple calls from a
VM parallely, even with the "bind-client-to-channel" hack you can't
make sure different channels run on different cpu cores.  If you are
ok with that, you could simply populate a mailbox controller with N
channels and allocate them in any order the clients ask.

-j
  
Elliot Berman Nov. 2, 2022, 6:05 p.m. UTC | #9
Hi Jassi,

On 11/1/2022 7:01 PM, Jassi Brar wrote:
> On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>
>>
>>
>> On 11/1/2022 2:58 PM, Jassi Brar wrote:
>>> On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/1/2022 9:23 AM, Jassi Brar wrote:
>>>>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>>>
>>>>>> Hi Jassi,
>>>>>>
>>>>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
>>>>>>     > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
>>>>>> <quic_eberman@quicinc.com> wrote:
>>>>>>     > .....
>>>>>>     >> +
>>>>>>     >> +        gunyah-resource-mgr@0 {
>>>>>>     >> +            compatible = "gunyah-resource-manager-1-0",
>>>>>> "gunyah-resource-manager";
>>>>>>     >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
>>>>>> full IRQ */
>>>>>>     >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
>>>>>> empty IRQ */
>>>>>>     >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>>>>>>     >> +                  /* TX, RX cap ids */
>>>>>>     >> +        };
>>>>>>     >>
>>>>>>     > All these resources are used only by the mailbox controller driver.
>>>>>>     > So, this should be the mailbox controller node, rather than the
>>>>>>     > mailbox user.> One option is to load gunyah-resource-manager as a
>>>>>> module that relies
>>>>>>     > on the gunyah-mailbox provider. That would also avoid the "Allow
>>>>>>     > direct registration to a channel" hack patch.
>>>>>>
>>>>>> A message queue to another guest VM wouldn't be known at boot time and
>>>>>> thus couldn't be described on the devicetree.
>>>>>>
>>>>> I think you need to implement of_xlate() ... or please tell me what
>>>>> exactly you need to specify in the dt.
>>>>
>>>> Dynamically created virtual machines can't be known on the dt, so there
>>>> is nothing to specify in the DT. There couldn't be a devicetree node for
>>>> the message queue client because that client is only exists once the VM
>>>> is created by userspace.
>>>>
>>> The underlying "physical channel" is the synchronous SMC instruction,
>>> which remains 1 irrespective of the number of mailbox instances
>>> created.
>>
>> I disagree that the physical channel is the SMC instruction. Regardless
>> though, there are num_online_cpus() "physical channels" with this
>> perspective.
>>
>>> So basically you are sharing one resource among users. Why doesn't the
>>> RM request the "smc instruction" channel once and share it among
>>> users?
>>
>> I suppose in this scenario, a single mailbox channel would represent all
>> message queues? This would cause Linux to serialize *all* message queue
>> hypercalls. Sorry, I can only think negative implications.
>>
>> Error handling needs to move into clients: if a TX message queue becomes
>> full or an RX message queue becomes empty, then we'll need to return
>> error back to the client right away. The clients would need to register
>> for the RTS/RTR interrupts to know when to send/receive messages and
>> have retry error handling. If the mailbox controller retried for the
>> clients as currently proposed, then we could get into a scenario where a
>> message queue could never be ready to send/receive and thus stuck
>> forever trying to process that message. The effect here would be that
>> the mailbox controller becomes a wrapper to some SMC instructions that
>> aren't related at the SMC instruction level.
>>
>> A single channel would limit performance of SMP systems because only one
>> core could send/receive a message. There is no such limitation for
>> message queues to behave like this.
>>
> This is just an illusion. If Gunyah can handle multiple calls from a
> VM parallely, even with the "bind-client-to-channel" hack you can't
> make sure different channels run on different cpu cores.  If you are
> ok with that, you could simply populate a mailbox controller with N
> channels and allocate them in any order the clients ask.

I wanted to make sure I understood the ask here completely. On what 
basis is N chosen? Who would be the mailbox clients?

Thanks,
Elliot
  
Jassi Brar Nov. 2, 2022, 6:24 p.m. UTC | #10
On Wed, Nov 2, 2022 at 1:06 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>
> Hi Jassi,
>
> On 11/1/2022 7:01 PM, Jassi Brar wrote:
> > On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>
> >>
> >>
> >> On 11/1/2022 2:58 PM, Jassi Brar wrote:
> >>> On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 11/1/2022 9:23 AM, Jassi Brar wrote:
> >>>>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>>>>>
> >>>>>> Hi Jassi,
> >>>>>>
> >>>>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
> >>>>>>     > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
> >>>>>> <quic_eberman@quicinc.com> wrote:
> >>>>>>     > .....
> >>>>>>     >> +
> >>>>>>     >> +        gunyah-resource-mgr@0 {
> >>>>>>     >> +            compatible = "gunyah-resource-manager-1-0",
> >>>>>> "gunyah-resource-manager";
> >>>>>>     >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
> >>>>>> full IRQ */
> >>>>>>     >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
> >>>>>> empty IRQ */
> >>>>>>     >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
> >>>>>>     >> +                  /* TX, RX cap ids */
> >>>>>>     >> +        };
> >>>>>>     >>
> >>>>>>     > All these resources are used only by the mailbox controller driver.
> >>>>>>     > So, this should be the mailbox controller node, rather than the
> >>>>>>     > mailbox user.> One option is to load gunyah-resource-manager as a
> >>>>>> module that relies
> >>>>>>     > on the gunyah-mailbox provider. That would also avoid the "Allow
> >>>>>>     > direct registration to a channel" hack patch.
> >>>>>>
> >>>>>> A message queue to another guest VM wouldn't be known at boot time and
> >>>>>> thus couldn't be described on the devicetree.
> >>>>>>
> >>>>> I think you need to implement of_xlate() ... or please tell me what
> >>>>> exactly you need to specify in the dt.
> >>>>
> >>>> Dynamically created virtual machines can't be known on the dt, so there
> >>>> is nothing to specify in the DT. There couldn't be a devicetree node for
> >>>> the message queue client because that client is only exists once the VM
> >>>> is created by userspace.
> >>>>
> >>> The underlying "physical channel" is the synchronous SMC instruction,
> >>> which remains 1 irrespective of the number of mailbox instances
> >>> created.
> >>
> >> I disagree that the physical channel is the SMC instruction. Regardless
> >> though, there are num_online_cpus() "physical channels" with this
> >> perspective.
> >>
> >>> So basically you are sharing one resource among users. Why doesn't the
> >>> RM request the "smc instruction" channel once and share it among
> >>> users?
> >>
> >> I suppose in this scenario, a single mailbox channel would represent all
> >> message queues? This would cause Linux to serialize *all* message queue
> >> hypercalls. Sorry, I can only think negative implications.
> >>
> >> Error handling needs to move into clients: if a TX message queue becomes
> >> full or an RX message queue becomes empty, then we'll need to return
> >> error back to the client right away. The clients would need to register
> >> for the RTS/RTR interrupts to know when to send/receive messages and
> >> have retry error handling. If the mailbox controller retried for the
> >> clients as currently proposed, then we could get into a scenario where a
> >> message queue could never be ready to send/receive and thus stuck
> >> forever trying to process that message. The effect here would be that
> >> the mailbox controller becomes a wrapper to some SMC instructions that
> >> aren't related at the SMC instruction level.
> >>
> >> A single channel would limit performance of SMP systems because only one
> >> core could send/receive a message. There is no such limitation for
> >> message queues to behave like this.
> >>
> > This is just an illusion. If Gunyah can handle multiple calls from a
> > VM parallely, even with the "bind-client-to-channel" hack you can't
> > make sure different channels run on different cpu cores.  If you are
> > ok with that, you could simply populate a mailbox controller with N
> > channels and allocate them in any order the clients ask.
>
> I wanted to make sure I understood the ask here completely. On what
> basis is N chosen? Who would be the mailbox clients?
>
A channel structure is cheap, so any number that is not likely to run
out. Say you have 10 possible users in a VM, set N=16. I know ideally
it should be precise and flexible but the gain in simplicity makes the
trade-off very acceptable.

thanks.
  
Elliot Berman Nov. 2, 2022, 11:23 p.m. UTC | #11
On 11/2/2022 11:24 AM, Jassi Brar wrote:
> On Wed, Nov 2, 2022 at 1:06 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>
>> Hi Jassi,
>>
>> On 11/1/2022 7:01 PM, Jassi Brar wrote:
>>> On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>
>>>>
>>>>
>>>> On 11/1/2022 2:58 PM, Jassi Brar wrote:
>>>>> On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/1/2022 9:23 AM, Jassi Brar wrote:
>>>>>>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>>>>>
>>>>>>>> Hi Jassi,
>>>>>>>>
>>>>>>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
>>>>>>>>      > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
>>>>>>>> <quic_eberman@quicinc.com> wrote:
>>>>>>>>      > .....
>>>>>>>>      >> +
>>>>>>>>      >> +        gunyah-resource-mgr@0 {
>>>>>>>>      >> +            compatible = "gunyah-resource-manager-1-0",
>>>>>>>> "gunyah-resource-manager";
>>>>>>>>      >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
>>>>>>>> full IRQ */
>>>>>>>>      >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
>>>>>>>> empty IRQ */
>>>>>>>>      >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>>>>>>>>      >> +                  /* TX, RX cap ids */
>>>>>>>>      >> +        };
>>>>>>>>      >>
>>>>>>>>      > All these resources are used only by the mailbox controller driver.
>>>>>>>>      > So, this should be the mailbox controller node, rather than the
>>>>>>>>      > mailbox user.> One option is to load gunyah-resource-manager as a
>>>>>>>> module that relies
>>>>>>>>      > on the gunyah-mailbox provider. That would also avoid the "Allow
>>>>>>>>      > direct registration to a channel" hack patch.
>>>>>>>>
>>>>>>>> A message queue to another guest VM wouldn't be known at boot time and
>>>>>>>> thus couldn't be described on the devicetree.
>>>>>>>>
>>>>>>> I think you need to implement of_xlate() ... or please tell me what
>>>>>>> exactly you need to specify in the dt.
>>>>>>
>>>>>> Dynamically created virtual machines can't be known on the dt, so there
>>>>>> is nothing to specify in the DT. There couldn't be a devicetree node for
>>>>>> the message queue client because that client is only exists once the VM
>>>>>> is created by userspace.
>>>>>>
>>>>> The underlying "physical channel" is the synchronous SMC instruction,
>>>>> which remains 1 irrespective of the number of mailbox instances
>>>>> created.
>>>>
>>>> I disagree that the physical channel is the SMC instruction. Regardless
>>>> though, there are num_online_cpus() "physical channels" with this
>>>> perspective.
>>>>
>>>>> So basically you are sharing one resource among users. Why doesn't the
>>>>> RM request the "smc instruction" channel once and share it among
>>>>> users?
>>>>
>>>> I suppose in this scenario, a single mailbox channel would represent all
>>>> message queues? This would cause Linux to serialize *all* message queue
>>>> hypercalls. Sorry, I can only think negative implications.
>>>>
>>>> Error handling needs to move into clients: if a TX message queue becomes
>>>> full or an RX message queue becomes empty, then we'll need to return
>>>> error back to the client right away. The clients would need to register
>>>> for the RTS/RTR interrupts to know when to send/receive messages and
>>>> have retry error handling. If the mailbox controller retried for the
>>>> clients as currently proposed, then we could get into a scenario where a
>>>> message queue could never be ready to send/receive and thus stuck
>>>> forever trying to process that message. The effect here would be that
>>>> the mailbox controller becomes a wrapper to some SMC instructions that
>>>> aren't related at the SMC instruction level.
>>>>
>>>> A single channel would limit performance of SMP systems because only one
>>>> core could send/receive a message. There is no such limitation for
>>>> message queues to behave like this.
>>>>
>>> This is just an illusion. If Gunyah can handle multiple calls from a
>>> VM parallely, even with the "bind-client-to-channel" hack you can't
>>> make sure different channels run on different cpu cores.  If you are
>>> ok with that, you could simply populate a mailbox controller with N
>>> channels and allocate them in any order the clients ask.
>>
>> I wanted to make sure I understood the ask here completely. On what
>> basis is N chosen? Who would be the mailbox clients?
>>
> A channel structure is cheap, so any number that is not likely to run
> out. Say you have 10 possible users in a VM, set N=16. I know ideally
> it should be precise and flexible but the gain in simplicity makes the
> trade-off very acceptable.

I think I get the direction you are thinking now. N is chosen based off 
of how many clients there might be. One mailbox controller will 
represent all message queues and each channel will be one message queue. 
There are some limitations that might make it more complex to implement 
than having 1 message queue per controller like I have now.

My interpretation is that mailbox controller knows the configuration of 
its channels before being bound to a client. For dynamically created 
message queues, the client would need tell the controller about the 
message queue configuration. I didn't find example where client is 
providing information about a channel to the controller.

  1. need a mechanism to allow the client to provide the 
gunyah_resources for the channel (i.e. the irqs and cap ids).
  2. Still need to have bind-client-to-channel patch since clients 
aren't real devices and so shouldn't be on DT.

Thanks,
Elliot
  
Jassi Brar Nov. 3, 2022, 3:21 a.m. UTC | #12
On Wed, Nov 2, 2022 at 6:23 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>
>
>
> On 11/2/2022 11:24 AM, Jassi Brar wrote:
> > On Wed, Nov 2, 2022 at 1:06 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>
> >> Hi Jassi,
> >>
> >> On 11/1/2022 7:01 PM, Jassi Brar wrote:
> >>> On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>>>
> >>>>
> >>>>
> >>>> On 11/1/2022 2:58 PM, Jassi Brar wrote:
> >>>>> On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 11/1/2022 9:23 AM, Jassi Brar wrote:
> >>>>>>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
> >>>>>>>>
> >>>>>>>> Hi Jassi,
> >>>>>>>>
> >>>>>>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
> >>>>>>>>      > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
> >>>>>>>> <quic_eberman@quicinc.com> wrote:
> >>>>>>>>      > .....
> >>>>>>>>      >> +
> >>>>>>>>      >> +        gunyah-resource-mgr@0 {
> >>>>>>>>      >> +            compatible = "gunyah-resource-manager-1-0",
> >>>>>>>> "gunyah-resource-manager";
> >>>>>>>>      >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
> >>>>>>>> full IRQ */
> >>>>>>>>      >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
> >>>>>>>> empty IRQ */
> >>>>>>>>      >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
> >>>>>>>>      >> +                  /* TX, RX cap ids */
> >>>>>>>>      >> +        };
> >>>>>>>>      >>
> >>>>>>>>      > All these resources are used only by the mailbox controller driver.
> >>>>>>>>      > So, this should be the mailbox controller node, rather than the
> >>>>>>>>      > mailbox user.> One option is to load gunyah-resource-manager as a
> >>>>>>>> module that relies
> >>>>>>>>      > on the gunyah-mailbox provider. That would also avoid the "Allow
> >>>>>>>>      > direct registration to a channel" hack patch.
> >>>>>>>>
> >>>>>>>> A message queue to another guest VM wouldn't be known at boot time and
> >>>>>>>> thus couldn't be described on the devicetree.
> >>>>>>>>
> >>>>>>> I think you need to implement of_xlate() ... or please tell me what
> >>>>>>> exactly you need to specify in the dt.
> >>>>>>
> >>>>>> Dynamically created virtual machines can't be known on the dt, so there
> >>>>>> is nothing to specify in the DT. There couldn't be a devicetree node for
> >>>>>> the message queue client because that client is only exists once the VM
> >>>>>> is created by userspace.
> >>>>>>
> >>>>> The underlying "physical channel" is the synchronous SMC instruction,
> >>>>> which remains 1 irrespective of the number of mailbox instances
> >>>>> created.
> >>>>
> >>>> I disagree that the physical channel is the SMC instruction. Regardless
> >>>> though, there are num_online_cpus() "physical channels" with this
> >>>> perspective.
> >>>>
> >>>>> So basically you are sharing one resource among users. Why doesn't the
> >>>>> RM request the "smc instruction" channel once and share it among
> >>>>> users?
> >>>>
> >>>> I suppose in this scenario, a single mailbox channel would represent all
> >>>> message queues? This would cause Linux to serialize *all* message queue
> >>>> hypercalls. Sorry, I can only think negative implications.
> >>>>
> >>>> Error handling needs to move into clients: if a TX message queue becomes
> >>>> full or an RX message queue becomes empty, then we'll need to return
> >>>> error back to the client right away. The clients would need to register
> >>>> for the RTS/RTR interrupts to know when to send/receive messages and
> >>>> have retry error handling. If the mailbox controller retried for the
> >>>> clients as currently proposed, then we could get into a scenario where a
> >>>> message queue could never be ready to send/receive and thus stuck
> >>>> forever trying to process that message. The effect here would be that
> >>>> the mailbox controller becomes a wrapper to some SMC instructions that
> >>>> aren't related at the SMC instruction level.
> >>>>
> >>>> A single channel would limit performance of SMP systems because only one
> >>>> core could send/receive a message. There is no such limitation for
> >>>> message queues to behave like this.
> >>>>
> >>> This is just an illusion. If Gunyah can handle multiple calls from a
> >>> VM parallely, even with the "bind-client-to-channel" hack you can't
> >>> make sure different channels run on different cpu cores.  If you are
> >>> ok with that, you could simply populate a mailbox controller with N
> >>> channels and allocate them in any order the clients ask.
> >>
> >> I wanted to make sure I understood the ask here completely. On what
> >> basis is N chosen? Who would be the mailbox clients?
> >>
> > A channel structure is cheap, so any number that is not likely to run
> > out. Say you have 10 possible users in a VM, set N=16. I know ideally
> > it should be precise and flexible but the gain in simplicity makes the
> > trade-off very acceptable.
>
> I think I get the direction you are thinking now. N is chosen based off
> of how many clients there might be. One mailbox controller will
> represent all message queues and each channel will be one message queue.
> There are some limitations that might make it more complex to implement
> than having 1 message queue per controller like I have now.
>
> My interpretation is that mailbox controller knows the configuration of
> its channels before being bound to a client. For dynamically created
> message queues, the client would need tell the controller about the
> message queue configuration. I didn't find example where client is
> providing information about a channel to the controller.
>
>   1. need a mechanism to allow the client to provide the
> gunyah_resources for the channel (i.e. the irqs and cap ids).
>
IIUC there is exactly one resource-manager in a VM. Right?
Looking at your code, TX and RX irq are used only by the mailbox
driver and are the same for all clients/users. So that should be a
property under the mailbox controller node. Not sure what cap ids are.

>   2. Still need to have bind-client-to-channel patch since clients
> aren't real devices and so shouldn't be on DT.
>
the clients may be virtual (serial, gpio etc) but the resource-manager
requires some mailbox hardware to communicate, so the resource-manager
should be the mailbox client (that further spawns virtual devices)

thnx.
  
Elliot Berman Nov. 3, 2022, 7:45 p.m. UTC | #13
On 11/2/2022 8:21 PM, Jassi Brar wrote:
> On Wed, Nov 2, 2022 at 6:23 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>
>>
>>
>> On 11/2/2022 11:24 AM, Jassi Brar wrote:
>>> On Wed, Nov 2, 2022 at 1:06 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>
>>>> Hi Jassi,
>>>>
>>>> On 11/1/2022 7:01 PM, Jassi Brar wrote:
>>>>> On Tue, Nov 1, 2022 at 7:12 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 11/1/2022 2:58 PM, Jassi Brar wrote:
>>>>>>> On Tue, Nov 1, 2022 at 3:35 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 11/1/2022 9:23 AM, Jassi Brar wrote:
>>>>>>>>> On Mon, Oct 31, 2022 at 10:20 PM Elliot Berman <quic_eberman@quicinc.com> wrote:
>>>>>>>>>>
>>>>>>>>>> Hi Jassi,
>>>>>>>>>>
>>>>>>>>>> On 10/27/2022 7:33 PM, Jassi Brar wrote:
>>>>>>>>>>       > On Wed, Oct 26, 2022 at 1:59 PM Elliot Berman
>>>>>>>>>> <quic_eberman@quicinc.com> wrote:
>>>>>>>>>>       > .....
>>>>>>>>>>       >> +
>>>>>>>>>>       >> +        gunyah-resource-mgr@0 {
>>>>>>>>>>       >> +            compatible = "gunyah-resource-manager-1-0",
>>>>>>>>>> "gunyah-resource-manager";
>>>>>>>>>>       >> +            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX
>>>>>>>>>> full IRQ */
>>>>>>>>>>       >> +                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX
>>>>>>>>>> empty IRQ */
>>>>>>>>>>       >> +            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
>>>>>>>>>>       >> +                  /* TX, RX cap ids */
>>>>>>>>>>       >> +        };
>>>>>>>>>>       >>
>>>>>>>>>>       > All these resources are used only by the mailbox controller driver.
>>>>>>>>>>       > So, this should be the mailbox controller node, rather than the
>>>>>>>>>>       > mailbox user.> One option is to load gunyah-resource-manager as a
>>>>>>>>>> module that relies
>>>>>>>>>>       > on the gunyah-mailbox provider. That would also avoid the "Allow
>>>>>>>>>>       > direct registration to a channel" hack patch.
>>>>>>>>>>
>>>>>>>>>> A message queue to another guest VM wouldn't be known at boot time and
>>>>>>>>>> thus couldn't be described on the devicetree.
>>>>>>>>>>
>>>>>>>>> I think you need to implement of_xlate() ... or please tell me what
>>>>>>>>> exactly you need to specify in the dt.
>>>>>>>>
>>>>>>>> Dynamically created virtual machines can't be known on the dt, so there
>>>>>>>> is nothing to specify in the DT. There couldn't be a devicetree node for
>>>>>>>> the message queue client because that client is only exists once the VM
>>>>>>>> is created by userspace.
>>>>>>>>
>>>>>>> The underlying "physical channel" is the synchronous SMC instruction,
>>>>>>> which remains 1 irrespective of the number of mailbox instances
>>>>>>> created.
>>>>>>
>>>>>> I disagree that the physical channel is the SMC instruction. Regardless
>>>>>> though, there are num_online_cpus() "physical channels" with this
>>>>>> perspective.
>>>>>>
>>>>>>> So basically you are sharing one resource among users. Why doesn't the
>>>>>>> RM request the "smc instruction" channel once and share it among
>>>>>>> users?
>>>>>>
>>>>>> I suppose in this scenario, a single mailbox channel would represent all
>>>>>> message queues? This would cause Linux to serialize *all* message queue
>>>>>> hypercalls. Sorry, I can only think negative implications.
>>>>>>
>>>>>> Error handling needs to move into clients: if a TX message queue becomes
>>>>>> full or an RX message queue becomes empty, then we'll need to return
>>>>>> error back to the client right away. The clients would need to register
>>>>>> for the RTS/RTR interrupts to know when to send/receive messages and
>>>>>> have retry error handling. If the mailbox controller retried for the
>>>>>> clients as currently proposed, then we could get into a scenario where a
>>>>>> message queue could never be ready to send/receive and thus stuck
>>>>>> forever trying to process that message. The effect here would be that
>>>>>> the mailbox controller becomes a wrapper to some SMC instructions that
>>>>>> aren't related at the SMC instruction level.
>>>>>>
>>>>>> A single channel would limit performance of SMP systems because only one
>>>>>> core could send/receive a message. There is no such limitation for
>>>>>> message queues to behave like this.
>>>>>>
>>>>> This is just an illusion. If Gunyah can handle multiple calls from a
>>>>> VM parallely, even with the "bind-client-to-channel" hack you can't
>>>>> make sure different channels run on different cpu cores.  If you are
>>>>> ok with that, you could simply populate a mailbox controller with N
>>>>> channels and allocate them in any order the clients ask.
>>>>
>>>> I wanted to make sure I understood the ask here completely. On what
>>>> basis is N chosen? Who would be the mailbox clients?
>>>>
>>> A channel structure is cheap, so any number that is not likely to run
>>> out. Say you have 10 possible users in a VM, set N=16. I know ideally
>>> it should be precise and flexible but the gain in simplicity makes the
>>> trade-off very acceptable.
>>
>> I think I get the direction you are thinking now. N is chosen based off
>> of how many clients there might be. One mailbox controller will
>> represent all message queues and each channel will be one message queue.
>> There are some limitations that might make it more complex to implement
>> than having 1 message queue per controller like I have now.
>>
>> My interpretation is that mailbox controller knows the configuration of
>> its channels before being bound to a client. For dynamically created
>> message queues, the client would need tell the controller about the
>> message queue configuration. I didn't find example where client is
>> providing information about a channel to the controller.
>>
>>    1. need a mechanism to allow the client to provide the
>> gunyah_resources for the channel (i.e. the irqs and cap ids).
>>
> IIUC there is exactly one resource-manager in a VM. Right?
> Looking at your code, TX and RX irq are used only by the mailbox
> driver and are the same for all clients/users. So that should be a
> property under the mailbox controller node. Not sure what cap ids are.
> 

Ah -- "message queues" are a generic inter-VM communication mechanism 
offered by Gunyah. One use case for message queues is to communicate 
with the resource-manager, but other message queues can exist between 
other virtual machines. Those other message queues use different TX and 
RX irq and have different client protocols.

In mailbox terminology, we have one known channel at boot-up time (the 
resource manager). That known channel can inform Linux about other 
channels at runtime. The client (not the controller) decodes received 
data from the channel to discover the new channels.

One approach we found was coming from pcc.c, which has their own 
request_channel function (pcc_mbox_request_channel). We could follow 
this approach as well...

>>    2. Still need to have bind-client-to-channel patch since clients
>> aren't real devices and so shouldn't be on DT.
>>
> the clients may be virtual (serial, gpio etc) but the resource-manager
> requires some mailbox hardware to communicate, so the resource-manager
> should be the mailbox client (that further spawns virtual devices)

Yes, this the design I'm aiming for. Also want to highlight that the 
resource-manager spawns Gunyah virtual devices such as message queue 
channels.

Thanks,
Elliot
  

Patch

diff --git a/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml b/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
new file mode 100644
index 000000000000..3a8c1c2157a4
--- /dev/null
+++ b/Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
@@ -0,0 +1,86 @@ 
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/firmware/gunyah-hypervisor.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Gunyah Hypervisor
+
+maintainers:
+  - Murali Nalajala <quic_mnalajal@quicinc.com>
+  - Elliot Berman <quic_eberman@quicinc.com>
+
+description: |+
+  Gunyah virtual machines use this information to determine the capability IDs
+  of the message queues used to communicate with the Gunyah Resource Manager.
+  See also: https://github.com/quic/gunyah-resource-manager/blob/develop/src/vm_creation/dto_construct.c
+
+properties:
+  compatible:
+    items:
+      - const: gunyah-hypervisor-1.0
+      - const: gunyah-hypervisor
+
+  "#address-cells":
+    description: Number of cells needed to represent 64-bit capability IDs.
+    const: 2
+
+  "#size-cells":
+    description: must be 0, because capability IDs are not memory address
+                  ranges and do not have a size.
+    const: 0
+
+patternProperties:
+  "^gunyah-resource-mgr(@.*)?":
+    type: object
+    description:
+      Resource Manager node which is required to communicate to Resource
+      Manager VM using Gunyah Message Queues.
+
+    properties:
+      compatible:
+        items:
+          - const: gunyah-resource-manager-1-0
+          - const: gunyah-resource-manager
+
+      reg:
+        items:
+          - description: Gunyah capability ID of the TX message queue
+          - description: Gunyah capability ID of the RX message queue
+
+      interrupts:
+        items:
+          - description: Interrupt for the TX message queue
+          - description: Interrupt for the RX message queue
+
+    additionalProperties: false
+
+    required:
+      - compatible
+      - reg
+      - interrupts
+
+additionalProperties: false
+
+required:
+  - compatible
+  - "#address-cells"
+  - "#size-cells"
+
+examples:
+  - |
+    #include <dt-bindings/interrupt-controller/arm-gic.h>
+
+    hypervisor {
+        #address-cells = <2>;
+        #size-cells = <0>;
+        compatible = "gunyah-hypervisor-1.0", "gunyah-hypervisor";
+
+        gunyah-resource-mgr@0 {
+            compatible = "gunyah-resource-manager-1-0", "gunyah-resource-manager";
+            interrupts = <GIC_SPI 3 IRQ_TYPE_EDGE_RISING>, /* TX full IRQ */
+                         <GIC_SPI 4 IRQ_TYPE_EDGE_RISING>; /* RX empty IRQ */
+            reg = <0x00000000 0x00000000>, <0x00000000 0x00000001>;
+                  /* TX, RX cap ids */
+        };
+    };
diff --git a/MAINTAINERS b/MAINTAINERS
index 9479cb3054cb..1de8d00dacb2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8940,6 +8940,7 @@  M:	Elliot Berman <quic_eberman@quicinc.com>
 M:	Murali Nalajala <quic_mnalajal@quicinc.com>
 L:	linux-arm-msm@vger.kernel.org
 S:	Supported
+F:	Documentation/devicetree/bindings/firmware/gunyah-hypervisor.yaml
 F:	Documentation/virt/gunyah/
 
 HABANALABS PCI DRIVER