[v2,0/7] Allow dynamic allocation of software IO TLB bounce buffers

Message ID cover.1681898595.git.petr.tesarik.ext@huawei.com
Headers
Series Allow dynamic allocation of software IO TLB bounce buffers |

Message

Petr Tesarik April 19, 2023, 10:03 a.m. UTC
  From: Petr Tesarik <petr.tesarik.ext@huawei.com>

The goal of my work is to provide more flexibility in the sizing of
SWIOTLB.

The software IO TLB was designed with these assumptions:

1. It would not be used much, especially on 64-bit systems.
2. A small fixed memory area (64 MiB by default) is sufficient to
   handle the few cases which require a bounce buffer.
3. 64 MiB is little enough that it has no impact on the rest of the
   system.

First, if SEV is active, all DMA must be done through shared
unencrypted pages, and SWIOTLB is used to make this happen without
changing device drivers. The software IO TLB size is increased to
6% of total memory in sev_setup_arch(), but that is more of an
approximation. The actual requirements may vary depending on the
amount of I/O and which drivers are used. These factors may not be
know at boot time, i.e. when SWIOTLB is allocated.

Second, other colleagues have noticed that they can reliably get
rid of occasional OOM kills on an Arm embedded device by reducing
the SWIOTLB size. This can be achieved with a kernel parameter, but
determining the right value puts additional burden on pre-release
testing, which could be avoided if SWIOTLB is allocated small and
grows only when necessary.

Changes from v1-devel-v7:
- Add comments to acquire/release barriers
- Fix whitespace issues reported by checkpatch.pl

Changes from v1-devel-v6:
- Provide long description of functions
- Fix kernel-doc (Returns: to Return:)
- Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()

Changes from RFC:
- Track dynamic buffers per device instead of per swiotlb
- Use a linked list instead of a maple tree
- Move initialization of swiotlb fields of struct device to a
  helper function
- Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
- Introduce per-device flag if dynamic buffers are in use
- Add one more user of DMA_ATTR_MAY_SLEEP
- Add kernel-doc comments for new (and some old) code
- Properly escape '*' in dma-attributes.rst

Petr Tesarik (7):
  swiotlb: Use a helper to initialize swiotlb fields in struct device
  swiotlb: Move code around in preparation for dynamic bounce buffers
  dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute
  swiotlb: Dynamically allocated bounce buffers
  swiotlb: Add a boot option to enable dynamic bounce buffers
  drm: Use DMA_ATTR_MAY_SLEEP from process context
  swiotlb: per-device flag if there are dynamically allocated buffers

 .../admin-guide/kernel-parameters.txt         |   6 +-
 Documentation/core-api/dma-attributes.rst     |  10 +
 drivers/base/core.c                           |   4 +-
 drivers/gpu/drm/drm_gem_shmem_helper.c        |   2 +-
 drivers/gpu/drm/drm_prime.c                   |   2 +-
 include/linux/device.h                        |  12 +
 include/linux/dma-mapping.h                   |   6 +
 include/linux/swiotlb.h                       |  54 ++-
 kernel/dma/swiotlb.c                          | 382 ++++++++++++++++--
 9 files changed, 443 insertions(+), 35 deletions(-)
  

Comments

Petr Tesařík April 26, 2023, 12:15 p.m. UTC | #1
Hi,

On Wed, 19 Apr 2023 12:03:52 +0200
Petr Tesarik <petrtesarik@huaweicloud.com> wrote:

> From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> 
> The goal of my work is to provide more flexibility in the sizing of
> SWIOTLB.
> 
> The software IO TLB was designed with these assumptions:
> 
> 1. It would not be used much, especially on 64-bit systems.
> 2. A small fixed memory area (64 MiB by default) is sufficient to
>    handle the few cases which require a bounce buffer.
> 3. 64 MiB is little enough that it has no impact on the rest of the
>    system.
> 
> First, if SEV is active, all DMA must be done through shared
> unencrypted pages, and SWIOTLB is used to make this happen without
> changing device drivers. The software IO TLB size is increased to
> 6% of total memory in sev_setup_arch(), but that is more of an
> approximation. The actual requirements may vary depending on the
> amount of I/O and which drivers are used. These factors may not be
> know at boot time, i.e. when SWIOTLB is allocated.
> 
> Second, other colleagues have noticed that they can reliably get
> rid of occasional OOM kills on an Arm embedded device by reducing
> the SWIOTLB size. This can be achieved with a kernel parameter, but
> determining the right value puts additional burden on pre-release
> testing, which could be avoided if SWIOTLB is allocated small and
> grows only when necessary.

Now that merging into 6.4 has begun, what about this patch series? I'm
eager to get some feedback (positive or negative) and respin the next
version.

Petr T
  
Greg KH April 26, 2023, 12:26 p.m. UTC | #2
On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:
> Hi,
> 
> On Wed, 19 Apr 2023 12:03:52 +0200
> Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> 
> > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > 
> > The goal of my work is to provide more flexibility in the sizing of
> > SWIOTLB.
> > 
> > The software IO TLB was designed with these assumptions:
> > 
> > 1. It would not be used much, especially on 64-bit systems.
> > 2. A small fixed memory area (64 MiB by default) is sufficient to
> >    handle the few cases which require a bounce buffer.
> > 3. 64 MiB is little enough that it has no impact on the rest of the
> >    system.
> > 
> > First, if SEV is active, all DMA must be done through shared
> > unencrypted pages, and SWIOTLB is used to make this happen without
> > changing device drivers. The software IO TLB size is increased to
> > 6% of total memory in sev_setup_arch(), but that is more of an
> > approximation. The actual requirements may vary depending on the
> > amount of I/O and which drivers are used. These factors may not be
> > know at boot time, i.e. when SWIOTLB is allocated.
> > 
> > Second, other colleagues have noticed that they can reliably get
> > rid of occasional OOM kills on an Arm embedded device by reducing
> > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > determining the right value puts additional burden on pre-release
> > testing, which could be avoided if SWIOTLB is allocated small and
> > grows only when necessary.
> 
> Now that merging into 6.4 has begun, what about this patch series? I'm
> eager to get some feedback (positive or negative) and respin the next
> version.

It's the merge window, we can't add new things that haven't been in
linux-next already.   Please resubmit it after -rc1 is out.

thanks,

greg k-h
  
Petr Tesařík April 26, 2023, 12:44 p.m. UTC | #3
Hi Greg,

On Wed, 26 Apr 2023 14:26:36 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:
> > Hi,
> > 
> > On Wed, 19 Apr 2023 12:03:52 +0200
> > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> >   
> > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > 
> > > The goal of my work is to provide more flexibility in the sizing of
> > > SWIOTLB.
> > > 
> > > The software IO TLB was designed with these assumptions:
> > > 
> > > 1. It would not be used much, especially on 64-bit systems.
> > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > >    handle the few cases which require a bounce buffer.
> > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > >    system.
> > > 
> > > First, if SEV is active, all DMA must be done through shared
> > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > changing device drivers. The software IO TLB size is increased to
> > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > approximation. The actual requirements may vary depending on the
> > > amount of I/O and which drivers are used. These factors may not be
> > > know at boot time, i.e. when SWIOTLB is allocated.
> > > 
> > > Second, other colleagues have noticed that they can reliably get
> > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > determining the right value puts additional burden on pre-release
> > > testing, which could be avoided if SWIOTLB is allocated small and
> > > grows only when necessary.  
> > 
> > Now that merging into 6.4 has begun, what about this patch series? I'm
> > eager to get some feedback (positive or negative) and respin the next
> > version.  
> 
> It's the merge window, we can't add new things that haven't been in
> linux-next already.

This is understood. I'm not asking for immediate inclusion.

>   Please resubmit it after -rc1 is out.

If you can believe that rebasing to -rc1 will be enough, then I will
also try to believe I'm lucky. ;-)

The kind of feedback I really want to get is e.g. about the extra
per-device DMA-specific fields. If they cannot be added to struct
device, then I'd rather start discussing an interim solution, because
getting all existing DMA fields out of that struct will take a lot of
time...

Thanks,
Petr T
  
Greg KH April 26, 2023, 12:53 p.m. UTC | #4
On Wed, Apr 26, 2023 at 02:44:39PM +0200, Petr Tesařík wrote:
> Hi Greg,
> 
> On Wed, 26 Apr 2023 14:26:36 +0200
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:
> > > Hi,
> > > 
> > > On Wed, 19 Apr 2023 12:03:52 +0200
> > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > >   
> > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > > 
> > > > The goal of my work is to provide more flexibility in the sizing of
> > > > SWIOTLB.
> > > > 
> > > > The software IO TLB was designed with these assumptions:
> > > > 
> > > > 1. It would not be used much, especially on 64-bit systems.
> > > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > > >    handle the few cases which require a bounce buffer.
> > > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > > >    system.
> > > > 
> > > > First, if SEV is active, all DMA must be done through shared
> > > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > > changing device drivers. The software IO TLB size is increased to
> > > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > > approximation. The actual requirements may vary depending on the
> > > > amount of I/O and which drivers are used. These factors may not be
> > > > know at boot time, i.e. when SWIOTLB is allocated.
> > > > 
> > > > Second, other colleagues have noticed that they can reliably get
> > > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > > determining the right value puts additional burden on pre-release
> > > > testing, which could be avoided if SWIOTLB is allocated small and
> > > > grows only when necessary.  
> > > 
> > > Now that merging into 6.4 has begun, what about this patch series? I'm
> > > eager to get some feedback (positive or negative) and respin the next
> > > version.  
> > 
> > It's the merge window, we can't add new things that haven't been in
> > linux-next already.
> 
> This is understood. I'm not asking for immediate inclusion.
> 
> >   Please resubmit it after -rc1 is out.
> 
> If you can believe that rebasing to -rc1 will be enough, then I will
> also try to believe I'm lucky. ;-)
> 
> The kind of feedback I really want to get is e.g. about the extra
> per-device DMA-specific fields. If they cannot be added to struct
> device, then I'd rather start discussing an interim solution, because
> getting all existing DMA fields out of that struct will take a lot of
> time...

I thought the goal was to get them out of the device and into the bus
instead right?  Or was it the other way around?  I can't remember
anymore, sorry...

greg k-h
  
Petr Tesařík April 26, 2023, 1:16 p.m. UTC | #5
On Wed, 26 Apr 2023 14:53:52 +0200
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Wed, Apr 26, 2023 at 02:44:39PM +0200, Petr Tesařík wrote:
> > Hi Greg,
> > 
> > On Wed, 26 Apr 2023 14:26:36 +0200
> > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> >   
> > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:  
> > > > Hi,
> > > > 
> > > > On Wed, 19 Apr 2023 12:03:52 +0200
> > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > > >     
> > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > > > 
> > > > > The goal of my work is to provide more flexibility in the sizing of
> > > > > SWIOTLB.
> > > > > 
> > > > > The software IO TLB was designed with these assumptions:
> > > > > 
> > > > > 1. It would not be used much, especially on 64-bit systems.
> > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > > > >    handle the few cases which require a bounce buffer.
> > > > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > > > >    system.
> > > > > 
> > > > > First, if SEV is active, all DMA must be done through shared
> > > > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > > > changing device drivers. The software IO TLB size is increased to
> > > > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > > > approximation. The actual requirements may vary depending on the
> > > > > amount of I/O and which drivers are used. These factors may not be
> > > > > know at boot time, i.e. when SWIOTLB is allocated.
> > > > > 
> > > > > Second, other colleagues have noticed that they can reliably get
> > > > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > > > determining the right value puts additional burden on pre-release
> > > > > testing, which could be avoided if SWIOTLB is allocated small and
> > > > > grows only when necessary.    
> > > > 
> > > > Now that merging into 6.4 has begun, what about this patch series? I'm
> > > > eager to get some feedback (positive or negative) and respin the next
> > > > version.    
> > > 
> > > It's the merge window, we can't add new things that haven't been in
> > > linux-next already.  
> > 
> > This is understood. I'm not asking for immediate inclusion.
> >   
> > >   Please resubmit it after -rc1 is out.  
> > 
> > If you can believe that rebasing to -rc1 will be enough, then I will
> > also try to believe I'm lucky. ;-)
> > 
> > The kind of feedback I really want to get is e.g. about the extra
> > per-device DMA-specific fields. If they cannot be added to struct
> > device, then I'd rather start discussing an interim solution, because
> > getting all existing DMA fields out of that struct will take a lot of
> > time...  
> 
> I thought the goal was to get them out of the device and into the bus
> instead right?  Or was it the other way around?  I can't remember
> anymore, sorry...

You remember it almost right. The goal is to get them out of struct
device into the bus (or platform device, or whatever holds struct
device). But I'd like to keep this task decoupled from the dynamic
swiotlb.

Thanks,
Petr T
  
Mike Lothian April 28, 2023, 8:53 a.m. UTC | #6
On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
>
> From: Petr Tesarik <petr.tesarik.ext@huawei.com>
>
> The goal of my work is to provide more flexibility in the sizing of
> SWIOTLB.
>
> The software IO TLB was designed with these assumptions:
>
> 1. It would not be used much, especially on 64-bit systems.
> 2. A small fixed memory area (64 MiB by default) is sufficient to
>    handle the few cases which require a bounce buffer.
> 3. 64 MiB is little enough that it has no impact on the rest of the
>    system.
>
> First, if SEV is active, all DMA must be done through shared
> unencrypted pages, and SWIOTLB is used to make this happen without
> changing device drivers. The software IO TLB size is increased to
> 6% of total memory in sev_setup_arch(), but that is more of an
> approximation. The actual requirements may vary depending on the
> amount of I/O and which drivers are used. These factors may not be
> know at boot time, i.e. when SWIOTLB is allocated.
>
> Second, other colleagues have noticed that they can reliably get
> rid of occasional OOM kills on an Arm embedded device by reducing
> the SWIOTLB size. This can be achieved with a kernel parameter, but
> determining the right value puts additional burden on pre-release
> testing, which could be avoided if SWIOTLB is allocated small and
> grows only when necessary.
>
> Changes from v1-devel-v7:
> - Add comments to acquire/release barriers
> - Fix whitespace issues reported by checkpatch.pl
>
> Changes from v1-devel-v6:
> - Provide long description of functions
> - Fix kernel-doc (Returns: to Return:)
> - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
>
> Changes from RFC:
> - Track dynamic buffers per device instead of per swiotlb
> - Use a linked list instead of a maple tree
> - Move initialization of swiotlb fields of struct device to a
>   helper function
> - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
> - Introduce per-device flag if dynamic buffers are in use
> - Add one more user of DMA_ATTR_MAY_SLEEP
> - Add kernel-doc comments for new (and some old) code
> - Properly escape '*' in dma-attributes.rst
>
> Petr Tesarik (7):
>   swiotlb: Use a helper to initialize swiotlb fields in struct device
>   swiotlb: Move code around in preparation for dynamic bounce buffers
>   dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute
>   swiotlb: Dynamically allocated bounce buffers
>   swiotlb: Add a boot option to enable dynamic bounce buffers
>   drm: Use DMA_ATTR_MAY_SLEEP from process context
>   swiotlb: per-device flag if there are dynamically allocated buffers
>
>  .../admin-guide/kernel-parameters.txt         |   6 +-
>  Documentation/core-api/dma-attributes.rst     |  10 +
>  drivers/base/core.c                           |   4 +-
>  drivers/gpu/drm/drm_gem_shmem_helper.c        |   2 +-
>  drivers/gpu/drm/drm_prime.c                   |   2 +-
>  include/linux/device.h                        |  12 +
>  include/linux/dma-mapping.h                   |   6 +
>  include/linux/swiotlb.h                       |  54 ++-
>  kernel/dma/swiotlb.c                          | 382 ++++++++++++++++--
>  9 files changed, 443 insertions(+), 35 deletions(-)
>
> --
> 2.25.1
>

Hi

Is this a potential fix for
https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually
setting bigger buffers to keep my wifi working?

Thanks

Mike
  
Petr Tesařík April 28, 2023, 9:07 a.m. UTC | #7
On Fri, 28 Apr 2023 09:53:38 +0100
Mike Lothian <mike@fireburn.co.uk> wrote:

> On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> >
> > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> >
> > The goal of my work is to provide more flexibility in the sizing of
> > SWIOTLB.
> >
> > The software IO TLB was designed with these assumptions:
> >
> > 1. It would not be used much, especially on 64-bit systems.
> > 2. A small fixed memory area (64 MiB by default) is sufficient to
> >    handle the few cases which require a bounce buffer.
> > 3. 64 MiB is little enough that it has no impact on the rest of the
> >    system.
> >
> > First, if SEV is active, all DMA must be done through shared
> > unencrypted pages, and SWIOTLB is used to make this happen without
> > changing device drivers. The software IO TLB size is increased to
> > 6% of total memory in sev_setup_arch(), but that is more of an
> > approximation. The actual requirements may vary depending on the
> > amount of I/O and which drivers are used. These factors may not be
> > know at boot time, i.e. when SWIOTLB is allocated.
> >
> > Second, other colleagues have noticed that they can reliably get
> > rid of occasional OOM kills on an Arm embedded device by reducing
> > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > determining the right value puts additional burden on pre-release
> > testing, which could be avoided if SWIOTLB is allocated small and
> > grows only when necessary.
> >
> > Changes from v1-devel-v7:
> > - Add comments to acquire/release barriers
> > - Fix whitespace issues reported by checkpatch.pl
> >
> > Changes from v1-devel-v6:
> > - Provide long description of functions
> > - Fix kernel-doc (Returns: to Return:)
> > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
> >
> > Changes from RFC:
> > - Track dynamic buffers per device instead of per swiotlb
> > - Use a linked list instead of a maple tree
> > - Move initialization of swiotlb fields of struct device to a
> >   helper function
> > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
> > - Introduce per-device flag if dynamic buffers are in use
> > - Add one more user of DMA_ATTR_MAY_SLEEP
> > - Add kernel-doc comments for new (and some old) code
> > - Properly escape '*' in dma-attributes.rst
> >
> > Petr Tesarik (7):
> >   swiotlb: Use a helper to initialize swiotlb fields in struct device
> >   swiotlb: Move code around in preparation for dynamic bounce buffers
> >   dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute
> >   swiotlb: Dynamically allocated bounce buffers
> >   swiotlb: Add a boot option to enable dynamic bounce buffers
> >   drm: Use DMA_ATTR_MAY_SLEEP from process context
> >   swiotlb: per-device flag if there are dynamically allocated buffers
> >
> >  .../admin-guide/kernel-parameters.txt         |   6 +-
> >  Documentation/core-api/dma-attributes.rst     |  10 +
> >  drivers/base/core.c                           |   4 +-
> >  drivers/gpu/drm/drm_gem_shmem_helper.c        |   2 +-
> >  drivers/gpu/drm/drm_prime.c                   |   2 +-
> >  include/linux/device.h                        |  12 +
> >  include/linux/dma-mapping.h                   |   6 +
> >  include/linux/swiotlb.h                       |  54 ++-
> >  kernel/dma/swiotlb.c                          | 382 ++++++++++++++++--
> >  9 files changed, 443 insertions(+), 35 deletions(-)
> >
> > --
> > 2.25.1
> >  
> 
> Hi
> 
> Is this a potential fix for
> https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually
> setting bigger buffers to keep my wifi working?

Yes. With these patches applied, your system should run just fine with
swiotlb=dynamic. However, keep in mind that this implementation adds a
bit of overhead. In short, it trades a bit of performance for not
having to figure out the optimal swiotlb size at boot time.

Petr T
  
Mike Lothian May 1, 2023, 10:29 a.m. UTC | #8
Hi

I've not had any issues since using this, but I imagine most people
won't know how to set swiotlb=dynamic if they start seeing this (when
it lands)

Any clue as to why this broke last cycle?

Thanks

Mike

On Fri, 28 Apr 2023 at 10:07, Petr Tesařík <petr@tesarici.cz> wrote:
>
> On Fri, 28 Apr 2023 09:53:38 +0100
> Mike Lothian <mike@fireburn.co.uk> wrote:
>
> > On Wed, 19 Apr 2023 at 11:05, Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > >
> > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > >
> > > The goal of my work is to provide more flexibility in the sizing of
> > > SWIOTLB.
> > >
> > > The software IO TLB was designed with these assumptions:
> > >
> > > 1. It would not be used much, especially on 64-bit systems.
> > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > >    handle the few cases which require a bounce buffer.
> > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > >    system.
> > >
> > > First, if SEV is active, all DMA must be done through shared
> > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > changing device drivers. The software IO TLB size is increased to
> > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > approximation. The actual requirements may vary depending on the
> > > amount of I/O and which drivers are used. These factors may not be
> > > know at boot time, i.e. when SWIOTLB is allocated.
> > >
> > > Second, other colleagues have noticed that they can reliably get
> > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > determining the right value puts additional burden on pre-release
> > > testing, which could be avoided if SWIOTLB is allocated small and
> > > grows only when necessary.
> > >
> > > Changes from v1-devel-v7:
> > > - Add comments to acquire/release barriers
> > > - Fix whitespace issues reported by checkpatch.pl
> > >
> > > Changes from v1-devel-v6:
> > > - Provide long description of functions
> > > - Fix kernel-doc (Returns: to Return:)
> > > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
> > >
> > > Changes from RFC:
> > > - Track dynamic buffers per device instead of per swiotlb
> > > - Use a linked list instead of a maple tree
> > > - Move initialization of swiotlb fields of struct device to a
> > >   helper function
> > > - Rename __lookup_dyn_slot() to lookup_dyn_slot_locked()
> > > - Introduce per-device flag if dynamic buffers are in use
> > > - Add one more user of DMA_ATTR_MAY_SLEEP
> > > - Add kernel-doc comments for new (and some old) code
> > > - Properly escape '*' in dma-attributes.rst
> > >
> > > Petr Tesarik (7):
> > >   swiotlb: Use a helper to initialize swiotlb fields in struct device
> > >   swiotlb: Move code around in preparation for dynamic bounce buffers
> > >   dma-mapping: introduce the DMA_ATTR_MAY_SLEEP attribute
> > >   swiotlb: Dynamically allocated bounce buffers
> > >   swiotlb: Add a boot option to enable dynamic bounce buffers
> > >   drm: Use DMA_ATTR_MAY_SLEEP from process context
> > >   swiotlb: per-device flag if there are dynamically allocated buffers
> > >
> > >  .../admin-guide/kernel-parameters.txt         |   6 +-
> > >  Documentation/core-api/dma-attributes.rst     |  10 +
> > >  drivers/base/core.c                           |   4 +-
> > >  drivers/gpu/drm/drm_gem_shmem_helper.c        |   2 +-
> > >  drivers/gpu/drm/drm_prime.c                   |   2 +-
> > >  include/linux/device.h                        |  12 +
> > >  include/linux/dma-mapping.h                   |   6 +
> > >  include/linux/swiotlb.h                       |  54 ++-
> > >  kernel/dma/swiotlb.c                          | 382 ++++++++++++++++--
> > >  9 files changed, 443 insertions(+), 35 deletions(-)
> > >
> > > --
> > > 2.25.1
> > >
> >
> > Hi
> >
> > Is this a potential fix for
> > https://bugzilla.kernel.org/show_bug.cgi?id=217310 where I'm manually
> > setting bigger buffers to keep my wifi working?
>
> Yes. With these patches applied, your system should run just fine with
> swiotlb=dynamic. However, keep in mind that this implementation adds a
> bit of overhead. In short, it trades a bit of performance for not
> having to figure out the optimal swiotlb size at boot time.
>
> Petr T
  
Petr Tesařík May 9, 2023, 7:16 a.m. UTC | #9
On Wed, 26 Apr 2023 14:44:39 +0200
Petr Tesařík <petr@tesarici.cz> wrote:

> Hi Greg,
> 
> On Wed, 26 Apr 2023 14:26:36 +0200
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
> > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:  
> > > Hi,
> > > 
> > > On Wed, 19 Apr 2023 12:03:52 +0200
> > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > >     
> > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > > 
> > > > The goal of my work is to provide more flexibility in the sizing of
> > > > SWIOTLB.
> > > > 
> > > > The software IO TLB was designed with these assumptions:
> > > > 
> > > > 1. It would not be used much, especially on 64-bit systems.
> > > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > > >    handle the few cases which require a bounce buffer.
> > > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > > >    system.
> > > > 
> > > > First, if SEV is active, all DMA must be done through shared
> > > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > > changing device drivers. The software IO TLB size is increased to
> > > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > > approximation. The actual requirements may vary depending on the
> > > > amount of I/O and which drivers are used. These factors may not be
> > > > know at boot time, i.e. when SWIOTLB is allocated.
> > > > 
> > > > Second, other colleagues have noticed that they can reliably get
> > > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > > determining the right value puts additional burden on pre-release
> > > > testing, which could be avoided if SWIOTLB is allocated small and
> > > > grows only when necessary.    
> > > 
> > > Now that merging into 6.4 has begun, what about this patch series? I'm
> > > eager to get some feedback (positive or negative) and respin the next
> > > version.    
> > 
> > It's the merge window, we can't add new things that haven't been in
> > linux-next already.  
> 
> This is understood. I'm not asking for immediate inclusion.
> 
> >   Please resubmit it after -rc1 is out.  
> 
> If you can believe that rebasing to -rc1 will be enough, then I will
> also try to believe I'm lucky. ;-)
> 
> The kind of feedback I really want to get is e.g. about the extra
> per-device DMA-specific fields. If they cannot be added to struct
> device, then I'd rather start discussing an interim solution, because
> getting all existing DMA fields out of that struct will take a lot of
> time...

All right, 6.4-rc1 is out now. The patch series still applies cleanly.

Any comments what must be changed (if anything) to get it in?

Thanks,
Petr T
  
Greg KH May 9, 2023, 7:32 a.m. UTC | #10
On Tue, May 09, 2023 at 09:16:35AM +0200, Petr Tesařík wrote:
> On Wed, 26 Apr 2023 14:44:39 +0200
> Petr Tesařík <petr@tesarici.cz> wrote:
> 
> > Hi Greg,
> > 
> > On Wed, 26 Apr 2023 14:26:36 +0200
> > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > 
> > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:  
> > > > Hi,
> > > > 
> > > > On Wed, 19 Apr 2023 12:03:52 +0200
> > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > > >     
> > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > > > 
> > > > > The goal of my work is to provide more flexibility in the sizing of
> > > > > SWIOTLB.
> > > > > 
> > > > > The software IO TLB was designed with these assumptions:
> > > > > 
> > > > > 1. It would not be used much, especially on 64-bit systems.
> > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > > > >    handle the few cases which require a bounce buffer.
> > > > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > > > >    system.
> > > > > 
> > > > > First, if SEV is active, all DMA must be done through shared
> > > > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > > > changing device drivers. The software IO TLB size is increased to
> > > > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > > > approximation. The actual requirements may vary depending on the
> > > > > amount of I/O and which drivers are used. These factors may not be
> > > > > know at boot time, i.e. when SWIOTLB is allocated.
> > > > > 
> > > > > Second, other colleagues have noticed that they can reliably get
> > > > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > > > determining the right value puts additional burden on pre-release
> > > > > testing, which could be avoided if SWIOTLB is allocated small and
> > > > > grows only when necessary.    
> > > > 
> > > > Now that merging into 6.4 has begun, what about this patch series? I'm
> > > > eager to get some feedback (positive or negative) and respin the next
> > > > version.    
> > > 
> > > It's the merge window, we can't add new things that haven't been in
> > > linux-next already.  
> > 
> > This is understood. I'm not asking for immediate inclusion.
> > 
> > >   Please resubmit it after -rc1 is out.  
> > 
> > If you can believe that rebasing to -rc1 will be enough, then I will
> > also try to believe I'm lucky. ;-)
> > 
> > The kind of feedback I really want to get is e.g. about the extra
> > per-device DMA-specific fields. If they cannot be added to struct
> > device, then I'd rather start discussing an interim solution, because
> > getting all existing DMA fields out of that struct will take a lot of
> > time...
> 
> All right, 6.4-rc1 is out now. The patch series still applies cleanly.
> 
> Any comments what must be changed (if anything) to get it in?

Try resending it, it's long out of my review queue...

thanks,

greg k-h
  
Mike Lothian July 10, 2023, 10:23 p.m. UTC | #11
Hi

I was hoping this might land for 6.5-rc1, is there a new version that
might apply against 6.5?

Cheers

Mike

On Tue, 9 May 2023 at 08:32, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Tue, May 09, 2023 at 09:16:35AM +0200, Petr Tesařík wrote:
> > On Wed, 26 Apr 2023 14:44:39 +0200
> > Petr Tesařík <petr@tesarici.cz> wrote:
> >
> > > Hi Greg,
> > >
> > > On Wed, 26 Apr 2023 14:26:36 +0200
> > > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > >
> > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:
> > > > > Hi,
> > > > >
> > > > > On Wed, 19 Apr 2023 12:03:52 +0200
> > > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > > > >
> > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > > > >
> > > > > > The goal of my work is to provide more flexibility in the sizing of
> > > > > > SWIOTLB.
> > > > > >
> > > > > > The software IO TLB was designed with these assumptions:
> > > > > >
> > > > > > 1. It would not be used much, especially on 64-bit systems.
> > > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > > > > >    handle the few cases which require a bounce buffer.
> > > > > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > > > > >    system.
> > > > > >
> > > > > > First, if SEV is active, all DMA must be done through shared
> > > > > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > > > > changing device drivers. The software IO TLB size is increased to
> > > > > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > > > > approximation. The actual requirements may vary depending on the
> > > > > > amount of I/O and which drivers are used. These factors may not be
> > > > > > know at boot time, i.e. when SWIOTLB is allocated.
> > > > > >
> > > > > > Second, other colleagues have noticed that they can reliably get
> > > > > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > > > > determining the right value puts additional burden on pre-release
> > > > > > testing, which could be avoided if SWIOTLB is allocated small and
> > > > > > grows only when necessary.
> > > > >
> > > > > Now that merging into 6.4 has begun, what about this patch series? I'm
> > > > > eager to get some feedback (positive or negative) and respin the next
> > > > > version.
> > > >
> > > > It's the merge window, we can't add new things that haven't been in
> > > > linux-next already.
> > >
> > > This is understood. I'm not asking for immediate inclusion.
> > >
> > > >   Please resubmit it after -rc1 is out.
> > >
> > > If you can believe that rebasing to -rc1 will be enough, then I will
> > > also try to believe I'm lucky. ;-)
> > >
> > > The kind of feedback I really want to get is e.g. about the extra
> > > per-device DMA-specific fields. If they cannot be added to struct
> > > device, then I'd rather start discussing an interim solution, because
> > > getting all existing DMA fields out of that struct will take a lot of
> > > time...
> >
> > All right, 6.4-rc1 is out now. The patch series still applies cleanly.
> >
> > Any comments what must be changed (if anything) to get it in?
>
> Try resending it, it's long out of my review queue...
>
> thanks,
>
> greg k-h
  
Petr Tesařík July 11, 2023, 9:03 a.m. UTC | #12
On Mon, 10 Jul 2023 23:23:45 +0100
Mike Lothian <mike@fireburn.co.uk> wrote:

> Hi
> 
> I was hoping this might land for 6.5-rc1, is there a new version that
> might apply against 6.5?

Yes, there is a v3, which is a complete rewrite based on feedback from
various people on this mailing list:

https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/

Petr T

> Cheers
> 
> Mike
> 
> On Tue, 9 May 2023 at 08:32, Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Tue, May 09, 2023 at 09:16:35AM +0200, Petr Tesařík wrote:  
> > > On Wed, 26 Apr 2023 14:44:39 +0200
> > > Petr Tesařík <petr@tesarici.cz> wrote:
> > >  
> > > > Hi Greg,
> > > >
> > > > On Wed, 26 Apr 2023 14:26:36 +0200
> > > > Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> > > >  
> > > > > On Wed, Apr 26, 2023 at 02:15:20PM +0200, Petr Tesařík wrote:  
> > > > > > Hi,
> > > > > >
> > > > > > On Wed, 19 Apr 2023 12:03:52 +0200
> > > > > > Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
> > > > > >  
> > > > > > > From: Petr Tesarik <petr.tesarik.ext@huawei.com>
> > > > > > >
> > > > > > > The goal of my work is to provide more flexibility in the sizing of
> > > > > > > SWIOTLB.
> > > > > > >
> > > > > > > The software IO TLB was designed with these assumptions:
> > > > > > >
> > > > > > > 1. It would not be used much, especially on 64-bit systems.
> > > > > > > 2. A small fixed memory area (64 MiB by default) is sufficient to
> > > > > > >    handle the few cases which require a bounce buffer.
> > > > > > > 3. 64 MiB is little enough that it has no impact on the rest of the
> > > > > > >    system.
> > > > > > >
> > > > > > > First, if SEV is active, all DMA must be done through shared
> > > > > > > unencrypted pages, and SWIOTLB is used to make this happen without
> > > > > > > changing device drivers. The software IO TLB size is increased to
> > > > > > > 6% of total memory in sev_setup_arch(), but that is more of an
> > > > > > > approximation. The actual requirements may vary depending on the
> > > > > > > amount of I/O and which drivers are used. These factors may not be
> > > > > > > know at boot time, i.e. when SWIOTLB is allocated.
> > > > > > >
> > > > > > > Second, other colleagues have noticed that they can reliably get
> > > > > > > rid of occasional OOM kills on an Arm embedded device by reducing
> > > > > > > the SWIOTLB size. This can be achieved with a kernel parameter, but
> > > > > > > determining the right value puts additional burden on pre-release
> > > > > > > testing, which could be avoided if SWIOTLB is allocated small and
> > > > > > > grows only when necessary.  
> > > > > >
> > > > > > Now that merging into 6.4 has begun, what about this patch series? I'm
> > > > > > eager to get some feedback (positive or negative) and respin the next
> > > > > > version.  
> > > > >
> > > > > It's the merge window, we can't add new things that haven't been in
> > > > > linux-next already.  
> > > >
> > > > This is understood. I'm not asking for immediate inclusion.
> > > >  
> > > > >   Please resubmit it after -rc1 is out.  
> > > >
> > > > If you can believe that rebasing to -rc1 will be enough, then I will
> > > > also try to believe I'm lucky. ;-)
> > > >
> > > > The kind of feedback I really want to get is e.g. about the extra
> > > > per-device DMA-specific fields. If they cannot be added to struct
> > > > device, then I'd rather start discussing an interim solution, because
> > > > getting all existing DMA fields out of that struct will take a lot of
> > > > time...  
> > >
> > > All right, 6.4-rc1 is out now. The patch series still applies cleanly.
> > >
> > > Any comments what must be changed (if anything) to get it in?  
>  [...]
  
Mike Lothian July 11, 2023, 11:29 a.m. UTC | #13
On Tue, 11 Jul 2023 at 10:03, Petr Tesařík <petr@tesarici.cz> wrote:
>
> On Mon, 10 Jul 2023 23:23:45 +0100
> Mike Lothian <mike@fireburn.co.uk> wrote:
>
> > Hi
> >
> > I was hoping this might land for 6.5-rc1, is there a new version that
> > might apply against 6.5?
>
> Yes, there is a v3, which is a complete rewrite based on feedback from
> various people on this mailing list:
>
> https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/
>
> Petr T
>


Patch 2 doesn't apply cleanly for me on 6.5-rc1
  
Petr Tesařík July 11, 2023, 1:21 p.m. UTC | #14
On Tue, 11 Jul 2023 12:29:44 +0100
Mike Lothian <mike@fireburn.co.uk> wrote:

> On Tue, 11 Jul 2023 at 10:03, Petr Tesařík <petr@tesarici.cz> wrote:
> >
> > On Mon, 10 Jul 2023 23:23:45 +0100
> > Mike Lothian <mike@fireburn.co.uk> wrote:
> >  
> > > Hi
> > >
> > > I was hoping this might land for 6.5-rc1, is there a new version that
> > > might apply against 6.5?  
> >
> > Yes, there is a v3, which is a complete rewrite based on feedback from
> > various people on this mailing list:
> >
> > https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/
> >
> > Petr T
> >  
> 
> 
> Patch 2 doesn't apply cleanly for me on 6.5-rc1

Ah, right. I'm going to rebase the series and include a few other
suggested changes.

I'm a bit worried that Christoph and all other maintainers (all taken
back into Cc) have stayed silent about the v3 series.

@Christoph: Are uncomfortable with something in the idea itself, or are
you just busy with other things?

Petr T
  
Mike Lothian July 11, 2023, 2:40 p.m. UTC | #15
On Tue, 11 Jul 2023 at 14:21, Petr Tesařík <petr@tesarici.cz> wrote:
>
> On Tue, 11 Jul 2023 12:29:44 +0100
> Mike Lothian <mike@fireburn.co.uk> wrote:
>
> > On Tue, 11 Jul 2023 at 10:03, Petr Tesařík <petr@tesarici.cz> wrote:
> > >
> > > On Mon, 10 Jul 2023 23:23:45 +0100
> > > Mike Lothian <mike@fireburn.co.uk> wrote:
> > >
> > > > Hi
> > > >
> > > > I was hoping this might land for 6.5-rc1, is there a new version that
> > > > might apply against 6.5?
> > >
> > > Yes, there is a v3, which is a complete rewrite based on feedback from
> > > various people on this mailing list:
> > >
> > > https://lore.kernel.org/linux-iommu/cover.1687859323.git.petr.tesarik.ext@huawei.com/T/
> > >
> > > Petr T
> > >
> >
> >
> > Patch 2 doesn't apply cleanly for me on 6.5-rc1
>
> Ah, right. I'm going to rebase the series and include a few other
> suggested changes.
>
> I'm a bit worried that Christoph and all other maintainers (all taken
> back into Cc) have stayed silent about the v3 series.
>
> @Christoph: Are uncomfortable with something in the idea itself, or are
> you just busy with other things?
>
> Petr T

I imagine once 6.4 starts appearing in distros there will be more bugs
appearing when people's wifi disconnects

If this is being delayed until 6.6 would it be better to increase the
default number, perhaps based on how much memory the system has and
maybe back port the fix?