KVM: Destroy target device if coalesced MMIO unregistration fails

Message ID 20221219171924.67989-1-seanjc@google.com
State New
Headers
Series KVM: Destroy target device if coalesced MMIO unregistration fails |

Commit Message

Sean Christopherson Dec. 19, 2022, 5:19 p.m. UTC
  Destroy and free the target coalesced MMIO device if unregistering said
device fails.  As clearly noted in the code, kvm_io_bus_unregister_dev()
does not destroy the target device.

  BUG: memory leak
  unreferenced object 0xffff888112a54880 (size 64):
    comm "syz-executor.2", pid 5258, jiffies 4297861402 (age 14.129s)
    hex dump (first 32 bytes):
      38 c7 67 15 00 c9 ff ff 38 c7 67 15 00 c9 ff ff  8.g.....8.g.....
      e0 c7 e1 83 ff ff ff ff 00 30 67 15 00 c9 ff ff  .........0g.....
    backtrace:
      [<0000000006995a8a>] kmalloc include/linux/slab.h:556 [inline]
      [<0000000006995a8a>] kzalloc include/linux/slab.h:690 [inline]
      [<0000000006995a8a>] kvm_vm_ioctl_register_coalesced_mmio+0x8e/0x3d0 arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:150
      [<00000000022550c2>] kvm_vm_ioctl+0x47d/0x1600 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3323
      [<000000008a75102f>] vfs_ioctl fs/ioctl.c:46 [inline]
      [<000000008a75102f>] file_ioctl fs/ioctl.c:509 [inline]
      [<000000008a75102f>] do_vfs_ioctl+0xbab/0x1160 fs/ioctl.c:696
      [<0000000080e3f669>] ksys_ioctl+0x76/0xa0 fs/ioctl.c:713
      [<0000000059ef4888>] __do_sys_ioctl fs/ioctl.c:720 [inline]
      [<0000000059ef4888>] __se_sys_ioctl fs/ioctl.c:718 [inline]
      [<0000000059ef4888>] __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:718
      [<000000006444fa05>] do_syscall_64+0x9f/0x4e0 arch/x86/entry/common.c:290
      [<000000009a4ed50b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe

  BUG: leak checking failed

Fixes: 5d3c4c79384a ("KVM: Stop looking for coalesced MMIO zones if the bus is destroyed")
Cc: stable@vger.kernel.org
Reported-by: 柳菁峰 <liujingfeng@qianxin.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 virt/kvm/coalesced_mmio.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)


base-commit: 9d75a3251adfbcf444681474511b58042a364863
  

Comments

Wang, Wei W Dec. 20, 2022, 3:04 a.m. UTC | #1
On Tuesday, December 20, 2022 1:19 AM, Sean Christopherson wrote:
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index
> 0be80c213f7f..5ef88f5a0864 100644
> --- a/virt/kvm/coalesced_mmio.c
> +++ b/virt/kvm/coalesced_mmio.c
> @@ -187,15 +187,17 @@ int
> kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
>  			r = kvm_io_bus_unregister_dev(kvm,
>  				zone->pio ? KVM_PIO_BUS : KVM_MMIO_BUS, &dev->dev);
> 
> +			kvm_iodevice_destructor(&dev->dev);
> +
>  			/*
>  			 * On failure, unregister destroys all devices on the
>  			 * bus _except_ the target device, i.e. coalesced_zones
> -			 * has been modified.  No need to restart the walk as
> -			 * there aren't any zones left.
> +			 * has been modified.  Bail after destroying the target
> +			 * device, there's no need to restart the walk as there
> +			 * aren't any zones left.
>  			 */
>  			if (r)
>  				break;
> -			kvm_iodevice_destructor(&dev->dev);
>  		}
>  	}

Another option is to let kvm_io_bus_unregister_dev handle this, and no need for callers
to make the extra kvm_iodevice_destructor() call. This simplifies the usage for callers
(e.g. reducing LOCs and no leakages like this):

diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
index d75fc4365746..56619e33251e 100644
--- a/include/kvm/iodev.h
+++ b/include/kvm/iodev.h
@@ -55,10 +55,4 @@ static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu,
                                 : -EOPNOTSUPP;
 }

-static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
-{
-       if (dev->ops->destructor)
-               dev->ops->destructor(dev);
-}
-
 #endif /* __KVM_IODEV_H__ */
diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 0be80c213f7f..d7135a5e76f8 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -195,7 +195,6 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
                         */
                        if (r)
                                break;
-                       kvm_iodevice_destructor(&dev->dev);
                }
        }

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 2a3ed401ce46..1b277afb545b 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
                bus = kvm_get_bus(kvm, bus_idx);
                if (bus)
                        bus->ioeventfd_count--;
-               ioeventfd_release(p);
                ret = 0;
                break;
        }
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 13e88297f999..582757ebdce6 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -5200,6 +5200,12 @@ static struct notifier_block kvm_reboot_notifier = {
        .priority = 0,
 };

+static void kvm_iodevice_destructor(struct kvm_io_device *dev)
+{
+       if (dev->ops->destructor)
+               dev->ops->destructor(dev);
+}
+
 static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
 {
        int i;
@@ -5423,7 +5429,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
 int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
                              struct kvm_io_device *dev)
 {
-       int i, j;
+       int i;
        struct kvm_io_bus *new_bus, *bus;

        lockdep_assert_held(&kvm->slots_lock);
@@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
        rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
        synchronize_srcu_expedited(&kvm->srcu);

-       /* Destroy the old bus _after_ installing the (null) bus. */
+       /*
+        * If (null) bus is installed, destroy the old bus, including all the
+        * attached devices. Otherwise, destroy the caller's device only.
+        */
        if (!new_bus) {
                pr_err("kvm: failed to shrink bus, removing it completely\n");
-               for (j = 0; j < bus->dev_count; j++) {
-                       if (j == i)
-                               continue;
-                       kvm_iodevice_destructor(bus->range[j].dev);
-               }
+               kvm_io_bus_destroy(bus);
+               return -ENOMEM;
        }

-       kfree(bus);
-       return new_bus ? 0 : -ENOMEM;
+       kvm_iodevice_destructor(dev);
+       return 0;
 }

 struct kvm_io_device *kvm_io_bus_get_dev(struct kvm *kvm, enum kvm_bus bus_idx,
  
Binbin Wu Dec. 20, 2022, 6:05 a.m. UTC | #2
On 12/20/2022 11:04 AM, Wang, Wei W wrote:
> On Tuesday, December 20, 2022 1:19 AM, Sean Christopherson wrote:
>> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c index
>> 0be80c213f7f..5ef88f5a0864 100644
>> --- a/virt/kvm/coalesced_mmio.c
>> +++ b/virt/kvm/coalesced_mmio.c
>> @@ -187,15 +187,17 @@ int
>> kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
>>   			r = kvm_io_bus_unregister_dev(kvm,
>>   				zone->pio ? KVM_PIO_BUS : KVM_MMIO_BUS, &dev->dev);
>>
>> +			kvm_iodevice_destructor(&dev->dev);
>> +
>>   			/*
>>   			 * On failure, unregister destroys all devices on the
>>   			 * bus _except_ the target device, i.e. coalesced_zones
>> -			 * has been modified.  No need to restart the walk as
>> -			 * there aren't any zones left.
>> +			 * has been modified.  Bail after destroying the target
>> +			 * device, there's no need to restart the walk as there
>> +			 * aren't any zones left.
>>   			 */
>>   			if (r)
>>   				break;
>> -			kvm_iodevice_destructor(&dev->dev);
>>   		}
>>   	}
> Another option is to let kvm_io_bus_unregister_dev handle this, and no need for callers
> to make the extra kvm_iodevice_destructor() call. This simplifies the usage for callers
> (e.g. reducing LOCs and no leakages like this):

One vote for this option : )


>
> diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> index d75fc4365746..56619e33251e 100644
> --- a/include/kvm/iodev.h
> +++ b/include/kvm/iodev.h
> @@ -55,10 +55,4 @@ static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu,
>                                   : -EOPNOTSUPP;
>   }
>
> -static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
> -{
> -       if (dev->ops->destructor)
> -               dev->ops->destructor(dev);
> -}
> -
>   #endif /* __KVM_IODEV_H__ */
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> index 0be80c213f7f..d7135a5e76f8 100644
> --- a/virt/kvm/coalesced_mmio.c
> +++ b/virt/kvm/coalesced_mmio.c
> @@ -195,7 +195,6 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
>                           */
>                          if (r)
>                                  break;
> -                       kvm_iodevice_destructor(&dev->dev);
>                  }
>          }
>
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 2a3ed401ce46..1b277afb545b 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
>                  bus = kvm_get_bus(kvm, bus_idx);
>                  if (bus)
>                          bus->ioeventfd_count--;
> -               ioeventfd_release(p);
>                  ret = 0;
>                  break;
>          }
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 13e88297f999..582757ebdce6 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5200,6 +5200,12 @@ static struct notifier_block kvm_reboot_notifier = {
>          .priority = 0,
>   };
>
> +static void kvm_iodevice_destructor(struct kvm_io_device *dev)
> +{
> +       if (dev->ops->destructor)
> +               dev->ops->destructor(dev);
> +}
> +
>   static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
>   {
>          int i;
> @@ -5423,7 +5429,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
>   int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
>                                struct kvm_io_device *dev)
>   {
> -       int i, j;
> +       int i;
>          struct kvm_io_bus *new_bus, *bus;
>
>          lockdep_assert_held(&kvm->slots_lock);
> @@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
>          rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
>          synchronize_srcu_expedited(&kvm->srcu);
>
> -       /* Destroy the old bus _after_ installing the (null) bus. */
> +       /*
> +        * If (null) bus is installed, destroy the old bus, including all the
> +        * attached devices. Otherwise, destroy the caller's device only.
> +        */
>          if (!new_bus) {
>                  pr_err("kvm: failed to shrink bus, removing it completely\n");
> -               for (j = 0; j < bus->dev_count; j++) {
> -                       if (j == i)
> -                               continue;
> -                       kvm_iodevice_destructor(bus->range[j].dev);
> -               }
> +               kvm_io_bus_destroy(bus);
> +               return -ENOMEM;
>          }
>
> -       kfree(bus);
> -       return new_bus ? 0 : -ENOMEM;
> +       kvm_iodevice_destructor(dev);
> +       return 0;
>   }
>
>   struct kvm_io_device *kvm_io_bus_get_dev(struct kvm *kvm, enum kvm_bus bus_idx,
  
Paolo Bonzini Dec. 23, 2022, 5:13 p.m. UTC | #3
On 12/20/22 04:04, Wang, Wei W wrote:
> Another option is to let kvm_io_bus_unregister_dev handle this, and no need for callers
> to make the extra kvm_iodevice_destructor() call. This simplifies the usage for callers
> (e.g. reducing LOCs and no leakages like this):

Can you send this as a patch?  Thanks!

Paolo

> diff --git a/include/kvm/iodev.h b/include/kvm/iodev.h
> index d75fc4365746..56619e33251e 100644
> --- a/include/kvm/iodev.h
> +++ b/include/kvm/iodev.h
> @@ -55,10 +55,4 @@ static inline int kvm_iodevice_write(struct kvm_vcpu *vcpu,
>                                   : -EOPNOTSUPP;
>   }
> 
> -static inline void kvm_iodevice_destructor(struct kvm_io_device *dev)
> -{
> -       if (dev->ops->destructor)
> -               dev->ops->destructor(dev);
> -}
> -
>   #endif /* __KVM_IODEV_H__ */
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> index 0be80c213f7f..d7135a5e76f8 100644
> --- a/virt/kvm/coalesced_mmio.c
> +++ b/virt/kvm/coalesced_mmio.c
> @@ -195,7 +195,6 @@ int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
>                           */
>                          if (r)
>                                  break;
> -                       kvm_iodevice_destructor(&dev->dev);
>                  }
>          }
> 
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 2a3ed401ce46..1b277afb545b 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -898,7 +898,6 @@ kvm_deassign_ioeventfd_idx(struct kvm *kvm, enum kvm_bus bus_idx,
>                  bus = kvm_get_bus(kvm, bus_idx);
>                  if (bus)
>                          bus->ioeventfd_count--;
> -               ioeventfd_release(p);
>                  ret = 0;
>                  break;
>          }
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 13e88297f999..582757ebdce6 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -5200,6 +5200,12 @@ static struct notifier_block kvm_reboot_notifier = {
>          .priority = 0,
>   };
> 
> +static void kvm_iodevice_destructor(struct kvm_io_device *dev)
> +{
> +       if (dev->ops->destructor)
> +               dev->ops->destructor(dev);
> +}
> +
>   static void kvm_io_bus_destroy(struct kvm_io_bus *bus)
>   {
>          int i;
> @@ -5423,7 +5429,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
>   int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
>                                struct kvm_io_device *dev)
>   {
> -       int i, j;
> +       int i;
>          struct kvm_io_bus *new_bus, *bus;
> 
>          lockdep_assert_held(&kvm->slots_lock);
> @@ -5453,18 +5459,18 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
>          rcu_assign_pointer(kvm->buses[bus_idx], new_bus);
>          synchronize_srcu_expedited(&kvm->srcu);
> 
> -       /* Destroy the old bus _after_ installing the (null) bus. */
> +       /*
> +        * If (null) bus is installed, destroy the old bus, including all the
> +        * attached devices. Otherwise, destroy the caller's device only.
> +        */
>          if (!new_bus) {
>                  pr_err("kvm: failed to shrink bus, removing it completely\n");
> -               for (j = 0; j < bus->dev_count; j++) {
> -                       if (j == i)
> -                               continue;
> -                       kvm_iodevice_destructor(bus->range[j].dev);
> -               }
> +               kvm_io_bus_destroy(bus);
> +               return -ENOMEM;
>          }
> 
> -       kfree(bus);
> -       return new_bus ? 0 : -ENOMEM;
> +       kvm_iodevice_destructor(dev);
> +       return 0;
>   }
> 
>   struct kvm_io_device *kvm_io_bus_get_dev(struct kvm *kvm, enum kvm_bus bus_idx,
  
Wang, Wei W Dec. 26, 2022, 10:17 a.m. UTC | #4
On Saturday, December 24, 2022 1:14 AM, Paolo Bonzini wrote:
> On 12/20/22 04:04, Wang, Wei W wrote:
> > Another option is to let kvm_io_bus_unregister_dev handle this, and no
> > need for callers to make the extra kvm_iodevice_destructor() call.
> > This simplifies the usage for callers (e.g. reducing LOCs and no leakages like
> this):
> 
> Can you send this as a patch?  Thanks!

Sure. I can do it. I'm also fine if Sean would be interested in taking over the
code (or anything I should do to keep his credits for the original fixing?)
  
Sean Christopherson Jan. 3, 2023, 4:33 p.m. UTC | #5
On Mon, Dec 26, 2022, Wang, Wei W wrote:
> On Saturday, December 24, 2022 1:14 AM, Paolo Bonzini wrote:
> > On 12/20/22 04:04, Wang, Wei W wrote:
> > > Another option is to let kvm_io_bus_unregister_dev handle this, and no
> > > need for callers to make the extra kvm_iodevice_destructor() call.
> > > This simplifies the usage for callers (e.g. reducing LOCs and no leakages like
> > this):
> > 
> > Can you send this as a patch?  Thanks!
> 
> Sure. I can do it. I'm also fine if Sean would be interested in taking over the
> code 

No thanks.

> (or anything I should do to keep his credits for the original fixing?)

No need.  If anything, take my patch first so that the fix for stable kernels is
trivial.  That's Paolo's call though.
  
Sean Christopherson Feb. 1, 2023, 10:37 p.m. UTC | #6
On Mon, 19 Dec 2022 17:19:24 +0000, Sean Christopherson wrote:
> Destroy and free the target coalesced MMIO device if unregistering said
> device fails.  As clearly noted in the code, kvm_io_bus_unregister_dev()
> does not destroy the target device.
> 
>   BUG: memory leak
>   unreferenced object 0xffff888112a54880 (size 64):
>     comm "syz-executor.2", pid 5258, jiffies 4297861402 (age 14.129s)
>     hex dump (first 32 bytes):
>       38 c7 67 15 00 c9 ff ff 38 c7 67 15 00 c9 ff ff  8.g.....8.g.....
>       e0 c7 e1 83 ff ff ff ff 00 30 67 15 00 c9 ff ff  .........0g.....
>     backtrace:
>       [<0000000006995a8a>] kmalloc include/linux/slab.h:556 [inline]
>       [<0000000006995a8a>] kzalloc include/linux/slab.h:690 [inline]
>       [<0000000006995a8a>] kvm_vm_ioctl_register_coalesced_mmio+0x8e/0x3d0 arch/x86/kvm/../../../virt/kvm/coalesced_mmio.c:150
>       [<00000000022550c2>] kvm_vm_ioctl+0x47d/0x1600 arch/x86/kvm/../../../virt/kvm/kvm_main.c:3323
>       [<000000008a75102f>] vfs_ioctl fs/ioctl.c:46 [inline]
>       [<000000008a75102f>] file_ioctl fs/ioctl.c:509 [inline]
>       [<000000008a75102f>] do_vfs_ioctl+0xbab/0x1160 fs/ioctl.c:696
>       [<0000000080e3f669>] ksys_ioctl+0x76/0xa0 fs/ioctl.c:713
>       [<0000000059ef4888>] __do_sys_ioctl fs/ioctl.c:720 [inline]
>       [<0000000059ef4888>] __se_sys_ioctl fs/ioctl.c:718 [inline]
>       [<0000000059ef4888>] __x64_sys_ioctl+0x6f/0xb0 fs/ioctl.c:718
>       [<000000006444fa05>] do_syscall_64+0x9f/0x4e0 arch/x86/entry/common.c:290
>       [<000000009a4ed50b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe
> 
> [...]

Applied to kvm-x86 generic, the plan is that Wei will send the bigger
cleanup on top.  Thanks!

[1/1] KVM: Destroy target device if coalesced MMIO unregistration fails
      https://github.com/kvm-x86/linux/commit/b1cb1fac22ab

--
https://github.com/kvm-x86/linux/tree/next
https://github.com/kvm-x86/linux/tree/fixes
  

Patch

diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index 0be80c213f7f..5ef88f5a0864 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -187,15 +187,17 @@  int kvm_vm_ioctl_unregister_coalesced_mmio(struct kvm *kvm,
 			r = kvm_io_bus_unregister_dev(kvm,
 				zone->pio ? KVM_PIO_BUS : KVM_MMIO_BUS, &dev->dev);
 
+			kvm_iodevice_destructor(&dev->dev);
+
 			/*
 			 * On failure, unregister destroys all devices on the
 			 * bus _except_ the target device, i.e. coalesced_zones
-			 * has been modified.  No need to restart the walk as
-			 * there aren't any zones left.
+			 * has been modified.  Bail after destroying the target
+			 * device, there's no need to restart the walk as there
+			 * aren't any zones left.
 			 */
 			if (r)
 				break;
-			kvm_iodevice_destructor(&dev->dev);
 		}
 	}