um: time-travel: fix time going backwards

Message ID 20231020-uml-time-backwards-v1-1-90b776fc6dfd@axis.com
State New
Headers
Series um: time-travel: fix time going backwards |

Commit Message

Vincent Whitchurch Oct. 20, 2023, 2:47 p.m. UTC
  In basic time travel mode, I sometimes see "time goes backwards" panics
like the one below:

 Kernel panic: time-travel: time goes backwards 161689340000492 -> 161689339869814
 Call Trace:
  panic+0x1a1/0x3d7
  time_travel_update_time.cold+0xe9/0x133
  timer_read+0xc1/0x100
  ktime_get+0x10c/0x200
  copy_process+0x1899/0x2230
  kernel_clone+0x57/0x7a0
  kernel_thread+0x4a/0x50
  kthreadd+0x116/0x190

The problem is a race between time_travel_handle_real_alarm() and
timer_read().  time_travel_handle_real_alarm() changes the time after
time_read() reads the current time but before time_travel_update_time()
has had a chance to add the end event.

Fix this by doing the time read and event add atomically with respect to
time_travel_handle_real_alarm().

Signed-off-by: Vincent Whitchurch <vincent.whitchurch@axis.com>
---
 arch/um/kernel/time.c | 48 ++++++++++++++++++++++++++++++++++--------------
 1 file changed, 34 insertions(+), 14 deletions(-)


---
base-commit: 58720809f52779dc0f08e53e54b014209d13eebb
change-id: 20231020-uml-time-backwards-552c81aedd23

Best regards,
  

Comments

Vincent Whitchurch Oct. 23, 2023, 7:08 a.m. UTC | #1
On Fri, 2023-10-20 at 16:47 +0200, Vincent Whitchurch wrote:
> In basic time travel mode, I sometimes see "time goes backwards" panics
> like the one below:
> 
>  Kernel panic: time-travel: time goes backwards 161689340000492 -> 161689339869814
>  Call Trace:
>   panic+0x1a1/0x3d7
>   time_travel_update_time.cold+0xe9/0x133
>   timer_read+0xc1/0x100
>   ktime_get+0x10c/0x200
>   copy_process+0x1899/0x2230
>   kernel_clone+0x57/0x7a0
>   kernel_thread+0x4a/0x50
>   kthreadd+0x116/0x190
> 
> The problem is a race between time_travel_handle_real_alarm() and
> timer_read().  time_travel_handle_real_alarm() changes the time after
> time_read() reads the current time but before time_travel_update_time()
> has had a chance to add the end event.
> 
> Fix this by doing the time read and event add atomically with respect to
> time_travel_handle_real_alarm().

Further testing resulted in hitting the BUG_ON(time_travel_time !=
e->time) so looks like this needs some more work.  

 BUG: failure at arch/um/kernel/time.c:409/time_travel_update_to_event()!
 Kernel panic - not syncing: BUG!
 CPU: 0 PID: 1 Comm: swapper Not tainted 6.5.0+ #13
 Stack:
  624db908 60a06c60 41b58ab3 60ae2398
  6003fb90 10000c1e577c 41b58ab3 60b419ae
  607f0dc0 00000000 00000000 6005b270
 Call Trace:
  [<607fad5d>] show_stack.cold+0xc0/0x149
  [<60841364>] dump_stack_lvl+0x75/0x94
  [<6084139d>] dump_stack+0x1a/0x1c
  [<607fca1d>] panic+0x27e/0x490
  [<607fb093>] time_travel_update_to_event.cold+0x89/0x18b
  [<60041317>] time_travel_forward_time.constprop.0+0xf7/0x160
  [<600414f1>] timer_read+0x171/0x1b0
  [<60140652>] random_get_entropy_fallback+0x42/0x60
  [<60545943>] add_device_randomness+0x73/0xf0
  [<6003a685>] do_one_initcall+0x145/0x3b0
  [<60002598>] 0x60002598
  [<60843ac5>] kernel_init+0x4a/0x195
  [<6003d111>] new_thread_handler+0x141/0x1a0
  
Johannes Berg Oct. 23, 2023, 7:33 a.m. UTC | #2
On Mon, 2023-10-23 at 07:08 +0000, Vincent Whitchurch wrote:
> On Fri, 2023-10-20 at 16:47 +0200, Vincent Whitchurch wrote:
> > In basic time travel mode, I sometimes see "time goes backwards" panics
> > like the one below:
> > 
> >  Kernel panic: time-travel: time goes backwards 161689340000492 -> 161689339869814

Ouch.

> >  Call Trace:
> >   panic+0x1a1/0x3d7
> >   time_travel_update_time.cold+0xe9/0x133
> >   timer_read+0xc1/0x100
> >   ktime_get+0x10c/0x200
> >   copy_process+0x1899/0x2230
> >   kernel_clone+0x57/0x7a0
> >   kernel_thread+0x4a/0x50
> >   kthreadd+0x116/0x190
> > 
> > The problem is a race between time_travel_handle_real_alarm() and
> > timer_read().  time_travel_handle_real_alarm() changes the time after
> > time_read() reads the current time but before time_travel_update_time()
> > has had a chance to add the end event.
> > 
> > Fix this by doing the time read and event add atomically with respect to
> > time_travel_handle_real_alarm().
> 
> Further testing resulted in hitting the BUG_ON(time_travel_time !=
> e->time) so looks like this needs some more work.  
> 

Yeah this is a tricky area, I fought with it for quite a while too,
seems we're not done yet ;-)

We mostly use time-travel=ext mode these days, so our system may not be
as susceptible to it? But not sure, in some cases it runs with just a
single instance, and that should be pretty much the same due to the
free-until information.

Do you have a specific workload that tends to reproduce this?

johannes
  
Vincent Whitchurch Oct. 25, 2023, 11:55 a.m. UTC | #3
On Mon, 2023-10-23 at 09:33 +0200, Johannes Berg wrote:
> Do you have a specific workload that tends to reproduce this?

I've been seeing it when running roadtest, but it's easily reproducible
without that by using the attached config and the following program as
init.

  cp repro.config .config
  make ARCH=um olddefconfig all
  gcc -Wall -static -o repro repro.c
  ./linux time-travel init=$PWD/repro rootfstype=hostfs

With the above commands I usually see the panic in a few seconds.  This
is on an unmodified v6.6-rc7.

----8<-------
#include <time.h>
#include <pthread.h>
#include <sys/types.h>
#include <unistd.h>
#include <stdio.h>

int main(void)
{
        if (fork() == 0) {
                while (1) {
                        nanosleep(&(struct timespec){.tv_nsec = 330 * 1000}, NULL);
                }
                return 0;
        }
        
        while (1) {
                struct timespec ts;
                clock_gettime(CLOCK_MONOTONIC, &ts);
                nanosleep(&(struct timespec){.tv_nsec = 22 * 1000}, NULL);
        }
        
        return 0;
}
# CONFIG_LOCALVERSION_AUTO is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_LOG_BUF_SHIFT=18
CONFIG_EXPERT=y
CONFIG_HOSTFS=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_UML_TIME_TRAVEL_SUPPORT=y
CONFIG_SSL=y
CONFIG_NULL_CHAN=y
CONFIG_PORT_CHAN=y
CONFIG_PTY_CHAN=y
CONFIG_TTY_CHAN=y
CONFIG_XTERM_CHAN=y
CONFIG_CON_CHAN="pts"
CONFIG_SSL_CHAN="pts"
CONFIG_VIRTIO_UML=y
CONFIG_UML_PCI_OVER_VIRTIO=y
CONFIG_UML_PCI_OVER_VIRTIO_DEVICE_ID=1234
CONFIG_GCOV_KERNEL=y
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_BINFMT_MISC=m
# CONFIG_COMPACTION is not set
CONFIG_NET=y
CONFIG_UNIX=y
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_OF=y
# CONFIG_INPUT is not set
CONFIG_LEGACY_PTY_COUNT=32
CONFIG_SERIAL_SC16IS7XX=m
CONFIG_GOLDFISH_TTY=m
CONFIG_SERIAL_DEV_BUS=y
CONFIG_UML_RANDOM=y
# CONFIG_I2C_COMPAT is not set
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_VIRTIO=y
CONFIG_I2C_STUB=m
CONFIG_SPI=y
CONFIG_SPI_SC18IS602=y
CONFIG_SPI_MUX=y
CONFIG_PINCTRL=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_SYSFS=y
CONFIG_GPIO_GENERIC_PLATFORM=m
CONFIG_GPIO_MOCKUP=y
CONFIG_GPIO_VIRTIO=y
CONFIG_POWER_SUPPLY=y
CONFIG_CHARGER_BQ256XX=m
CONFIG_PMBUS=y
CONFIG_SENSORS_LTC2978=m
CONFIG_SENSORS_LTC2978_REGULATOR=y
CONFIG_SENSORS_PWM_FAN=m
CONFIG_SENSORS_TMP401=m
CONFIG_REGULATOR=y
CONFIG_REGULATOR_DEBUG=y
CONFIG_REGULATOR_FIXED_VOLTAGE=y
CONFIG_REGULATOR_VIRTUAL_CONSUMER=m
CONFIG_REGULATOR_GPIO=m
CONFIG_REGULATOR_TPS6286X=m
CONFIG_REGULATOR_TPS6287X=m
CONFIG_DRM=y
CONFIG_TINYDRM_ILI9341=m
CONFIG_NEW_LEDS=y
CONFIG_LEDS_CLASS=y
CONFIG_LEDS_GPIO=y
CONFIG_LEDS_TRIGGERS=y
CONFIG_LEDS_TRIGGER_HEARTBEAT=y
CONFIG_RTC_CLASS=y
# CONFIG_RTC_HCTOSYS is not set
# CONFIG_RTC_SYSTOHC is not set
CONFIG_RTC_DEBUG=y
CONFIG_RTC_DRV_PCF8523=m
CONFIG_RTC_DRV_PCF8563=m
CONFIG_GOLDFISH=y
CONFIG_COMMON_CLK=y
CONFIG_COMMON_CLK_SI5351=m
CONFIG_IIO=y
CONFIG_IIO_BUFFER_CB=m
CONFIG_IIO_SW_TRIGGER=y
CONFIG_MCP320X=m
CONFIG_TI_ADC081C=m
CONFIG_TI_ADC084S021=m
CONFIG_TI_ADS7924=m
CONFIG_PMS7003=m
CONFIG_OPT3001=m
CONFIG_OPT4001=m
CONFIG_VCNL4000=m
CONFIG_IIO_HRTIMER_TRIGGER=y
CONFIG_IIO_SYSFS_TRIGGER=y
CONFIG_IRSD200=m
CONFIG_PWM=y
CONFIG_PWM_PCA9685=m
CONFIG_MUX_GPIO=y
CONFIG_QUOTA=y
CONFIG_PROC_KCORE=y
CONFIG_TMPFS=y
CONFIG_NLS=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_JITTERENTROPY=y
CONFIG_CRC16=y
CONFIG_PRINTK_TIME=y
CONFIG_PRINTK_CALLER=y
CONFIG_DYNAMIC_DEBUG=y
CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
CONFIG_GDB_SCRIPTS=y
CONFIG_FRAME_WARN=1024
CONFIG_READABLE_ASM=y
CONFIG_DEBUG_FS=y
CONFIG_UBSAN=y
CONFIG_PAGE_EXTENSION=y
CONFIG_DEBUG_OBJECTS=y
CONFIG_DEBUG_OBJECTS_FREE=y
CONFIG_DEBUG_OBJECTS_TIMERS=y
CONFIG_DEBUG_OBJECTS_WORK=y
CONFIG_KASAN=y
CONFIG_PROVE_LOCKING=y
CONFIG_ENABLE_DEFAULT_TRACERS=y
CONFIG_FAULT_INJECTION=y
CONFIG_FAILSLAB=y
CONFIG_FAULT_INJECTION_DEBUG_FS=y
CONFIG_FAULT_INJECTION_STACKTRACE_FILTER=y
  
Johannes Berg Oct. 25, 2023, 7:51 p.m. UTC | #4
On Wed, 2023-10-25 at 11:55 +0000, Vincent Whitchurch wrote:
> On Mon, 2023-10-23 at 09:33 +0200, Johannes Berg wrote:
> > Do you have a specific workload that tends to reproduce this?
> 
> I've been seeing it when running roadtest, but it's easily reproducible
> without that by using the attached config and the following program as
> init.
> 
>   cp repro.config .config
>   make ARCH=um olddefconfig all
>   gcc -Wall -static -o repro repro.c
>   ./linux time-travel init=$PWD/repro rootfstype=hostfs
> 
> With the above commands I usually see the panic in a few seconds.  This
> is on an unmodified v6.6-rc7.
> 

Yes, I can reproduce it with your test, thanks. I'm on 6.4-rc still for
$reasons (6.5 we skipped during vacations, and 6.6 EEVDF scheduler broke
everything for hwsim tests in hostap), but this code didn't really
change anyway.

I'll poke at this.

(And I'm still amazed that anyone other than me even uses this stuff :P)

johannes
  
Johannes Berg Oct. 25, 2023, 8:02 p.m. UTC | #5
On Wed, 2023-10-25 at 21:51 +0200, Johannes Berg wrote:
> On Wed, 2023-10-25 at 11:55 +0000, Vincent Whitchurch wrote:
> > On Mon, 2023-10-23 at 09:33 +0200, Johannes Berg wrote:
> > > Do you have a specific workload that tends to reproduce this?
> > 
> > I've been seeing it when running roadtest, but it's easily reproducible
> > without that by using the attached config and the following program as
> > init.
> > 
> >   cp repro.config .config
> >   make ARCH=um olddefconfig all
> >   gcc -Wall -static -o repro repro.c
> >   ./linux time-travel init=$PWD/repro rootfstype=hostfs

Ohhh.

Pure "time-travel" is actually something I hardly think about these
days, we mostly use time-travel=inf-cpu (or =ext).

That makes sense, here you actually *can* get interrupted. I'll need to
dig into what happens though.

johannes
  
Johannes Berg Oct. 25, 2023, 8:41 p.m. UTC | #6
On Wed, 2023-10-25 at 22:02 +0200, Johannes Berg wrote:
> On Wed, 2023-10-25 at 21:51 +0200, Johannes Berg wrote:
> > On Wed, 2023-10-25 at 11:55 +0000, Vincent Whitchurch wrote:
> > > On Mon, 2023-10-23 at 09:33 +0200, Johannes Berg wrote:
> > > > Do you have a specific workload that tends to reproduce this?
> > > 
> > > I've been seeing it when running roadtest, but it's easily reproducible
> > > without that by using the attached config and the following program as
> > > init.
> > > 
> > >   cp repro.config .config
> > >   make ARCH=um olddefconfig all
> > >   gcc -Wall -static -o repro repro.c
> > >   ./linux time-travel init=$PWD/repro rootfstype=hostfs
> 
> Ohhh.
> 
> Pure "time-travel" is actually something I hardly think about these
> days, we mostly use time-travel=inf-cpu (or =ext).
> 
> That makes sense, here you actually *can* get interrupted. I'll need to
> dig into what happens though.

I just sent a patch, please take a look. It does seem to fix it - at
least I got bored of waiting after running your test program for a few
minutes :)

johannes
  

Patch

diff --git a/arch/um/kernel/time.c b/arch/um/kernel/time.c
index fddd1dec27e6..38a94dd41b8f 100644
--- a/arch/um/kernel/time.c
+++ b/arch/um/kernel/time.c
@@ -392,17 +392,11 @@  bool time_travel_del_event(struct time_travel_event *e)
 	return true;
 }
 
-static void time_travel_update_time(unsigned long long next, bool idle)
+static void time_travel_update_to_event(struct time_travel_event *ne, bool idle)
 {
-	struct time_travel_event ne = {
-		.onstack = true,
-	};
 	struct time_travel_event *e;
 	bool finished = idle;
 
-	/* add it without a handler - we deal with that specifically below */
-	__time_travel_add_event(&ne, next);
-
 	do {
 		e = time_travel_first_event();
 
@@ -414,7 +408,7 @@  static void time_travel_update_time(unsigned long long next, bool idle)
 			BUG_ON(!time_travel_del_event(e));
 			BUG_ON(time_travel_time != e->time);
 
-			if (e == &ne) {
+			if (e == ne) {
 				finished = true;
 			} else {
 				if (e->onstack)
@@ -427,14 +421,38 @@  static void time_travel_update_time(unsigned long long next, bool idle)
 		e = time_travel_first_event();
 		if (e)
 			time_travel_ext_update_request(e->time);
-	} while (ne.pending && !finished);
+	} while (ne->pending && !finished);
+
+	time_travel_del_event(ne);
+}
 
-	time_travel_del_event(&ne);
+static void time_travel_update_time(unsigned long long next, bool idle)
+{
+	struct time_travel_event ne = {
+		.onstack = true,
+	};
+
+	__time_travel_add_event(&ne, next);
+	time_travel_update_to_event(&ne, idle);
+}
+
+static void time_travel_forward_time(unsigned long nsec, bool idle)
+{
+	struct time_travel_event ne = {
+		.onstack = true,
+	};
+	unsigned long flags;
+
+	local_irq_save(flags);
+	__time_travel_add_event(&ne, time_travel_time + nsec);
+	local_irq_restore(flags);
+
+	time_travel_update_to_event(&ne, idle);
 }
 
 void time_travel_ndelay(unsigned long nsec)
 {
-	time_travel_update_time(time_travel_time + nsec, false);
+	time_travel_forward_time(nsec, false);
 }
 EXPORT_SYMBOL(time_travel_ndelay);
 
@@ -572,6 +590,10 @@  static inline void time_travel_update_time(unsigned long long ns, bool retearly)
 {
 }
 
+static void time_travel_forward_time(unsigned long nsec, bool idle)
+{
+}
+
 static inline void time_travel_handle_real_alarm(void)
 {
 }
@@ -720,9 +742,7 @@  static u64 timer_read(struct clocksource *cs)
 		 */
 		if (!irqs_disabled() && !in_interrupt() && !in_softirq() &&
 		    !time_travel_ext_waiting)
-			time_travel_update_time(time_travel_time +
-						TIMER_MULTIPLIER,
-						false);
+			time_travel_forward_time(TIMER_MULTIPLIER, false);
 		return time_travel_time / TIMER_MULTIPLIER;
 	}