[committed,OG10] amdgcn, openmp: Fix concurrency in low-latency allocator
Checks
Commit Message
I've committed this to the devel/omp/gcc-12 branch.
The patch fixes a concurrency issue where the spin-locks didn't work
well if many GPU threads tried to free low-latency memory all at once.
Adding a short sleep instruction is enough for the hardware thread to
yield and allow another to proceed. The alloc routine already had this
feature, so this just corrects an accidental omission.
This patch will get folded into the previous OG12 patch series when I
repost it for mainline.
Andrew
amdgcn, openmp: Fix concurrency in low-latency allocator
The previous code works fine on Fiji and Vega 10 devices, but bogs down in The
spin locks on Vega 20 or newer. Adding the sleep instructions fixes the
problem.
libgomp/ChangeLog:
* basic-allocator.c (basic_alloc_free): Use BASIC_ALLOC_YIELD.
(basic_alloc_realloc): Use BASIC_ALLOC_YIELD.
@@ -188,6 +188,7 @@ basic_alloc_free (char *heap, void *addr, size_t size)
break;
}
/* Spin. */
+ BASIC_ALLOC_YIELD;
}
while (1);
@@ -267,6 +268,7 @@ basic_alloc_realloc (char *heap, void *addr, size_t oldsize,
break;
}
/* Spin. */
+ BASIC_ALLOC_YIELD;
}
while (1);