libstdc++: Add missing constexpr to simd

Message ID 116291317.nniJfEyVGO@minbar
State Accepted
Headers
Series libstdc++: Add missing constexpr to simd |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Matthias Kretz May 22, 2023, 3:36 p.m. UTC
  OK for trunk and backporting?

regtested on x86_64-linux and aarch64-linux

The constexpr API is only available with -std=gnu++XX (and proposed for
C++26). The proposal is to have the complete simd API usable in constant
expressions.

This patch resolves several issues with using simd in constant
expressions.

Issues why constant_evaluated branches are necessary:
* subscripting vector builtins is not allowed in constant expressions
* if the implementation needs/uses memcpy
* if the implementation would otherwise call SIMD intrinsics/builtins

Signed-off-by: Matthias Kretz <m.kretz@gsi.de>

libstdc++-v3/ChangeLog:

	PR libstdc++/109261
	* include/experimental/bits/simd.h (_SimdWrapper::_M_set):
	Avoid vector builtin subscripting in constant expressions.
	(resizing_simd_cast): Avoid memcpy if constant_evaluated.
	(const_where_expression, where_expression, where)
	(__extract_part, simd_mask, _SimdIntOperators, simd): Add either
	_GLIBCXX_SIMD_CONSTEXPR (on public APIs), or constexpr (on
	internal APIs).
	* include/experimental/bits/simd_builtin.h (__vector_permute)
	(__vector_shuffle, __extract_part, _GnuTraits::_SimdCastType1)
	(_GnuTraits::_SimdCastType2, _SimdImplBuiltin)
	(_MaskImplBuiltin::_S_store): Add constexpr.
	(_CommonImplBuiltin::_S_store_bool_array)
	(_SimdImplBuiltin::_S_load, _SimdImplBuiltin::_S_store)
	(_SimdImplBuiltin::_S_reduce, _MaskImplBuiltin::_S_load): Add
	constant_evaluated case.
	* include/experimental/bits/simd_fixed_size.h
	(_S_masked_load): Reword comment.
	(__tuple_element_meta, __make_meta, _SimdTuple::_M_apply_r)
	(_SimdTuple::_M_subscript_read, _SimdTuple::_M_subscript_write)
	(__make_simd_tuple, __optimize_simd_tuple, __extract_part)
	(__autocvt_to_simd, _Fixed::__traits::_SimdBase)
	(_Fixed::__traits::_SimdCastType, _SimdImplFixedSize): Add
	constexpr.
	(_SimdTuple::operator[], _M_set): Add constexpr and add
	constant_evaluated case.
	(_MaskImplFixedSize::_S_load): Add constant_evaluated case.
	* include/experimental/bits/simd_scalar.h: Add constexpr.

	* include/experimental/bits/simd_x86.h (_CommonImplX86): Add
	constexpr and add constant_evaluated case.
	(_SimdImplX86::_S_equal_to, _S_not_equal_to, _S_less)
	(_S_less_equal): Value-initialize to satisfy constexpr
	evaluation.
	(_MaskImplX86::_S_load): Add constant_evaluated case.
	(_MaskImplX86::_S_store): Add constexpr and constant_evaluated
	case. Value-initialize local variables.
	(_MaskImplX86::_S_logical_and, _S_logical_or, _S_bit_not)
	(_S_bit_and, _S_bit_or, _S_bit_xor): Add constant_evaluated
	case.
	* testsuite/experimental/simd/pr109261_constexpr_simd.cc: New
	test.
---
 libstdc++-v3/include/experimental/bits/simd.h | 153 ++++++++-------
 .../include/experimental/bits/simd_builtin.h  | 100 ++++++----
 .../experimental/bits/simd_fixed_size.h       | 177 +++++++++---------
 .../include/experimental/bits/simd_scalar.h   |  78 ++++----
 .../include/experimental/bits/simd_x86.h      |  68 +++++--
 .../simd/pr109261_constexpr_simd.cc           | 109 +++++++++++
 6 files changed, 437 insertions(+), 248 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/experimental/simd/
pr109261_constexpr_simd.cc


--
──────────────────────────────────────────────────────────────────────────
 Dr. Matthias Kretz                           https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research               https://gsi.de
 stdₓ::simd
──────────────────────────────────────────────────────────────────────────
  

Comments

Jonathan Wakely May 22, 2023, 4:25 p.m. UTC | #1
On Mon, 22 May 2023 at 16:36, Matthias Kretz via Libstdc++ <
libstdc++@gcc.gnu.org> wrote:

> OK for trunk and backporting?
>
> regtested on x86_64-linux and aarch64-linux
>
> The constexpr API is only available with -std=gnu++XX (and proposed for
> C++26). The proposal is to have the complete simd API usable in constant
> expressions.
>
> This patch resolves several issues with using simd in constant
> expressions.
>
> Issues why constant_evaluated branches are necessary:
>

I note that using if (not __builtin_constant_evaluated()) will fail if
compiled with -fno-operator-names, which is why we don't use 'not', 'and',
etc. elsewhere in libstdc++. I don't know if (or why) anybody uses that
option though, so I don't think you need to hange anything in stdx::simd.




> * subscripting vector builtins is not allowed in constant expressions
>

Is that just because nobody made it work (yet)?


* if the implementation needs/uses memcpy
> * if the implementation would otherwise call SIMD intrinsics/builtins
>


The indentation looks off here and in the _M_set member function following
it:

     operator[](size_t __i) const noexcept
     {
       if constexpr (_S_tuple_size == 1)
  return _M_subscript_read(__i);
       else
- {
 #ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
-  return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
-#else
-  if constexpr (__is_scalar_abi<_Abi0>())
-    {
-      const _Tp* ptr = &first;
-      return ptr[__i];
-    }
-  else
-    return __i < simd_size_v<_Tp, _Abi0>
-     ? _M_subscript_read(__i)
-     : second[__i - simd_size_v<_Tp, _Abi0>];
+ if (not __builtin_is_constant_evaluated())
+ return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
+      else
 #endif
+ if constexpr (__is_scalar_abi<_Abi0>())
+ {
+  const _Tp* ptr = &first;
+  return ptr[__i];
  }
+      else
+ return __i < simd_size_v<_Tp, _Abi0> ? _M_subscript_read(__i)
+     : second[__i - simd_size_v<_Tp, _Abi0>];
     }


Are the copyright years on
testsuite/experimental/simd/pr109261_constexpr_simd.cc correct, or just
copy&paste?
  
Matthias Kretz May 22, 2023, 8:27 p.m. UTC | #2
On Monday, 22 May 2023 18:25:15 CEST Jonathan Wakely wrote:
> I note that using if (not __builtin_constant_evaluated()) will fail if
> compiled with -fno-operator-names, which is why we don't use 'not', 'and',
> etc. elsewhere in libstdc++. I don't know if (or why) anybody uses that
> option though, so I don't think you need to hange anything in stdx::simd.

Ah, I just recently convinced myself that "operator-names" are more readable 
(=> easier to maintain). But OTOH a mix isn't necessarily better. I'm fine 
with keeping it consistent.

> > * subscripting vector builtins is not allowed in constant expressions
> 
> Is that just because nobody made it work (yet)?

That is a good question. I guess I should open a PR.

> * if the implementation needs/uses memcpy
> 
> > * if the implementation would otherwise call SIMD intrinsics/builtins
> 
> The indentation looks off here and in the _M_set member function following
> it:

Yes. I had to put an #if between an else and an if. Looks like this:

  else
#ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
    if (not __builtin_is_constant_evaluated())
    return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
  else
#endif
    if constexpr (__is_scalar_abi<_Abi0>())

Should the `if` be aligned to the `else` instead?

> Are the copyright years on
> testsuite/experimental/simd/pr109261_constexpr_simd.cc correct, or just
> copy&paste?

Right, copy&paste. Should I simply remove the complete header?

- Matthias
  
Jonathan Wakely May 22, 2023, 8:51 p.m. UTC | #3
On Mon, 22 May 2023 at 21:27, Matthias Kretz <m.kretz@gsi.de> wrote:

> On Monday, 22 May 2023 18:25:15 CEST Jonathan Wakely wrote:
> > I note that using if (not __builtin_constant_evaluated()) will fail if
> > compiled with -fno-operator-names, which is why we don't use 'not',
> 'and',
> > etc. elsewhere in libstdc++. I don't know if (or why) anybody uses that
> > option though, so I don't think you need to hange anything in stdx::simd.
>
> Ah, I just recently convinced myself that "operator-names" are more
> readable
> (=> easier to maintain).


I tend to agree, but every time I decide to start using them some testcases
start to fail and I remember why we don't use them :-(



> But OTOH a mix isn't necessarily better. I'm fine
> with keeping it consistent.
>
> > > * subscripting vector builtins is not allowed in constant expressions
> >
> > Is that just because nobody made it work (yet)?
>
> That is a good question. I guess I should open a PR.
>
> > * if the implementation needs/uses memcpy
> >
> > > * if the implementation would otherwise call SIMD intrinsics/builtins
> >
> > The indentation looks off here and in the _M_set member function
> following
> > it:
>
> Yes. I had to put an #if between an else and an if. Looks like this:
>
>   else
> #ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
>     if (not __builtin_is_constant_evaluated())
>     return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
>   else
> #endif
>     if constexpr (__is_scalar_abi<_Abi0>())
>
>
Ah yes, so the if is indented two spaces from the else above it.
What looks wrong to me is that the return is the at the same indentation as
the if controlling it.



> Should the `if` be aligned to the `else` instead?
>

How about moving the two else tokens?

 #ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
   else if (not __builtin_is_constant_evaluated())
     return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
 #endif
   else if constexpr (__is_scalar_abi<_Abi0>())

I think that avoids the issue.



>
> > Are the copyright years on
> > testsuite/experimental/simd/pr109261_constexpr_simd.cc correct, or just
> > copy&paste?
>
> Right, copy&paste. Should I simply remove the complete header?
>
>
You could do. I don't think there's much in that test that's novel or worth
asserting copyright over - but if you disagree and want to assign whatever
is copyrightable to the FSF, keep the header but fix the years. Either way
is fine by me.

OK for trunk and backports, with the comments above suitably resolved.
  
Marc Glisse May 22, 2023, 8:56 p.m. UTC | #4
On Mon, 22 May 2023, Jonathan Wakely via Libstdc++ wrote:

>> * subscripting vector builtins is not allowed in constant expressions
>
> Is that just because nobody made it work (yet)?

Yes.

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101651 and others.

>> * if the implementation would otherwise call SIMD intrinsics/builtins

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80517 and others.

Makes sense to work around them for now.
  

Patch

diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index 224153ffbaf..b0571ca26c4 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -2675,7 +2675,14 @@  _SimdWrapper(_V __x)
 
     _GLIBCXX_SIMD_INTRINSIC constexpr void
     _M_set(size_t __i, _Tp __x)
-    { _M_data[__i] = __x; }
+    {
+      if (__builtin_is_constant_evaluated())
+	_M_data = __generate_from_n_evaluations<_Width, _BuiltinType>([&](auto __j) {
+		    return __j == __i ? __x : _M_data[__j()];
+		  });
+      else
+	_M_data[__i] = __x;
+    }
 
     _GLIBCXX_SIMD_INTRINSIC
     constexpr bool
@@ -3186,6 +3193,10 @@  resizing_simd_cast(const simd<_Up, _Ap>& __x)
   {
     if constexpr (is_same_v<typename _Tp::abi_type, _Ap>)
       return __x;
+    else if (__builtin_is_constant_evaluated())
+      return _Tp([&](auto __i) constexpr {
+	       return __i < simd_size_v<_Up, _Ap> ? __x[__i] : _Up();
+	     });
     else if constexpr (simd_size_v<_Up, _Ap> == 1)
       {
 	_Tp __r{};
@@ -3321,10 +3332,11 @@  __get_lvalue(const const_where_expression& __x)
 
     const_where_expression& operator=(const const_where_expression&) = delete;
 
-    _GLIBCXX_SIMD_INTRINSIC const_where_expression(const _M& __kk, const _Tp& dd)
-      : _M_k(__kk), _M_value(const_cast<_Tp&>(dd)) {}
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    const_where_expression(const _M& __kk, const _Tp& dd)
+    : _M_k(__kk), _M_value(const_cast<_Tp&>(dd)) {}
 
-    _GLIBCXX_SIMD_INTRINSIC _V
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _V
     operator-() const&&
     {
       return {__private_init,
@@ -3333,7 +3345,7 @@  __get_lvalue(const const_where_expression& __x)
     }
 
     template <typename _Up, typename _Flags>
-      [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _V
+      [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _V
       copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _IsSimdFlagType<_Flags>) const&&
       {
 	return {__private_init,
@@ -3342,7 +3354,7 @@  __get_lvalue(const const_where_expression& __x)
       }
 
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_INTRINSIC void
+      _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
       copy_to(_LoadStorePtr<_Up, value_type>* __mem, _IsSimdFlagType<_Flags>) const&&
       {
 	_Impl::_S_masked_store(__data(_M_value),
@@ -3381,19 +3393,21 @@  __get_lvalue(const const_where_expression& __x)
     const_where_expression(const const_where_expression&) = delete;
     const_where_expression& operator=(const const_where_expression&) = delete;
 
-    _GLIBCXX_SIMD_INTRINSIC const_where_expression(const bool __kk, const _Tp& dd)
-      : _M_k(__kk), _M_value(const_cast<_Tp&>(dd)) {}
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    const_where_expression(const bool __kk, const _Tp& dd)
+    : _M_k(__kk), _M_value(const_cast<_Tp&>(dd)) {}
 
-    _GLIBCXX_SIMD_INTRINSIC _V operator-() const&&
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _V
+    operator-() const&&
     { return _M_k ? -_M_value : _M_value; }
 
     template <typename _Up, typename _Flags>
-      [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _V
+      [[nodiscard]] _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR _V
       copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _IsSimdFlagType<_Flags>) const&&
       { return _M_k ? static_cast<_V>(__mem[0]) : _M_value; }
 
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_INTRINSIC void
+      _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
       copy_to(_LoadStorePtr<_Up, value_type>* __mem, _IsSimdFlagType<_Flags>) const&&
       {
 	if (_M_k)
@@ -3419,18 +3433,21 @@  static_assert(
       is_same<typename _M::abi_type, typename _Tp::abi_type>::value, "");
     static_assert(_M::size() == _Tp::size(), "");
 
-    _GLIBCXX_SIMD_INTRINSIC friend _Tp& __get_lvalue(where_expression& __x)
+    _GLIBCXX_SIMD_INTRINSIC friend constexpr _Tp&
+    __get_lvalue(where_expression& __x)
     { return __x._M_value; }
 
   public:
     where_expression(const where_expression&) = delete;
     where_expression& operator=(const where_expression&) = delete;
 
-    _GLIBCXX_SIMD_INTRINSIC where_expression(const _M& __kk, _Tp& dd)
-      : const_where_expression<_M, _Tp>(__kk, dd) {}
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+    where_expression(const _M& __kk, _Tp& dd)
+    : const_where_expression<_M, _Tp>(__kk, dd) {}
 
     template <typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC void operator=(_Up&& __x) &&
+      _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
+      operator=(_Up&& __x) &&
       {
 	_Impl::_S_masked_assign(__data(_M_k), __data(_M_value),
 				__to_value_type_or_member_type<_Tp>(
@@ -3439,7 +3456,8 @@  static_assert(
 
 #define _GLIBCXX_SIMD_OP_(__op, __name)                                        \
   template <typename _Up>                                                      \
-    _GLIBCXX_SIMD_INTRINSIC void operator __op##=(_Up&& __x)&&                 \
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void                       \
+    operator __op##=(_Up&& __x)&&                                              \
     {                                                                          \
       _Impl::template _S_masked_cassign(                                       \
 	__data(_M_k), __data(_M_value),                                        \
@@ -3461,28 +3479,28 @@  static_assert(
     _GLIBCXX_SIMD_OP_(>>, _S_shift_right);
 #undef _GLIBCXX_SIMD_OP_
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator++() &&
     {
       __data(_M_value)
 	= _Impl::template _S_masked_unary<__increment>(__data(_M_k), __data(_M_value));
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator++(int) &&
     {
       __data(_M_value)
 	= _Impl::template _S_masked_unary<__increment>(__data(_M_k), __data(_M_value));
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator--() &&
     {
       __data(_M_value)
 	= _Impl::template _S_masked_unary<__decrement>(__data(_M_k), __data(_M_value));
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator--(int) &&
     {
       __data(_M_value)
@@ -3491,7 +3509,7 @@  static_assert(
 
     // intentionally hides const_where_expression::copy_from
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_INTRINSIC void
+      _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
       copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _IsSimdFlagType<_Flags>) &&
       {
 	__data(_M_value) = _Impl::_S_masked_load(__data(_M_value), __data(_M_k),
@@ -3513,13 +3531,13 @@  class where_expression<bool, _Tp>
     where_expression(const where_expression&) = delete;
     where_expression& operator=(const where_expression&) = delete;
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
     where_expression(const _M& __kk, _Tp& dd)
     : const_where_expression<_M, _Tp>(__kk, dd) {}
 
 #define _GLIBCXX_SIMD_OP_(__op)                                                \
     template <typename _Up>                                                    \
-      _GLIBCXX_SIMD_INTRINSIC void                                             \
+      _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void                     \
       operator __op(_Up&& __x)&&                                               \
       { if (_M_k) _M_value __op static_cast<_Up&&>(__x); }
 
@@ -3536,68 +3554,71 @@  where_expression(const _M& __kk, _Tp& dd)
     _GLIBCXX_SIMD_OP_(>>=)
   #undef _GLIBCXX_SIMD_OP_
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator++() &&
     { if (_M_k) ++_M_value; }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator++(int) &&
     { if (_M_k) ++_M_value; }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator--() &&
     { if (_M_k) --_M_value; }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
     operator--(int) &&
     { if (_M_k) --_M_value; }
 
     // intentionally hides const_where_expression::copy_from
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_INTRINSIC void
+      _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR void
       copy_from(const _LoadStorePtr<_Up, value_type>* __mem, _IsSimdFlagType<_Flags>) &&
       { if (_M_k) _M_value = __mem[0]; }
   };
 
 // where {{{1
 template <typename _Tp, typename _Ap>
-  _GLIBCXX_SIMD_INTRINSIC where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
+  where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
   where(const typename simd<_Tp, _Ap>::mask_type& __k, simd<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp, typename _Ap>
-  _GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
   const_where_expression<simd_mask<_Tp, _Ap>, simd<_Tp, _Ap>>
   where(const typename simd<_Tp, _Ap>::mask_type& __k, const simd<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp, typename _Ap>
-  _GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
   where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
   where(const remove_const_t<simd_mask<_Tp, _Ap>>& __k, simd_mask<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp, typename _Ap>
-  _GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR
   const_where_expression<simd_mask<_Tp, _Ap>, simd_mask<_Tp, _Ap>>
   where(const remove_const_t<simd_mask<_Tp, _Ap>>& __k, const simd_mask<_Tp, _Ap>& __value)
   { return {__k, __value}; }
 
 template <typename _Tp>
-  _GLIBCXX_SIMD_INTRINSIC where_expression<bool, _Tp>
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR where_expression<bool, _Tp>
   where(_ExactBool __k, _Tp& __value)
   { return {__k, __value}; }
 
 template <typename _Tp>
-  _GLIBCXX_SIMD_INTRINSIC const_where_expression<bool, _Tp>
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR const_where_expression<bool, _Tp>
   where(_ExactBool __k, const _Tp& __value)
   { return {__k, __value}; }
 
 template <typename _Tp, typename _Ap>
-  void where(bool __k, simd<_Tp, _Ap>& __value) = delete;
+  _GLIBCXX_SIMD_CONSTEXPR void
+  where(bool __k, simd<_Tp, _Ap>& __value) = delete;
 
 template <typename _Tp, typename _Ap>
-  void where(bool __k, const simd<_Tp, _Ap>& __value) = delete;
+  _GLIBCXX_SIMD_CONSTEXPR void
+  where(bool __k, const simd<_Tp, _Ap>& __value) = delete;
 
 // proposed mask iterations {{{1
 namespace __proposed {
@@ -3820,12 +3841,12 @@  clamp(const simd<_Tp, _Ap>& __v, const simd<_Tp, _Ap>& __lo, const simd<_Tp, _Ap
 
 // __extract_part {{{
 template <int _Index, int _Total, int _Combine = 1, typename _Tp, size_t _Np>
-  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr
   _SimdWrapper<_Tp, _Np / _Total * _Combine>
   __extract_part(const _SimdWrapper<_Tp, _Np> __x);
 
 template <int _Index, int _Parts, int _Combine = 1, typename _Tp, typename _A0, typename... _As>
-  _GLIBCXX_SIMD_INTRINSIC auto
+  _GLIBCXX_SIMD_INTRINSIC constexpr auto
   __extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x);
 
 // }}}
@@ -4551,7 +4572,7 @@  class simd_mask
 
     // }}}
     // access to internal representation (optional feature) {{{
-    _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR explicit
     simd_mask(typename _Traits::_MaskCastType __init)
     : _M_data{__init} {}
     // conversions to internal type is done in _MaskBase
@@ -4562,11 +4583,11 @@  class simd_mask
     // Conversion of simd_mask to and from bitset makes it much easier to
     // interface with other facilities. I suggest adding `static
     // simd_mask::from_bitset` and `simd_mask::to_bitset`.
-    _GLIBCXX_SIMD_ALWAYS_INLINE static simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR static simd_mask
     __from_bitset(bitset<size()> bs)
     { return {__bitset_init, bs}; }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE bitset<size()>
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR bitset<size()>
     __to_bitset() const
     { return _Impl::_S_to_bits(_M_data)._M_to_bitset(); }
 
@@ -4591,7 +4612,7 @@  simd_mask(const simd_mask<_Up, _A2>& __x)
     template <typename _Up, typename = enable_if_t<conjunction<
 			      is_same<abi_type, simd_abi::fixed_size<size()>>,
 			      is_same<_Up, _Up>>::value>>
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
       simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
       : _M_data(_Impl::_S_from_bitmask(__data(__x), _S_type_tag)) {}
   #endif
@@ -4599,12 +4620,12 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     // }}}
     // load constructor {{{
     template <typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
       simd_mask(const value_type* __mem, _IsSimdFlagType<_Flags>)
       : _M_data(_Impl::template _S_load<_Ip>(_Flags::template _S_apply<simd_mask>(__mem))) {}
 
     template <typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
       simd_mask(const value_type* __mem, simd_mask __k, _IsSimdFlagType<_Flags>)
       : _M_data{}
       {
@@ -4615,20 +4636,20 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     // }}}
     // loads [simd_mask.load] {{{
     template <typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR void
       copy_from(const value_type* __mem, _IsSimdFlagType<_Flags>)
       { _M_data = _Impl::template _S_load<_Ip>(_Flags::template _S_apply<simd_mask>(__mem)); }
 
     // }}}
     // stores [simd_mask.store] {{{
     template <typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR void
       copy_to(value_type* __mem, _IsSimdFlagType<_Flags>) const
       { _Impl::_S_store(_M_data, _Flags::template _S_apply<simd_mask>(__mem)); }
 
     // }}}
     // scalar access {{{
-    _GLIBCXX_SIMD_ALWAYS_INLINE reference
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR reference
     operator[](size_t __i)
     {
       if (__i >= size())
@@ -4636,7 +4657,7 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
       return {_M_data, int(__i)};
     }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE value_type
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR value_type
     operator[](size_t __i) const
     {
       if (__i >= size())
@@ -4649,7 +4670,7 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
 
     // }}}
     // negation {{{
-    _GLIBCXX_SIMD_ALWAYS_INLINE simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR simd_mask
     operator!() const
     { return {__private_init, _Impl::_S_bit_not(_M_data)}; }
 
@@ -4659,7 +4680,7 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
     // simd_mask<int> && simd_mask<uint> needs disambiguation
     template <typename _Up, typename _A2,
 	      typename = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
-      _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
       operator&&(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
       {
 	return {__private_init,
@@ -4668,7 +4689,7 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
 
     template <typename _Up, typename _A2,
 	      typename = enable_if_t<is_convertible_v<simd_mask<_Up, _A2>, simd_mask>>>
-      _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
       operator||(const simd_mask& __x, const simd_mask<_Up, _A2>& __y)
       {
 	return {__private_init,
@@ -4676,41 +4697,41 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
       }
   #endif // _GLIBCXX_SIMD_ENABLE_IMPLICIT_MASK_CAST
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
     operator&&(const simd_mask& __x, const simd_mask& __y)
     { return {__private_init, _Impl::_S_logical_and(__x._M_data, __y._M_data)}; }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
     operator||(const simd_mask& __x, const simd_mask& __y)
     { return {__private_init, _Impl::_S_logical_or(__x._M_data, __y._M_data)}; }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
     operator&(const simd_mask& __x, const simd_mask& __y)
     { return {__private_init, _Impl::_S_bit_and(__x._M_data, __y._M_data)}; }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
     operator|(const simd_mask& __x, const simd_mask& __y)
     { return {__private_init, _Impl::_S_bit_or(__x._M_data, __y._M_data)}; }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask
     operator^(const simd_mask& __x, const simd_mask& __y)
     { return {__private_init, _Impl::_S_bit_xor(__x._M_data, __y._M_data)}; }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask&
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask&
     operator&=(simd_mask& __x, const simd_mask& __y)
     {
       __x._M_data = _Impl::_S_bit_and(__x._M_data, __y._M_data);
       return __x;
     }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask&
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask&
     operator|=(simd_mask& __x, const simd_mask& __y)
     {
       __x._M_data = _Impl::_S_bit_or(__x._M_data, __y._M_data);
       return __x;
     }
 
-    _GLIBCXX_SIMD_ALWAYS_INLINE friend simd_mask&
+    _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR friend simd_mask&
     operator^=(simd_mask& __x, const simd_mask& __y)
     {
       __x._M_data = _Impl::_S_bit_xor(__x._M_data, __y._M_data);
@@ -4747,7 +4768,8 @@  simd_mask(const simd_mask<_Up, simd_abi::fixed_size<size()>>& __x)
 
     // }}}
     // bitset_init ctor {{{
-    _GLIBCXX_SIMD_INTRINSIC simd_mask(_BitsetInit, bitset<size()> __init)
+    _GLIBCXX_SIMD_INTRINSIC constexpr
+    simd_mask(_BitsetInit, bitset<size()> __init)
     : _M_data(_Impl::_S_from_bitmask(_SanitizedBitMask<size()>(__init), _S_type_tag))
     {}
 
@@ -5015,7 +5037,8 @@  class _SimdIntOperators<_V, _Tp, _Abi, true>
   {
     using _Impl = typename _SimdTraits<_Tp, _Abi>::_SimdImpl;
 
-    _GLIBCXX_SIMD_INTRINSIC const _V& __derived() const
+    _GLIBCXX_SIMD_INTRINSIC constexpr const _V&
+    __derived() const
     { return *static_cast<const _V*>(this); }
 
     template <typename _Up>
@@ -5235,7 +5258,7 @@  simd(const simd<_Up, _A2>& __x)
 
     // load constructor
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR
       simd(const _Up* __mem, _IsSimdFlagType<_Flags>)
       : _M_data(
 	  _Impl::_S_load(_Flags::template _S_apply<simd>(__mem), _S_type_tag))
@@ -5243,7 +5266,7 @@  simd(const simd<_Up, _A2>& __x)
 
     // loads [simd.load]
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR void
       copy_from(const _Vectorizable<_Up>* __mem, _IsSimdFlagType<_Flags>)
       {
 	_M_data = static_cast<decltype(_M_data)>(
@@ -5252,7 +5275,7 @@  simd(const simd<_Up, _A2>& __x)
 
     // stores [simd.store]
     template <typename _Up, typename _Flags>
-      _GLIBCXX_SIMD_ALWAYS_INLINE void
+      _GLIBCXX_SIMD_ALWAYS_INLINE _GLIBCXX_SIMD_CONSTEXPR void
       copy_to(_Vectorizable<_Up>* __mem, _IsSimdFlagType<_Flags>) const
       {
 	_Impl::_S_store(_M_data, _Flags::template _S_apply<simd>(__mem),
@@ -5424,7 +5447,7 @@  _M_is_constprop() const
     }
 
   private:
-    _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_SIMD_CONSTEXPR static mask_type
+    _GLIBCXX_SIMD_INTRINSIC static constexpr mask_type
     _S_make_mask(typename mask_type::_MemberType __k)
     { return {__private_init, __k}; }
 
diff --git a/libstdc++-v3/include/experimental/bits/simd_builtin.h b/libstdc++-v3/include/experimental/bits/simd_builtin.h
index 3d52bc6c96a..8337fa2d9a6 100644
--- a/libstdc++-v3/include/experimental/bits/simd_builtin.h
+++ b/libstdc++-v3/include/experimental/bits/simd_builtin.h
@@ -52,7 +52,7 @@ 
 // Index == -1 requests zeroing of the output element
 template <int... _Indices, typename _Tp, typename _TVT = _VectorTraits<_Tp>,
 	  typename = __detail::__odr_helper>
-  _Tp
+  constexpr _Tp
   __vector_permute(_Tp __x)
   {
     static_assert(sizeof...(_Indices) == _TVT::_S_full_size);
@@ -65,7 +65,7 @@  __vector_permute(_Tp __x)
 // Index == -1 requests zeroing of the output element
 template <int... _Indices, typename _Tp, typename _TVT = _VectorTraits<_Tp>,
 	  typename = __detail::__odr_helper>
-  _Tp
+  constexpr _Tp
   __vector_shuffle(_Tp __x, _Tp __y)
   {
     return _Tp{(_Indices == -1 ? 0
@@ -205,7 +205,7 @@  __shift_elements_right(_Tp __v)
 // }}}
 // __extract_part(_SimdWrapper<_Tp, _Np>) {{{
 template <int _Index, int _Total, int _Combine, typename _Tp, size_t _Np>
-  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST
+  _GLIBCXX_SIMD_INTRINSIC _GLIBCXX_CONST constexpr
   _SimdWrapper<_Tp, _Np / _Total * _Combine>
   __extract_part(const _SimdWrapper<_Tp, _Np> __x)
   {
@@ -905,10 +905,10 @@  class _SimdCastType1
       _SimdMember _M_data;
 
     public:
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
       _SimdCastType1(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
       operator _SimdMember() const { return _M_data; }
     };
 
@@ -919,13 +919,13 @@  class _SimdCastType2
       _SimdMember _M_data;
 
     public:
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
       _SimdCastType2(_Ap __a) : _M_data(__vector_bitcast<_Tp>(__a)) {}
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
       _SimdCastType2(_Bp __b) : _M_data(__b) {}
 
-      _GLIBCXX_SIMD_ALWAYS_INLINE
+      _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
       operator _SimdMember() const { return _M_data; }
     };
 
@@ -1345,6 +1345,11 @@  _S_store_bool_array(_BitMask<_Np, _Sanitized> __x, bool* __mem)
     {
       if constexpr (_Np == 1)
 	__mem[0] = __x[0];
+      else if (__builtin_is_constant_evaluated())
+	{
+	  for (size_t __i = 0; __i < _Np; ++__i)
+	    __mem[__i] = __x[__i];
+	}
       else if constexpr (_Np == 2)
 	{
 	  short __bool2 = (__x._M_to_bits() * 0x81) & 0x0101;
@@ -1424,12 +1429,12 @@  struct _SimdImplBuiltin
 
     // _M_make_simd(_SimdWrapper/__intrinsic_type_t) {{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static simd<_Tp, _Abi>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr simd<_Tp, _Abi>
       _M_make_simd(_SimdWrapper<_Tp, _Np> __x)
       { return {__private_init, __x}; }
 
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static simd<_Tp, _Abi>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr simd<_Tp, _Abi>
       _M_make_simd(__intrinsic_type_t<_Tp, _Np> __x)
       { return {__private_init, __vector_bitcast<_Tp>(__x)}; }
 
@@ -1455,7 +1460,7 @@  _S_broadcast(_Tp __x) noexcept
 
     // _S_load {{{2
     template <typename _Tp, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdMember<_Tp>
       _S_load(const _Up* __mem, _TypeTag<_Tp>) noexcept
       {
 	constexpr size_t _Np = _S_size<_Tp>;
@@ -1464,7 +1469,12 @@  _S_broadcast(_Tp __x) noexcept
 	    : (is_floating_point_v<_Up> && __have_avx) || __have_avx2 ? 32
 								      : 16;
 	constexpr size_t __bytes_to_load = sizeof(_Up) * _Np;
-	if constexpr (sizeof(_Up) > 8)
+	if (__builtin_is_constant_evaluated())
+	  return __generate_vector<_Tp, _S_full_size<_Tp>>(
+		   [&](auto __i) constexpr {
+		     return static_cast<_Tp>(__i < _Np ? __mem[__i] : 0);
+		   });
+	else if constexpr (sizeof(_Up) > 8)
 	  return __generate_vector<_Tp, _SimdMember<_Tp>::_S_full_size>(
 		   [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 		     return static_cast<_Tp>(__i < _Np ? __mem[__i] : 0);
@@ -1511,7 +1521,7 @@  _S_broadcast(_Tp __x) noexcept
 
     // _S_masked_load {{{2
     template <typename _Tp, size_t _Np, typename _Up>
-      static inline _SimdWrapper<_Tp, _Np>
+      static constexpr inline _SimdWrapper<_Tp, _Np>
       _S_masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
 		     const _Up* __mem) noexcept
       {
@@ -1524,14 +1534,19 @@  _S_masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
 
     // _S_store {{{2
     template <typename _Tp, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_store(_SimdMember<_Tp> __v, _Up* __mem, _TypeTag<_Tp>) noexcept
       {
 	// TODO: converting int -> "smaller int" can be optimized with AVX512
 	constexpr size_t _Np = _S_size<_Tp>;
 	constexpr size_t __max_store_size
 	  = _SuperImpl::template _S_max_store_size<_Up>;
-	if constexpr (sizeof(_Up) > 8)
+	if (__builtin_is_constant_evaluated())
+	  {
+	    for (size_t __i = 0; __i < _Np; ++__i)
+	      __mem[__i] = __v[__i];
+	  }
+	else if constexpr (sizeof(_Up) > 8)
 	  __execute_n_times<_Np>([&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 	    __mem[__i] = __v[__i];
 	  });
@@ -1562,7 +1577,7 @@  _S_masked_load(_SimdWrapper<_Tp, _Np> __merge, _MaskMember<_Tp> __k,
 
     // _S_masked_store_nocvt {{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _MaskMember<_Tp> __k)
       {
 	_BitOps::_S_bit_iteration(
@@ -1575,7 +1590,7 @@  _S_masked_store_nocvt(_SimdWrapper<_Tp, _Np> __v, _Tp* __mem, _MaskMember<_Tp> _
     // _S_masked_store {{{2
     template <typename _TW, typename _TVT = _VectorTraits<_TW>,
 	      typename _Tp = typename _TVT::value_type, typename _Up>
-      static inline void
+      static constexpr inline void
       _S_masked_store(const _TW __v, _Up* __mem, const _MaskMember<_Tp> __k) noexcept
       {
 	constexpr size_t _TV_size = _S_size<_Tp>;
@@ -1803,7 +1818,7 @@  _S_minmax(_SimdWrapper<_Tp, _Np> __a, _SimdWrapper<_Tp, _Np> __b)
     // reductions {{{2
     template <size_t _Np, size_t... _Is, size_t... _Zeros, typename _Tp,
 	      typename _BinaryOperation>
-      _GLIBCXX_SIMD_INTRINSIC static _Tp
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
       _S_reduce_partial(index_sequence<_Is...>, index_sequence<_Zeros...>,
 			simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
       {
@@ -1833,6 +1848,13 @@  _S_reduce(simd<_Tp, _Abi> __x, _BinaryOperation&& __binary_op)
 	else if constexpr (_Np == 2)
 	  return __binary_op(simd<_Tp, simd_abi::scalar>(__x[0]),
 			     simd<_Tp, simd_abi::scalar>(__x[1]))[0];
+	else if (__builtin_is_constant_evaluated())
+	  {
+	    simd<_Tp, simd_abi::scalar> __acc = __x[0];
+	    for (size_t __i = 1; __i < _Np; ++__i)
+	      __acc = __binary_op(__acc, simd<_Tp, simd_abi::scalar>(__x[__i]));
+	    return __acc[0];
+	  }
 	else if constexpr (_Abi::template _S_is_partial<_Tp>) //{{{
 	  {
 	    [[maybe_unused]] constexpr auto __full_size
@@ -2445,24 +2467,24 @@  _S_fpclassify(_SimdWrapper<_Tp, _Np> __x)
 
     // _S_increment & _S_decrement{{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_increment(_SimdWrapper<_Tp, _Np>& __x)
       { __x = __x._M_data + 1; }
 
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_decrement(_SimdWrapper<_Tp, _Np>& __x)
       { __x = __x._M_data - 1; }
 
     // smart_reference access {{{2
     template <typename _Tp, size_t _Np, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC constexpr static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_set(_SimdWrapper<_Tp, _Np>& __v, int __i, _Up&& __x) noexcept
       { __v._M_set(__i, static_cast<_Up&&>(__x)); }
 
     // _S_masked_assign{{{2
     template <typename _Tp, typename _K, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
 		       __type_identity_t<_SimdWrapper<_Tp, _Np>> __rhs)
       {
@@ -2475,7 +2497,7 @@  _S_masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
       }
 
     template <typename _Tp, typename _K, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
 		       __type_identity_t<_Tp> __rhs)
       {
@@ -2503,7 +2525,7 @@  _S_masked_assign(_SimdWrapper<_K, _Np> __k, _SimdWrapper<_Tp, _Np>& __lhs,
 
     // _S_masked_cassign {{{2
     template <typename _Op, typename _Tp, typename _K, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_cassign(const _SimdWrapper<_K, _Np> __k,
 			_SimdWrapper<_Tp, _Np>& __lhs,
 			const __type_identity_t<_SimdWrapper<_Tp, _Np>> __rhs,
@@ -2519,7 +2541,7 @@  _S_masked_cassign(const _SimdWrapper<_K, _Np> __k,
       }
 
     template <typename _Op, typename _Tp, typename _K, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_cassign(const _SimdWrapper<_K, _Np> __k,
 			_SimdWrapper<_Tp, _Np>& __lhs,
 			const __type_identity_t<_Tp> __rhs, _Op __op)
@@ -2528,7 +2550,7 @@  _S_masked_cassign(const _SimdWrapper<_K, _Np> __k,
     // _S_masked_unary {{{2
     template <template <typename> class _Op, typename _Tp, typename _K,
 	      size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static _SimdWrapper<_Tp, _Np>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdWrapper<_Tp, _Np>
       _S_masked_unary(const _SimdWrapper<_K, _Np> __k,
 		      const _SimdWrapper<_Tp, _Np> __v)
       {
@@ -2704,18 +2726,18 @@  _S_broadcast(bool __x)
       _S_load(const bool* __mem)
       {
 	using _I = __int_for_sizeof_t<_Tp>;
-	if constexpr (sizeof(_Tp) == sizeof(bool))
-	  {
-	    const auto __bools
-	      = _CommonImpl::template _S_load<_I, _S_size<_Tp>>(__mem);
-	    // bool is {0, 1}, everything else is UB
-	    return __bools > 0;
-	  }
-	else
-	  return __generate_vector<_I, _S_size<_Tp>>(
-		   [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
-		     return __mem[__i] ? ~_I() : _I();
-		   });
+	if (not __builtin_is_constant_evaluated())
+	  if constexpr (sizeof(_Tp) == sizeof(bool))
+	    {
+	      const auto __bools
+		= _CommonImpl::template _S_load<_I, _S_size<_Tp>>(__mem);
+	      // bool is {0, 1}, everything else is UB
+	      return __bools > 0;
+	    }
+	return __generate_vector<_I, _S_size<_Tp>>(
+		 [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
+		   return __mem[__i] ? ~_I() : _I();
+		 });
       }
 
     // }}}
@@ -2797,7 +2819,7 @@  _S_masked_load(_SimdWrapper<_Tp, _Np> __merge,
 
     // _S_store {{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_store(_SimdWrapper<_Tp, _Np> __v, bool* __mem) noexcept
       {
 	__execute_n_times<_Np>([&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
diff --git a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
index 123e714b528..287f34f5dd8 100644
--- a/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
+++ b/libstdc++-v3/include/experimental/bits/simd_fixed_size.h
@@ -166,25 +166,25 @@  struct __tuple_element_meta
     static constexpr _MaskImpl _S_mask_impl = {};
 
     template <size_t _Np, bool _Sanitized>
-      _GLIBCXX_SIMD_INTRINSIC static auto
+      _GLIBCXX_SIMD_INTRINSIC static constexpr auto
       _S_submask(_BitMask<_Np, _Sanitized> __bits)
       { return __bits.template _M_extract<_Offset, _S_size()>(); }
 
     template <size_t _Np, bool _Sanitized>
-      _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
       _S_make_mask(_BitMask<_Np, _Sanitized> __bits)
       {
 	return _MaskImpl::template _S_convert<_Tp>(
 	  __bits.template _M_extract<_Offset, _S_size()>()._M_sanitized());
       }
 
-    _GLIBCXX_SIMD_INTRINSIC static _ULLong
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _ULLong
     _S_mask_to_shifted_ullong(_MaskMember __k)
     { return _MaskImpl::_S_to_bits(__k).to_ullong() << _Offset; }
   };
 
 template <size_t _Offset, typename _Tp, typename _Abi, typename... _As>
-  _GLIBCXX_SIMD_INTRINSIC
+  _GLIBCXX_SIMD_INTRINSIC constexpr
   __tuple_element_meta<_Tp, _Abi, _Offset>
   __make_meta(const _SimdTuple<_Tp, _Abi, _As...>&)
   { return {}; }
@@ -535,7 +535,7 @@  _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
       }
 
     template <typename _R = _Tp, typename _Fp, typename... _More>
-      _GLIBCXX_SIMD_INTRINSIC auto
+      _GLIBCXX_SIMD_INTRINSIC constexpr auto
       _M_apply_r(_Fp&& __fun, const _More&... __more) const
       {
 	auto&& __first = __fun(__tuple_element_meta<_Tp, _Abi0, 0>(), first,
@@ -573,50 +573,47 @@  _M_assign_front(const _SimdTuple<_Tp, _As...>& __x) &
 	  return second[integral_constant<_Up, _I - simd_size_v<_Tp, _Abi0>>()];
       }
 
-    _GLIBCXX_SIMD_INTRINSIC _Tp
+    _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
     operator[](size_t __i) const noexcept
     {
       if constexpr (_S_tuple_size == 1)
 	return _M_subscript_read(__i);
       else
-	{
 #ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
-	  return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
-#else
-	  if constexpr (__is_scalar_abi<_Abi0>())
-	    {
-	      const _Tp* ptr = &first;
-	      return ptr[__i];
-	    }
-	  else
-	    return __i < simd_size_v<_Tp, _Abi0>
-		     ? _M_subscript_read(__i)
-		     : second[__i - simd_size_v<_Tp, _Abi0>];
+	if (not __builtin_is_constant_evaluated())
+	return reinterpret_cast<const __may_alias<_Tp>*>(this)[__i];
+      else
 #endif
+	if constexpr (__is_scalar_abi<_Abi0>())
+	{
+	  const _Tp* ptr = &first;
+	  return ptr[__i];
 	}
+      else
+	return __i < simd_size_v<_Tp, _Abi0> ? _M_subscript_read(__i)
+					     : second[__i - simd_size_v<_Tp, _Abi0>];
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC constexpr void
     _M_set(size_t __i, _Tp __val) noexcept
     {
       if constexpr (_S_tuple_size == 1)
 	return _M_subscript_write(__i, __val);
       else
-	{
 #ifdef _GLIBCXX_SIMD_USE_ALIASING_LOADS
-	  reinterpret_cast<__may_alias<_Tp>*>(this)[__i] = __val;
-#else
-	  if (__i < simd_size_v<_Tp, _Abi0>)
-	    _M_subscript_write(__i, __val);
-	  else
-	    second._M_set(__i - simd_size_v<_Tp, _Abi0>, __val);
+	if (not __builtin_is_constant_evaluated())
+	reinterpret_cast<__may_alias<_Tp>*>(this)[__i] = __val;
+      else
 #endif
-	}
+	if (__i < simd_size_v<_Tp, _Abi0>)
+	_M_subscript_write(__i, __val);
+      else
+	second._M_set(__i - simd_size_v<_Tp, _Abi0>, __val);
     }
 
   private:
     // _M_subscript_read/_write {{{
-    _GLIBCXX_SIMD_INTRINSIC _Tp
+    _GLIBCXX_SIMD_INTRINSIC constexpr _Tp
     _M_subscript_read([[maybe_unused]] size_t __i) const noexcept
     {
       if constexpr (__is_vectorizable_v<_FirstType>)
@@ -625,7 +622,7 @@  _M_set(size_t __i, _Tp __val) noexcept
 	return first[__i];
     }
 
-    _GLIBCXX_SIMD_INTRINSIC void
+    _GLIBCXX_SIMD_INTRINSIC constexpr void
     _M_subscript_write([[maybe_unused]] size_t __i, _Tp __y) noexcept
     {
       if constexpr (__is_vectorizable_v<_FirstType>)
@@ -639,22 +636,22 @@  _M_set(size_t __i, _Tp __val) noexcept
 
 // __make_simd_tuple {{{1
 template <typename _Tp, typename _A0>
-  _GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, _A0>
   __make_simd_tuple(simd<_Tp, _A0> __x0)
   { return {__data(__x0)}; }
 
 template <typename _Tp, typename _A0, typename... _As>
-  _GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0, _As...>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, _A0, _As...>
   __make_simd_tuple(const simd<_Tp, _A0>& __x0, const simd<_Tp, _As>&... __xs)
   { return {__data(__x0), __make_simd_tuple(__xs...)}; }
 
 template <typename _Tp, typename _A0>
-  _GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, _A0>
   __make_simd_tuple(const typename _SimdTraits<_Tp, _A0>::_SimdMember& __arg0)
   { return {__arg0}; }
 
 template <typename _Tp, typename _A0, typename _A1, typename... _Abis>
-  _GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp, _A0, _A1, _Abis...>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp, _A0, _A1, _Abis...>
   __make_simd_tuple(
     const typename _SimdTraits<_Tp, _A0>::_SimdMember& __arg0,
     const typename _SimdTraits<_Tp, _A1>::_SimdMember& __arg1,
@@ -797,19 +794,19 @@  __to_simd_tuple_sized(
 
 // __optimize_simd_tuple {{{1
 template <typename _Tp>
-  _GLIBCXX_SIMD_INTRINSIC _SimdTuple<_Tp>
+  _GLIBCXX_SIMD_INTRINSIC constexpr _SimdTuple<_Tp>
   __optimize_simd_tuple(const _SimdTuple<_Tp>)
   { return {}; }
 
 template <typename _Tp, typename _Ap>
-  _GLIBCXX_SIMD_INTRINSIC const _SimdTuple<_Tp, _Ap>&
+  _GLIBCXX_SIMD_INTRINSIC constexpr const _SimdTuple<_Tp, _Ap>&
   __optimize_simd_tuple(const _SimdTuple<_Tp, _Ap>& __x)
   { return __x; }
 
 template <typename _Tp, typename _A0, typename _A1, typename... _Abis,
 	  typename _R = __fixed_size_storage_t<
 	    _Tp, _SimdTuple<_Tp, _A0, _A1, _Abis...>::_S_size()>>
-  _GLIBCXX_SIMD_INTRINSIC _R
+  _GLIBCXX_SIMD_INTRINSIC constexpr _R
   __optimize_simd_tuple(const _SimdTuple<_Tp, _A0, _A1, _Abis...>& __x)
   {
     using _Tup = _SimdTuple<_Tp, _A0, _A1, _Abis...>;
@@ -916,7 +913,7 @@  __for_each(const _SimdTuple<_Tp, _A0, _A1, _As...>& __a,
 // }}}1
 // __extract_part(_SimdTuple) {{{
 template <int _Index, int _Total, int _Combine, typename _Tp, typename _A0, typename... _As>
-  _GLIBCXX_SIMD_INTRINSIC auto // __vector_type_t or _SimdTuple
+  _GLIBCXX_SIMD_INTRINSIC constexpr auto // __vector_type_t or _SimdTuple
   __extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x)
   {
     // worst cases:
@@ -1017,11 +1014,11 @@  struct __autocvt_to_simd
     _Tp _M_data;
     using _TT = __remove_cvref_t<_Tp>;
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC constexpr
     operator _TT()
     { return _M_data; }
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC constexpr
     operator _TT&()
     {
       static_assert(is_lvalue_reference<_Tp>::value, "");
@@ -1029,7 +1026,7 @@  struct __autocvt_to_simd
       return _M_data;
     }
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC constexpr
     operator _TT*()
     {
       static_assert(is_lvalue_reference<_Tp>::value, "");
@@ -1041,17 +1038,17 @@  struct __autocvt_to_simd
     __autocvt_to_simd(_Tp dd) : _M_data(dd) {}
 
     template <typename _Abi>
-      _GLIBCXX_SIMD_INTRINSIC
+      _GLIBCXX_SIMD_INTRINSIC constexpr
       operator simd<typename _TT::value_type, _Abi>()
       { return {__private_init, _M_data}; }
 
     template <typename _Abi>
-      _GLIBCXX_SIMD_INTRINSIC
+      _GLIBCXX_SIMD_INTRINSIC constexpr
       operator simd<typename _TT::value_type, _Abi>&()
       { return *reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data); }
 
     template <typename _Abi>
-      _GLIBCXX_SIMD_INTRINSIC
+      _GLIBCXX_SIMD_INTRINSIC constexpr
       operator simd<typename _TT::value_type, _Abi>*()
       { return reinterpret_cast<simd<typename _TT::value_type, _Abi>*>(&_M_data); }
   };
@@ -1073,11 +1070,11 @@  struct __autocvt_to_simd<_Tp, true>
     ~__autocvt_to_simd()
     { _M_data = __data(_M_fd).first; }
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC constexpr
     operator fixed_size_simd<_TT, 1>()
     { return _M_fd; }
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC constexpr
     operator fixed_size_simd<_TT, 1> &()
     {
       static_assert(is_lvalue_reference<_Tp>::value, "");
@@ -1085,7 +1082,7 @@  struct __autocvt_to_simd<_Tp, true>
       return _M_fd;
     }
 
-    _GLIBCXX_SIMD_INTRINSIC
+    _GLIBCXX_SIMD_INTRINSIC constexpr
     operator fixed_size_simd<_TT, 1> *()
     {
       static_assert(is_lvalue_reference<_Tp>::value, "");
@@ -1162,15 +1159,16 @@  struct _SimdBase
 	{
 	  // The following ensures, function arguments are passed via the stack.
 	  // This is important for ABI compatibility across TU boundaries
-	  _GLIBCXX_SIMD_ALWAYS_INLINE
+	  _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
 	  _SimdBase(const _SimdBase&) {}
+
 	  _SimdBase() = default;
 
-	  _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+	  _GLIBCXX_SIMD_ALWAYS_INLINE constexpr explicit
 	  operator const _SimdMember &() const
 	  { return static_cast<const simd<_Tp, _Fixed>*>(this)->_M_data; }
 
-	  _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+	  _GLIBCXX_SIMD_ALWAYS_INLINE constexpr explicit
 	  operator array<_Tp, _Np>() const
 	  {
 	    array<_Tp, _Np> __r;
@@ -1191,13 +1189,13 @@  struct _MaskBase
 	// _SimdCastType {{{
 	struct _SimdCastType
 	{
-	  _GLIBCXX_SIMD_ALWAYS_INLINE
+	  _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
 	  _SimdCastType(const array<_Tp, _Np>&);
 
-	  _GLIBCXX_SIMD_ALWAYS_INLINE
+	  _GLIBCXX_SIMD_ALWAYS_INLINE constexpr
 	  _SimdCastType(const _SimdMember& dd) : _M_data(dd) {}
 
-	  _GLIBCXX_SIMD_ALWAYS_INLINE explicit
+	  _GLIBCXX_SIMD_ALWAYS_INLINE constexpr explicit
 	  operator const _SimdMember &() const { return _M_data; }
 
 	private:
@@ -1282,7 +1280,7 @@  _S_broadcast(_Tp __x) noexcept
 
     // _S_load {{{2
     template <typename _Tp, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC static _SimdMember<_Tp>
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _SimdMember<_Tp>
       _S_load(const _Up* __mem, _TypeTag<_Tp>) noexcept
       {
 	return _SimdMember<_Tp>::_S_generate(
@@ -1301,10 +1299,10 @@  _S_masked_load(const _SimdTuple<_Tp, _As...>& __old,
 	__for_each(__merge, [&](auto __meta, auto& __native) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 	  if (__meta._S_submask(__bits).any())
 #pragma GCC diagnostic push
-	  // __mem + __mem._S_offset could be UB ([expr.add]/4.3, but it punts
-	  // the responsibility for avoiding UB to the caller of the masked load
-	  // via the mask. Consequently, the compiler may assume this branch is
-	  // unreachable, if the pointer arithmetic is UB.
+	    // Dereferencing __mem + __meta._S_offset could be UB ([expr.add]/4.3).
+	    // It is the responsibility of the caller of the masked load (via the mask's value) to
+	    // avoid UB. Consequently, the compiler may assume this branch is unreachable, if the
+	    // pointer arithmetic is UB.
 #pragma GCC diagnostic ignored "-Warray-bounds"
 	    __native
 	      = __meta._S_masked_load(__native, __meta._S_make_mask(__bits),
@@ -1316,7 +1314,7 @@  _S_masked_load(const _SimdTuple<_Tp, _As...>& __old,
 
     // _S_store {{{2
     template <typename _Tp, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_store(const _SimdMember<_Tp>& __v, _Up* __mem, _TypeTag<_Tp>) noexcept
       {
 	__for_each(__v, [&](auto __meta, auto __native) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -1346,7 +1344,7 @@  _S_masked_store(const _SimdTuple<_Tp, _As...>& __v, _Up* __mem,
 
     // negation {{{2
     template <typename _Tp, typename... _As>
-      static inline _MaskMember
+      static constexpr inline _MaskMember
       _S_negate(const _SimdTuple<_Tp, _As...>& __x) noexcept
       {
 	_MaskMember __bits = 0;
@@ -1699,7 +1697,7 @@  __for_each(
     // compares {{{2
 #define _GLIBCXX_SIMD_CMP_OPERATIONS(__cmp)                                    \
     template <typename _Tp, typename... _As>                                   \
-      _GLIBCXX_SIMD_INTRINSIC constexpr static _MaskMember                     \
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember                     \
       __cmp(const _SimdTuple<_Tp, _As...>& __x,                                \
 	    const _SimdTuple<_Tp, _As...>& __y)                                \
       {                                                                        \
@@ -1723,13 +1721,13 @@  __for_each(
 
     // smart_reference access {{{2
     template <typename _Tp, typename... _As, typename _Up>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_set(_SimdTuple<_Tp, _As...>& __v, int __i, _Up&& __x) noexcept
       { __v._M_set(__i, static_cast<_Up&&>(__x)); }
 
     // _S_masked_assign {{{2
     template <typename _Tp, typename... _As>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
 		       const __type_identity_t<_SimdTuple<_Tp, _As...>>& __rhs)
       {
@@ -1745,7 +1743,7 @@  _S_masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
     // Optimization for the case where the RHS is a scalar. No need to broadcast
     // the scalar to a simd first.
     template <typename _Tp, typename... _As>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_masked_assign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
 		       const __type_identity_t<_Tp> __rhs)
       {
@@ -1758,7 +1756,7 @@  __for_each(
 
     // _S_masked_cassign {{{2
     template <typename _Op, typename _Tp, typename... _As>
-      static inline void
+      static constexpr inline void
       _S_masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
 			const _SimdTuple<_Tp, _As...>& __rhs, _Op __op)
       {
@@ -1774,7 +1772,7 @@  _S_masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
     // Optimization for the case where the RHS is a scalar. No need to broadcast
     // the scalar to a simd first.
     template <typename _Op, typename _Tp, typename... _As>
-      static inline void
+      static constexpr inline void
       _S_masked_cassign(const _MaskMember __bits, _SimdTuple<_Tp, _As...>& __lhs,
 			const _Tp& __rhs, _Op __op)
       {
@@ -1787,7 +1785,7 @@  __for_each(
 
     // _S_masked_unary {{{2
     template <template <typename> class _Op, typename _Tp, typename... _As>
-      static inline _SimdTuple<_Tp, _As...>
+      static constexpr inline _SimdTuple<_Tp, _As...>
       _S_masked_unary(const _MaskMember __bits, const _SimdTuple<_Tp, _As...>& __v)
       {
 	return __v._M_apply_wrapped([&__bits](auto __meta,
@@ -1834,6 +1832,13 @@  _S_broadcast(bool __x)
       _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
       _S_load(const bool* __mem)
       {
+	if (__builtin_is_constant_evaluated())
+	  {
+	    _MaskMember __r{};
+	    for (size_t __i = 0; __i < _Np; ++__i)
+	      __r.set(__i, __mem[__i]);
+	    return __r;
+	  }
 	using _Ip = __int_for_sizeof_t<bool>;
 	// the following load uses element_aligned and relies on __mem already
 	// carrying alignment information from when this load function was
@@ -1869,12 +1874,12 @@  _S_convert(simd_mask<_Up, _UAbi> __x)
     // }}}
     // _S_from_bitmask {{{2
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+      _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
       _S_from_bitmask(_MaskMember __bits, _TypeTag<_Tp>) noexcept
       { return __bits; }
 
     // _S_load {{{2
-    static inline _MaskMember
+    static constexpr inline _MaskMember
     _S_load(const bool* __mem) noexcept
     {
       // TODO: _UChar is not necessarily the best type to use here. For smaller
@@ -1890,7 +1895,7 @@  _S_load(const bool* __mem) noexcept
     }
 
     // _S_masked_load {{{2
-    static inline _MaskMember
+    static constexpr inline _MaskMember
     _S_masked_load(_MaskMember __merge, _MaskMember __mask, const bool* __mem) noexcept
     {
       _BitOps::_S_bit_iteration(__mask.to_ullong(),
@@ -1901,7 +1906,7 @@  _S_load(const bool* __mem) noexcept
     }
 
     // _S_store {{{2
-    static inline void
+    static constexpr inline void
     _S_store(const _MaskMember __bitmask, bool* __mem) noexcept
     {
       if constexpr (_Np == 1)
@@ -1911,7 +1916,7 @@  _S_store(const _MaskMember __bitmask, bool* __mem) noexcept
     }
 
     // _S_masked_store {{{2
-    static inline void
+    static constexpr inline void
     _S_masked_store(const _MaskMember __v, bool* __mem, const _MaskMember __k) noexcept
     {
       _BitOps::_S_bit_iteration(
@@ -1919,11 +1924,11 @@  _S_store(const _MaskMember __bitmask, bool* __mem) noexcept
     }
 
     // logical and bitwise operators {{{2
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
     _S_logical_and(const _MaskMember& __x, const _MaskMember& __y) noexcept
     { return __x & __y; }
 
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
     _S_logical_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
     { return __x | __y; }
 
@@ -1931,30 +1936,30 @@  _S_logical_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
     _S_bit_not(const _MaskMember& __x) noexcept
     { return ~__x; }
 
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
     _S_bit_and(const _MaskMember& __x, const _MaskMember& __y) noexcept
     { return __x & __y; }
 
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
     _S_bit_or(const _MaskMember& __x, const _MaskMember& __y) noexcept
     { return __x | __y; }
 
-    _GLIBCXX_SIMD_INTRINSIC static _MaskMember
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _MaskMember
     _S_bit_xor(const _MaskMember& __x, const _MaskMember& __y) noexcept
     { return __x ^ __y; }
 
     // smart_reference access {{{2
-    _GLIBCXX_SIMD_INTRINSIC static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_set(_MaskMember& __k, int __i, bool __x) noexcept
     { __k.set(__i, __x); }
 
     // _S_masked_assign {{{2
-    _GLIBCXX_SIMD_INTRINSIC static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs, const _MaskMember __rhs)
     { __lhs = (__lhs & ~__k) | (__rhs & __k); }
 
     // Optimization for the case where the RHS is a scalar.
-    _GLIBCXX_SIMD_INTRINSIC static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_masked_assign(const _MaskMember __k, _MaskMember& __lhs, const bool __rhs)
     {
       if (__rhs)
@@ -1966,28 +1971,28 @@  _S_bit_xor(const _MaskMember& __x, const _MaskMember& __y) noexcept
     // }}}2
     // _S_all_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool
+      _GLIBCXX_SIMD_INTRINSIC static constexpr bool
       _S_all_of(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).all(); }
 
     // }}}
     // _S_any_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool
+      _GLIBCXX_SIMD_INTRINSIC static constexpr bool
       _S_any_of(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).any(); }
 
     // }}}
     // _S_none_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool
+      _GLIBCXX_SIMD_INTRINSIC static constexpr bool
       _S_none_of(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).none(); }
 
     // }}}
     // _S_some_of {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static bool
+      _GLIBCXX_SIMD_INTRINSIC static constexpr bool
       _S_some_of([[maybe_unused]] simd_mask<_Tp, _Abi> __k)
       {
 	if constexpr (_Np == 1)
@@ -1999,21 +2004,21 @@  _S_none_of(simd_mask<_Tp, _Abi> __k)
     // }}}
     // _S_popcount {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int
+      _GLIBCXX_SIMD_INTRINSIC static constexpr int
       _S_popcount(simd_mask<_Tp, _Abi> __k)
       { return __data(__k).count(); }
 
     // }}}
     // _S_find_first_set {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int
+      _GLIBCXX_SIMD_INTRINSIC static constexpr int
       _S_find_first_set(simd_mask<_Tp, _Abi> __k)
       { return std::__countr_zero(__data(__k).to_ullong()); }
 
     // }}}
     // _S_find_last_set {{{
     template <typename _Tp>
-      _GLIBCXX_SIMD_INTRINSIC static int
+      _GLIBCXX_SIMD_INTRINSIC static constexpr int
       _S_find_last_set(simd_mask<_Tp, _Abi> __k)
       { return std::__bit_width(__data(__k).to_ullong()) - 1; }
 
diff --git a/libstdc++-v3/include/experimental/bits/simd_scalar.h b/libstdc++-v3/include/experimental/bits/simd_scalar.h
index 1a1cc46fbe0..b88e13ff8bc 100644
--- a/libstdc++-v3/include/experimental/bits/simd_scalar.h
+++ b/libstdc++-v3/include/experimental/bits/simd_scalar.h
@@ -152,13 +152,13 @@  _S_broadcast(_Tp __x) noexcept
 
   // _S_load {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
     _S_load(const _Up* __mem, _TypeTag<_Tp>) noexcept
     { return static_cast<_Tp>(__mem[0]); }
 
   // _S_masked_load {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC static _Tp
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
     _S_masked_load(_Tp __merge, bool __k, const _Up* __mem) noexcept
     {
       if (__k)
@@ -168,13 +168,13 @@  _S_broadcast(_Tp __x) noexcept
 
   // _S_store {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_store(_Tp __v, _Up* __mem, _TypeTag<_Tp>) noexcept
     { __mem[0] = static_cast<_Up>(__v); }
 
   // _S_masked_store {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_masked_store(const _Tp __v, _Up* __mem, const bool __k) noexcept
     { if (__k) __mem[0] = __v; }
 
@@ -572,101 +572,101 @@  _S_fmin(_Tp __x, _Tp __y)
     { return std::remquo(__x, __y, &__z->first); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static _ST<int>
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _ST<int>
     _S_fpclassify(_Tp __x)
     { return {std::fpclassify(__x)}; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isfinite(_Tp __x)
     { return std::isfinite(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isinf(_Tp __x)
     { return std::isinf(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isnan(_Tp __x)
     { return std::isnan(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isnormal(_Tp __x)
     { return std::isnormal(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_signbit(_Tp __x)
     { return std::signbit(__x); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isgreater(_Tp __x, _Tp __y)
     { return std::isgreater(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isgreaterequal(_Tp __x, _Tp __y)
     { return std::isgreaterequal(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isless(_Tp __x, _Tp __y)
     { return std::isless(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_islessequal(_Tp __x, _Tp __y)
     { return std::islessequal(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_islessgreater(_Tp __x, _Tp __y)
     { return std::islessgreater(__x, __y); }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_isunordered(_Tp __x, _Tp __y)
     { return std::isunordered(__x, __y); }
 
   // _S_increment & _S_decrement{{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_increment(_Tp& __x)
     { ++__x; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_decrement(_Tp& __x)
     { --__x; }
 
 
   // compares {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_equal_to(_Tp __x, _Tp __y)
     { return __x == __y; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_not_equal_to(_Tp __x, _Tp __y)
     { return __x != __y; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_less(_Tp __x, _Tp __y)
     { return __x < __y; }
 
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_less_equal(_Tp __x, _Tp __y)
     { return __x <= __y; }
 
   // smart_reference access {{{2
   template <typename _Tp, typename _Up>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_set(_Tp& __v, [[maybe_unused]] int __i, _Up&& __x) noexcept
     {
       _GLIBCXX_DEBUG_ASSERT(__i == 0);
@@ -675,19 +675,19 @@  _S_less_equal(_Tp __x, _Tp __y)
 
   // _S_masked_assign {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_masked_assign(bool __k, _Tp& __lhs, _Tp __rhs)
     { if (__k) __lhs = __rhs; }
 
   // _S_masked_cassign {{{2
   template <typename _Op, typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_masked_cassign(const bool __k, _Tp& __lhs, const _Tp __rhs, _Op __op)
     { if (__k) __lhs = __op(_SimdImplScalar{}, __lhs, __rhs); }
 
   // _S_masked_unary {{{2
   template <template <typename> class _Op, typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static _Tp
+    _GLIBCXX_SIMD_INTRINSIC static constexpr _Tp
     _S_masked_unary(const bool __k, const _Tp __v)
     { return static_cast<_Tp>(__k ? _Op<_Tp>{}(__v) : __v); }
 
@@ -737,12 +737,12 @@  _S_convert(simd_mask<_Up, _UAbi> __x)
   // }}}
   // _S_from_bitmask {{{2
   template <typename _Tp>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_from_bitmask(_SanitizedBitMask<1> __bits, _TypeTag<_Tp>) noexcept
     { return __bits[0]; }
 
   // _S_masked_load {{{2
-  _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+  _GLIBCXX_SIMD_INTRINSIC static constexpr bool
   _S_masked_load(bool __merge, bool __mask, const bool* __mem) noexcept
   {
     if (__mask)
@@ -751,12 +751,12 @@  _S_convert(simd_mask<_Up, _UAbi> __x)
   }
 
   // _S_store {{{2
-  _GLIBCXX_SIMD_INTRINSIC static void
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
   _S_store(bool __v, bool* __mem) noexcept
   { __mem[0] = __v; }
 
   // _S_masked_store {{{2
-  _GLIBCXX_SIMD_INTRINSIC static void
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
   _S_masked_store(const bool __v, bool* __mem, const bool __k) noexcept
   {
     if (__k)
@@ -789,7 +789,7 @@  _S_bit_xor(bool __x, bool __y)
   { return __x != __y; }
 
   // smart_reference access {{{2
-  _GLIBCXX_SIMD_INTRINSIC constexpr static void
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
   _S_set(bool& __k, [[maybe_unused]] int __i, bool __x) noexcept
   {
     _GLIBCXX_DEBUG_ASSERT(__i == 0);
@@ -797,7 +797,7 @@  _S_bit_xor(bool __x, bool __y)
   }
 
   // _S_masked_assign {{{2
-  _GLIBCXX_SIMD_INTRINSIC static void
+  _GLIBCXX_SIMD_INTRINSIC static constexpr void
   _S_masked_assign(bool __k, bool& __lhs, bool __rhs)
   {
     if (__k)
@@ -807,49 +807,49 @@  _S_bit_xor(bool __x, bool __y)
   // }}}2
   // _S_all_of {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_all_of(simd_mask<_Tp, _Abi> __k)
     { return __k._M_data; }
 
   // }}}
   // _S_any_of {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_any_of(simd_mask<_Tp, _Abi> __k)
     { return __k._M_data; }
 
   // }}}
   // _S_none_of {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_none_of(simd_mask<_Tp, _Abi> __k)
     { return !__k._M_data; }
 
   // }}}
   // _S_some_of {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static bool
+    _GLIBCXX_SIMD_INTRINSIC static constexpr bool
     _S_some_of(simd_mask<_Tp, _Abi>)
     { return false; }
 
   // }}}
   // _S_popcount {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static int
+    _GLIBCXX_SIMD_INTRINSIC static constexpr int
     _S_popcount(simd_mask<_Tp, _Abi> __k)
     { return __k._M_data; }
 
   // }}}
   // _S_find_first_set {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static int
+    _GLIBCXX_SIMD_INTRINSIC static constexpr int
     _S_find_first_set(simd_mask<_Tp, _Abi>)
     { return 0; }
 
   // }}}
   // _S_find_last_set {{{
   template <typename _Tp, typename _Abi>
-    _GLIBCXX_SIMD_INTRINSIC constexpr static int
+    _GLIBCXX_SIMD_INTRINSIC static constexpr int
     _S_find_last_set(simd_mask<_Tp, _Abi>)
     { return 0; }
 
diff --git a/libstdc++-v3/include/experimental/bits/simd_x86.h b/libstdc++-v3/include/experimental/bits/simd_x86.h
index fc3e96d696c..77d2f84ab71 100644
--- a/libstdc++-v3/include/experimental/bits/simd_x86.h
+++ b/libstdc++-v3/include/experimental/bits/simd_x86.h
@@ -510,12 +510,14 @@  _S_converts_via_decomposition()
   using _CommonImplBuiltin::_S_store;
 
   template <typename _Tp, size_t _Np>
-    _GLIBCXX_SIMD_INTRINSIC static void
+    _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_store(_SimdWrapper<_Tp, _Np> __x, void* __addr)
     {
       constexpr size_t _Bytes = _Np * sizeof(_Tp);
 
-      if constexpr ((_Bytes & (_Bytes - 1)) != 0 && __have_avx512bw_vl)
+      if (__builtin_is_constant_evaluated())
+	_CommonImplBuiltin::_S_store(__x, __addr);
+      else if constexpr ((_Bytes & (_Bytes - 1)) != 0 && __have_avx512bw_vl)
 	{
 	  const auto __v = __to_intrin(__x);
 
@@ -581,7 +583,9 @@  static_assert(
     _GLIBCXX_SIMD_INTRINSIC static constexpr void
     _S_store_bool_array(const _BitMask<_Np, _Sanitized> __x, bool* __mem)
     {
-      if constexpr (__have_avx512bw_vl) // don't care for BW w/o VL
+      if (__builtin_is_constant_evaluated())
+	_CommonImplBuiltin::_S_store_bool_array(__x, __mem);
+      else if constexpr (__have_avx512bw_vl) // don't care for BW w/o VL
 	_S_store<_Np>(1 & __vector_bitcast<_UChar, _Np>(
 			    [=]() constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 			      if constexpr (_Np <= 16)
@@ -2319,14 +2323,14 @@  _S_equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
 	  } // }}}
 	else if (__builtin_is_constant_evaluated())
 	  return _Base::_S_equal_to(__x, __y);
-	else if constexpr (sizeof(__x) == 8) // {{{
+	else if constexpr (sizeof(__x) == 8)
 	  {
 	    const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 				== __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
-	    _MaskMember<_Tp> __r64;
+	    _MaskMember<_Tp> __r64{};
 	    __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	    return __r64;
-	  } // }}}
+	  }
 	else
 	  return _Base::_S_equal_to(__x, __y);
       }
@@ -2397,7 +2401,7 @@  _S_not_equal_to(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
 	  {
 	    const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 				!= __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
-	    _MaskMember<_Tp> __r64;
+	    _MaskMember<_Tp> __r64{};
 	    __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	    return __r64;
 	  }
@@ -2505,7 +2509,7 @@  _S_less(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
 	  {
 	    const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 				< __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
-	    _MaskMember<_Tp> __r64;
+	    _MaskMember<_Tp> __r64{};
 	    __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	    return __r64;
 	  }
@@ -2613,7 +2617,7 @@  _S_less_equal(_SimdWrapper<_Tp, _Np> __x, _SimdWrapper<_Tp, _Np> __y)
 	  {
 	    const auto __r128 = __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__x)
 				<= __vector_bitcast<_Tp, 16 / sizeof(_Tp)>(__y);
-	    _MaskMember<_Tp> __r64;
+	    _MaskMember<_Tp> __r64{};
 	    __builtin_memcpy(&__r64._M_data, &__r128, sizeof(__r64));
 	    return __r64;
 	  }
@@ -4409,7 +4413,19 @@  _S_broadcast(bool __x)
       _S_load(const bool* __mem)
       {
 	static_assert(is_same_v<_Tp, __int_for_sizeof_t<_Tp>>);
-	if constexpr (__have_avx512bw)
+	if (__builtin_is_constant_evaluated())
+	  {
+	    if constexpr (__is_avx512_abi<_Abi>())
+	      {
+		_MaskMember<_Tp> __r{};
+		for (size_t __i = 0; __i < _S_size<_Tp>; ++__i)
+		  __r._M_data |= _ULLong(__mem[__i]) << __i;
+		return __r;
+	      }
+	    else
+	      return _Base::template _S_load<_Tp>(__mem);
+	  }
+	else if constexpr (__have_avx512bw)
 	  {
 	    const auto __to_vec_or_bits
 	      = [](auto __bits) _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA -> decltype(auto) {
@@ -4677,10 +4693,12 @@  _mm256_cvtepi8_epi64(
 
     // _S_store {{{2
     template <typename _Tp, size_t _Np>
-      _GLIBCXX_SIMD_INTRINSIC static void
+      _GLIBCXX_SIMD_INTRINSIC static constexpr void
       _S_store(_SimdWrapper<_Tp, _Np> __v, bool* __mem) noexcept
       {
-	if constexpr (__is_avx512_abi<_Abi>())
+	if (__builtin_is_constant_evaluated())
+	  _Base::_S_store(__v, __mem);
+	else if constexpr (__is_avx512_abi<_Abi>())
 	  {
 	    if constexpr (__have_avx512bw_vl)
 	      _CommonImplX86::_S_store<_Np>(
@@ -4762,7 +4780,7 @@  _mm512_mask_cvtepi32_storeu_epi8(
 	    if constexpr (_Np <= 4 && sizeof(_Tp) == 8)
 	      {
 		auto __k = __intrin_bitcast<__m256i>(__to_intrin(__v));
-		int __bool4;
+		int __bool4{};
 		if constexpr (__have_avx2)
 		  __bool4 = _mm256_movemask_epi8(__k);
 		else
@@ -4846,7 +4864,9 @@  _S_logical_and(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>&
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
-	    if constexpr (__have_avx512dq && _Np <= 8)
+	    if (__builtin_is_constant_evaluated())
+	      return __x._M_data & __y._M_data;
+	    else if constexpr (__have_avx512dq && _Np <= 8)
 	      return _kand_mask8(__x._M_data, __y._M_data);
 	    else if constexpr (_Np <= 16)
 	      return _kand_mask16(__x._M_data, __y._M_data);
@@ -4867,7 +4887,9 @@  _S_logical_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& _
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
-	    if constexpr (__have_avx512dq && _Np <= 8)
+	    if (__builtin_is_constant_evaluated())
+	      return __x._M_data | __y._M_data;
+	    else if constexpr (__have_avx512dq && _Np <= 8)
 	      return _kor_mask8(__x._M_data, __y._M_data);
 	    else if constexpr (_Np <= 16)
 	      return _kor_mask16(__x._M_data, __y._M_data);
@@ -4888,7 +4910,9 @@  _S_bit_not(const _SimdWrapper<_Tp, _Np>& __x)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
-	    if constexpr (__have_avx512dq && _Np <= 8)
+	    if (__builtin_is_constant_evaluated())
+	      return __x._M_data ^ _Abi::template __implicit_mask_n<_Np>();
+	    else if constexpr (__have_avx512dq && _Np <= 8)
 	      return _kandn_mask8(__x._M_data,
 				  _Abi::template __implicit_mask_n<_Np>());
 	    else if constexpr (_Np <= 16)
@@ -4913,7 +4937,9 @@  _S_bit_and(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
-	    if constexpr (__have_avx512dq && _Np <= 8)
+	    if (__builtin_is_constant_evaluated())
+	      return __x._M_data & __y._M_data;
+	    else if constexpr (__have_avx512dq && _Np <= 8)
 	      return _kand_mask8(__x._M_data, __y._M_data);
 	    else if constexpr (_Np <= 16)
 	      return _kand_mask16(__x._M_data, __y._M_data);
@@ -4934,7 +4960,9 @@  _S_bit_or(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
-	    if constexpr (__have_avx512dq && _Np <= 8)
+	    if (__builtin_is_constant_evaluated())
+	      return __x._M_data | __y._M_data;
+	    else if constexpr (__have_avx512dq && _Np <= 8)
 	      return _kor_mask8(__x._M_data, __y._M_data);
 	    else if constexpr (_Np <= 16)
 	      return _kor_mask16(__x._M_data, __y._M_data);
@@ -4955,7 +4983,9 @@  _S_bit_xor(const _SimdWrapper<_Tp, _Np>& __x, const _SimdWrapper<_Tp, _Np>& __y)
       {
 	if constexpr (is_same_v<_Tp, bool>)
 	  {
-	    if constexpr (__have_avx512dq && _Np <= 8)
+	    if (__builtin_is_constant_evaluated())
+	      return __x._M_data ^ __y._M_data;
+	    else if constexpr (__have_avx512dq && _Np <= 8)
 	      return _kxor_mask8(__x._M_data, __y._M_data);
 	    else if constexpr (_Np <= 16)
 	      return _kxor_mask16(__x._M_data, __y._M_data);
diff --git a/libstdc++-v3/testsuite/experimental/simd/pr109261_constexpr_simd.cc b/libstdc++-v3/testsuite/experimental/simd/pr109261_constexpr_simd.cc
new file mode 100644
index 00000000000..f1ce39000d4
--- /dev/null
+++ b/libstdc++-v3/testsuite/experimental/simd/pr109261_constexpr_simd.cc
@@ -0,0 +1,109 @@ 
+// { dg-options "-std=gnu++17" }
+// { dg-do compile { target c++17 } }
+// { dg-require-cmath "" }
+
+// Copyright (C) 2020-2023 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// <http://www.gnu.org/licenses/>.
+
+#include <experimental/simd>
+
+namespace stdx = std::experimental;
+
+template <typename T, typename V>
+  void
+  test01()
+  {
+    constexpr T data[V::size()] = {};
+    constexpr auto a = V(data, stdx::element_aligned);
+
+    constexpr auto b = []() constexpr {
+      V x = T(1);
+      where(x > T(), x) = T();
+      where(x < T(), x) += T();
+      where(x >= T(), x) -= T();
+      where(x <= T(), x) *= T();
+      where(x == T(), x) /= T(1);
+      where(x != T(), x) += T(1);
+      return x;
+    }();
+
+    constexpr T c = V()[0];
+
+    constexpr auto d = !V() && !!V() || !V() & !V() | !V() ^ !V();
+
+    constexpr auto e = []() constexpr {
+      T data[V::size()] = {};
+      V(T(1)).copy_to(data, stdx::element_aligned);
+      V x = T();
+      x[0] = T(1);
+      x.copy_from(data, stdx::element_aligned);
+      bool mask[V::size()] = {};
+      auto k = hmin(x + x - x * x) == x / x;
+      k.copy_to(mask, stdx::element_aligned);
+      mask[0] = false;
+      using M = typename V::mask_type;
+      return M(mask, stdx::element_aligned);
+    }();
+
+    static_assert(not e[0]);
+    static_assert(popcount(e) == V::size() - 1);
+
+    static_assert(all_of(V(T(1)) == []() constexpr {
+      float data[V::size()] = {};
+      V(T(1)).copy_to(data, stdx::element_aligned);
+      V x = T();
+      x.copy_from(data, stdx::element_aligned);
+      return x;
+    }()));
+
+    static_assert(hmin(V()) == T());
+    static_assert(hmax(V()) == T());
+    static_assert(reduce(V(1)) == T(V::size()));
+  }
+
+template <typename T>
+  void
+  iterate_abis()
+  {
+    test01<T, stdx::simd<T, stdx::simd_abi::scalar>>();
+    test01<T, stdx::simd<T>>();
+    test01<T, stdx::native_simd<T>>();
+    test01<T, stdx::fixed_size_simd<T, 3>>();
+    test01<T, stdx::fixed_size_simd<T, stdx::simd_abi::max_fixed_size<T> - 4>>();
+  }
+
+int main()
+{
+  iterate_abis<char>();
+  iterate_abis<wchar_t>();
+  iterate_abis<char16_t>();
+  iterate_abis<char32_t>();
+
+  iterate_abis<signed char>();
+  iterate_abis<unsigned char>();
+  iterate_abis<short>();
+  iterate_abis<unsigned short>();
+  iterate_abis<int>();
+  iterate_abis<unsigned int>();
+  iterate_abis<long>();
+  iterate_abis<unsigned long>();
+  iterate_abis<long long>();
+  iterate_abis<unsigned long long>();
+  iterate_abis<float>();
+  iterate_abis<double>();
+  iterate_abis<long double>();
+}