[v2] aarch64: Fix +nopredres, +nols64 and +nomops
Checks
Commit Message
On Sat, Dec 09, 2023 at 07:22:49PM +0000, Richard Sandiford wrote:
> Andrew Carlotti <andrew.carlotti@arm.com> writes:
> > ...
>
> This is the only use of native_detect_p, so it'd be good to remove
> the field itself.
Done
> > ...
> >
> > @@ -447,6 +451,13 @@ host_detect_local_cpu (int argc, const char **argv)
> > if (tune)
> > return res;
> >
> > + if (!processed_exts)
> > + goto not_found;
>
> Could you explain this part? It seems like more of a parsing change
> (i.e. being more strict about what we accept).
>
> If that's the intention, it probably belongs in:
>
> if (n_cores == 0
> || n_cores > 2
> || (n_cores == 1 && n_variants != 1)
> || imp == INVALID_IMP)
> goto not_found;
>
> But maybe it should be a separate patch.
I added it because I realised that the parsing behaviour didn't make sense in
that case, and my patch happens to change the behaviour as well (the outcome
without the check would be no enabled features, whereas previously it would
enable only the features with no native detection).
I agree that it makes sense to put it with the original check, so I've made that change.
> Looks good otherwise, thanks.
>
> Richard
New patch version below, ok for master?
---
For native cpu feature detection, certain features have no entry in
/proc/cpuinfo, so have to be assumed to be present whenever the detected
cpu is supposed to support that feature.
However, the logic for this was mistakenly implemented by excluding
these features from part of aarch64_get_extension_string_for_isa_flags.
This function is also used elsewhere when canonicalising explicit
feature sets, which may require removing features that are normally
implied by the specified architecture version.
This change reenables generation of +nopredres, +nols64 and +nomops
during canonicalisation, by relocating the misplaced native cpu
detection logic.
gcc/ChangeLog:
* common/config/aarch64/aarch64-common.cc
(struct aarch64_option_extension): Remove unused field.
(all_extensions): Ditto.
(aarch64_get_extension_string_for_isa_flags): Remove filtering
of features without native detection.
* config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
Explicitly add expected features that lack cpuinfo detection.
gcc/testsuite/ChangeLog:
* gcc.target/aarch64/options_set_28.c: New test.
Comments
Andrew Carlotti <andrew.carlotti@arm.com> writes:
> On Sat, Dec 09, 2023 at 07:22:49PM +0000, Richard Sandiford wrote:
>> Andrew Carlotti <andrew.carlotti@arm.com> writes:
>> > ...
>>
>> This is the only use of native_detect_p, so it'd be good to remove
>> the field itself.
>
> Done
>
>> > ...
>> >
>> > @@ -447,6 +451,13 @@ host_detect_local_cpu (int argc, const char **argv)
>> > if (tune)
>> > return res;
>> >
>> > + if (!processed_exts)
>> > + goto not_found;
>>
>> Could you explain this part? It seems like more of a parsing change
>> (i.e. being more strict about what we accept).
>>
>> If that's the intention, it probably belongs in:
>>
>> if (n_cores == 0
>> || n_cores > 2
>> || (n_cores == 1 && n_variants != 1)
>> || imp == INVALID_IMP)
>> goto not_found;
>>
>> But maybe it should be a separate patch.
>
> I added it because I realised that the parsing behaviour didn't make sense in
> that case, and my patch happens to change the behaviour as well (the outcome
> without the check would be no enabled features, whereas previously it would
> enable only the features with no native detection).
Ah, OK, thanks for the explanation.
> I agree that it makes sense to put it with the original check, so I've made that change.
>
>> Looks good otherwise, thanks.
>>
>> Richard
>
> New patch version below, ok for master?
>
> ---
>
> For native cpu feature detection, certain features have no entry in
> /proc/cpuinfo, so have to be assumed to be present whenever the detected
> cpu is supposed to support that feature.
>
> However, the logic for this was mistakenly implemented by excluding
> these features from part of aarch64_get_extension_string_for_isa_flags.
> This function is also used elsewhere when canonicalising explicit
> feature sets, which may require removing features that are normally
> implied by the specified architecture version.
>
> This change reenables generation of +nopredres, +nols64 and +nomops
> during canonicalisation, by relocating the misplaced native cpu
> detection logic.
>
> gcc/ChangeLog:
>
> * common/config/aarch64/aarch64-common.cc
> (struct aarch64_option_extension): Remove unused field.
> (all_extensions): Ditto.
> (aarch64_get_extension_string_for_isa_flags): Remove filtering
> of features without native detection.
> * config/aarch64/driver-aarch64.cc (host_detect_local_cpu):
> Explicitly add expected features that lack cpuinfo detection.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/aarch64/options_set_28.c: New test.
OK, thanks.
Richard
> diff --git a/gcc/common/config/aarch64/aarch64-common.cc b/gcc/common/config/aarch64/aarch64-common.cc
> index c2a6d357c0bc17996a25ea5c3a40f69d745c7931..4d0431d3a2cad5414790646bce0c09877c0366b2 100644
> --- a/gcc/common/config/aarch64/aarch64-common.cc
> +++ b/gcc/common/config/aarch64/aarch64-common.cc
> @@ -149,9 +149,6 @@ struct aarch64_option_extension
> aarch64_feature_flags flags_on;
> /* If this feature is turned off, these bits also need to be turned off. */
> aarch64_feature_flags flags_off;
> - /* Indicates whether this feature is taken into account during native cpu
> - detection. */
> - bool native_detect_p;
> };
>
> /* ISA extensions in AArch64. */
> @@ -159,10 +156,9 @@ static constexpr aarch64_option_extension all_extensions[] =
> {
> #define AARCH64_OPT_EXTENSION(NAME, IDENT, C, D, E, FEATURE_STRING) \
> {NAME, AARCH64_FL_##IDENT, feature_deps::IDENT ().explicit_on, \
> - feature_deps::get_flags_off (feature_deps::root_off_##IDENT), \
> - FEATURE_STRING[0]},
> + feature_deps::get_flags_off (feature_deps::root_off_##IDENT)},
> #include "config/aarch64/aarch64-option-extensions.def"
> - {NULL, 0, 0, 0, false}
> + {NULL, 0, 0, 0}
> };
>
> struct processor_name_to_arch
> @@ -358,8 +354,7 @@ aarch64_get_extension_string_for_isa_flags
> /* If either crypto flag needs removing here, then both do. */
> flags = flags_crypto;
>
> - if (opt.native_detect_p
> - && (flags & current_flags & ~isa_flags))
> + if (flags & current_flags & ~isa_flags)
> {
> current_flags &= ~opt.flags_off;
> outstr += "+no";
> diff --git a/gcc/config/aarch64/driver-aarch64.cc b/gcc/config/aarch64/driver-aarch64.cc
> index 8e318892b10aa2288421fad418844744a2f5a3b4..c18f065aa41e7328d71b45a53c82a3b703ae44d5 100644
> --- a/gcc/config/aarch64/driver-aarch64.cc
> +++ b/gcc/config/aarch64/driver-aarch64.cc
> @@ -262,6 +262,7 @@ host_detect_local_cpu (int argc, const char **argv)
> unsigned int n_variants = 0;
> bool processed_exts = false;
> aarch64_feature_flags extension_flags = 0;
> + aarch64_feature_flags unchecked_extension_flags = 0;
> aarch64_feature_flags default_flags = 0;
> std::string buf;
> size_t sep_pos = -1;
> @@ -348,7 +349,10 @@ host_detect_local_cpu (int argc, const char **argv)
> /* If the feature contains no HWCAPS string then ignore it for the
> auto detection. */
> if (val.empty ())
> - continue;
> + {
> + unchecked_extension_flags |= aarch64_extensions[i].flag;
> + continue;
> + }
>
> bool enabled = true;
>
> @@ -380,7 +384,8 @@ host_detect_local_cpu (int argc, const char **argv)
> if (n_cores == 0
> || n_cores > 2
> || (n_cores == 1 && n_variants != 1)
> - || imp == INVALID_IMP)
> + || imp == INVALID_IMP
> + || !processed_exts)
> goto not_found;
>
> /* Simple case, one core type or just looking for the arch. */
> @@ -447,6 +452,10 @@ host_detect_local_cpu (int argc, const char **argv)
> if (tune)
> return res;
>
> + /* Add any features that should be be present, but can't be verified using
> + the /proc/cpuinfo "Features" list. */
> + extension_flags |= unchecked_extension_flags & default_flags;
> +
> {
> std::string extension
> = aarch64_get_extension_string_for_isa_flags (extension_flags,
> diff --git a/gcc/testsuite/gcc.target/aarch64/options_set_28.c b/gcc/testsuite/gcc.target/aarch64/options_set_28.c
> new file mode 100644
> index 0000000000000000000000000000000000000000..9e63768581e9d429e9408863942051b1b04761ac
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/options_set_28.c
> @@ -0,0 +1,9 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-march=armv9.3-a+nopredres+nols64+nomops" } */
> +
> +int main ()
> +{
> + return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times {\.arch armv9\.3\-a\+crc\+nopredres\+nols64\+nomops\n} 1 } } */
@@ -149,9 +149,6 @@ struct aarch64_option_extension
aarch64_feature_flags flags_on;
/* If this feature is turned off, these bits also need to be turned off. */
aarch64_feature_flags flags_off;
- /* Indicates whether this feature is taken into account during native cpu
- detection. */
- bool native_detect_p;
};
/* ISA extensions in AArch64. */
@@ -159,10 +156,9 @@ static constexpr aarch64_option_extension all_extensions[] =
{
#define AARCH64_OPT_EXTENSION(NAME, IDENT, C, D, E, FEATURE_STRING) \
{NAME, AARCH64_FL_##IDENT, feature_deps::IDENT ().explicit_on, \
- feature_deps::get_flags_off (feature_deps::root_off_##IDENT), \
- FEATURE_STRING[0]},
+ feature_deps::get_flags_off (feature_deps::root_off_##IDENT)},
#include "config/aarch64/aarch64-option-extensions.def"
- {NULL, 0, 0, 0, false}
+ {NULL, 0, 0, 0}
};
struct processor_name_to_arch
@@ -358,8 +354,7 @@ aarch64_get_extension_string_for_isa_flags
/* If either crypto flag needs removing here, then both do. */
flags = flags_crypto;
- if (opt.native_detect_p
- && (flags & current_flags & ~isa_flags))
+ if (flags & current_flags & ~isa_flags)
{
current_flags &= ~opt.flags_off;
outstr += "+no";
@@ -262,6 +262,7 @@ host_detect_local_cpu (int argc, const char **argv)
unsigned int n_variants = 0;
bool processed_exts = false;
aarch64_feature_flags extension_flags = 0;
+ aarch64_feature_flags unchecked_extension_flags = 0;
aarch64_feature_flags default_flags = 0;
std::string buf;
size_t sep_pos = -1;
@@ -348,7 +349,10 @@ host_detect_local_cpu (int argc, const char **argv)
/* If the feature contains no HWCAPS string then ignore it for the
auto detection. */
if (val.empty ())
- continue;
+ {
+ unchecked_extension_flags |= aarch64_extensions[i].flag;
+ continue;
+ }
bool enabled = true;
@@ -380,7 +384,8 @@ host_detect_local_cpu (int argc, const char **argv)
if (n_cores == 0
|| n_cores > 2
|| (n_cores == 1 && n_variants != 1)
- || imp == INVALID_IMP)
+ || imp == INVALID_IMP
+ || !processed_exts)
goto not_found;
/* Simple case, one core type or just looking for the arch. */
@@ -447,6 +452,10 @@ host_detect_local_cpu (int argc, const char **argv)
if (tune)
return res;
+ /* Add any features that should be be present, but can't be verified using
+ the /proc/cpuinfo "Features" list. */
+ extension_flags |= unchecked_extension_flags & default_flags;
+
{
std::string extension
= aarch64_get_extension_string_for_isa_flags (extension_flags,
new file mode 100644
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-march=armv9.3-a+nopredres+nols64+nomops" } */
+
+int main ()
+{
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times {\.arch armv9\.3\-a\+crc\+nopredres\+nols64\+nomops\n} 1 } } */