[12/14] platform/x86/intel/ifs: Add current_batch sysfs entry
Commit Message
Initial implementation assumed a single IFS test image file with a
fixed name ff-mm-ss.scan. (where ff, mm, ss refers to family, model
and stepping of the core)
Subsequently, it became evident that supporting more than one
test image file is needed to provide more comprehensive
test coverage. (Test coverage in this scenario refers to testing
more transistors in the core to identify faults)
The other alternative of increasing the size of a single scan test image
file would not work as the upper bound is limited by the size of memory
area reserved by BIOS for loading IFS test image.
Introduce "current_batch" file which accepts a number. Writing a
number to the current_batch file would load the test image file by name
ff-mm-ss-<xy>.scan, where <xy> is the number written to the
"current_batch" file in hex. Range check of the input is done to verify
it not greater than 0xff.
For e.g if the scan test image comprises of 6 files, they would be named
as show below:
06-8f-06-01.scan
06-8f-06-02.scan
06-8f-06-03.scan
06-8f-06-04.scan
06-8f-06-05.scan
06-8f-06-06.scan
And writing 3 to current_batch would result in loading 06-8f-06-03.scan
in the above e.g. The file can also be read to know the currently loaded
file.
And testing a system looks like:
for each scan file
do
load the IFS test image file (write to the batch file)
for each core
do
test the core with this set of tests
done
done
Qualify few error messages with the test image file suffix to
provide better context.
Reviewed-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Jithu Joseph <jithu.joseph@intel.com>
---
drivers/platform/x86/intel/ifs/ifs.h | 21 +++++++++----
drivers/platform/x86/intel/ifs/core.c | 1 +
drivers/platform/x86/intel/ifs/load.c | 14 +++++----
drivers/platform/x86/intel/ifs/runtest.c | 10 ++++---
drivers/platform/x86/intel/ifs/sysfs.c | 38 ++++++++++++++++++++++++
5 files changed, 70 insertions(+), 14 deletions(-)
Comments
On 10/21/2022 1:34 PM, Jithu Joseph wrote:
> Initial implementation assumed a single IFS test image file with a
> fixed name ff-mm-ss.scan. (where ff, mm, ss refers to family, model
> and stepping of the core)
>
> Subsequently, it became evident that supporting more than one
> test image file is needed to provide more comprehensive
> test coverage. (Test coverage in this scenario refers to testing
> more transistors in the core to identify faults)
>
> The other alternative of increasing the size of a single scan test image
> file would not work as the upper bound is limited by the size of memory
> area reserved by BIOS for loading IFS test image.
>
> Introduce "current_batch" file which accepts a number. Writing a
> number to the current_batch file would load the test image file by name
> ff-mm-ss-<xy>.scan, where <xy> is the number written to the
> "current_batch" file in hex.
Any specific reasoning why the name "current_batch" was chosen? To me,
batch seems to suggest multiple or a group of files. But in reality only
one test file is loaded at a time.
Naming can sometimes be quite subjective so it might be useful to get
multiple opinions here.
As per my understanding, there is sysfs file called run_test which runs
a loaded test. Instead of current_batch how about the name load_test (or
maybe current_test)?
load_test - Write a number less than or equal to 0xff to load an IFS
test image. (Description as-is from the documentation patch)
> * Running tests
> * -------------
> @@ -207,6 +217,7 @@ struct ifs_data {
> int status;
> u64 scan_details;
> int cur_batch;
> + int test_num;
> };
>
> struct ifs_work {
> diff --git a/drivers/platform/x86/intel/ifs/core.c b/drivers/platform/x86/intel/ifs/core.c
> index 5fb7f655c291..1f040837e8eb 100644
> --- a/drivers/platform/x86/intel/ifs/core.c
> +++ b/drivers/platform/x86/intel/ifs/core.c
> @@ -22,6 +22,7 @@ MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids);
> static struct ifs_device ifs_device = {
> .data = {
> .integrity_cap_bit = MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT,
> + .test_num = 0,
Is this initialization really needed? Wouldn't it default to 0?
Maybe if you explain what does test_num refer to it might answer the above?
> +static ssize_t current_batch_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct ifs_data *ifsd = ifs_get_data(dev);
> +
> + if (!ifsd->loaded)
> + return sysfs_emit(buf, "%s\n", "none");
Why not:
sysfs_emit(buf, "none\n");
> + else
> + return sysfs_emit(buf, "0x%02x\n", ifsd->cur_batch);
> +}
Sohil
On 11/1/2022 3:26 PM, Sohil Mehta wrote:
> On 10/21/2022 1:34 PM, Jithu Joseph wrote:
>> Initial implementation assumed a single IFS test image file with a
>> fixed name ff-mm-ss.scan. (where ff, mm, ss refers to family, model
>> and stepping of the core)
>>
>> Subsequently, it became evident that supporting more than one
>> test image file is needed to provide more comprehensive
>> test coverage. (Test coverage in this scenario refers to testing
>> more transistors in the core to identify faults)
>>
>> The other alternative of increasing the size of a single scan test image
>> file would not work as the upper bound is limited by the size of memory
>> area reserved by BIOS for loading IFS test image.
>>
>> Introduce "current_batch" file which accepts a number. Writing a
>> number to the current_batch file would load the test image file by name
>> ff-mm-ss-<xy>.scan, where <xy> is the number written to the
>> "current_batch" file in hex.
>
> Any specific reasoning why the name "current_batch" was chosen? To me, batch seems to suggest multiple or a group of files. But in reality only one test file is loaded at a time.
There would be a batch of test files (6 in the e.g described in the commit message). At a time you can load one by writing its suffix into "current_batch". So "current_batch" was more like a short form for "current_file_in_the_batch_of_files"
>
> Naming can sometimes be quite subjective so it might be useful to get multiple opinions here.
>
> As per my understanding, there is sysfs file called run_test which runs a loaded test. Instead of current_batch how about the name load_test (or maybe current_test)?
Each file would contain multiple tests. All these tests contained within a file would be executed when you write 1 to "runtest".
Given this load_test too would be confusing (more appropriate than "load_test" would be "load_test_file" or "load_file" or "current_file")
>
> load_test - Write a number less than or equal to 0xff to load an IFS test image. (Description as-is from the documentation patch)
>
>
>> * Running tests
>> * -------------
>> @@ -207,6 +217,7 @@ struct ifs_data {
>> int status;
>> u64 scan_details;
>> int cur_batch;
>> + int test_num;
>> };
>> struct ifs_work {
>> diff --git a/drivers/platform/x86/intel/ifs/core.c b/drivers/platform/x86/intel/ifs/core.c
>> index 5fb7f655c291..1f040837e8eb 100644
>> --- a/drivers/platform/x86/intel/ifs/core.c
>> +++ b/drivers/platform/x86/intel/ifs/core.c
>> @@ -22,6 +22,7 @@ MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids);
>> static struct ifs_device ifs_device = {
>> .data = {
>> .integrity_cap_bit = MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT,
>> + .test_num = 0,
>
> Is this initialization really needed? Wouldn't it default to 0?
>
Not strictly needed when we have a single test
> Maybe if you explain what does test_num refer to it might answer the above?
It is currently used to form the path in ifs_load_firmware() for the test image to be loaded
>
>
>> +static ssize_t current_batch_show(struct device *dev,
>> + struct device_attribute *attr, char *buf)
>> +{
>> + struct ifs_data *ifsd = ifs_get_data(dev);
>> +
>> + if (!ifsd->loaded)
>> + return sysfs_emit(buf, "%s\n", "none");
>
> Why not:
>
> sysfs_emit(buf, "none\n");
Will change
Jithu
On 11/1/2022 4:27 PM, Joseph, Jithu wrote:
> Each file would contain multiple tests. All these tests contained within a file would be executed when you write 1 to "runtest".
> Given this load_test too would be confusing (more appropriate than "load_test" would be "load_test_file" or "load_file" or "current_file")
>
Yeah, not sure if any of them are more suitable in that case. Maybe you
can leave it as is until someone suggests a better name or has a strong
preference for one of the above.
(Probably you can add "load_batch" to mix as well.)
Sohil
@@ -33,13 +33,23 @@
* The driver loads the tests into memory reserved BIOS local to each CPU
* socket in a two step process using writes to MSRs to first load the
* SHA hashes for the test. Then the tests themselves. Status MSRs provide
- * feedback on the success/failure of these steps. When a new test file
- * is installed it can be loaded by writing to the driver reload file::
+ * feedback on the success/failure of these steps.
*
- * # echo 1 > /sys/devices/virtual/misc/intel_ifs_0/reload
+ * The test files are kept in a fixed location: /lib/firmware/intel/ifs_0/
+ * For e.g if there are 3 test files, they would be named in the following
+ * fashion:
+ * ff-mm-ss-01.scan
+ * ff-mm-ss-02.scan
+ * ff-mm-ss-03.scan
+ * (where ff refers to family, mm indicates model and ss indicates stepping)
*
- * Similar to microcode, the current version of the scan tests is stored
- * in a fixed location: /lib/firmware/intel/ifs.0/family-model-stepping.scan
+ * A different testfile can be loaded by writing the numerical portion
+ * (e.g 1, 2 or 3 in the above scenario) into the curent_batch file.
+ * To load ff-mm-ss-02.scan, the following command can be used::
+ *
+ * # echo 2 > /sys/devices/virtual/misc/intel_ifs_0/current_batch
+ *
+ * The above file can also be read to know the currently loaded image.
*
* Running tests
* -------------
@@ -207,6 +217,7 @@ struct ifs_data {
int status;
u64 scan_details;
int cur_batch;
+ int test_num;
};
struct ifs_work {
@@ -22,6 +22,7 @@ MODULE_DEVICE_TABLE(x86cpu, ifs_cpu_ids);
static struct ifs_device ifs_device = {
.data = {
.integrity_cap_bit = MSR_INTEGRITY_CAPS_PERIODIC_BIST_BIT,
+ .test_num = 0,
},
.misc = {
.name = "intel_ifs_0",
@@ -225,17 +225,18 @@ static int ifs_image_sanity_check(struct device *dev, const struct microcode_hea
/*
* Load ifs image. Before loading ifs module, the ifs image must be located
- * in /lib/firmware/intel/ifs and named as {family/model/stepping}.{testname}.
+ * in /lib/firmware/intel/ifs_x/ and named as family-model-stepping-02x.{testname}.
*/
int ifs_load_firmware(struct device *dev)
{
struct ifs_data *ifsd = ifs_get_data(dev);
const struct firmware *fw;
- char scan_path[32];
- int ret;
+ char scan_path[64];
+ int ret = -EINVAL;
- snprintf(scan_path, sizeof(scan_path), "intel/ifs/%02x-%02x-%02x.scan",
- boot_cpu_data.x86, boot_cpu_data.x86_model, boot_cpu_data.x86_stepping);
+ snprintf(scan_path, sizeof(scan_path), "intel/ifs_%d/%02x-%02x-%02x-%02x.scan",
+ ifsd->test_num, boot_cpu_data.x86, boot_cpu_data.x86_model,
+ boot_cpu_data.x86_stepping, ifsd->cur_batch);
ret = request_firmware_direct(&fw, scan_path, dev);
if (ret) {
@@ -251,6 +252,9 @@ int ifs_load_firmware(struct device *dev)
ifs_hash_ptr = (u64)(ifs_header_ptr + 1);
ret = scan_chunks_sanity_check(dev);
+ if (ret)
+ dev_err(dev, "Load failure for batch: %02x\n", ifsd->cur_batch);
+
release:
release_firmware(fw);
done:
@@ -78,14 +78,16 @@ static void message_not_tested(struct device *dev, int cpu, union ifs_status sta
static void message_fail(struct device *dev, int cpu, union ifs_status status)
{
+ struct ifs_data *ifsd = ifs_get_data(dev);
+
/*
* control_error is set when the microcode runs into a problem
* loading the image from the reserved BIOS memory, or it has
* been corrupted. Reloading the image may fix this issue.
*/
if (status.control_error) {
- dev_err(dev, "CPU(s) %*pbl: could not execute from loaded scan image\n",
- cpumask_pr_args(cpu_smt_mask(cpu)));
+ dev_err(dev, "CPU(s) %*pbl: could not execute from loaded scan image. Batch: %02x version: 0x%x\n",
+ cpumask_pr_args(cpu_smt_mask(cpu)), ifsd->cur_batch, ifsd->loaded_version);
}
/*
@@ -96,8 +98,8 @@ static void message_fail(struct device *dev, int cpu, union ifs_status status)
* the core being tested.
*/
if (status.signature_error) {
- dev_err(dev, "CPU(s) %*pbl: test signature incorrect.\n",
- cpumask_pr_args(cpu_smt_mask(cpu)));
+ dev_err(dev, "CPU(s) %*pbl: test signature incorrect. Batch: %02x version: 0x%x\n",
+ cpumask_pr_args(cpu_smt_mask(cpu)), ifsd->cur_batch, ifsd->loaded_version);
}
}
@@ -87,6 +87,43 @@ static ssize_t run_test_store(struct device *dev,
static DEVICE_ATTR_WO(run_test);
+static ssize_t current_batch_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct ifs_data *ifsd = ifs_get_data(dev);
+ int cur_batch;
+ int rc;
+
+ rc = kstrtouint(buf, 0, &cur_batch);
+ if (rc < 0 || cur_batch > 0xff)
+ return -EINVAL;
+
+ if (down_interruptible(&ifs_sem))
+ return -EINTR;
+
+ ifsd->cur_batch = cur_batch;
+
+ rc = ifs_load_firmware(dev);
+
+ up(&ifs_sem);
+
+ return (rc == 0) ? count : rc;
+}
+
+static ssize_t current_batch_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct ifs_data *ifsd = ifs_get_data(dev);
+
+ if (!ifsd->loaded)
+ return sysfs_emit(buf, "%s\n", "none");
+ else
+ return sysfs_emit(buf, "0x%02x\n", ifsd->cur_batch);
+}
+
+static DEVICE_ATTR_RW(current_batch);
+
/*
* Display currently loaded IFS image version.
*/
@@ -108,6 +145,7 @@ static struct attribute *plat_ifs_attrs[] = {
&dev_attr_details.attr,
&dev_attr_status.attr,
&dev_attr_run_test.attr,
+ &dev_attr_current_batch.attr,
&dev_attr_image_version.attr,
NULL
};