[RESEND,1/2] thermal: intel: Prevent accidental clearing of HFI status

Message ID 20221116025417.2590275-1-srinivas.pandruvada@linux.intel.com
State New
Headers
Series [RESEND,1/2] thermal: intel: Prevent accidental clearing of HFI status |

Commit Message

srinivas pandruvada Nov. 16, 2022, 2:54 a.m. UTC
  When there is a package thermal interrupt with PROCHOT log, it will be
processed and cleared. It is possible that there is an active HFI event
status, which is about to get processed or getting processed. While
clearing PROCHOT log bit, it will also clear HFI status bit. This means
that hardware is free to update HFI memory.

When clearing a package thermal interrupt, some processors will generate
a "general protection fault" when any of the read only bit is set to 1.
The driver maintains a mask of all read-write bits which can be set.
This mask doesn't include HFI status bit. This bit will also be cleared,
as it will be assumed read-only bit. So, add HFI status bit 26 to the
mask.

Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
---
Email address was wrong, so sending again.

 drivers/thermal/intel/therm_throt.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Rafael J. Wysocki Nov. 18, 2022, 5:54 p.m. UTC | #1
On Wed, Nov 16, 2022 at 3:54 AM Srinivas Pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
>
> When there is a package thermal interrupt with PROCHOT log, it will be
> processed and cleared. It is possible that there is an active HFI event
> status, which is about to get processed or getting processed. While
> clearing PROCHOT log bit, it will also clear HFI status bit. This means
> that hardware is free to update HFI memory.
>
> When clearing a package thermal interrupt, some processors will generate
> a "general protection fault" when any of the read only bit is set to 1.
> The driver maintains a mask of all read-write bits which can be set.
> This mask doesn't include HFI status bit. This bit will also be cleared,
> as it will be assumed read-only bit. So, add HFI status bit 26 to the
> mask.
>
> Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
> Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>

Is a Fixes tag missing here?

Also, do you want it in 6.1-rc7 or would 6.2 suffice?

> ---
> Email address was wrong, so sending again.
>
>  drivers/thermal/intel/therm_throt.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/therm_throt.c
> index 8352083b87c7..9e8ab31d756e 100644
> --- a/drivers/thermal/intel/therm_throt.c
> +++ b/drivers/thermal/intel/therm_throt.c
> @@ -197,7 +197,7 @@ static const struct attribute_group thermal_attr_group = {
>  #define THERM_STATUS_PROCHOT_LOG       BIT(1)
>
>  #define THERM_STATUS_CLEAR_CORE_MASK (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11) | BIT(13) | BIT(15))
> -#define THERM_STATUS_CLEAR_PKG_MASK  (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11))
> +#define THERM_STATUS_CLEAR_PKG_MASK  (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11) | BIT(26))
>
>  static void clear_therm_status_log(int level)
>  {
> --
> 2.31.1
>
  
srinivas pandruvada Nov. 18, 2022, 7:35 p.m. UTC | #2
On Fri, 2022-11-18 at 18:54 +0100, Rafael J. Wysocki wrote:
> On Wed, Nov 16, 2022 at 3:54 AM Srinivas Pandruvada
> <srinivas.pandruvada@linux.intel.com> wrote:
> > 
> > When there is a package thermal interrupt with PROCHOT log, it will
> > be
> > processed and cleared. It is possible that there is an active HFI
> > event
> > status, which is about to get processed or getting processed. While
> > clearing PROCHOT log bit, it will also clear HFI status bit. This
> > means
> > that hardware is free to update HFI memory.
> > 
> > When clearing a package thermal interrupt, some processors will
> > generate
> > a "general protection fault" when any of the read only bit is set
> > to 1.
> > The driver maintains a mask of all read-write bits which can be
> > set.
> > This mask doesn't include HFI status bit. This bit will also be
> > cleared,
> > as it will be assumed read-only bit. So, add HFI status bit 26 to
> > the
> > mask.
> > 
> > Signed-off-by: Srinivas Pandruvada
> > <srinivas.pandruvada@linux.intel.com>
> > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> 
> Is a Fixes tag missing here?
While adding the following change, this should have been take care of:
ab09b0744a99 ("thermal: intel: hfi: Enable notification interrupt")

But the above change didn't add this line, which this patch is
changing. We can add:

Fixes: ab09b0744a99 ("thermal: intel: hfi: Enable notification
interrupt")

Do you want me to send another PATCH with fixes.

> 
> Also, do you want it in 6.1-rc7 or would 6.2 suffice?
Not urgent. 6.2 should be fine.

Thanks,
Srinivas

> 
> > ---
> > Email address was wrong, so sending again.
> > 
> >  drivers/thermal/intel/therm_throt.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/thermal/intel/therm_throt.c
> > b/drivers/thermal/intel/therm_throt.c
> > index 8352083b87c7..9e8ab31d756e 100644
> > --- a/drivers/thermal/intel/therm_throt.c
> > +++ b/drivers/thermal/intel/therm_throt.c
> > @@ -197,7 +197,7 @@ static const struct attribute_group
> > thermal_attr_group = {
> >  #define THERM_STATUS_PROCHOT_LOG       BIT(1)
> > 
> >  #define THERM_STATUS_CLEAR_CORE_MASK (BIT(1) | BIT(3) | BIT(5) |
> > BIT(7) | BIT(9) | BIT(11) | BIT(13) | BIT(15))
> > -#define THERM_STATUS_CLEAR_PKG_MASK  (BIT(1) | BIT(3) | BIT(5) |
> > BIT(7) | BIT(9) | BIT(11))
> > +#define THERM_STATUS_CLEAR_PKG_MASK  (BIT(1) | BIT(3) | BIT(5) |
> > BIT(7) | BIT(9) | BIT(11) | BIT(26))
> > 
> >  static void clear_therm_status_log(int level)
> >  {
> > --
> > 2.31.1
> >
  
Rafael J. Wysocki Nov. 18, 2022, 7:49 p.m. UTC | #3
On Fri, Nov 18, 2022 at 8:38 PM srinivas pandruvada
<srinivas.pandruvada@linux.intel.com> wrote:
>
> On Fri, 2022-11-18 at 18:54 +0100, Rafael J. Wysocki wrote:
> > On Wed, Nov 16, 2022 at 3:54 AM Srinivas Pandruvada
> > <srinivas.pandruvada@linux.intel.com> wrote:
> > >
> > > When there is a package thermal interrupt with PROCHOT log, it will
> > > be
> > > processed and cleared. It is possible that there is an active HFI
> > > event
> > > status, which is about to get processed or getting processed. While
> > > clearing PROCHOT log bit, it will also clear HFI status bit. This
> > > means
> > > that hardware is free to update HFI memory.
> > >
> > > When clearing a package thermal interrupt, some processors will
> > > generate
> > > a "general protection fault" when any of the read only bit is set
> > > to 1.
> > > The driver maintains a mask of all read-write bits which can be
> > > set.
> > > This mask doesn't include HFI status bit. This bit will also be
> > > cleared,
> > > as it will be assumed read-only bit. So, add HFI status bit 26 to
> > > the
> > > mask.
> > >
> > > Signed-off-by: Srinivas Pandruvada
> > > <srinivas.pandruvada@linux.intel.com>
> > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> >
> > Is a Fixes tag missing here?
> While adding the following change, this should have been take care of:
> ab09b0744a99 ("thermal: intel: hfi: Enable notification interrupt")
>
> But the above change didn't add this line, which this patch is
> changing. We can add:
>
> Fixes: ab09b0744a99 ("thermal: intel: hfi: Enable notification
> interrupt")

OK

> Do you want me to send another PATCH with fixes.

No, I can take care of this.

> >
> > Also, do you want it in 6.1-rc7 or would 6.2 suffice?
> Not urgent. 6.2 should be fine.

OK, thanks!
  
Rafael J. Wysocki Nov. 23, 2022, 7:10 p.m. UTC | #4
On Fri, Nov 18, 2022 at 8:49 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Nov 18, 2022 at 8:38 PM srinivas pandruvada
> <srinivas.pandruvada@linux.intel.com> wrote:
> >
> > On Fri, 2022-11-18 at 18:54 +0100, Rafael J. Wysocki wrote:
> > > On Wed, Nov 16, 2022 at 3:54 AM Srinivas Pandruvada
> > > <srinivas.pandruvada@linux.intel.com> wrote:
> > > >
> > > > When there is a package thermal interrupt with PROCHOT log, it will
> > > > be
> > > > processed and cleared. It is possible that there is an active HFI
> > > > event
> > > > status, which is about to get processed or getting processed. While
> > > > clearing PROCHOT log bit, it will also clear HFI status bit. This
> > > > means
> > > > that hardware is free to update HFI memory.
> > > >
> > > > When clearing a package thermal interrupt, some processors will
> > > > generate
> > > > a "general protection fault" when any of the read only bit is set
> > > > to 1.
> > > > The driver maintains a mask of all read-write bits which can be
> > > > set.
> > > > This mask doesn't include HFI status bit. This bit will also be
> > > > cleared,
> > > > as it will be assumed read-only bit. So, add HFI status bit 26 to
> > > > the
> > > > mask.
> > > >
> > > > Signed-off-by: Srinivas Pandruvada
> > > > <srinivas.pandruvada@linux.intel.com>
> > > > Reviewed-by: Ricardo Neri <ricardo.neri-calderon@linux.intel.com>
> > >
> > > Is a Fixes tag missing here?
> > While adding the following change, this should have been take care of:
> > ab09b0744a99 ("thermal: intel: hfi: Enable notification interrupt")
> >
> > But the above change didn't add this line, which this patch is
> > changing. We can add:
> >
> > Fixes: ab09b0744a99 ("thermal: intel: hfi: Enable notification
> > interrupt")
>
> OK
>
> > Do you want me to send another PATCH with fixes.
>
> No, I can take care of this.
>
> > >
> > > Also, do you want it in 6.1-rc7 or would 6.2 suffice?
> > Not urgent. 6.2 should be fine.
>
> OK, thanks!

Both applied as 6.2 material now, thanks!
  

Patch

diff --git a/drivers/thermal/intel/therm_throt.c b/drivers/thermal/intel/therm_throt.c
index 8352083b87c7..9e8ab31d756e 100644
--- a/drivers/thermal/intel/therm_throt.c
+++ b/drivers/thermal/intel/therm_throt.c
@@ -197,7 +197,7 @@  static const struct attribute_group thermal_attr_group = {
 #define THERM_STATUS_PROCHOT_LOG	BIT(1)
 
 #define THERM_STATUS_CLEAR_CORE_MASK (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11) | BIT(13) | BIT(15))
-#define THERM_STATUS_CLEAR_PKG_MASK  (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11))
+#define THERM_STATUS_CLEAR_PKG_MASK  (BIT(1) | BIT(3) | BIT(5) | BIT(7) | BIT(9) | BIT(11) | BIT(26))
 
 static void clear_therm_status_log(int level)
 {