[41/58] x86/apic: Add max_apic_id member

Message ID 20230717223225.515238528@linutronix.de
State New
Headers
Series x86/apic: Decrapification and static calls |

Commit Message

Thomas Gleixner July 17, 2023, 11:15 p.m. UTC
  There is really no point to have a callback which compares numbers.

Add a field which allows each APIC to store the maximum APIC ID supported
and fill it in for all APIC incarnations.

The next step will remove the callback.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/apic.h           |    3 +++
 arch/x86/kernel/apic/apic_flat_64.c   |    2 ++
 arch/x86/kernel/apic/apic_noop.c      |    1 +
 arch/x86/kernel/apic/apic_numachip.c  |    2 ++
 arch/x86/kernel/apic/bigsmp_32.c      |    1 +
 arch/x86/kernel/apic/probe_32.c       |    1 +
 arch/x86/kernel/apic/x2apic_cluster.c |    1 +
 arch/x86/kernel/apic/x2apic_phys.c    |    1 +
 arch/x86/kernel/apic/x2apic_uv_x.c    |    1 +
 arch/x86/xen/apic.c                   |    1 +
 10 files changed, 14 insertions(+)
  

Comments

Linus Torvalds July 18, 2023, 12:19 a.m. UTC | #1
So all of your patches make sense to me, but the whole apic_flat case
confuses me.

On Mon, 17 Jul 2023 at 16:15, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> --- a/arch/x86/kernel/apic/apic_flat_64.c
> +++ b/arch/x86/kernel/apic/apic_flat_64.c
> @@ -94,6 +94,7 @@ static struct apic apic_flat __ro_after_
>         .cpu_present_to_apicid          = default_cpu_present_to_apicid,
>         .phys_pkg_id                    = flat_phys_pkg_id,
>
> +       .max_apic_id                    = 0xFE,
>         .get_apic_id                    = flat_get_apic_id,
>         .set_apic_id                    = set_apic_id,

flat_send_IPI_mask() can only deal with a single word mask. How the
heck can the max apic ID be more than 64?

I'm probably very confused.

Which is what the APIC code would do to anybody, which is why I'm
cheering your patch series on despite (or maybe _because_) it confuses
me.

            Linus
  
Thomas Gleixner July 18, 2023, 7:47 a.m. UTC | #2
Linus!

On Mon, Jul 17 2023 at 17:19, Linus Torvalds wrote:
> So all of your patches make sense to me, but the whole apic_flat case
> confuses me.
>
> On Mon, 17 Jul 2023 at 16:15, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> --- a/arch/x86/kernel/apic/apic_flat_64.c
>> +++ b/arch/x86/kernel/apic/apic_flat_64.c
>> @@ -94,6 +94,7 @@ static struct apic apic_flat __ro_after_
>>         .cpu_present_to_apicid          = default_cpu_present_to_apicid,
>>         .phys_pkg_id                    = flat_phys_pkg_id,
>>
>> +       .max_apic_id                    = 0xFE,
>>         .get_apic_id                    = flat_get_apic_id,
>>         .set_apic_id                    = set_apic_id,
>
> flat_send_IPI_mask() can only deal with a single word mask. How the
> heck can the max apic ID be more than 64?
>
> I'm probably very confused.

APIC is doing that to people.

The confusing part here is the physical APIC ID vs. the destination
mode.

Physical APIC ID is always a unique number per CPU (APIC) and on XAPIC
ranging from 0x0 to 0xFE. That's what is actually checked with that
max_apic_id entry.

Destination mode is a different story. APIC has two destination modes
for actually sending IPIs or messages from IO/APIC or PCI/MSI: Physical
and logical.

Logical is a bitmap of 8 bits, where each bit represents one CPU. So the
maximum number of CPUs addressable in logical mode is 8.

You can have a system with 8 CPUs where the physical APIC IDs are
0x20-0x27 and use logical destination mode by setting the LDR register
to the bit which represents the CPU, i.e. 1 << CPU#.

So in that 8 CPU case LDR is 0x1, 0x2, ... 0x80 on CPU0 - 7. When
sending an IPI then the destination mode in the ICR is set to logical
and the destination field is written with the bit representing the
target, i.e. 1 << CPU#. The destination field can have multiple bits set
to send the IPI to several CPUs (up to 8) with a single ICR write
operation.

Once the machine has more than 8 CPUs we are forced to use physical
destination mode. Physical destination mode is using the physical APIC
ID of the target: 0-254 are the valid addresses, which fit into one
byte. 255 (0xff) is a broadcast address. Sending IPIs to multiple
targets needs one ICR write per target, which is obviously more
expensive as between each write the ICR register has to be read back and
the ICR busy bit needs to go back to zero before the next ICR write can
happen.

X2APIC is similar. It just has a wider physical APIC ID space (full
32bit). X2APIC has a logical destination mode too which is more
useful. The logical destination is 32bit wide and split into two areas:

  [cluster ID] [logical ID]

Each being 16 bit wide. The logical ID is again one bit per CPU, the
cluster ID is a number. So we can send IPIs to multiple targets (up to
16) within a cluster with one ICR write. If the IPI targets are in
different clusters then obviously we need one write per cluster.

Physical destination mode on X2APIC is the same as with XAPIC but
cheaper as it does not need the ICR readback.

I'm sure you are now even more confused.

Thanks,

        tglx
  
Thomas Gleixner July 18, 2023, 3:54 p.m. UTC | #3
On Tue, Jul 18 2023 at 09:47, Thomas Gleixner wrote:
> On Mon, Jul 17 2023 at 17:19, Linus Torvalds wrote:
>> So all of your patches make sense to me, but the whole apic_flat case
>> confuses me.
>>
>> On Mon, 17 Jul 2023 at 16:15, Thomas Gleixner <tglx@linutronix.de> wrote:
>>>
>>> --- a/arch/x86/kernel/apic/apic_flat_64.c
>>> +++ b/arch/x86/kernel/apic/apic_flat_64.c
>>> @@ -94,6 +94,7 @@ static struct apic apic_flat __ro_after_
>>>         .cpu_present_to_apicid          = default_cpu_present_to_apicid,
>>>         .phys_pkg_id                    = flat_phys_pkg_id,
>>>
>>> +       .max_apic_id                    = 0xFE,
>>>         .get_apic_id                    = flat_get_apic_id,
>>>         .set_apic_id                    = set_apic_id,
>>
>> flat_send_IPI_mask() can only deal with a single word mask. How the
>> heck can the max apic ID be more than 64?
>>
>> I'm probably very confused.
>
> APIC is doing that to people.
>
> The confusing part here is the physical APIC ID vs. the destination
> mode.
>
> Physical APIC ID is always a unique number per CPU (APIC) and on XAPIC
> ranging from 0x0 to 0xFE. That's what is actually checked with that
> max_apic_id entry.
>
> Destination mode is a different story. APIC has two destination modes
> for actually sending IPIs or messages from IO/APIC or PCI/MSI: Physical
> and logical.
>
> Logical is a bitmap of 8 bits, where each bit represents one CPU. So the
> maximum number of CPUs addressable in logical mode is 8.
>
> You can have a system with 8 CPUs where the physical APIC IDs are
> 0x20-0x27 and use logical destination mode by setting the LDR register
> to the bit which represents the CPU, i.e. 1 << CPU#.
>
> So in that 8 CPU case LDR is 0x1, 0x2, ... 0x80 on CPU0 - 7. When
> sending an IPI then the destination mode in the ICR is set to logical
> and the destination field is written with the bit representing the
> target, i.e. 1 << CPU#. The destination field can have multiple bits set
> to send the IPI to several CPUs (up to 8) with a single ICR write
> operation.
>
> Once the machine has more than 8 CPUs we are forced to use physical
> destination mode. Physical destination mode is using the physical APIC
> ID of the target: 0-254 are the valid addresses, which fit into one
> byte. 255 (0xff) is a broadcast address. Sending IPIs to multiple
> targets needs one ICR write per target, which is obviously more
> expensive as between each write the ICR register has to be read back and
> the ICR busy bit needs to go back to zero before the next ICR write can
> happen.
>
> X2APIC is similar. It just has a wider physical APIC ID space (full
> 32bit). X2APIC has a logical destination mode too which is more
> useful. The logical destination is 32bit wide and split into two areas:
>
>   [cluster ID] [logical ID]
>
> Each being 16 bit wide. The logical ID is again one bit per CPU, the
> cluster ID is a number. So we can send IPIs to multiple targets (up to
> 16) within a cluster with one ICR write. If the IPI targets are in
> different clusters then obviously we need one write per cluster.
>
> Physical destination mode on X2APIC is the same as with XAPIC but
> cheaper as it does not need the ICR readback.

Just for making the confusion complete. XAPIC has a clustered logical
mode too:

The cluster ID is in the topmost 4 bits with a range from 0-14 and the
logical ID bits are the lower 4 bits. That means it works up to 4 * 15 =
60 CPUs.

The kernel never implemented that mode and with anything modern having
X2APIC I don't think it's worth the trouble to do so.

Thanks,

        tglx

---
"Confusion now hath made his masterpiece." - Shakespeare, Macbeth
  
Linus Torvalds July 18, 2023, 4:06 p.m. UTC | #4
On Tue, 18 Jul 2023 at 00:47, Thomas Gleixner <tglx@linutronix.de> wrote:
>
> The confusing part here is the physical APIC ID vs. the destination
> mode.

Actually, no, what confused me here ended up being that I didn't see
any other limit checking at all for the flat mode, and then I was
"this cannot possibly work up to that limit".

But it turns out that the limit checking appears to be in the
"physflat" case, not in the simple flat case.

IOW, the physflat probe function says "I'll take it" whenever
num_possible_cpus() > 8", and that seems to be what then limits the
flat mode to a max of 8 cpus. So the limit was just in another place
than I expected.

            Linus
  
Thomas Gleixner July 18, 2023, 6:59 p.m. UTC | #5
On Tue, Jul 18 2023 at 09:06, Linus Torvalds wrote:
> On Tue, 18 Jul 2023 at 00:47, Thomas Gleixner <tglx@linutronix.de> wrote:
>>
>> The confusing part here is the physical APIC ID vs. the destination
>> mode.
>
> Actually, no, what confused me here ended up being that I didn't see
> any other limit checking at all for the flat mode, and then I was
> "this cannot possibly work up to that limit".
>
> But it turns out that the limit checking appears to be in the
> "physflat" case, not in the simple flat case.
>
> IOW, the physflat probe function says "I'll take it" whenever
> num_possible_cpus() > 8", and that seems to be what then limits the
> flat mode to a max of 8 cpus. So the limit was just in another place
> than I expected.

Right. And obviously you managed to confuse me too :)
  

Patch

--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -277,6 +277,9 @@  struct apic {
 	u64	(*icr_read)(void);
 	void	(*icr_write)(u32 low, u32 high);
 
+	/* The limit of the APIC ID space. */
+	u32	max_apic_id;
+
 	/* Probe, setup and smpboot functions */
 	int	(*probe)(void);
 	int	(*acpi_madt_oem_check)(char *oem_id, char *oem_table_id);
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -94,6 +94,7 @@  static struct apic apic_flat __ro_after_
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= flat_phys_pkg_id,
 
+	.max_apic_id			= 0xFE,
 	.get_apic_id			= flat_get_apic_id,
 	.set_apic_id			= set_apic_id,
 
@@ -170,6 +171,7 @@  static struct apic apic_physflat __ro_af
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= flat_phys_pkg_id,
 
+	.max_apic_id			= 0xFE,
 	.get_apic_id			= flat_get_apic_id,
 	.set_apic_id			= set_apic_id,
 
--- a/arch/x86/kernel/apic/apic_noop.c
+++ b/arch/x86/kernel/apic/apic_noop.c
@@ -80,6 +80,7 @@  struct apic apic_noop __ro_after_init =
 
 	.phys_pkg_id			= noop_phys_pkg_id,
 
+	.max_apic_id			= 0xFE,
 	.get_apic_id			= noop_get_apic_id,
 	.set_apic_id			= NULL,
 
--- a/arch/x86/kernel/apic/apic_numachip.c
+++ b/arch/x86/kernel/apic/apic_numachip.c
@@ -239,6 +239,7 @@  static const struct apic apic_numachip1
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= numachip_phys_pkg_id,
 
+	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= numachip1_get_apic_id,
 	.set_apic_id			= numachip1_set_apic_id,
 
@@ -278,6 +279,7 @@  static const struct apic apic_numachip2
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= numachip_phys_pkg_id,
 
+	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= numachip2_get_apic_id,
 	.set_apic_id			= numachip2_set_apic_id,
 
--- a/arch/x86/kernel/apic/bigsmp_32.c
+++ b/arch/x86/kernel/apic/bigsmp_32.c
@@ -91,6 +91,7 @@  static struct apic apic_bigsmp __ro_afte
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= bigsmp_phys_pkg_id,
 
+	.max_apic_id			= 0xFE,
 	.get_apic_id			= bigsmp_get_apic_id,
 	.set_apic_id			= NULL,
 
--- a/arch/x86/kernel/apic/probe_32.c
+++ b/arch/x86/kernel/apic/probe_32.c
@@ -47,6 +47,7 @@  static struct apic apic_default __ro_aft
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= default_phys_pkg_id,
 
+	.max_apic_id			= 0xFE,
 	.get_apic_id			= default_get_apic_id,
 	.set_apic_id			= NULL,
 
--- a/arch/x86/kernel/apic/x2apic_cluster.c
+++ b/arch/x86/kernel/apic/x2apic_cluster.c
@@ -239,6 +239,7 @@  static struct apic apic_x2apic_cluster _
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= x2apic_phys_pkg_id,
 
+	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= x2apic_get_apic_id,
 	.set_apic_id			= x2apic_set_apic_id,
 
--- a/arch/x86/kernel/apic/x2apic_phys.c
+++ b/arch/x86/kernel/apic/x2apic_phys.c
@@ -163,6 +163,7 @@  static struct apic apic_x2apic_phys __ro
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= x2apic_phys_pkg_id,
 
+	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= x2apic_get_apic_id,
 	.set_apic_id			= x2apic_set_apic_id,
 
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -822,6 +822,7 @@  static struct apic apic_x2apic_uv_x __ro
 	.cpu_present_to_apicid		= default_cpu_present_to_apicid,
 	.phys_pkg_id			= uv_phys_pkg_id,
 
+	.max_apic_id			= UINT_MAX,
 	.get_apic_id			= x2apic_get_apic_id,
 	.set_apic_id			= set_apic_id,
 
--- a/arch/x86/xen/apic.c
+++ b/arch/x86/xen/apic.c
@@ -138,6 +138,7 @@  static struct apic xen_pv_apic = {
 	.cpu_present_to_apicid		= xen_cpu_present_to_apicid,
 	.phys_pkg_id			= xen_phys_pkg_id, /* detect_ht */
 
+	.max_apic_id			= UINT_MAX,
 	.get_apic_id 			= xen_get_apic_id,
 	.set_apic_id 			= xen_set_apic_id, /* Can be NULL on 32-bit. */