[0/3] entry: inline syscall enter/exit functions

Message ID 20231205133015.752543-1-svens@linux.ibm.com
Headers
Series entry: inline syscall enter/exit functions |

Message

Sven Schnelle Dec. 5, 2023, 1:30 p.m. UTC
  Hi List,

looking into the performance of syscall entry/exit after s390 switched
to generic entry showed that there's quite some overhead calling some
of the entry/exit work functions even when there's nothing to do.
This patchset moves the entry and exit function to entry-common.h, so
non inlined code gets only called when there is some work pending.

I wrote a small program that just issues invalid syscalls in a loop.
On an s390 machine, this results in the following numbers:

without this series:

# ./syscall 1000000000
runtime: 94.886581s / per-syscall 9.488658e-08s

with this series:

./syscall 1000000000
runtime: 84.732391s / per-syscall 8.473239e-08s

so the time required for one syscall dropped from 94.8ns to
84.7ns, which is a drop of about 11%.

Sven Schnelle (3):
  entry: move exit to usermode functions to header file
  move enter_from_user_mode() to header file
  entry: move syscall_enter_from_user_mode() to header file

 include/linux/entry-common.h | 137 ++++++++++++++++++++++++++++++++-
 kernel/entry/common.c        | 145 ++---------------------------------
 2 files changed, 138 insertions(+), 144 deletions(-)
  

Comments

Peter Zijlstra Dec. 6, 2023, 11:02 a.m. UTC | #1
On Tue, Dec 05, 2023 at 02:30:12PM +0100, Sven Schnelle wrote:
> Hi List,
> 
> looking into the performance of syscall entry/exit after s390 switched
> to generic entry showed that there's quite some overhead calling some
> of the entry/exit work functions even when there's nothing to do.
> This patchset moves the entry and exit function to entry-common.h, so
> non inlined code gets only called when there is some work pending.

So per that logic you wouldn't need to inline exit_to_user_mode_loop()
for example, that's only called when there is a EXIT_TO_USER_MODE_WORK
bit set.

That is, I'm just being pedantic here and pointing out that your
justification doesn't cover the extent of the changes.

> I wrote a small program that just issues invalid syscalls in a loop.
> On an s390 machine, this results in the following numbers:
> 
> without this series:
> 
> # ./syscall 1000000000
> runtime: 94.886581s / per-syscall 9.488658e-08s
> 
> with this series:
> 
> ./syscall 1000000000
> runtime: 84.732391s / per-syscall 8.473239e-08s
> 
> so the time required for one syscall dropped from 94.8ns to
> 84.7ns, which is a drop of about 11%.

That is obviously very nice, and I don't immediately see anything wrong
with moving the lot to header based inlines.

Thomas?
  
Sven Schnelle Dec. 14, 2023, 8:24 a.m. UTC | #2
Peter Zijlstra <peterz@infradead.org> writes:

> On Tue, Dec 05, 2023 at 02:30:12PM +0100, Sven Schnelle wrote:
>> Hi List,
>> 
>> looking into the performance of syscall entry/exit after s390 switched
>> to generic entry showed that there's quite some overhead calling some
>> of the entry/exit work functions even when there's nothing to do.
>> This patchset moves the entry and exit function to entry-common.h, so
>> non inlined code gets only called when there is some work pending.
>
> So per that logic you wouldn't need to inline exit_to_user_mode_loop()
> for example, that's only called when there is a EXIT_TO_USER_MODE_WORK
> bit set.
>
> That is, I'm just being pedantic here and pointing out that your
> justification doesn't cover the extent of the changes.
>
>> I wrote a small program that just issues invalid syscalls in a loop.
>> On an s390 machine, this results in the following numbers:
>> 
>> without this series:
>> 
>> # ./syscall 1000000000
>> runtime: 94.886581s / per-syscall 9.488658e-08s
>> 
>> with this series:
>> 
>> ./syscall 1000000000
>> runtime: 84.732391s / per-syscall 8.473239e-08s
>> 
>> so the time required for one syscall dropped from 94.8ns to
>> 84.7ns, which is a drop of about 11%.
>
> That is obviously very nice, and I don't immediately see anything wrong
> with moving the lot to header based inlines.
>
> Thomas?

Thomas, any opinion on this change?
  
Thomas Gleixner Dec. 15, 2023, 7:06 p.m. UTC | #3
On Thu, Dec 14 2023 at 09:24, Sven Schnelle wrote:
> Peter Zijlstra <peterz@infradead.org> writes:
>>> so the time required for one syscall dropped from 94.8ns to
>>> 84.7ns, which is a drop of about 11%.
>>
>> That is obviously very nice, and I don't immediately see anything wrong
>> with moving the lot to header based inlines.
>>
>> Thomas?

No objections in principle. Let me look at the lot