diff options
author | Juergen Christ <jchrist@linux.ibm.com> | 2025-08-11 17:22:53 +0200 |
---|---|---|
committer | Alexander Gordeev <agordeev@linux.ibm.com> | 2025-08-20 16:38:24 +0200 |
commit | 669bc57e7016cf9d1a9eedb2a984c4fb4fd67f3d (patch) | |
tree | e6bd0371a4f4c41ae7e8f505198d552f52ef1b05 /scripts/bash-completion | |
parent | de88e74889a30bd9ff4047726021cde857348b4b (diff) |
s390/bitops: Optimize inlining
GCC inlining heuristics prevent code growth due to inlining into cold
paths. This causes GCC to emit a partially specialized version of
__flogr for non-constant input for all occurrences on cold paths.
This happens since the overhead seen during inlining includes setting
up a union register_pair, calling flogr, and extracting and casting
the result. This overhead is not removed until the function is
lowered into RTL. But this happens after inlining.
For -ftrivial-var-auto-init=zero builds, an additional initialization
of the union register_pair adds another statement to be inlinined.
This is unneeded since the even register is initialized anyway and the
odd register is not an input register. It is only marked as such
since the whole pair has to be marked as a read/write output register.
Mark the union register_pair as uninitialized to get rid of this
statement. This, however, does not change the code since the
initialization happens when part of the register pair is written.
Nevertheless, GCC function size approximation during inlining is
reduced by one statement.
Force inlining of flogr and also flatten some other functions that
should be leaf functions but are called in cold context, like, e.g.,
__init functions.
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Juergen Christ <jchrist@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Diffstat (limited to 'scripts/bash-completion')
0 files changed, 0 insertions, 0 deletions