diff options
| author | David Laight <david.laight.linux@gmail.com> | 2025-11-05 20:10:33 +0000 |
|---|---|---|
| committer | Andrew Morton <akpm@linux-foundation.org> | 2025-11-20 14:03:42 -0800 |
| commit | 630f96a687def5616d6fa7f069adcea158320909 (patch) | |
| tree | 0642e5dcd327c15f1272dc63653dcb83674406e6 /scripts/gdb/linux/proc.py | |
| parent | f0bff2eb04686f13386b2af97bc7aaa09f020f35 (diff) | |
lib: mul_u64_u64_div_u64(): optimise multiply on 32bit x86
gcc generates horrid code for both ((u64)u32_a * u32_b) and (u64_a +
u32_b). As well as the extra instructions it can generate a lot of spills
to stack (including spills of constant zeros and even multiplies by
constant zero).
mul_u32_u32() already exists to optimise the multiply. Add a similar
add_u64_32() for the addition. Disable both for clang - it generates
better code without them.
Move the 64x64 => 128 multiply into a static inline helper function for
code clarity. No need for the a/b_hi/lo variables, the implicit casts on
the function calls do the work for us. Should have minimal effect on the
generated code.
Use mul_u32_u32() and add_u64_u32() in the 64x64 => 128 multiply in
mul_u64_add_u64_div_u64().
Link: https://lkml.kernel.org/r/20251105201035.64043-8-david.laight.linux@gmail.com
Signed-off-by: David Laight <david.laight.linux@gmail.com>
Reviewed-by: Nicolas Pitre <npitre@baylibre.com>
Cc: Biju Das <biju.das.jz@bp.renesas.com>
Cc: Borislav Betkov <bp@alien8.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Li RongQing <lirongqing@baidu.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleinxer <tglx@linutronix.de>
Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'scripts/gdb/linux/proc.py')
0 files changed, 0 insertions, 0 deletions
