summaryrefslogtreecommitdiff
path: root/scripts/patch-kernel
diff options
context:
space:
mode:
authorDavid Laight <david.laight.linux@gmail.com>2025-11-05 20:10:34 +0000
committerAndrew Morton <akpm@linux-foundation.org>2025-11-20 14:03:42 -0800
commitd10bb374c41e4c4dced04ae7d2fe2d782a5858a0 (patch)
tree7457021ad4e2c8af84bad8216525a3a8d000ca01 /scripts/patch-kernel
parent630f96a687def5616d6fa7f069adcea158320909 (diff)
lib: mul_u64_u64_div_u64(): optimise the divide code
Replace the bit by bit algorithm with one that generates 16 bits per iteration on 32bit architectures and 32 bits on 64bit ones. On my zen 5 this reduces the time for the tests (using the generic code) from ~3350ns to ~1000ns. Running the 32bit algorithm on 64bit x86 takes ~1500ns. It'll be slightly slower on a real 32bit system, mostly due to register pressure. The savings for 32bit x86 are much higher (tested in userspace). The worst case (lots of bits in the quotient) drops from ~900 clocks to ~130 (pretty much independant of the arguments). Other 32bit architectures may see better savings. It is possibly to optimise for divisors that span less than __LONG_WIDTH__/2 bits. However I suspect they don't happen that often and it doesn't remove any slow cpu divide instructions which dominate the result. Typical improvements for 64bit random divides: old new sandy bridge: 470 150 haswell: 400 144 piledriver: 960 467 I think rdpmc is very slow. zen5: 244 80 (Timing is 'rdpmc; mul_div(); rdpmc' with the multiply depending on the first rdpmc and the second rdpmc depending on the quotient.) Object code (64bit x86 test program): old 0x173 new 0x141. Link: https://lkml.kernel.org/r/20251105201035.64043-9-david.laight.linux@gmail.com Signed-off-by: David Laight <david.laight.linux@gmail.com> Reviewed-by: Nicolas Pitre <npitre@baylibre.com> Cc: Biju Das <biju.das.jz@bp.renesas.com> Cc: Borislav Betkov <bp@alien8.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Li RongQing <lirongqing@baidu.com> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Uwe Kleine-König <u.kleine-koenig@baylibre.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'scripts/patch-kernel')
0 files changed, 0 insertions, 0 deletions