summaryrefslogtreecommitdiff
path: root/tools/perf/scripts/python/task-analyzer.py
diff options
context:
space:
mode:
authorSteve Wahl <steve.wahl@hpe.com>2025-03-04 10:08:43 -0600
committerPeter Zijlstra <peterz@infradead.org>2025-04-08 20:55:51 +0200
commitf55dac1dafb3334be1d5b54bf385e8cfaa0ab3b3 (patch)
tree544565d56d5a0b6f1b8d8ea5090b2985458fee9c /tools/perf/scripts/python/task-analyzer.py
parent8feb053d53194382fcfb68231296fdc220497ea6 (diff)
sched/topology: improve topology_span_sane speed
Use a different approach to topology_span_sane(), that checks for the same constraint of no partial overlaps for any two CPU sets for non-NUMA topology levels, but does so in a way that is O(N) rather than O(N^2). Instead of comparing with all other masks to detect collisions, keep one mask that includes all CPUs seen so far and detect collisions with a single cpumask_intersects test. If the current mask has no collisions with previously seen masks, it should be a new mask, which can be uniquely identified by the lowest bit set in this mask. Keep a pointer to this mask for future reference (in an array indexed by the lowest bit set), and add the CPUs in this mask to the list of those seen. If the current mask does collide with previously seen masks, it should be exactly equal to a mask seen before, looked up in the same array indexed by the lowest bit set in the mask, a single comparison. Move the topology_span_sane() check out of the existing topology level loop, let it use its own loop so that the array allocation can be done only once, shared across levels. On a system with 1920 processors (16 sockets, 60 cores, 2 threads), the average time to take one processor offline is reduced from 2.18 seconds to 1.01 seconds. (Off-lining 959 of 1920 processors took 34m49.765s without this change, 16m10.038s with this change in place.) Signed-off-by: Steve Wahl <steve.wahl@hpe.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Reviewed-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Tested-by: Valentin Schneider <vschneid@redhat.com> Tested-by: Madadi Vineeth Reddy <vineethr@linux.ibm.com> Link: https://lore.kernel.org/r/20250304160844.75373-2-steve.wahl@hpe.com
Diffstat (limited to 'tools/perf/scripts/python/task-analyzer.py')
0 files changed, 0 insertions, 0 deletions