diff options
| author | Nadav Amit <namit@vmware.com> | 2021-02-20 15:17:04 -0800 | 
|---|---|---|
| committer | Ingo Molnar <mingo@kernel.org> | 2021-03-06 12:59:09 +0100 | 
| commit | a32a4d8a815c4eb6dc64b8962dc13a9dfae70868 (patch) | |
| tree | 48c6cc9e7a5ad613eaae0b9b4632182f5f63c979 /drivers/fpga/fpga-bridge.c | |
| parent | a38fd8748464831584a19438cbb3082b5a2dab15 (diff) | |
smp: Run functions concurrently in smp_call_function_many_cond()
Currently, on_each_cpu() and similar functions do not exploit the
potential of concurrency: the function is first executed remotely and
only then it is executed locally. Functions such as TLB flush can take
considerable time, so this provides an opportunity for performance
optimization.
To do so, modify smp_call_function_many_cond(), to allows the callers to
provide a function that should be executed (remotely/locally), and run
them concurrently. Keep other smp_call_function_many() semantic as it is
today for backward compatibility: the called function is not executed in
this case locally.
smp_call_function_many_cond() does not use the optimized version for a
single remote target that smp_call_function_single() implements. For
synchronous function call, smp_call_function_single() keeps a
call_single_data (which is used for synchronization) on the stack.
Interestingly, it seems that not using this optimization provides
greater performance improvements (greater speedup with a single remote
target than with multiple ones). Presumably, holding data structures
that are intended for synchronization on the stack can introduce
overheads due to TLB misses and false-sharing when the stack is used for
other purposes.
Signed-off-by: Nadav Amit <namit@vmware.com>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Reviewed-by: Dave Hansen <dave.hansen@linux.intel.com>
Link: https://lore.kernel.org/r/20210220231712.2475218-2-namit@vmware.com
Diffstat (limited to 'drivers/fpga/fpga-bridge.c')
0 files changed, 0 insertions, 0 deletions
