summaryrefslogtreecommitdiff
path: root/scripts/gdb/linux/tasks.py
diff options
context:
space:
mode:
authorJacob Moroni <jmoroni@google.com>2025-02-20 17:56:12 +0000
committerJason Gunthorpe <jgg@nvidia.com>2025-04-07 15:38:09 -0300
commit4dab26bed543584577b64b36aadb8b5b165bf44f (patch)
treec3ed6cd0ec864f6fb3a9a0b1d3c669da894caba4 /scripts/gdb/linux/tasks.py
parent3aadd652c2c816a908abe55079e2590e3259e8ba (diff)
IB/cm: use rwlock for MAD agent lock
In workloads where there are many processes establishing connections using RDMA CM in parallel (large scale MPI), there can be heavy contention for mad_agent_lock in cm_alloc_msg. This contention can occur while inside of a spin_lock_irq region, leading to interrupts being disabled for extended durations on many cores. Furthermore, it leads to the serialization of rdma_create_ah calls, which has negative performance impacts for NICs which are capable of processing multiple address handle creations in parallel. The end result is the machine becoming unresponsive, hung task warnings, netdev TX timeouts, etc. Since the lock appears to be only for protection from cm_remove_one, it can be changed to a rwlock to resolve these issues. Reproducer: Server: for i in $(seq 1 512); do ucmatose -c 32 -p $((i + 5000)) & done Client: for i in $(seq 1 512); do ucmatose -c 32 -p $((i + 5000)) -s 10.2.0.52 & done Fixes: 76039ac9095f ("IB/cm: Protect cm_dev, cm_ports and mad_agent with kref and lock") Link: https://patch.msgid.link/r/20250220175612.2763122-1-jmoroni@google.com Signed-off-by: Jacob Moroni <jmoroni@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
Diffstat (limited to 'scripts/gdb/linux/tasks.py')
0 files changed, 0 insertions, 0 deletions