summaryrefslogtreecommitdiff
path: root/net
diff options
context:
space:
mode:
authorJakub Kicinski <kuba@kernel.org>2025-04-12 16:30:11 -0700
committerJakub Kicinski <kuba@kernel.org>2025-04-14 12:48:44 -0700
commitf0433eea468810aebd61d0b9d095e9acd6bea2ed (patch)
treed3d55ed8680a466c54e475e3fc7d2eaf6f2ec361 /net
parent8c941f14a694b40a91d381e77bcd334622aa7196 (diff)
net: don't mix device locking in dev_close_many() calls
Lockdep found the following dependency: &dev_instance_lock_key#3 --> &rdev->wiphy.mtx --> &net->xdp.lock --> &xs->mutex --> &dev_instance_lock_key#3 The first dependency is the problem. wiphy mutex should be outside the instance locks. The problem happens in notifiers (as always) for CLOSE. We only hold the instance lock for ops locked devices during CLOSE, and WiFi netdevs are not ops locked. Unfortunately, when we dev_close_many() during netns dismantle we may be holding the instance lock of _another_ netdev when issuing a CLOSE for a WiFi device. Lockdep's "Possible unsafe locking scenario" only prints 3 locks and we have 4, plus I think we'd need 3 CPUs, like this: CPU0 CPU1 CPU2 ---- ---- ---- lock(&xs->mutex); lock(&dev_instance_lock_key#3); lock(&rdev->wiphy.mtx); lock(&net->xdp.lock); lock(&xs->mutex); lock(&rdev->wiphy.mtx); lock(&dev_instance_lock_key#3); Tho, I don't think that's possible as CPU1 and CPU2 would be under rtnl_lock. Even if we have per-netns rtnl_lock and wiphy can span network namespaces - CPU0 and CPU1 must be in the same netns to see dev_instance_lock, so CPU0 can't be installing a socket as CPU1 is tearing the netns down. Regardless, our expected lock ordering is that wiphy lock is taken before instance locks, so let's fix this. Go over the ops locked and non-locked devices separately. Note that calling dev_close_many() on an empty list is perfectly fine. All processing (including RCU syncs) are conditional on the list not being empty, already. Fixes: 7e4d784f5810 ("net: hold netdev instance lock during rtnetlink operations") Reported-by: syzbot+6f588c78bf765b62b450@syzkaller.appspotmail.com Acked-by: Stanislav Fomichev <sdf@fomichev.me> Link: https://patch.msgid.link/20250412233011.309762-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Diffstat (limited to 'net')
-rw-r--r--net/core/dev.c17
1 files changed, 13 insertions, 4 deletions
diff --git a/net/core/dev.c b/net/core/dev.c
index 75e104322ad5..5fcbc66d865e 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11932,15 +11932,24 @@ void unregister_netdevice_many_notify(struct list_head *head,
BUG_ON(dev->reg_state != NETREG_REGISTERED);
}
- /* If device is running, close it first. */
+ /* If device is running, close it first. Start with ops locked... */
list_for_each_entry(dev, head, unreg_list) {
- list_add_tail(&dev->close_list, &close_head);
- netdev_lock_ops(dev);
+ if (netdev_need_ops_lock(dev)) {
+ list_add_tail(&dev->close_list, &close_head);
+ netdev_lock(dev);
+ }
+ }
+ dev_close_many(&close_head, true);
+ /* ... now unlock them and go over the rest. */
+ list_for_each_entry(dev, head, unreg_list) {
+ if (netdev_need_ops_lock(dev))
+ netdev_unlock(dev);
+ else
+ list_add_tail(&dev->close_list, &close_head);
}
dev_close_many(&close_head, true);
list_for_each_entry(dev, head, unreg_list) {
- netdev_unlock_ops(dev);
/* And unlink it from device chain. */
unlist_netdevice(dev);
netdev_lock(dev);