summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
27 hoursMerge tag 'nfsd-6.19-3' of ↵HEADmasterLinus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "A set of NFSD fixes for stable that arrived after the merge window: - Remove an invalid NFS status code - Fix an fstests failure when using pNFS - Fix a UAF in v4_end_grace() - Fix the administrative interface used to revoke NFSv4 state - Fix a memory leak reported by syzbot" * tag 'nfsd-6.19-3' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: net ref data still needs to be freed even if net hasn't startup nfsd: check that server is running in unlock_filesystem nfsd: use correct loop termination in nfsd4_revoke_states() nfsd: provide locking for v4_end_grace NFSD: Fix permission check for read access to executable-only files NFSD: Remove NFSERR_EAGAIN
47 hoursMerge tag 'for-6.19-rc4-tag' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs fixes from David Sterba: - fix potential deadlock due to mismatching transaction states when waiting for the current transaction - fix squota accounting with nested snapshots - fix quota inheritance of qgroups with multiple parent qgroups - fix NULL inode pointer in evict tracepoint - fix writes beyond end of file on systems with 64K page size and 4K block size - fix logging of inodes after exchange rename - fix use after free when using ref_tracker feature - space reservation fixes * tag 'for-6.19-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: btrfs: fix reservation leak in some error paths when inserting inline extent btrfs: do not free data reservation in fallback from inline due to -ENOSPC btrfs: fix use-after-free warning in btrfs_get_or_create_delayed_node() btrfs: always detect conflicting inodes when logging inode refs btrfs: fix beyond-EOF write handling btrfs: fix deadlock in wait_current_trans() due to ignored transaction type btrfs: fix NULL dereference on root when tracing inode eviction btrfs: qgroup: update all parent qgroups when doing quick inherit btrfs: fix qgroup_snapshot_quick_inherit() squota bug
5 daysNFSD: net ref data still needs to be freed even if net hasn't startupEdward Adam Davis
When the NFSD instance doesn't to startup, the net ref data memory is not properly reclaimed, which triggers the memory leak issue reported by syzbot [1]. To avoid the problem reported in [1], the net ref data memory reclamation action is moved outside of nfsd_net_up when the net is shutdown. [1] unreferenced object 0xffff88812a39dfc0 (size 64): backtrace (crc a2262fc6): percpu_ref_init+0x94/0x1e0 lib/percpu-refcount.c:76 nfsd_create_serv+0xbe/0x260 fs/nfsd/nfssvc.c:605 nfsd_nl_listener_set_doit+0x62/0xb00 fs/nfsd/nfsctl.c:1882 genl_family_rcv_msg_doit+0x11e/0x190 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x2fd/0x440 net/netlink/genetlink.c:1210 BUG: memory leak Reported-by: syzbot+6ee3b889bdeada0a6226@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6ee3b889bdeada0a6226 Fixes: 39972494e318 ("nfsd: update percpu_ref to manage references on nfsd_net") Cc: stable@vger.kernel.org Signed-off-by: Edward Adam Davis <eadavis@qq.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
5 daysnfsd: check that server is running in unlock_filesystemOlga Kornievskaia
If we are trying to unlock the filesystem via an administrative interface and nfsd isn't running, it crashes the server. This happens currently because nfsd4_revoke_states() access state structures (eg., conf_id_hashtbl) that has been freed as a part of the server shutdown. [ 59.465072] Call trace: [ 59.465308] nfsd4_revoke_states+0x1b4/0x898 [nfsd] (P) [ 59.465830] write_unlock_fs+0x258/0x440 [nfsd] [ 59.466278] nfsctl_transaction_write+0xb0/0x120 [nfsd] [ 59.466780] vfs_write+0x1f0/0x938 [ 59.467088] ksys_write+0xfc/0x1f8 [ 59.467395] __arm64_sys_write+0x74/0xb8 [ 59.467746] invoke_syscall.constprop.0+0xdc/0x1e8 [ 59.468177] do_el0_svc+0x154/0x1d8 [ 59.468489] el0_svc+0x40/0xe0 [ 59.468767] el0t_64_sync_handler+0xa0/0xe8 [ 59.469138] el0t_64_sync+0x1ac/0x1b0 Ensure this can't happen by taking the nfsd_mutex and checking that the server is still up, and then holding the mutex across the call to nfsd4_revoke_states(). Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Fixes: 1ac3629bf0125 ("nfsd: prepare for supporting admin-revocation of state") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
5 daysnfsd: use correct loop termination in nfsd4_revoke_states()NeilBrown
The loop in nfsd4_revoke_states() stops one too early because the end value given is CLIENT_HASH_MASK where it should be CLIENT_HASH_SIZE. This means that an admin request to drop all locks for a filesystem will miss locks held by clients which hash to the maximum possible hash value. Fixes: 1ac3629bf012 ("nfsd: prepare for supporting admin-revocation of state") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
5 daysnfsd: provide locking for v4_end_graceNeilBrown
Writing to v4_end_grace can race with server shutdown and result in memory being accessed after it was freed - reclaim_str_hashtbl in particularly. We cannot hold nfsd_mutex across the nfsd4_end_grace() call as that is held while client_tracking_op->init() is called and that can wait for an upcall to nfsdcltrack which can write to v4_end_grace, resulting in a deadlock. nfsd4_end_grace() is also called by the landromat work queue and this doesn't require locking as server shutdown will stop the work and wait for it before freeing anything that nfsd4_end_grace() might access. However, we must be sure that writing to v4_end_grace doesn't restart the work item after shutdown has already waited for it. For this we add a new flag protected with nn->client_lock. It is set only while it is safe to make client tracking calls, and v4_end_grace only schedules work while the flag is set with the spinlock held. So this patch adds a nfsd_net field "client_tracking_active" which is set as described. Another field "grace_end_forced", is set when v4_end_grace is written. After this is set, and providing client_tracking_active is set, the laundromat is scheduled. This "grace_end_forced" field bypasses other checks for whether the grace period has finished. This resolves a race which can result in use-after-free. Reported-by: Li Lingfeng <lilingfeng3@huawei.com> Closes: https://lore.kernel.org/linux-nfs/20250623030015.2353515-1-neil@brown.name/T/#t Fixes: 7f5ef2e900d9 ("nfsd: add a v4_end_grace file to /proc/fs/nfsd") Cc: stable@vger.kernel.org Signed-off-by: NeilBrown <neil@brown.name> Tested-by: Li Lingfeng <lilingfeng3@huawei.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
5 daysNFSD: Fix permission check for read access to executable-only filesScott Mayhew
Commit abc02e5602f7 ("NFSD: Support write delegations in LAYOUTGET") added NFSD_MAY_OWNER_OVERRIDE to the access flags passed from nfsd4_layoutget() to fh_verify(). This causes LAYOUTGET to fail for executable-only files, and causes xfstests generic/126 to fail on pNFS SCSI. To allow read access to executable-only files, what we really want is: 1. The "permissions" portion of the access flags (the lower 6 bits) must be exactly NFSD_MAY_READ 2. The "hints" portion of the access flags (the upper 26 bits) can contain any combination of NFSD_MAY_OWNER_OVERRIDE and NFSD_MAY_READ_IF_EXEC Fixes: abc02e5602f7 ("NFSD: Support write delegations in LAYOUTGET") Cc: stable@vger.kernel.org # v6.6+ Signed-off-by: Scott Mayhew <smayhew@redhat.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: NeilBrown <neil@brown.name> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
5 daysNFSD: Remove NFSERR_EAGAINChuck Lever
I haven't found an NFSERR_EAGAIN in RFCs 1094, 1813, 7530, or 8881. None of these RFCs have an NFS status code that match the numeric value "11". Based on the meaning of the EAGAIN errno, I presume the use of this status in NFSD means NFS4ERR_DELAY. So replace the one usage of nfserr_eagain, and remove it from NFSD's NFS status conversion tables. As far as I can tell, NFSERR_EAGAIN has existed since the pre-git era, but was not actually used by any code until commit f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed."), at which time it become possible for NFSD to return a status code of 11 (which is not valid NFS protocol). Fixes: f4e44b393389 ("NFSD: delay unmount source's export after inter-server copy completed.") Cc: stable@vger.kernel.org Reviewed-by: NeilBrown <neil@brown.name> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
5 daysMerge tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - Fix memory leak - Fix two refcount leaks - Fix error path in create_smb2_pipe * tag 'v6.19-rc3-smb3-server-fixes' of git://git.samba.org/ksmbd: smb/server: fix refcount leak in smb2_open() smb/server: fix refcount leak in parse_durable_handle_context() smb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe() ksmbd: Fix memory leak in get_file_all_info()
5 daysMerge tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - Fix array out of bounds error in copy_file_range - Add tracepoint to help debug ioctl failures * tag 'v6.19-rc3-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_range smb3 client: add missing tracepoint for unsupported ioctls
8 daysMerge tag 'nfsd-6.19-2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "A set of NFSD fixes that arrived just a bit late for the 6.19 merge window. Regression fix: - Avoid unnecessarily breaking a timestamp delegation Stable fixes: - Fix a crasher in nlm4svc_proc_test() - Fix nfsd_file reference leak during write delegation - Fix error flow in client_states_open()" * tag 'nfsd-6.19-2' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: nfsd: Drop the client reference in client_states_open() nfsd: use ATTR_DELEG in nfsd4_finalize_deleg_timestamps() nfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg() lockd: fix vfs_test_lock() calls
8 dayssmb: client: fix UBSAN array-index-out-of-bounds in smb2_copychunk_rangeHenrique Carvalho
struct copychunk_ioctl_req::ChunkCount is annotated with __counted_by_le() as the number of elements in Chunks[]. smb2_copychunk_range reuses ChunkCount to store the number of chunks sent in the current iteration. If a later iteration populates more chunks than a previous one, the stale smaller value trips UBSAN. Set ChunkCount to chunk_count (allocated capacity) before populating Chunks[]. Fixes: cc26f593dc19 ("smb: move copychunk definitions to common/smb2pdu.h") Link: https://lore.kernel.org/linux-cifs/CAH2r5ms9AWLy8WZ04Cpq5XOeVK64tcrUQ6__iMW+yk1VPzo1BA@mail.gmail.com Tested-by: Youling Tang <tangyouling@kylinos.cn> Acked-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
8 dayssmb3 client: add missing tracepoint for unsupported ioctlsSteve French
In debugging a recent problem with an xfstest, noticed that we weren't tracing cases where the ioctl was not supported. Add dynamic tracepoint: "trace-cmd record -e smb3_unsupported_ioctl" and then after running an app which calls unsupported ioctl, "trace-cmd show"would display e.g. xfs_io-7289 [012] ..... 1205.137765: smb3_unsupported_ioctl: xid=19 fid=0x4535bb84 ioctl cmd=0x801c581f Acked-by: Bharath SM <bharathsm@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb/server: fix refcount leak in smb2_open()ZhangGuoDong
When ksmbd_vfs_getattr() fails, the reference count of ksmbd_file must be released. Suggested-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb/server: fix refcount leak in parse_durable_handle_context()ZhangGuoDong
When the command is a replay operation and -ENOEXEC is returned, the refcount of ksmbd_file must be released. Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 dayssmb/server: call ksmbd_session_rpc_close() on error path in create_smb2_pipe()ZhangGuoDong
When ksmbd_iov_pin_rsp() fails, we should call ksmbd_session_rpc_close(). Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
9 daysksmbd: Fix memory leak in get_file_all_info()Zilin Guan
In get_file_all_info(), if vfs_getattr() fails, the function returns immediately without freeing the allocated filename, leading to a memory leak. Fix this by freeing the filename before returning in this error case. Fixes: 5614c8c487f6a ("ksmbd: replace generic_fillattr with vfs_getattr") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
12 daysMerge tag 'v6.19-rc2-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fix from Steve French: - Fix potential memory leak * tag 'v6.19-rc2-smb3-client-fix' of git://git.samba.org/sfrench/cifs-2.6: cifs: Fix memory and information leak in smb3_reconfigure()
12 daysMerge tag 'driver-core-6.19-rc3' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core Pull driver core fixes from Danilo Krummrich: - Introduce DMA Rust helpers to avoid build errors when !CONFIG_HAS_DMA - Remove unnecessary (and hence incorrect) endian conversion in the Rust PCI driver sample code - Fix memory leak in the unwind path of debugfs_change_name() - Support non-const struct software_node pointers in SOFTWARE_NODE_REFERENCE(), after introducing _Generic() - Avoid NULL pointer dereference in the unwind path of simple_xattrs_free() * tag 'driver-core-6.19-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core: fs/kernfs: null-ptr deref in simple_xattrs_free() software node: Also support referencing non-constant software nodes debugfs: Fix memleak in debugfs_change_name(). samples: rust: fix endianness issue in rust_driver_pci rust: dma: add helpers for architectures without CONFIG_HAS_DMA
12 daysMerge tag 'v6.19-rc2-smb3-server-fixes' of git://git.samba.org/ksmbdLinus Torvalds
Pull smb server fixes from Steve French: - Fix parsing of SMB1 negotiate request by adjusting offsets affected by the removal of the RFC1002 length field from the SMB header - Update minimum PDU size macros for both SMB1 and SMB2 - Rename smb2_get_msg function to smb_get_msg to better reflect its role in handling both SMB1 and SMB2 requests * tag 'v6.19-rc2-smb3-server-fixes' of git://git.samba.org/ksmbd: smb/server: fix minimum SMB2 PDU size smb/server: fix minimum SMB1 PDU size ksmbd: rename smb2_get_msg to smb_get_msg ksmbd: Fix to handle removal of rfc1002 header from smb_hdr
14 daysnfsd: Drop the client reference in client_states_open()Haoxiang Li
In error path, call drop_client() to drop the reference obtained by get_nfsdfs_clp(). Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Haoxiang Li <lihaoxiang@isrc.iscas.ac.cn> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
14 daysnfsd: use ATTR_DELEG in nfsd4_finalize_deleg_timestamps()Jeff Layton
When finalizing timestamps that have never been updated and preparing to release the delegation lease, the notify_change() call can trigger a delegation break, and fail to update the timestamps. When this happens, there will be messages like this in dmesg: [ 2709.375785] Unable to update timestamps on inode 00:39:263: -11 Since this code is going to release the lease just after updating the timestamps, breaking the delegation is undesirable. Fix this by setting ATTR_DELEG in ia_valid, in order to avoid the delegation break. Fixes: e5e9b24ab8fa ("nfsd: freeze c/mtime updates with outstanding WRITE_ATTRS delegation") Cc: stable@vger.kernel.org Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
14 daysnfsd: fix nfsd_file reference leak in nfsd4_add_rdaccess_to_wrdeleg()Chuck Lever
nfsd4_add_rdaccess_to_wrdeleg() unconditionally overwrites fp->fi_fds[O_RDONLY] with a newly acquired nfsd_file. However, if the client already has a SHARE_ACCESS_READ open from a previous OPEN operation, this action overwrites the existing pointer without releasing its reference, orphaning the previous reference. Additionally, the function originally stored the same nfsd_file pointer in both fp->fi_fds[O_RDONLY] and fp->fi_rdeleg_file with only a single reference. When put_deleg_file() runs, it clears fi_rdeleg_file and calls nfs4_file_put_access() to release the file. However, nfs4_file_put_access() only releases fi_fds[O_RDONLY] when the fi_access[O_RDONLY] counter drops to zero. If another READ open exists on the file, the counter remains elevated and the nfsd_file reference from the delegation is never released. This potentially causes open conflicts on that file. Then, on server shutdown, these leaks cause __nfsd_file_cache_purge() to encounter files with an elevated reference count that cannot be cleaned up, ultimately triggering a BUG() in kmem_cache_destroy() because there are still nfsd_file objects allocated in that cache. Fixes: e7a8ebc305f2 ("NFSD: Offer write delegation for OPEN with OPEN4_SHARE_ACCESS_WRITE") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
14 dayslockd: fix vfs_test_lock() callsNeilBrown
Usage of vfs_test_lock() is somewhat confused. Documentation suggests it is given a "lock" but this is not the case. It is given a struct file_lock which contains some details of the sort of lock it should be looking for. In particular passing a "file_lock" containing fl_lmops or fl_ops is meaningless and possibly confusing. This is particularly problematic in lockd. nlmsvc_testlock() receives an initialised "file_lock" from xdr-decode, including manager ops and an owner. It then mistakenly passes this to vfs_test_lock() which might replace the owner and the ops. This can lead to confusion when freeing the lock. The primary role of the 'struct file_lock' passed to vfs_test_lock() is to report a conflicting lock that was found, so it makes more sense for nlmsvc_testlock() to pass "conflock", which it uses for returning the conflicting lock. With this change, freeing of the lock is not confused and code in __nlm4svc_proc_test() and __nlmsvc_proc_test() can be simplified. Documentation for vfs_test_lock() is improved to reflect its real purpose, and a WARN_ON_ONCE() is added to avoid a similar problem in the future. Reported-by: Olga Kornievskaia <okorniev@redhat.com> Closes: https://lore.kernel.org/all/20251021130506.45065-1-okorniev@redhat.com Signed-off-by: NeilBrown <neil@brown.name> Fixes: 20fa19027286 ("nfs: add export operations") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-24Merge tag 'nfsd-6.19-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux Pull nfsd fixes from Chuck Lever: "A set of NFSD fixes that arrived just a bit late for the 6.19 merge window. Regression fixes: - Mark variable __maybe_unused to avoid W=1 build break Stable fixes: - NFSv4 file creation neglects setting ACL - Clear TIME_DELEG in the suppattr_exclcreat bitmap - Clear SECLABEL in the suppattr_exclcreat bitmap - Fix memory leak in nfsd_create_serv error paths - Bound check rq_pages index in inline path - Return 0 on success from svc_rdma_copy_inline_range - Use rc_pageoff for memcpy byte offset - Avoid NULL deref on zero length gss_token in gss_read_proxy_verf" * tag 'nfsd-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: NFSD: NFSv4 file creation neglects setting ACL NFSD: Clear TIME_DELEG in the suppattr_exclcreat bitmap NFSD: Clear SECLABEL in the suppattr_exclcreat bitmap nfsd: fix memory leak in nfsd_create_serv error paths nfsd: Mark variable __maybe_unused to avoid W=1 build break svcrdma: bound check rq_pages index in inline path svcrdma: return 0 on success from svc_rdma_copy_inline_range svcrdma: use rc_pageoff for memcpy byte offset SUNRPC: svcauth_gss: avoid NULL deref on zero length gss_token in gss_read_proxy_verf
2025-12-24Merge tag 'erofs-for-6.19-rc3-fixes' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs fix from Gao Xiang: "Junbeom reported that synchronous reads could hit unintended EIOs under memory pressure due to incorrect error propagation in z_erofs_decompress_queue(), where earlier physical clusters in the same decompression queue may be served for another readahead. This addresses the issue by decompressing each physical cluster independently as long as disk I/Os succeed, rather than being impacted by the error status of previous physical clusters in the same queue. Summary: - Fix unexpected EIOs under memory pressure caused by recent incorrect error propagation logic" * tag 'erofs-for-6.19-rc3-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: fix unexpected EIO under memory pressure
2025-12-24cifs: Fix memory and information leak in smb3_reconfigure()Zilin Guan
In smb3_reconfigure(), if smb3_sync_session_ctx_passwords() fails, the function returns immediately without freeing and erasing the newly allocated new_password and new_password2. This causes both a memory leak and a potential information leak. Fix this by calling kfree_sensitive() on both password buffers before returning in this error case. Fixes: 0f0e357902957 ("cifs: during remount, make sure passwords are in sync") Signed-off-by: Zilin Guan <zilin@seu.edu.cn> Reviewed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-23fs/kernfs: null-ptr deref in simple_xattrs_free()Will Rosenberg
There exists a null pointer dereference in simple_xattrs_free() as part of the __kernfs_new_node() routine. Within __kernfs_new_node(), err_out4 calls simple_xattr_free(), but kn->iattr may be NULL if __kernfs_setattr() was never called. As a result, the first argument to simple_xattrs_free() may be NULL + 0x38, and no NULL check is done internally, causing an incorrect pointer dereference. Add a check to ensure kn->iattr is not NULL, meaning __kernfs_setattr() has been called and kn->iattr is allocated. Note that struct kernfs_node kn is allocated with kmem_cache_zalloc, so we can assume kn->iattr will be NULL if not allocated. An alternative fix could be to not call simple_xattrs_free() at all. As was previously discussed during the initial patch, simple_xattrs_free() is not strictly needed and is included to be consistent with kernfs_free_rcu(), which also helps the function maintain correctness if changes are made in __kernfs_new_node(). Reported-by: syzbot+6aaf7f48ae034ab0ea97@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=6aaf7f48ae034ab0ea97 Fixes: 382b1e8f30f7 ("kernfs: fix memory leak of kernfs_iattrs in __kernfs_new_node") Signed-off-by: Will Rosenberg <whrosenb@asu.edu> Link: https://patch.msgid.link/20251217060107.4171558-1-whrosenb@asu.edu Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-12-21smb/server: fix minimum SMB2 PDU sizeChenXiaoSong
The minimum SMB2 PDU size should be updated to the size of `struct smb2_pdu` (that is, the size of `struct smb2_hdr` + 2). Suggested-by: David Howells <dhowells@redhat.com> Suggested-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-21smb/server: fix minimum SMB1 PDU sizeChenXiaoSong
Since the RFC1002 header has been removed from `struct smb_hdr`, the minimum SMB1 PDU size should be updated as well. Fixes: 83bfbd0bb902 ("cifs: Remove the RFC1002 header from smb_hdr") Suggested-by: David Howells <dhowells@redhat.com> Suggested-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Reviewed-by: David Howells <dhowells@redhat.com> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-21ksmbd: rename smb2_get_msg to smb_get_msgNamjae Jeon
With the removal of the RFC1002 length field from the SMB header, smb2_get_msg is now used to get the smb1 request from the request buffer. Since this function is no longer exclusive to smb2 and now supports smb1 as well, This patch rename it to smb_get_msg to better reflect its usage. Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-21ksmbd: Fix to handle removal of rfc1002 header from smb_hdrDavid Howells
The commit that removed the RFC1002 header from struct smb_hdr didn't also fix the places in ksmbd that use it in order to provide graceful rejection of SMB1 protocol requests. Fixes: 83bfbd0bb902 ("cifs: Remove the RFC1002 header from smb_hdr") Reported-by: Namjae Jeon <linkinjeon@kernel.org> Link: https://lore.kernel.org/r/CAKYAXd9Ju4MFkkH5Jxfi1mO0AWEr=R35M3vQ_Xa7Yw34JoNZ0A@mail.gmail.com/ Cc: ChenXiaoSong <chenxiaosong.chenxiaosong@linux.dev> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Namjae Jeon <linkinjeon@kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-22erofs: fix unexpected EIO under memory pressureJunbeom Yeom
erofs readahead could fail with ENOMEM under the memory pressure because it tries to alloc_page with GFP_NOWAIT | GFP_NORETRY, while GFP_KERNEL for a regular read. And if readahead fails (with non-uptodate folios), the original request will then fall back to synchronous read, and `.read_folio()` should return appropriate errnos. However, in scenarios where readahead and read operations compete, read operation could return an unintended EIO because of an incorrect error propagation. To resolve this, this patch modifies the behavior so that, when the PCL is for read(which means pcl.besteffort is true), it attempts actual decompression instead of propagating the privios error except initial EIO. - Page size: 4K - The original size of FileA: 16K - Compress-ratio per PCL: 50% (Uncompressed 8K -> Compressed 4K) [page0, page1] [page2, page3] [PCL0]---------[PCL1] - functions declaration: . pread(fd, buf, count, offset) . readahead(fd, offset, count) - Thread A tries to read the last 4K - Thread B tries to do readahead 8K from 4K - RA, besteffort == false - R, besteffort == true <process A> <process B> pread(FileA, buf, 4K, 12K) do readahead(page3) // failed with ENOMEM wait_lock(page3) if (!uptodate(page3)) goto do_read readahead(FileA, 4K, 8K) // Here create PCL-chain like below: // [null, page1] [page2, null] // [PCL0:RA]-----[PCL1:RA] ... do read(page3) // found [PCL1:RA] and add page3 into it, // and then, change PCL1 from RA to R ... // Now, PCL-chain is as below: // [null, page1] [page2, page3] // [PCL0:RA]-----[PCL1:R] // try to decompress PCL-chain... z_erofs_decompress_queue err = 0; // failed with ENOMEM, so page 1 // only for RA will not be uptodated. // it's okay. err = decompress([PCL0:RA], err) // However, ENOMEM propagated to next // PCL, even though PCL is not only // for RA but also for R. As a result, // it just failed with ENOMEM without // trying any decompression, so page2 // and page3 will not be uptodated. ** BUG HERE ** --> err = decompress([PCL1:R], err) return err as ENOMEM ... wait_lock(page3) if (!uptodate(page3)) return EIO <-- Return an unexpected EIO! ... Fixes: 2349d2fa02db ("erofs: sunset unneeded NOFAILs") Cc: stable@vger.kernel.org Reviewed-by: Jaewook Kim <jw5454.kim@samsung.com> Reviewed-by: Sungjong Seo <sj1557.seo@samsung.com> Signed-off-by: Junbeom Yeom <junbeom.yeom@samsung.com> Reviewed-by: Gao Xiang <hsiangkao@linux.alibaba.com> Signed-off-by: Gao Xiang <hsiangkao@linux.alibaba.com>
2025-12-20Merge tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds
Pull xfs fixes from Carlos Maiolino: "This contains a few fixes for zoned devices support, an UAF and a compiler warning, and some cleaning up" * tag 'xfs-fixes-6.19-rc2' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: xfs: fix the zoned RT growfs check for zone alignment xfs: validate that zoned RT devices are zone aligned xfs: fix XFS_ERRTAG_FORCE_ZERO_RANGE for zoned file system xfs: fix a memory leak in xfs_buf_item_init() xfs: fix stupid compiler warning xfs: fix a UAF problem in xattr repair xfs: ignore discard return value
2025-12-19debugfs: Fix memleak in debugfs_change_name().Kuniyuki Iwashima
syzbot reported memleak in debugfs_change_name(). [0] When lookup_noperm_unlocked() fails, new_name is leaked. Let's fix it by reusing kfree_const() at the end of debugfs_change_name(). [0]: BUG: memory leak unreferenced object 0xffff8881110bb308 (size 8): comm "syz.0.17", pid 6090, jiffies 4294942958 hex dump (first 8 bytes): 2e 00 00 00 00 00 00 00 ........ backtrace (crc ecfc7064): kmemleak_alloc_recursive include/linux/kmemleak.h:44 [inline] slab_post_alloc_hook mm/slub.c:4953 [inline] slab_alloc_node mm/slub.c:5258 [inline] __do_kmalloc_node mm/slub.c:5651 [inline] __kmalloc_node_track_caller_noprof+0x3b2/0x670 mm/slub.c:5759 __kmemdup_nul mm/util.c:64 [inline] kstrdup+0x3c/0x80 mm/util.c:84 kstrdup_const+0x63/0x80 mm/util.c:104 kvasprintf_const+0xca/0x110 lib/kasprintf.c:48 debugfs_change_name+0xf6/0x5d0 fs/debugfs/inode.c:854 cfg80211_dev_rename+0xd8/0x110 net/wireless/core.c:149 nl80211_set_wiphy+0x102/0x1770 net/wireless/nl80211.c:3844 genl_family_rcv_msg_doit+0x11e/0x190 net/netlink/genetlink.c:1115 genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline] genl_rcv_msg+0x2fd/0x440 net/netlink/genetlink.c:1210 netlink_rcv_skb+0x93/0x1d0 net/netlink/af_netlink.c:2550 genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219 netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline] netlink_unicast+0x3a3/0x4f0 net/netlink/af_netlink.c:1344 netlink_sendmsg+0x335/0x6b0 net/netlink/af_netlink.c:1894 sock_sendmsg_nosec net/socket.c:718 [inline] __sock_sendmsg net/socket.c:733 [inline] ____sys_sendmsg+0x562/0x5a0 net/socket.c:2608 ___sys_sendmsg+0xc8/0x130 net/socket.c:2662 __sys_sendmsg+0xc7/0x140 net/socket.c:2694 Fixes: 833d2b3a072f7 ("Add start_renaming_two_dentries()") Reported-by: syzbot+3d7ca9c802c547f8550a@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/69369d82.a70a0220.38f243.009f.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251208094551.46184-1-kuniyu@google.com [ Fix minor typo in commit message. - Danilo ] Signed-off-by: Danilo Krummrich <dakr@kernel.org>
2025-12-19Merge tag 'v6.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - important fix for reconnect problem - minor cleanup * tag 'v6.19-rc1-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: update internal module version number smb: move some SMB1 definitions into common/smb1pdu.h smb: align durable reconnect v2 context to 8 byte boundary
2025-12-19Merge tag 'fsnotify_for_v6.19-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull fsnotify fixes from Jan Kara: "Two fsnotify fixes. The fix from Ahelenia makes sure we generate event when modifying inode flags, the fix from Amir disables sending of events from device inodes to their parent directory as it could concievably create a usable side channel attack in case of some devices and so far we aren't aware of anybody depending on the functionality" * tag 'fsnotify_for_v6.19-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: fs: send fsnotify_xattr()/IN_ATTRIB from vfs_fileattr_set()/chattr(1) fsnotify: do not generate ACCESS/MODIFY events on child for special files
2025-12-18NFSD: NFSv4 file creation neglects setting ACLChuck Lever
An NFSv4 client that sets an ACL with a named principal during file creation retrieves the ACL afterwards, and finds that it is only a default ACL (based on the mode bits) and not the ACL that was requested during file creation. This violates RFC 8881 section 6.4.1.3: "the ACL attribute is set as given". The issue occurs in nfsd_create_setattr(), which calls nfsd_attrs_valid() to determine whether to call nfsd_setattr(). However, nfsd_attrs_valid() checks only for iattr changes and security labels, but not POSIX ACLs. When only an ACL is present, the function returns false, nfsd_setattr() is skipped, and the POSIX ACL is never applied to the inode. Subsequently, when the client retrieves the ACL, the server finds no POSIX ACL on the inode and returns one generated from the file's mode bits rather than returning the originally-specified ACL. Reported-by: Aurélien Couderc <aurelien.couderc2002@gmail.com> Fixes: c0cbe70742f4 ("NFSD: add posix ACLs to struct nfsd_attrs") Cc: Roland Mainz <roland.mainz@nrubsig.org> Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-18NFSD: Clear TIME_DELEG in the suppattr_exclcreat bitmapChuck Lever
>From RFC 8881: 5.8.1.14. Attribute 75: suppattr_exclcreat > The bit vector that would set all REQUIRED and RECOMMENDED > attributes that are supported by the EXCLUSIVE4_1 method of file > creation via the OPEN operation. The scope of this attribute > applies to all objects with a matching fsid. There's nothing in RFC 8881 that states that suppattr_exclcreat is or is not allowed to contain bits for attributes that are clear in the reported supported_attrs bitmask. But it doesn't make sense for an NFS server to indicate that it /doesn't/ implement an attribute, but then also indicate that clients /are/ allowed to set that attribute using OPEN(create) with EXCLUSIVE4_1. The FATTR4_WORD2_TIME_DELEG attributes are also not to be allowed for OPEN(create) with EXCLUSIVE4_1. It doesn't make sense to set a delegated timestamp on a new file. Fixes: 7e13f4f8d27d ("nfsd: handle delegated timestamps in SETATTR") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-18NFSD: Clear SECLABEL in the suppattr_exclcreat bitmapChuck Lever
>From RFC 8881: 5.8.1.14. Attribute 75: suppattr_exclcreat > The bit vector that would set all REQUIRED and RECOMMENDED > attributes that are supported by the EXCLUSIVE4_1 method of file > creation via the OPEN operation. The scope of this attribute > applies to all objects with a matching fsid. There's nothing in RFC 8881 that states that suppattr_exclcreat is or is not allowed to contain bits for attributes that are clear in the reported supported_attrs bitmask. But it doesn't make sense for an NFS server to indicate that it /doesn't/ implement an attribute, but then also indicate that clients /are/ allowed to set that attribute using OPEN(create) with EXCLUSIVE4_1. Ensure that the SECURITY_LABEL and ACL bits are not set in the suppattr_exclcreat bitmask when they are also not set in the supported_attrs bitmask. Fixes: 8c18f2052e75 ("nfsd41: SUPPATTR_EXCLCREAT attribute") Cc: stable@vger.kernel.org Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-18nfsd: fix memory leak in nfsd_create_serv error pathsShardul Bankar
When nfsd_create_serv() calls percpu_ref_init() to initialize nn->nfsd_net_ref, it allocates both a percpu reference counter and a percpu_ref_data structure (64 bytes). However, if the function fails later due to svc_create_pooled() returning NULL or svc_bind() returning an error, these allocations are not cleaned up, resulting in a memory leak. The leak manifests as: - Unreferenced percpu allocation (8 bytes per CPU) - Unreferenced percpu_ref_data structure (64 bytes) Fix this by adding percpu_ref_exit() calls in both error paths to properly clean up the percpu_ref_init() allocations. This patch fixes the percpu_ref leak in nfsd_create_serv() seen as an auxiliary leak in syzbot report 099461f8558eb0a1f4f3; the prepare_creds() and vsock-related leaks in the same report remain to be addressed separately. Reported-by: syzbot+099461f8558eb0a1f4f3@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=099461f8558eb0a1f4f3 Fixes: 47e988147f40 ("nfsd: add nfsd_serv_try_get and nfsd_serv_put") Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-12-17xfs: fix the zoned RT growfs check for zone alignmentChristoph Hellwig
The grofs code for zoned RT subvolums already tries to check for zone alignment, but gets it wrong by using the old instead of the new mount structure. Fixes: 01b71e64bb87 ("xfs: support growfs on zoned file systems") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: stable@vger.kernel.org # v6.15 Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-17xfs: validate that zoned RT devices are zone alignedChristoph Hellwig
Garbage collection assumes all zones contain the full amount of blocks. Mkfs already ensures this happens, but make the kernel check it as well to avoid getting into trouble due to fuzzers or mkfs bugs. Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format") Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Cc: stable@vger.kernel.org # v6.15 Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-12-16cifs: update internal module version numberSteve French
to 2.58 Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-16smb: move some SMB1 definitions into common/smb1pdu.hZhangGuoDong
These definitions are only used by SMB1, so move them into the new common/smb1pdu.h. KSMBD only implements SMB_COM_NEGOTIATE, see MS-SMB2 3.3.5.2. Co-developed-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ChenXiaoSong <chenxiaosong@kylinos.cn> Signed-off-by: ZhangGuoDong <zhangguodong@kylinos.cn> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-16smb: align durable reconnect v2 context to 8 byte boundaryBharath SM
Add a 4-byte Pad to create_durable_handle_reconnect_v2 so the DH2C create context is 8 byte aligned. This avoids malformed CREATE contexts on reconnect. Recent change removed this Padding, adding it back. Fixes: 81a45de432c6 ("smb: move create_durable_handle_reconnect_v2 to common/smb2pdu.h") Signed-off-by: Bharath SM <bharathsm@microsoft.com> Reviewed-by: Paulo Alcantara (Red Hat) <pc@manguebit.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-12-16btrfs: fix reservation leak in some error paths when inserting inline extentFilipe Manana
If we fail to allocate a path or join a transaction, we return from __cow_file_range_inline() without freeing the reserved qgroup data, resulting in a leak. Fix this by ensuring we call btrfs_qgroup_free_data() in such cases. Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-12-16btrfs: do not free data reservation in fallback from inline due to -ENOSPCFilipe Manana
If we fail to create an inline extent due to -ENOSPC, we will attempt to go through the normal COW path, reserve an extent, create an ordered extent, etc. However we were always freeing the reserved qgroup data, which is wrong since we will use data. Fix this by freeing the reserved qgroup data in __cow_file_range_inline() only if we are not doing the fallback (ret is <= 0). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: David Sterba <dsterba@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-12-16btrfs: fix use-after-free warning in btrfs_get_or_create_delayed_node()Leo Martins
Previously, btrfs_get_or_create_delayed_node() set the delayed_node's refcount before acquiring the root->delayed_nodes lock. Commit e8513c012de7 ("btrfs: implement ref_tracker for delayed_nodes") moved refcount_set inside the critical section, which means there is no longer a memory barrier between setting the refcount and setting btrfs_inode->delayed_node. Without that barrier, the stores to node->refs and btrfs_inode->delayed_node may become visible out of order. Another thread can then read btrfs_inode->delayed_node and attempt to increment a refcount that hasn't been set yet, leading to a refcounting bug and a use-after-free warning. The fix is to move refcount_set back to where it was to take advantage of the implicit memory barrier provided by lock acquisition. Because the allocations now happen outside of the lock's critical section, they can use GFP_NOFS instead of GFP_ATOMIC. Reported-by: kernel test robot <oliver.sang@intel.com> Closes: https://lore.kernel.org/oe-lkp/202511262228.6dda231e-lkp@intel.com Fixes: e8513c012de7 ("btrfs: implement ref_tracker for delayed_nodes") Tested-by: kernel test robot <oliver.sang@intel.com> Reviewed-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: Leo Martins <loemra.dev@gmail.com> Signed-off-by: David Sterba <dsterba@suse.com>
2025-12-16btrfs: always detect conflicting inodes when logging inode refsFilipe Manana
After rename exchanging (either with the rename exchange operation or regular renames in multiple non-atomic steps) two inodes and at least one of them is a directory, we can end up with a log tree that contains only of the inodes and after a power failure that can result in an attempt to delete the other inode when it should not because it was not deleted before the power failure. In some case that delete attempt fails when the target inode is a directory that contains a subvolume inside it, since the log replay code is not prepared to deal with directory entries that point to root items (only inode items). 1) We have directories "dir1" (inode A) and "dir2" (inode B) under the same parent directory; 2) We have a file (inode C) under directory "dir1" (inode A); 3) We have a subvolume inside directory "dir2" (inode B); 4) All these inodes were persisted in a past transaction and we are currently at transaction N; 5) We rename the file (inode C), so at btrfs_log_new_name() we update inode C's last_unlink_trans to N; 6) We get a rename exchange for "dir1" (inode A) and "dir2" (inode B), so after the exchange "dir1" is inode B and "dir2" is inode A. During the rename exchange we call btrfs_log_new_name() for inodes A and B, but because they are directories, we don't update their last_unlink_trans to N; 7) An fsync against the file (inode C) is done, and because its inode has a last_unlink_trans with a value of N we log its parent directory (inode A) (through btrfs_log_all_parents(), called from btrfs_log_inode_parent()). 8) So we end up with inode B not logged, which now has the old name of inode A. At copy_inode_items_to_log(), when logging inode A, we did not check if we had any conflicting inode to log because inode A has a generation lower than the current transaction (created in a past transaction); 9) After a power failure, when replaying the log tree, since we find that inode A has a new name that conflicts with the name of inode B in the fs tree, we attempt to delete inode B... this is wrong since that directory was never deleted before the power failure, and because there is a subvolume inside that directory, attempting to delete it will fail since replay_dir_deletes() and btrfs_unlink_inode() are not prepared to deal with dir items that point to roots instead of inodes. When that happens the mount fails and we get a stack trace like the following: [87.2314] BTRFS info (device dm-0): start tree-log replay [87.2318] BTRFS critical (device dm-0): failed to delete reference to subvol, root 5 inode 256 parent 259 [87.2332] ------------[ cut here ]------------ [87.2338] BTRFS: Transaction aborted (error -2) [87.2346] WARNING: CPU: 1 PID: 638968 at fs/btrfs/inode.c:4345 __btrfs_unlink_inode+0x416/0x440 [btrfs] [87.2368] Modules linked in: btrfs loop dm_thin_pool (...) [87.2470] CPU: 1 UID: 0 PID: 638968 Comm: mount Tainted: G W 6.18.0-rc7-btrfs-next-218+ #2 PREEMPT(full) [87.2489] Tainted: [W]=WARN [87.2494] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [87.2514] RIP: 0010:__btrfs_unlink_inode+0x416/0x440 [btrfs] [87.2538] Code: c0 89 04 24 (...) [87.2568] RSP: 0018:ffffc0e741f4b9b8 EFLAGS: 00010286 [87.2574] RAX: 0000000000000000 RBX: ffff9d3ec8a6cf60 RCX: 0000000000000000 [87.2582] RDX: 0000000000000002 RSI: ffffffff84ab45a1 RDI: 00000000ffffffff [87.2591] RBP: ffff9d3ec8a6ef20 R08: 0000000000000000 R09: ffffc0e741f4b840 [87.2599] R10: ffff9d45dc1fffa8 R11: 0000000000000003 R12: ffff9d3ee26d77e0 [87.2608] R13: ffffc0e741f4ba98 R14: ffff9d4458040800 R15: ffff9d44b6b7ca10 [87.2618] FS: 00007f7b9603a840(0000) GS:ffff9d4658982000(0000) knlGS:0000000000000000 [87.2629] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [87.2637] CR2: 00007ffc9ec33b98 CR3: 000000011273e003 CR4: 0000000000370ef0 [87.2648] Call Trace: [87.2651] <TASK> [87.2654] btrfs_unlink_inode+0x15/0x40 [btrfs] [87.2661] unlink_inode_for_log_replay+0x27/0xf0 [btrfs] [87.2669] check_item_in_log+0x1ea/0x2c0 [btrfs] [87.2676] replay_dir_deletes+0x16b/0x380 [btrfs] [87.2684] fixup_inode_link_count+0x34b/0x370 [btrfs] [87.2696] fixup_inode_link_counts+0x41/0x160 [btrfs] [87.2703] btrfs_recover_log_trees+0x1ff/0x7c0 [btrfs] [87.2711] ? __pfx_replay_one_buffer+0x10/0x10 [btrfs] [87.2719] open_ctree+0x10bb/0x15f0 [btrfs] [87.2726] btrfs_get_tree.cold+0xb/0x16c [btrfs] [87.2734] ? fscontext_read+0x15c/0x180 [87.2740] ? rw_verify_area+0x50/0x180 [87.2746] vfs_get_tree+0x25/0xd0 [87.2750] vfs_cmd_create+0x59/0xe0 [87.2755] __do_sys_fsconfig+0x4f6/0x6b0 [87.2760] do_syscall_64+0x50/0x1220 [87.2764] entry_SYSCALL_64_after_hwframe+0x76/0x7e [87.2770] RIP: 0033:0x7f7b9625f4aa [87.2775] Code: 73 01 c3 48 (...) [87.2803] RSP: 002b:00007ffc9ec35b08 EFLAGS: 00000246 ORIG_RAX: 00000000000001af [87.2817] RAX: ffffffffffffffda RBX: 0000558bfa91ac20 RCX: 00007f7b9625f4aa [87.2829] RDX: 0000000000000000 RSI: 0000000000000006 RDI: 0000000000000003 [87.2842] RBP: 0000558bfa91b120 R08: 0000000000000000 R09: 0000000000000000 [87.2854] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 [87.2864] R13: 00007f7b963f1580 R14: 00007f7b963f326c R15: 00007f7b963d8a23 [87.2877] </TASK> [87.2882] ---[ end trace 0000000000000000 ]--- [87.2891] BTRFS: error (device dm-0 state A) in __btrfs_unlink_inode:4345: errno=-2 No such entry [87.2904] BTRFS: error (device dm-0 state EAO) in do_abort_log_replay:191: errno=-2 No such entry [87.2915] BTRFS critical (device dm-0 state EAO): log tree (for root 5) leaf currently being processed (slot 7 key (258 12 257)): [87.2929] BTRFS info (device dm-0 state EAO): leaf 30736384 gen 10 total ptrs 7 free space 15712 owner 18446744073709551610 [87.2929] BTRFS info (device dm-0 state EAO): refs 3 lock_owner 0 current 638968 [87.2929] item 0 key (257 INODE_ITEM 0) itemoff 16123 itemsize 160 [87.2929] inode generation 9 transid 10 size 0 nbytes 0 [87.2929] block group 0 mode 40755 links 1 uid 0 gid 0 [87.2929] rdev 0 sequence 7 flags 0x0 [87.2929] atime 1765464494.678070921 [87.2929] ctime 1765464494.686606513 [87.2929] mtime 1765464494.686606513 [87.2929] otime 1765464494.678070921 [87.2929] item 1 key (257 INODE_REF 256) itemoff 16109 itemsize 14 [87.2929] index 4 name_len 4 [87.2929] item 2 key (257 DIR_LOG_INDEX 2) itemoff 16101 itemsize 8 [87.2929] dir log end 2 [87.2929] item 3 key (257 DIR_LOG_INDEX 3) itemoff 16093 itemsize 8 [87.2929] dir log end 18446744073709551615 [87.2930] item 4 key (257 DIR_INDEX 3) itemoff 16060 itemsize 33 [87.2930] location key (258 1 0) type 1 [87.2930] transid 10 data_len 0 name_len 3 [87.2930] item 5 key (258 INODE_ITEM 0) itemoff 15900 itemsize 160 [87.2930] inode generation 9 transid 10 size 0 nbytes 0 [87.2930] block group 0 mode 100644 links 1 uid 0 gid 0 [87.2930] rdev 0 sequence 2 flags 0x0 [87.2930] atime 1765464494.678456467 [87.2930] ctime 1765464494.686606513 [87.2930] mtime 1765464494.678456467 [87.2930] otime 1765464494.678456467 [87.2930] item 6 key (258 INODE_REF 257) itemoff 15887 itemsize 13 [87.2930] index 3 name_len 3 [87.2930] BTRFS critical (device dm-0 state EAO): log replay failed in unlink_inode_for_log_replay:1045 for root 5, stage 3, with error -2: failed to unlink inode 256 parent dir 259 name subvol root 5 [87.2963] BTRFS: error (device dm-0 state EAO) in btrfs_recover_log_trees:7743: errno=-2 No such entry [87.2981] BTRFS: error (device dm-0 state EAO) in btrfs_replay_log:2083: errno=-2 No such entry (Failed to recover log tr So fix this by changing copy_inode_items_to_log() to always detect if there are conflicting inodes for the ref/extref of the inode being logged even if the inode was created in a past transaction. A test case for fstests will follow soon. CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>