summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-11-22fs/resctrl: Modify struct rdt_parse_data to pass mode and CLOSIDBabu Moger
parse_cbm() requires resource group mode and CLOSID to validate the capacity bitmask (CBM). It is passed via struct rdtgroup in struct rdt_parse_data. The io_alloc feature also uses CBMs to indicate which portions of cache are allocated for I/O traffic. The CBMs are provided by user space and need to be validated the same as CBMs provided for general (CPU) cache allocation. parse_cbm() cannot be used as-is since io_alloc does not have rdtgroup context. Pass the resource group mode and CLOSID directly to parse_cbm() via struct rdt_parse_data, instead of through the rdtgroup struct, to facilitate calling parse_cbm() to verify the CBM of the io_alloc feature. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://patch.msgid.link/f8ec6ab5cf594d906a3fe75f56793d5fbd63f38f.1762995456.git.babu.moger@amd.com
2025-11-22fs/resctrl: Introduce interface to display io_alloc CBMsBabu Moger
Introduce the "io_alloc_cbm" resctrl file to display the capacity bitmasks (CBMs) that represent the portions of each cache instance allocated for I/O traffic on a cache resource that supports the "io_alloc" feature. io_alloc_cbm resides in the info directory of a cache resource, for example, /sys/fs/resctrl/info/L3/. Since the resource name is part of the path, it is not necessary to display the resource name as done in the schemata file. When CDP is enabled, io_alloc routes traffic using the highest CLOSID associated with the CDP_CODE resource and that CLOSID becomes unusable for the CDP_DATA resource. The highest CLOSID of CDP_CODE and CDP_DATA resources will be kept in sync to ensure consistent user interface. In preparation for this, access the CBMs for I/O traffic through highest CLOSID of either CDP_CODE or CDP_DATA resource. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://patch.msgid.link/55a3ff66a70e7ce8239f022e62b334e9d64af604.1762995456.git.babu.moger@amd.com
2025-11-21fs/resctrl: Add user interface to enable/disable io_alloc featureBabu Moger
AMD's SDCIAE forces all SDCI lines to be placed into the L3 cache portions identified by the highest-supported L3_MASK_n register, where n is the maximum supported CLOSID. To support this, when io_alloc resctrl feature is enabled, reserve the highest CLOSID exclusively for I/O allocation traffic making it no longer available for general CPU cache allocation. Introduce user interface to enable/disable io_alloc feature and encourage users to enable io_alloc only when running workloads that can benefit from this functionality. On enable, initialize the io_alloc CLOSID with all usable CBMs across all the domains. Since CLOSIDs are managed by resctrl fs, it is least invasive to make "io_alloc is supported by maximum supported CLOSID" part of the initial resctrl fs support for io_alloc. Take care to minimally (only in error messages) expose this use of CLOSID for io_alloc to user space so that this is not required from other architectures that may support io_alloc differently in the future. When resctrl is mounted with "-o cdp" to enable code/data prioritization, there are two L3 resources that can support I/O allocation: L3CODE and L3DATA. From resctrl fs perspective the two resources share a CLOSID and the architecture's available CLOSID are halved to support this. The architecture's underlying CLOSID used by SDCIAE when CDP is enabled is the CLOSID associated with the CDP_CODE resource, but from resctrl's perspective there is only one CLOSID for both CDP_CODE and CDP_DATA. CDP_DATA is thus not usable for general (CPU) cache allocation nor I/O allocation. Keep the CDP_CODE and CDP_DATA I/O alloc status in sync to avoid any confusion to user space. That is, enabling io_alloc on CDP_CODE does so on CDP_DATA and vice-versa, and keep the I/O allocation CBMs of CDP_CODE and CDP_DATA in sync. Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://patch.msgid.link/c7d3037795e653e22b02d8fc73ca80d9b075031c.1762995456.git.babu.moger@amd.com
2025-11-21fs/resctrl: Introduce interface to display "io_alloc" supportBabu Moger
Introduce the "io_alloc" resctrl file to the "info" area of a cache resource, for example /sys/fs/resctrl/info/L3/io_alloc. "io_alloc" indicates support for the "io_alloc" feature that allows direct insertion of data from I/O devices into the cache. Restrict exposing support for "io_alloc" to the L3 resource that is the only resource where this feature can be backed by AMD's L3 Smart Data Cache Injection Allocation Enforcement (SDCIAE). With that, the "io_alloc" file is only visible to user space if the L3 resource supports "io_alloc". Doing so makes the file visible for all cache resources though, for example also L2 cache (if it supports cache allocation). As a consequence, add capability for file to report expected "enabled" and "disabled", as well as "not supported". Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://patch.msgid.link/e8b116a8f424128b227734bb1d433c14af478d90.1762995456.git.babu.moger@amd.com
2025-11-21Merge tag 'v6.18-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds
Pull smb client fixes from Steve French: - Fix potential memory leak in mount - Add some missing read tracepoints - Fix locking issue with directory leases * tag 'v6.18-rc6-smb3-client-fixes' of git://git.samba.org/sfrench/cifs-2.6: cifs: Add the smb3_read_* tracepoints to SMB1 cifs: fix memory leak in smb3_fs_context_parse_param error path smb: client: introduce close_cached_dir_locked()
2025-11-20ocfs2: mark inode bad upon validation failure during readAhmet Eray Karadag
A VFS cache inconsistency, potentially triggered by sequences like buffered writes followed by open(O_DIRECT), can result in an invalid on-disk inode block (e.g., bad signature). OCFS2 detects this corruption when reading the inode block via ocfs2_validate_inode_block(), logs "Invalid dinode", and often switches the filesystem to read-only mode. The VFS open(O_DIRECT) operation appears to incorrectly clear the inode's I_DIRTY flag without ensuring the dirty metadata (reflecting the earlier buffered write, e.g., an updated i_size) is flushed to disk. This leaves the in-memory VFS inode object "in limbo" with an updated size (e.g., 38639 from the write) but marked clean, while its on-disk counterpart remains stale (e.g., size 0) or invalid. Currently, the function reading the inode block (ocfs2_read_inode_block_full()) fails to call make_bad_inode() upon detecting the validation error. Because the in-memory inode is not marked bad, subsequent operations (like ftruncate) proceed erroneously. They eventually reach code (e.g., ocfs2_truncate_file()) that compares the inconsistent in-memory size (38639) against the invalid/stale on-disk size (0), leading to kernel crashes via BUG_ON. Fix this by calling make_bad_inode(inode) within the error handling path of ocfs2_read_inode_block_full() immediately after a block read or validation error occurs. This ensures VFS is properly notified about the corrupt inode at the point of detection. Marking the inode bad allows VFS to correctly fail subsequent operations targeting this inode early, preventing kernel panics caused by operating on known inconsistent inode states. Link: https://lkml.kernel.org/r/20251118001833.423470-2-eraykrdg1@gmail.com Link: https://lore.kernel.org/all/20251029225748.11361-2-eraykrdg1@gmail.com/T/ Signed-off-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Signed-off-by: Ahmet Eray Karadag <eraykrdg1@gmail.com> Reported-by: syzbot+b93b65ee321c97861072@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=b93b65ee321c97861072 Reviewed-by: Heming Zhao <heming.zhao@suse.com> Co-developed-by: Albin Babu Varghese <albinbabuvarghese20@gmail.com> Acked-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: David Hunter <david.hunter.linux@gmail.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20ocfs2: replace deprecated strcpy with strscpyThorsten Blum
strcpy() has been deprecated [1] because it performs no bounds checking on the destination buffer, which can lead to buffer overflows. Replace it with the safer strscpy(), and copy directly into '->rf_signature' instead of using the start of the struct as the destination buffer. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1] Link: https://lkml.kernel.org/r/20251118185345.132411-3-thorsten.blum@linux.dev Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20ocfs2: replace deprecated strcpy in ocfs2_create_xattr_blockThorsten Blum
strcpy() has been deprecated [1] because it performs no bounds checking on the destination buffer, which can lead to buffer overflows. Replace it with the safer strscpy(), and copy directly into '->xb_signature' instead of using the start of the struct as the destination buffer. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1] Link: https://lkml.kernel.org/r/20251118185345.132411-2-thorsten.blum@linux.dev Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Changwei Ge <gechangwei@live.cn> Cc: Jun Piao <piaojun@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20ceph: replace local base64 helpers with lib/base64Guan-Chun Wu
Remove the ceph_base64_encode() and ceph_base64_decode() functions and replace their usage with the generic base64_encode() and base64_decode() helpers from lib/base64. This eliminates the custom implementation in Ceph, reduces code duplication, and relies on the shared Base64 code in lib. The helpers preserve RFC 3501-compliant Base64 encoding without padding, so there are no functional changes. This change also improves performance: encoding is about 2.7x faster and decoding achieves 43-52x speedups compared to the previous local implementation. Link: https://lkml.kernel.org/r/20251114060240.89965-1-409411716@gms.tku.edu.tw Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Cc: Keith Busch <kbusch@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Christoph Hellwig <hch@lst.de> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: Xiubo Li <xiubli@redhat.com> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: "Theodore Y. Ts'o" <tytso@mit.edu> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: David Laight <david.laight.linux@gmail.com> Cc: Yu-Sheng Huang <home7438072@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20fscrypt: replace local base64url helpers with lib/base64Guan-Chun Wu
Replace the base64url encoding and decoding functions in fscrypt with the generic base64_encode() and base64_decode() helpers from lib/base64. This removes the custom implementation in fscrypt, reduces code duplication, and relies on the shared Base64 implementation in lib. The helpers preserve RFC 4648-compliant URL-safe Base64 encoding without padding, so there are no functional changes. This change also improves performance: encoding is about 2.7x faster and decoding achieves 43-52x speedups compared to the previous implementation. Link: https://lkml.kernel.org/r/20251114060221.89734-1-409411716@gms.tku.edu.tw Reviewed-by: Kuan-Wei Chiu <visitorckw@gmail.com> Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> Cc: Christoph Hellwig <hch@lst.de> Cc: David Laight <david.laight.linux@gmail.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Ilya Dryomov <idryomov@gmail.com> Cc: Jaegeuk Kim <jaegeuk@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Cc: Keith Busch <kbusch@kernel.org> Cc: Sagi Grimberg <sagi@grimberg.me> Cc: "Theodore Y. Ts'o" <tytso@mit.edu> Cc: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Cc: Xiubo Li <xiubli@redhat.com> Cc: Yu-Sheng Huang <home7438072@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20fs/proc/page: remove unused KPMBITSzhang jiao
KPMBITS is never referenced in the code. Just remove it. Link: https://lkml.kernel.org/r/20251106010735.1603-1-zhangjiao2@cmss.chinamobile.com Signed-off-by: zhang jiao <zhangjiao2@cmss.chinamobile.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Liu Ye <liuye@kylinos.cn> Cc: Luiz Capitulino <luizcap@redhat.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Miaohe Lin <linmiaohe@huawei.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20ocfs2: validate cl_bpc in allocator inodes to prevent divide-by-zeroDeepanshu Kartikey
The chain allocator field cl_bpc (blocks per cluster) is read from disk and used in division operations without validation. A corrupted filesystem image with cl_bpc=0 causes a divide-by-zero crash in the kernel: divide error: 0000 [#1] PREEMPT SMP KASAN RIP: 0010:ocfs2_bg_discontig_add_extent fs/ocfs2/suballoc.c:335 [inline] RIP: 0010:ocfs2_block_group_fill+0x5bd/0xa70 fs/ocfs2/suballoc.c:386 Call Trace: ocfs2_block_group_alloc+0x7e9/0x1330 fs/ocfs2/suballoc.c:703 ocfs2_reserve_suballoc_bits+0x20a6/0x4640 fs/ocfs2/suballoc.c:834 ocfs2_reserve_new_inode+0x4f4/0xcc0 fs/ocfs2/suballoc.c:1074 ocfs2_mknod+0x83c/0x2050 fs/ocfs2/namei.c:306 This patch adds validation in ocfs2_validate_inode_block() to ensure cl_bpc matches the expected value calculated from the superblock's cluster size and block size for chain allocator inodes (identified by OCFS2_CHAIN_FL). Moving the validation to inode validation time (rather than allocation time) has several benefits: - Validates once when the inode is read, rather than on every allocation - Protects all code paths that use cl_bpc (allocation, resize, etc.) - Follows the existing pattern of inode validation in OCFS2 - Centralizes validation logic The validation catches both: - Zero values that cause divide-by-zero crashes - Non-zero but incorrect values indicating filesystem corruption or mismatched filesystem geometry With this fix, mounting a corrupted filesystem produces: OCFS2: ERROR (device loop0): ocfs2_validate_inode_block: Inode 74 has corrupted cl_bpc: ondisk=0 expected=16 instead of a kernel crash. [dmantipov@yandex.ru: combine into the series and tweak the message to fit the commonly used style] Link: https://lkml.kernel.org/r/20251030153003.1934585-2-dmantipov@yandex.ru Link: https://lore.kernel.org/ocfs2-devel/20251026132625.12348-1-kartikey406@gmail.com/T/#u [v1] Link: https://lore.kernel.org/all/20251027124131.10002-1-kartikey406@gmail.com/T/ [v2] Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+fd8af97c7227fe605d95@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=fd8af97c7227fe605d95 Tested-by: syzbot+fd8af97c7227fe605d95@syzkaller.appspotmail.com Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Jun Piao <piaojun@huawei.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20ocfs2: add extra consistency checks for chain allocator dinodesDmitry Antipov
When validating chain allocator dinode in 'ocfs2_validate_inode_block()', add an extra checks whether a) the maximum amount of chain records in 'struct ocfs2_chain_list' matches the value calculated based on the filesystem block size, and b) the next free slot index is within the valid range. Link: https://lkml.kernel.org/r/20251030153003.1934585-1-dmantipov@yandex.ru Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> Reported-by: syzbot+77026564530dbc29b854@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=77026564530dbc29b854 Reported-by: syzbot+5054473a31f78f735416@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=5054473a31f78f735416 Suggested-by: Joseph Qi <joseph.qi@linux.alibaba.com> Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Junxiao Bi <junxiao.bi@oracle.com> Cc: Jun Piao <piaojun@huawei.com> Cc: Deepanshu Kartikey <kartikey406@gmail.com> Cc: Heming Zhao <heming.zhao@suse.com> Cc: Joel Becker <jlbec@evilplan.org> Cc: Mark Fasheh <mark@fasheh.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20mm: introduce VM_MAYBE_GUARD and make visible in /proc/$pid/smapsLorenzo Stoakes
Patch series "introduce VM_MAYBE_GUARD and make it sticky", v4. Currently, guard regions are not visible to users except through /proc/$pid/pagemap, with no explicit visibility at the VMA level. This makes the feature less useful, as it isn't entirely apparent which VMAs may have these entries present, especially when performing actions which walk through memory regions such as those performed by CRIU. This series addresses this issue by introducing the VM_MAYBE_GUARD flag which fulfils this role, updating the smaps logic to display an entry for these. The semantics of this flag are that a guard region MAY be present if set (we cannot be sure, as we can't efficiently track whether an MADV_GUARD_REMOVE finally removes all the guard regions in a VMA) - but if not set the VMA definitely does NOT have any guard regions present. It's problematic to establish this flag without further action, because that means that VMAs with guard regions in them become non-mergeable with adjacent VMAs for no especially good reason. To work around this, this series also introduces the concept of 'sticky' VMA flags - that is flags which: a. if set in one VMA and not in another still permit those VMAs to be merged (if otherwise compatible). b. When they are merged, the resultant VMA must have the flag set. The VMA logic is updated to propagate these flags correctly. Additionally, VM_MAYBE_GUARD being an explicit VMA flag allows us to solve an issue with file-backed guard regions - previously these established an anon_vma object for file-backed mappings solely to have vma_needs_copy() correctly propagate guard region mappings to child processes. We introduce a new flag alias VM_COPY_ON_FORK (which currently only specifies VM_MAYBE_GUARD) and update vma_needs_copy() to check explicitly for this flag and to copy page tables if it is present, which resolves this issue. Additionally, we add the ability for allow-listed VMA flags to be atomically writable with only mmap/VMA read locks held. The only flag we allow so far is VM_MAYBE_GUARD, which we carefully ensure does not cause any races by being allowed to do so. This allows us to maintain guard region installation as a read-locked operation and not endure the overhead of obtaining a write lock here. Finally we introduce extensive VMA userland tests to assert that the sticky VMA logic behaves correctly as well as guard region self tests to assert that smaps visibility is correctly implemented. This patch (of 9): Currently, if a user needs to determine if guard regions are present in a range, they have to scan all VMAs (or have knowledge of which ones might have guard regions). Since commit 8e2f2aeb8b48 ("fs/proc/task_mmu: add guard region bit to pagemap") and the related commit a516403787e0 ("fs/proc: extend the PAGEMAP_SCAN ioctl to report guard regions"), users can use either /proc/$pid/pagemap or the PAGEMAP_SCAN functionality to perform this operation at a virtual address level. This is not ideal, and it gives no visibility at a /proc/$pid/smaps level that guard regions exist in ranges. This patch remedies the situation by establishing a new VMA flag, VM_MAYBE_GUARD, to indicate that a VMA may contain guard regions (it is uncertain because we cannot reasonably determine whether a MADV_GUARD_REMOVE call has removed all of the guard regions in a VMA, and additionally VMAs may change across merge/split). We utilise 0x800 for this flag which makes it available to 32-bit architectures also, a flag that was previously used by VM_DENYWRITE, which was removed in commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") and hasn't bee reused yet. We also update the smaps logic and documentation to identify these VMAs. Another major use of this functionality is that we can use it to identify that we ought to copy page tables on fork. We do not actually implement usage of this flag in mm/madvise.c yet as we need to allow some VMA flags to be applied atomically under mmap/VMA read lock in order to avoid the need to acquire a write lock for this purpose. Link: https://lkml.kernel.org/r/cover.1763460113.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/cf8ef821eba29b6c5b5e138fffe95d6dcabdedb9.1763460113.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Reviewed-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lance Yang <lance.yang@linux.dev> Cc: Andrei Vagin <avagin@gmail.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Dev Jain <dev.jain@arm.com> Cc: Jann Horn <jannh@google.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: "Masami Hiramatsu (Google)" <mhiramat@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-20lockd: don't allow locking on reexported NFSv2/3Jeff Layton
Since commit 9254c8ae9b81 ("nfsd: disallow file locking and delegations for NFSv4 reexport"), file locking when reexporting an NFS mount via NFSv4 is expressly prohibited by nfsd. Do the same in lockd: Add a new nlmsvc_file_cannot_lock() helper that will test whether file locking is allowed for a given file, and return nlm_lck_denied_nolocks if it isn't. Signed-off-by: Jeff Layton <jlayton@kernel.org> Tested-by: Olga Kornievskaia <okorniev@redhat.com> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
2025-11-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Cross-merge networking fixes after downstream PR (net-6.18-rc7). No conflicts, adjacent changes: tools/testing/selftests/net/af_unix/Makefile e1bb28bf13f4 ("selftest: af_unix: Add test for SO_PEEK_OFF.") 45a1cd8346ca ("selftests: af_unix: Add tests for ECONNRESET and EOF semantics") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20xfs: use zi more in xfs_zone_gc_mountChristoph Hellwig
Use the local variable instead of the extra pointer dereference when starting the GC thread. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-20xfs: fix out of bounds memory read error in symlink repairDarrick J. Wong
xfs/286 produced this report on my test fleet: ================================================================== BUG: KFENCE: out-of-bounds read in memcpy_orig+0x54/0x110 Out-of-bounds read at 0xffff88843fe9e038 (184B right of kfence-#184): memcpy_orig+0x54/0x110 xrep_symlink_salvage_inline+0xb3/0xf0 [xfs] xrep_symlink_salvage+0x100/0x110 [xfs] xrep_symlink+0x2e/0x80 [xfs] xrep_attempt+0x61/0x1f0 [xfs] xfs_scrub_metadata+0x34f/0x5c0 [xfs] xfs_ioc_scrubv_metadata+0x387/0x560 [xfs] xfs_file_ioctl+0xe23/0x10e0 [xfs] __x64_sys_ioctl+0x76/0xc0 do_syscall_64+0x4e/0x1e0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 kfence-#184: 0xffff88843fe9df80-0xffff88843fe9dfea, size=107, cache=kmalloc-128 allocated by task 3470 on cpu 1 at 263329.131592s (192823.508886s ago): xfs_init_local_fork+0x79/0xe0 [xfs] xfs_iformat_local+0xa4/0x170 [xfs] xfs_iformat_data_fork+0x148/0x180 [xfs] xfs_inode_from_disk+0x2cd/0x480 [xfs] xfs_iget+0x450/0xd60 [xfs] xfs_bulkstat_one_int+0x6b/0x510 [xfs] xfs_bulkstat_iwalk+0x1e/0x30 [xfs] xfs_iwalk_ag_recs+0xdf/0x150 [xfs] xfs_iwalk_run_callbacks+0xb9/0x190 [xfs] xfs_iwalk_ag+0x1dc/0x2f0 [xfs] xfs_iwalk_args.constprop.0+0x6a/0x120 [xfs] xfs_iwalk+0xa4/0xd0 [xfs] xfs_bulkstat+0xfa/0x170 [xfs] xfs_ioc_fsbulkstat.isra.0+0x13a/0x230 [xfs] xfs_file_ioctl+0xbf2/0x10e0 [xfs] __x64_sys_ioctl+0x76/0xc0 do_syscall_64+0x4e/0x1e0 entry_SYSCALL_64_after_hwframe+0x4b/0x53 CPU: 1 UID: 0 PID: 1300113 Comm: xfs_scrub Not tainted 6.18.0-rc4-djwx #rc4 PREEMPT(lazy) 3d744dd94e92690f00a04398d2bd8631dcef1954 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-4.module+el8.8.0+21164+ed375313 04/01/2014 ================================================================== On further analysis, I realized that the second parameter to min() is not correct. xfs_ifork::if_bytes is the size of the xfs_ifork::if_data buffer. if_bytes can be smaller than the data fork size because: (a) the forkoff code tries to keep the data area as large as possible (b) for symbolic links, if_bytes is the ondisk file size + 1 (c) forkoff is always a multiple of 8. Case in point: for a single-byte symlink target, forkoff will be 8 but the buffer will only be 2 bytes long. In other words, the logic here is wrong and we walk off the end of the incore buffer. Fix that. Cc: stable@vger.kernel.org # v6.10 Fixes: 2651923d8d8db0 ("xfs: online repair of symbolic links") Signed-off-by: Darrick J. Wong <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-11-20cifs: Add the smb3_read_* tracepoints to SMB1David Howells
Add the smb3_read_* tracepoints to SMB1's cifs_async_readv() and cifs_readv_callback(). Signed-off-by: David Howells <dhowells@redhat.com> cc: Steve French <sfrench@samba.org> cc: Paulo Alcantara <pc@manguebit.org> cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Steve French <stfrench@microsoft.com>
2025-11-20cifs: fix memory leak in smb3_fs_context_parse_param error pathShaurya Rane
Add proper cleanup of ctx->source and fc->source to the cifs_parse_mount_err error handler. This ensures that memory allocated for the source strings is correctly freed on all error paths, matching the cleanup already performed in the success path by smb3_cleanup_fs_context_contents(). Pointers are also set to NULL after freeing to prevent potential double-free issues. This change fixes a memory leak originally detected by syzbot. The leak occurred when processing Opt_source mount options if an error happened after ctx->source and fc->source were successfully allocated but before the function completed. The specific leak sequence was: 1. ctx->source = smb3_fs_context_fullpath(ctx, '/') allocates memory 2. fc->source = kstrdup(ctx->source, GFP_KERNEL) allocates more memory 3. A subsequent error jumps to cifs_parse_mount_err 4. The old error handler freed passwords but not the source strings, causing the memory to leak. This issue was not addressed by commit e8c73eb7db0a ("cifs: client: fix memory leak in smb3_fs_context_parse_param"), which only fixed leaks from repeated fsconfig() calls but not this error path. Patch updated with minor change suggested by kernel test robot Reported-by: syzbot+87be6809ed9bf6d718e3@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=87be6809ed9bf6d718e3 Fixes: 24e0a1eff9e2 ("cifs: switch to new mount api") Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Shaurya Rane <ssrane_b23@ee.vjti.ac.in> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-11-20smb: client: introduce close_cached_dir_locked()Henrique Carvalho
Replace close_cached_dir() calls under cfid_list_lock with a new close_cached_dir_locked() variant that uses kref_put() instead of kref_put_lock() to avoid recursive locking when dropping references. While the existing code works if the refcount >= 2 invariant holds, this area has proven error-prone. Make deadlocks impossible and WARN on invariant violations. Cc: stable@vger.kernel.org Reviewed-by: David Howells <dhowells@redhat.com> Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-11-19hfs: introduce KUnit tests for HFS string operationsViacheslav Dubeyko
This patch implements the initial Kunit based set of unit tests for HFS string operations. It checks functionality of hfs_strcmp(), hfs_hash_dentry(), and hfs_compare_dentry() methods. ./tools/testing/kunit/kunit.py run --kunitconfig ./fs/hfs/.kunitconfig [16:04:50] Configuring KUnit Kernel ... Regenerating .config ... Populating config with: $ make ARCH=um O=.kunit olddefconfig [16:04:51] Building KUnit Kernel ... Populating config with: $ make ARCH=um O=.kunit olddefconfig Building with: $ make all compile_commands.json scripts_gdb ARCH=um O=.kunit --jobs=22 [16:04:59] Starting KUnit Kernel (1/1)... [16:04:59] ============================================================ Running tests with: $ .kunit/linux kunit.enable=1 mem=1G console=tty kunit_shutdown=halt [16:04:59] ================= hfs_string (3 subtests) ================== [16:04:59] [PASSED] hfs_strcmp_test [16:04:59] [PASSED] hfs_hash_dentry_test [16:04:59] [PASSED] hfs_compare_dentry_test [16:04:59] =================== [PASSED] hfs_string ==================== [16:04:59] ============================================================ [16:04:59] Testing complete. Ran 3 tests: passed: 3 [16:04:59] Elapsed time: 9.087s total, 1.310s configuring, 7.611s building, 0.125s running v2 Fix linker error. v3 Chen Linxuan suggested to use EXPORT_SYMBOL_IF_KUNIT. Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> cc: Yangtao Li <frank.li@vivo.com> cc: linux-fsdevel@vger.kernel.org cc: Chen Linxuan <me@black-desk.cn> Reviewed-by: Chen Linxuan <me@black-desk.cn> Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com> Link: https://lore.kernel.org/r/20250912225022.1083313-1-slava@dubeyko.com Signed-off-by: Viacheslav Dubeyko <slava@dubeyko.com>
2025-11-19ovl: remove struct ovl_cu_creds and associated functionsChristian Brauner
Now that we have this all ported to a cred guard remove the struct and the associated helpers. Link: https://patch.msgid.link/20251114-work-ovl-cred-guard-copyup-v1-5-ea3fb15cf427@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_copy_up_tmpfile() to cred guardChristian Brauner
Remove the complicated struct ovl_cu_creds dance and use our new copy up cred guard. Link: https://patch.msgid.link/20251114-work-ovl-cred-guard-copyup-v1-4-ea3fb15cf427@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: mark *_cu_creds() as unused temporarilyChristian Brauner
They will become unused in the next patch and we'll drop them after the conversion is finished together with the struct. This keeps the changes small and reviewable. Link: https://patch.msgid.link/20251114-work-ovl-cred-guard-copyup-v1-3-ea3fb15cf427@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_copy_up_workdir() to cred guardChristian Brauner
Remove the complicated struct ovl_cu_creds dance and use our new copy up cred guard. Link: https://patch.msgid.link/20251114-work-ovl-cred-guard-copyup-v1-2-ea3fb15cf427@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: add copy up credential guardChristian Brauner
Add a credential guard for copy up. This will allows us to waste struct struct ovl_cu_creds and simplify the code. Link: https://patch.msgid.link/20251114-work-ovl-cred-guard-copyup-v1-1-ea3fb15cf427@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: drop ovl_setup_cred_for_create()Christian Brauner
It is now unused and can be removed. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-prepare-v2-6-bd1c97a36d7b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_create_or_link() to new ovl_override_creator_creds cleanup guardChristian Brauner
This clearly indicates the double-credential override and makes the code a lot easier to grasp with one glance. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-prepare-v2-5-bd1c97a36d7b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: mark ovl_setup_cred_for_create() as unused temporarilyChristian Brauner
The function will become unused in the next patch. We'll remove it in later patches to keep the diff legible. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-prepare-v2-4-bd1c97a36d7b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: reflow ovl_create_or_link()Christian Brauner
Reflow the creation routine in preparation of porting it to a guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-prepare-v2-3-bd1c97a36d7b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_create_tmpfile() to new ovl_override_creator_creds cleanup guardChristian Brauner
This clearly indicates the double-credential override and makes the code a lot easier to grasp with one glance. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-prepare-v2-2-bd1c97a36d7b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: add ovl_override_creator_creds cred guardChristian Brauner
The current code to override credentials for creation operations is pretty difficult to understand. We effectively override the credentials twice: (1) override with the mounter's credentials (2) copy the mounts credentials and override the fs{g,u}id with the inode {u,g}id And then we elide the revert because it would be an idempotent revert. That elision doesn't buy us anything anymore though because I've made it all work without any reference counting anyway. All it does is mix the two credential overrides together. We can use a cleanup guard to clarify the creation codepaths and make them easier to understand. This just introduces the cleanup guard keeping the patch reviewable. We'll convert the caller in follow-up patches and then drop the duplicated code. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-prepare-v2-1-bd1c97a36d7b@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: remove ovl_revert_creds()Christian Brauner
The wrapper isn't needed anymore. Overlayfs completely relies on its cleanup guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-42-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_fill_super() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-41-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: refactor ovl_fill_super()Christian Brauner
Split the core into a separate helper in preparation of converting the caller to the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-40-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_lower_positive() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-39-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_lookup() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-38-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: refactor ovl_lookup()Christian Brauner
Split the core into a separate helper in preparation of converting the caller to the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-37-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_copyfile() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-36-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_rename() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-35-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: refactor ovl_rename()Christian Brauner
Extract the code that runs under overridden credentials into a separate ovl_rename_upper() helper function and the code that runs before/after to ovl_rename_start/end(). Error handling is simplified. The helpers returns errors directly instead of using goto labels. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-34-b31603935724@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: introduce struct ovl_renamedataChristian Brauner
Add a struct ovl_renamedata to group rename-related state that was previously stored in local variables. Embedd struct renamedata directly aligning with the vfs. Signed-off-by: Amir Goldstein <amir73il@gmail.com> Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-33-b31603935724@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_listxattr() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-32-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_xattr_get() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-31-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_xattr_set() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-30-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_nlink_end() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-29-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_nlink_start() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-28-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_check_empty_dir() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-27-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-19ovl: port ovl_dir_llseek() to cred guardChristian Brauner
Use the scoped ovl cred guard. Link: https://patch.msgid.link/20251117-work-ovl-cred-guard-v4-26-b31603935724@kernel.org Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>