summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2025-03-13jbd2: Correct stale comment of release_buffer_pageKemeng Shi
Update stale lock info in comment of release_buffer_page. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250123155014.2097920-7-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13jbd2: correct stale function name in commentKemeng Shi
Rename stale journal_clear_revoked_flag to jbd2_clear_buffer_revoked_flags. Rename stale journal_switch_revoke to jbd2_journal_switch_revoke_table. Rename stale __journal_file_buffer to __jbd2_journal_file_buffer. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250123155014.2097920-6-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13jbd2: remove stale comment of update_t_max_waitKemeng Shi
Commit 2d44292058828 ("jbd2: remove CONFIG_JBD2_DEBUG to update t_max_wait") removed jbd2_journal_enable_debug, just remove stale comment about jbd2_journal_enable_debug. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250123155014.2097920-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13jbd2: remove unused return value of do_readaheadKemeng Shi
Remove unused return value of do_readahead. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250123155014.2097920-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13jbd2: remove unused return value of jbd2_journal_cancel_revokeKemeng Shi
Remove unused return value of jbd2_journal_cancel_revoke. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250123155014.2097920-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: show 'shutdown' hint when ext4 is forced to shutdownBaokun Li
Now, if dmesg is cleared, we have no way of knowing if the file system has been shutdown. Moreover, ext4 allows directory reads even after the file system has been shutdown, so when reading a file returns -EIO, we cannot determine whether this is a hardware issue or if the file system has been shutdown. Therefore, when ext4 file system is shutdown, we're adding a 'shutdown' hint to commands like mount so users can easily check the file system's status. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122114130.229709-8-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: show 'emergency_ro' when EXT4_FLAGS_EMERGENCY_RO is setBaokun Li
After commit d3476f3dad4a ("ext4: don't set SB_RDONLY after filesystem errors") in v6.12-rc1, the 'errors=remount-ro' mode no longer sets SB_RDONLY on errors, which results in us seeing the filesystem is still in rw state after errors. Therefore, after setting EXT4_FLAGS_EMERGENCY_RO, display the emergency_ro option so that users can query whether the current file system has become emergency read-only due to errors through commands such as 'mount' or 'cat /proc/fs/ext4/sdx/options'. Fixes: d3476f3dad4a ("ext4: don't set SB_RDONLY after filesystem errors") Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122114130.229709-7-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: correct behavior under errors=remount-ro modeBaokun Li
And after commit 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag") in v6.6-rc1, the EXT4_FLAGS_SHUTDOWN bit is set in ext4_handle_error() under errors=remount-ro mode. This causes the read to fail even when the error is triggered in errors=remount-ro mode. To correct the behavior under errors=remount-ro, EXT4_FLAGS_SHUTDOWN is replaced by the newly introduced EXT4_FLAGS_EMERGENCY_RO. This new flag only prevents writes, matching the previous behavior with SB_RDONLY. Fixes: 95257987a638 ("ext4: drop EXT4_MF_FS_ABORTED flag") Closes: https://lore.kernel.org/all/22d652f6-cb3c-43f5-b2fe-0a4bb6516a04@huawei.com/ Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250122114130.229709-6-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: add more ext4_emergency_state() checks around sb_rdonly()Baokun Li
Some functions check sb_rdonly() to make sure the file system isn't modified after it's read-only. Since we also don't want the file system modified if it's in an emergency state (shutdown or emergency_ro), we're adding additional ext4_emergency_state() checks where sb_rdonly() is checked. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250122114130.229709-5-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: add ext4_emergency_state() helper functionBaokun Li
Since both SHUTDOWN and EMERGENCY_RO are emergency states of the ext4 file system, and they are checked in similar locations, we have added a helper function, ext4_emergency_state(), to determine whether the current file system is in one of these two emergency states. Then, replace calls to ext4_forced_shutdown() with ext4_emergency_state() in those functions that could potentially trigger write operations. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122114130.229709-4-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: add EXT4_FLAGS_EMERGENCY_RO bitBaokun Li
EXT4_FLAGS_EMERGENCY_RO Indicates that the current file system has become read-only due to some error. Compared to SB_RDONLY, setting it does not require a lock because we won't clear it, which avoids over-coupling with vfs freeze. Also, add a helper function ext4_emergency_ro() to check if the bit is set. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122114130.229709-3-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: convert EXT4_FLAGS_* defines to enumBaokun Li
Do away with the defines and use an enum as it's cleaner. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122114130.229709-2-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: pack holes in ext4_inode_infoBaokun Li
When CONFIG_DEBUG_SPINLOCK is not enabled (general case), there are four 4 bytes holes and one 2 bytes hole in struct ext4_inode_info. Move the members to pack the four 4 bytes holes. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122110533.4116662-10-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: remove unused member 'i_unwritten' from 'ext4_inode_info'Baokun Li
After commit 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure"), no one cares about the value of i_unwritten, so there is no need to maintain this variable, remove it, and clean up the associated logic. Suggested-by: Zhang Yi <yi.zhang@huawei.com> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122110533.4116662-9-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13jbd2: drop JBD2_ABORT_ON_SYNCDATA_ERRBaokun Li
Since ext4's data_err=abort mode doesn't depend on JBD2_ABORT_ON_SYNCDATA_ERR anymore, and nobody else uses it, we can drop it and only warn in jbd2 as it used to be long ago. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122110533.4116662-7-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: abort journal on data writeback failure if in data_err=abort modeBaokun Li
The data_err=abort was initially introduced to address users' worries about data corruption spreading unnoticed. With direct writes, we can rely on return values to confirm successful writes to disk. But with buffered writes, a successful return only means the data has been written to memory. Users have no way of knowing if the data has actually written it to disk unless they use fsync (which impacts performance and can sometimes miss errors). The current data_err=abort implementation relies on the ordered data list, but past changes have inadvertently altered its behavior. For example, if an extent is unwritten, we do not add the inode to the ordered data list. Therefore, jbd2 will not wait for the data write-back of that inode to complete and check for errors in the inode mapping. Moreover, the checks performed by jbd2 can also miss errors. Now, all buffered writes eventually call ext4_end_bio(), where I/O errors are checked. Therefore, we can check for the data_err=abort mode at this point and abort the journal in a kworker (due to the interrupt context). Therefore, when data_err=abort is enabled, the journal is aborted in ext4_end_io_end() when an I/O error is detected in ext4_end_bio() to make users who are concerned about the contents of the file happy. Suggested-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/c7ab26f3-85ad-4b31-b132-0afb0e07bf79@huawei.com Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250122110533.4116662-6-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: extract ext4_has_journal_option() from __ext4_fill_super()Baokun Li
Extract the ext4_has_journal_option() helper function to reduce code duplication. No functional changes. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250122110533.4116662-5-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: reject the 'data_err=abort' option in nojournal modeBaokun Li
data_err=abort aborts the journal on I/O errors. However, this option is meaningless if journal is disabled, so it is rejected in nojournal mode to reduce unnecessary checks. Also, this option is ignored upon remount. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250122110533.4116662-4-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: do not convert the unwritten extents if data writeback failsBaokun Li
When dioread_nolock is turned on (the default), it will convert unwritten extents to written at ext4_end_io_end(), even if the data writeback fails. It leads to the possibility that stale data may be exposed when the physical block corresponding to the file data is read-only (i.e., writes return -EIO, but reads are normal). Therefore a new ext4_io_end->flags EXT4_IO_END_FAILED is added, which indicates that some bio write-back failed in the current ext4_io_end. When this flag is set, the unwritten to written conversion is no longer performed. Users can read the data normally until the caches are dropped, after that, the failed extents can only be read to all 0. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122110533.4116662-3-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: replace opencoded ext4_end_io_end() in ext4_put_io_end()Baokun Li
This reduces duplicate code and ensures that a “potential data loss” warning is available if the unwritten conversion fails. Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20250122110533.4116662-2-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: fix potential null dereference in ext4 kunit testCharles Han
kunit_kzalloc() may return a NULL pointer, dereferencing it without NULL check may lead to NULL dereference. Add a NULL check for grp. Fixes: ac96b56a2fbd ("ext4: Add unit test for mb_mark_used") Fixes: b7098e1fa7bc ("ext4: Add unit test for mb_free_blocks") Signed-off-by: Charles Han <hanchunchao@inspur.com> Reviewed-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://patch.msgid.link/20250110092421.35619-1-hanchunchao@inspur.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: Refactor out ext4_try_to_write_inline_data()Julian Sun
Refactor ext4_try_to_write_inline_data() to simplify its implementation by directly invoking ext4_generic_write_inline_data(). Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250107045730.1837808-1-sunjunchao2870@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: Replace ext4_da_write_inline_data_begin() with ↵Julian Sun
ext4_generic_write_inline_data(). Replace the call to ext4_da_write_inline_data_begin() with ext4_generic_write_inline_data(), and delete the ext4_da_write_inline_data_begin(). Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250107045710.1837756-1-sunjunchao2870@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: Introduce a new helper function ext4_generic_write_inline_data()Julian Sun
A new function, ext4_generic_write_inline_data(), is introduced to provide a generic implementation of the common logic found in ext4_da_write_inline_data_begin() and ext4_try_to_write_inline_data(). This function will be utilized in the subsequent two patches. Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250107045549.1837589-1-sunjunchao2870@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: Don't set EXT4_STATE_MAY_INLINE_DATA for ea inodesJulian Sun
Setting the EXT4_STATE_MAY_INLINE_DATA flag for ea inodes is meaningless because ea inodes do not use functions like ext4_write_begin(). Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250107044702.1836852-3-sunjunchao2870@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13ext4: Remove a redundant return statementJulian Sun
Remove a redundant return statements in the ext4_es_remove_extent() function. Signed-off-by: Julian Sun <sunjunchao2870@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250107044702.1836852-2-sunjunchao2870@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2025-03-13smb: client: Fix match_session bug preventing session reuseHenrique Carvalho
Fix a bug in match_session() that can causes the session to not be reused in some cases. Reproduction steps: mount.cifs //server/share /mnt/a -o credentials=creds mount.cifs //server/share /mnt/b -o credentials=creds,sec=ntlmssp cat /proc/fs/cifs/DebugData | grep SessionId | wc -l mount.cifs //server/share /mnt/b -o credentials=creds,sec=ntlmssp mount.cifs //server/share /mnt/a -o credentials=creds cat /proc/fs/cifs/DebugData | grep SessionId | wc -l Cc: stable@vger.kernel.org Reviewed-by: Enzo Matsumiya <ematsumiya@suse.de> Signed-off-by: Henrique Carvalho <henrique.carvalho@suse.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-13cifs: Fix integer overflow while processing closetimeo mount optionMurad Masimov
User-provided mount parameter closetimeo of type u32 is intended to have an upper limit, but before it is validated, the value is converted from seconds to jiffies which can lead to an integer overflow. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 5efdd9122eff ("smb3: allow deferred close timeout to be configurable") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-13cifs: Fix integer overflow while processing actimeo mount optionMurad Masimov
User-provided mount parameter actimeo of type u32 is intended to have an upper limit, but before it is validated, the value is converted from seconds to jiffies which can lead to an integer overflow. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 6d20e8406f09 ("cifs: add attribute cache timeout (actimeo) tunable") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-13cifs: Fix integer overflow while processing acdirmax mount optionMurad Masimov
User-provided mount parameter acdirmax of type u32 is intended to have an upper limit, but before it is validated, the value is converted from seconds to jiffies which can lead to an integer overflow. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 4c9f948142a5 ("cifs: Add new mount parameter "acdirmax" to allow caching directory metadata") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-13cifs: Fix integer overflow while processing acregmax mount optionMurad Masimov
User-provided mount parameter acregmax of type u32 is intended to have an upper limit, but before it is validated, the value is converted from seconds to jiffies which can lead to an integer overflow. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 5780464614f6 ("cifs: Add new parameter "acregmax" for distinct file and directory metadata timeout") Signed-off-by: Murad Masimov <m.masimov@mt-integration.ru> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-13smb: client: fix regression with guest optionPaulo Alcantara
When mounting a CIFS share with 'guest' mount option, mount.cifs(8) will set empty password= and password2= options. Currently we only handle empty strings from user= and password= options, so the mount will fail with cifs: Bad value for 'password2' Fix this by handling empty string from password2= option as well. Link: https://bbs.archlinux.org/viewtopic.php?id=303927 Reported-by: Adam Williamson <awilliam@redhat.com> Closes: https://lore.kernel.org/r/83c00b5fea81c07f6897a5dd3ef50fd3b290f56c.camel@redhat.com Fixes: 35f834265e0d ("smb3: fix broken reconnect when password changing on the server by allowing password rotation") Cc: stable@vger.kernel.org Signed-off-by: Paulo Alcantara (Red Hat) <pc@manguebit.com> Signed-off-by: Steve French <stfrench@microsoft.com>
2025-03-13posix-timers: Dont iterate /proc/$PID/timers with sighand:: Siglock heldThomas Gleixner
The readout of /proc/$PID/timers holds sighand::siglock with interrupts disabled. That is required to protect against concurrent modifications of the task::signal::posix_timers list because the list is not RCU safe. With the conversion of the timer storage to a RCU protected hlist, this is not longer required. The only requirement is to protect the returned entry against a concurrent free, which is trivial as the timers are RCU protected. Removing the trylock of sighand::siglock is benign because the life time of task_struct::signal is bound to the life time of the task_struct itself. There are two scenarios where this matters: 1) The process is life and not about to be checkpointed 2) The process is stopped via ptrace for checkpointing #1 is a racy snapshot of the armed timers and nothing can rely on it. It's not more than debug information and it has been that way before because sighand lock is dropped when the buffer is full and the restart of the iteration might find a completely different set of timers. The task and therefore task::signal cannot be freed as timers_start() acquired a reference count via get_pid_task(). #2 the process is stopped for checkpointing so nothing can delete or create timers at this point. Neither can the process exit during the traversal. If CRIU fails to observe an exit in progress prior to the dissimination of the timers, then there are more severe problems to solve in the CRIU mechanics as they can't rely on posix timers being enabled in the first place. Therefore replace the lock acquisition with rcu_read_lock() and switch the timer storage traversal over to seq_hlist_*_rcu(). Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Link: https://lore.kernel.org/all/20250308155624.465175807@linutronix.de
2025-03-13fs: use debug-only asserts around fd allocation and installMateusz Guzik
This also restores the check which got removed in 52732bb9abc9ee5b ("fs/file.c: remove sanity_check and add likely/unlikely in alloc_fd()") for performance reasons -- they no longer apply with a debug-only variant. Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250312161941.1261615-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-13bcachefs: fix tiny leak in bch2_dev_add()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-12f2fs: fix to avoid running out of free segmentsChao Yu
If checkpoint is disabled, GC can not reclaim any segments, we need to detect such condition and bail out from fallocate() of a pinfile, rather than letting allocator running out of free segment, which may cause f2fs to be shutdown. reproducer: mkfs.f2fs -f /dev/vda 16777216 mount -o checkpoint=disable:10% /dev/vda /mnt/f2fs for ((i=0;i<4096;i++)) do { dd if=/dev/zero of=/mnt/f2fs/$i bs=1M count=1; } done sync for ((i=0;i<4096;i+=2)) do { rm /mnt/f2fs/$i; } done sync touch /mnt/f2fs/pinfile f2fs_io pinfile set /mnt/f2fs/pinfile f2fs_io fallocate 0 0 4201644032 /mnt/f2fs/pinfile cat /sys/kernel/debug/f2fs/status output: - Free: 0 (0) Fixes: f5a53edcf01e ("f2fs: support aligned pinned file") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-12gfs2: Fix a NULL vs IS_ERR() bug in gfs2_find_jhead()Dan Carpenter
The filemap_grab_folio() function doesn't return NULL, it returns error pointers. Fix the check to match. Fixes: 40829760096d ("gfs2: Convert gfs2_find_jhead() to use a folio") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2025-03-12xfs: Remove duplicate xfs_rtbitmap.h headerJiapeng Chong
./fs/xfs/libxfs/xfs_sb.c: xfs_rtbitmap.h is included more than once. Fixes: 2167eaabe2fa ("xfs: define the zoned on-disk format") Reported-by: Abaci Robot <abaci@linux.alibaba.com> Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=19446 Signed-off-by: Jiapeng Chong <jiapeng.chong@linux.alibaba.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-03-12fs: dodge an atomic in putname if ref == 1Mateusz Guzik
While the structure is refcounted, the only consumer incrementing it is audit and even then the atomic operation is only needed when it interacts with io_uring. If putname spots a count of 1, there is no legitimate way for anyone to bump it. If audit is disabled, the count is guaranteed to be 1, which consistently elides the atomic for all path lookups. If audit is enabled, it still manages to elide the last decrement. Note the patch does not do anything to prevent audit from suffering atomics. See [1] and [2] for a different approach. Benchmarked on Sapphire Rapids issuing access() (ops/s): before: 5106246 after: 5269678 (+3%) Link 1: https://lore.kernel.org/linux-fsdevel/20250307161155.760949-1-mjguzik@gmail.com/ Link 2: https://lore.kernel.org/linux-fsdevel/20250307164216.GI2023217@ZenIV/ Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://lore.kernel.org/r/20250311181804.1165758-1-mjguzik@gmail.com Reviewed-by: Jens Axboe <axboe@kernel.dk> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-11f2fs: Remove f2fs_write_node_page()Matthew Wilcox (Oracle)
Mappings which implement writepages should not implement writepage as it can only harm writeback patterns. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-11f2fs: Remove f2fs_write_meta_page()Matthew Wilcox (Oracle)
Mappings which implement writepages should not implement writepage as it can only harm writeback patterns. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-11f2fs: Remove f2fs_write_data_page()Matthew Wilcox (Oracle)
Mappings which implement writepages should not implement writepage as it can only harm writeback patterns. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-11f2fs: Remove check for ->writepageMatthew Wilcox (Oracle)
We're almost able to remove a_ops->writepage. This check is unnecessary as we'll never call into __f2fs_write_data_pages() for character devices. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2025-03-11jfs: add index corruption check to DT_GETPAGE()Roman Smirnov
If the file system is corrupted, the header.stblindex variable may become greater than 127. Because of this, an array access out of bounds may occur: ------------[ cut here ]------------ UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dtree.c:3096:10 index 237 is out of range for type 'struct dtslot[128]' CPU: 0 UID: 0 PID: 5822 Comm: syz-executor740 Not tainted 6.13.0-rc4-syzkaller-00110-g4099a71718b0 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 ubsan_epilogue lib/ubsan.c:231 [inline] __ubsan_handle_out_of_bounds+0x121/0x150 lib/ubsan.c:429 dtReadFirst+0x622/0xc50 fs/jfs/jfs_dtree.c:3096 dtReadNext fs/jfs/jfs_dtree.c:3147 [inline] jfs_readdir+0x9aa/0x3c50 fs/jfs/jfs_dtree.c:2862 wrap_directory_iterator+0x91/0xd0 fs/readdir.c:65 iterate_dir+0x571/0x800 fs/readdir.c:108 __do_sys_getdents64 fs/readdir.c:403 [inline] __se_sys_getdents64+0x1e2/0x4b0 fs/readdir.c:389 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> ---[ end trace ]--- Add a stblindex check for corruption. Reported-by: syzbot <syzbot+9120834fc227768625ba@syzkaller.appspotmail.com> Closes: https://syzkaller.appspot.com/bug?extid=9120834fc227768625ba Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Signed-off-by: Dave Kleikamp <dave.kleikamp@oracle.com>
2025-03-11bcachefs: Make sure trans is unlocked when submitting read IOKent Overstreet
We were still using the trans after the unlock, leading to this bug in the retry path: 00255 ------------[ cut here ]------------ 00255 kernel BUG at fs/bcachefs/btree_iter.c:3348! 00255 Internal error: Oops - BUG: 00000000f2000800 [#1] SMP 00255 bcachefs (0ca38fe8-0a26-41f9-9b5d-6a27796c7803): /fiotest offset 86048768: no device to read from: 00255 u64s 8 type extent 4098:168192:U32_MAX len 128 ver 0: durability: 0 crc: c_size 128 size 128 offset 0 nonce 0 csum crc32c 0:8040a368 compress none ec: idx 83 block 1 ptr: 0:302:128 gen 0 00255 bcachefs (0ca38fe8-0a26-41f9-9b5d-6a27796c7803): /fiotest offset 85983232: no device to read from: 00255 u64s 8 type extent 4098:168064:U32_MAX len 128 ver 0: durability: 0 crc: c_size 128 size 128 offset 0 nonce 0 csum crc32c 0:43311336 compress none ec: idx 83 block 1 ptr: 0:302:0 gen 0 00255 Modules linked in: 00255 CPU: 5 UID: 0 PID: 304 Comm: kworker/u70:2 Not tainted 6.14.0-rc6-ktest-g526aae23d67d #16040 00255 Hardware name: linux,dummy-virt (DT) 00255 Workqueue: events_unbound bch2_rbio_retry 00255 pstate: 60001005 (nZCv daif -PAN -UAO -TCO -DIT +SSBS BTYPE=--) 00255 pc : __bch2_trans_get+0x100/0x378 00255 lr : __bch2_trans_get+0xa0/0x378 00255 sp : ffffff80c865b760 00255 x29: ffffff80c865b760 x28: 0000000000000000 x27: ffffff80d76ed880 00255 x26: 0000000000000018 x25: 0000000000000000 x24: ffffff80f4ec3760 00255 x23: ffffff80f4010140 x22: 0000000000000056 x21: ffffff80f4ec0000 00255 x20: ffffff80f4ec3788 x19: ffffff80d75f8000 x18: 00000000ffffffff 00255 x17: 2065707974203820 x16: 7334367520200a3a x15: 0000000000000008 00255 x14: 0000000000000001 x13: 0000000000000100 x12: 0000000000000006 00255 x11: ffffffc080b47a40 x10: 0000000000000000 x9 : ffffffc08038dea8 00255 x8 : ffffff80d75fc018 x7 : 0000000000000000 x6 : 0000000000003788 00255 x5 : 0000000000003760 x4 : ffffff80c922de80 x3 : ffffff80f18f0000 00255 x2 : ffffff80c922de80 x1 : 0000000000000130 x0 : 0000000000000006 00255 Call trace: 00255 __bch2_trans_get+0x100/0x378 (P) 00255 bch2_read_io_err+0x98/0x260 00255 bch2_read_endio+0xb8/0x2d0 00255 __bch2_read_extent+0xce8/0xfe0 00255 __bch2_read+0x2a8/0x978 00255 bch2_rbio_retry+0x188/0x318 00255 process_one_work+0x154/0x390 00255 worker_thread+0x20c/0x3b8 00255 kthread+0xf0/0x1b0 00255 ret_from_fork+0x10/0x20 00255 Code: 6b01001f 54ffff01 79408460 3617fec0 (d4210000) 00255 ---[ end trace 0000000000000000 ]--- 00255 Kernel panic - not syncing: Oops - BUG: Fatal exception 00255 SMP: stopping secondary CPUs 00255 Kernel Offset: disabled 00255 CPU features: 0x000,00000070,00000010,8240500b 00255 Memory Limit: none 00255 ---[ end Kernel panic - not syncing: Oops - BUG: Fatal exception ]--- Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-11bcachefs: Initialize from_inode members for bch_io_optsRoxana Nicolescu
When there is no inode source, all "from_inode" members in the structure bhc_io_opts should be set false. Fixes: 7a7c43a0c1ecf ("bcachefs: Add bch_io_opts fields for indicating whether the opts came from the inode") Reported-by: syzbot+c17ad4b4367b72a853cb@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c17ad4b4367b72a853cb Signed-off-by: Roxana Nicolescu <nicolescu.roxana@protonmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-11bcachefs: Fix b->written overflowAlan Huang
When bset past end of btree node, we should not add sectors to b->written, which will overflow b->written. Reported-by: syzbot+3cb3d9e8c3f197754825@syzkaller.appspotmail.com Tested-by: syzbot+3cb3d9e8c3f197754825@syzkaller.appspotmail.com Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-03-11vboxsf: Add __nonstring annotations for unterminated stringsKees Cook
When a character array without a terminating NUL character has a static initializer, GCC 15's -Wunterminated-string-initialization will only warn if the array lacks the "nonstring" attribute[1]. Mark the arrays with __nonstring to and correctly identify the char array as "not a C string" and thereby eliminate the warning. This effectively reverts the change in 4e7487245abc ("vboxsf: fix building with GCC 15"), to add the annotation that has other uses (i.e. warning if the string is ever used with C string APIs). Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117178 [1] Cc: Hans de Goede <hdegoede@redhat.com> Cc: Brahmajit Das <brahmajit.xyz@gmail.com> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Kees Cook <kees@kernel.org> Link: https://lore.kernel.org/r/20250310222530.work.374-kees@kernel.org Reviewed-by: Hans de Goede <hdegoede@redhat.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-03-11xfs: trigger zone GC when out of available rt blocksHans Holmberg
We periodically check the available rt blocks when filling up zones and start GC if needed, but we may run completely out in between filling zones, so start GC(unless already running) if we can't reserve writable space. This should only happen as a corner case in setups with very few backing zones. Fixes: 080d01c41d44 ("xfs: implement zoned garbage collection") Signed-off-by: Hans Holmberg <hans.holmberg@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Carlos Maiolino <cem@kernel.org>
2025-03-11Revert "f2fs: rebuild nat_bits during umount"Chao Yu
This reverts commit 94c821fb286b545d37549ff30a0c341e066f0d6c. It reports that there is potential corruption in node footer, the most suspious feature is nat_bits, let's revert recovery related code. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>