git.armlinux.org.uk/linux.git - Linus' kernel tree

Age	Commit message (Collapse)	Author
2024-08-26	ext4: fix access to uninitialised lock in fc replay path	Luis Henriques (SUSE)
	The following kernel trace can be triggered with fstest generic/629 when executed against a filesystem with fast-commit feature enabled: INFO: trying to register non-static key. The code is fine but needs lockdep annotation, or maybe you didn't initialize this object before use? turning off the locking correctness validator. CPU: 0 PID: 866 Comm: mount Not tainted 6.10.0+ #11 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.2-3-gd478f380-prebuilt.qemu.org 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x66/0x90 register_lock_class+0x759/0x7d0 __lock_acquire+0x85/0x2630 ? __find_get_block+0xb4/0x380 lock_acquire+0xd1/0x2d0 ? __ext4_journal_get_write_access+0xd5/0x160 _raw_spin_lock+0x33/0x40 ? __ext4_journal_get_write_access+0xd5/0x160 __ext4_journal_get_write_access+0xd5/0x160 ext4_reserve_inode_write+0x61/0xb0 __ext4_mark_inode_dirty+0x79/0x270 ? ext4_ext_replay_set_iblocks+0x2f8/0x450 ext4_ext_replay_set_iblocks+0x330/0x450 ext4_fc_replay+0x14c8/0x1540 ? jread+0x88/0x2e0 ? rcu_is_watching+0x11/0x40 do_one_pass+0x447/0xd00 jbd2_journal_recover+0x139/0x1b0 jbd2_journal_load+0x96/0x390 ext4_load_and_init_journal+0x253/0xd40 ext4_fill_super+0x2cc6/0x3180 ... In the replay path there's an attempt to lock sbi->s_bdev_wb_lock in function ext4_check_bdev_write_error(). Unfortunately, at this point this spinlock has not been initialized yet. Moving it's initialization to an earlier point in __ext4_fill_super() fixes this splat. Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Link: https://patch.msgid.link/20240718094356.7863-1-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2024-08-26	ext4: fix fast commit inode enqueueing during a full journal commit	Luis Henriques (SUSE)
	When a full journal commit is on-going, any fast commit has to be enqueued into a different queue: FC_Q_STAGING instead of FC_Q_MAIN. This enqueueing is done only once, i.e. if an inode is already queued in a previous fast commit entry it won't be enqueued again. However, if a full commit starts _after_ the inode is enqueued into FC_Q_MAIN, the next fast commit needs to be done into FC_Q_STAGING. And this is not being done in function ext4_fc_track_template(). This patch fixes the issue by re-enqueuing an inode into the STAGING queue during the fast commit clean-up callback when doing a full commit. However, to prevent a race with a fast-commit, the clean-up callback has to be called with the journal locked. This bug was found using fstest generic/047. This test creates several 32k bytes files, sync'ing each of them after it's creation, and then shutting down the filesystem. Some data may be loss in this operation; for example a file may have it's size truncated to zero. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240717172220.14201-1-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2024-08-26	ext4: fix timer use-after-free on failed mount	Xiaxi Shen
	Syzbot has found an ODEBUG bug in ext4_fill_super The del_timer_sync function cancels the s_err_report timer, which reminds about filesystem errors daily. We should guarantee the timer is no longer active before kfree(sbi). When filesystem mounting fails, the flow goes to failed_mount3, where an error occurs when ext4_stop_mmpd is called, causing a read I/O failure. This triggers the ext4_handle_error function that ultimately re-arms the timer, leaving the s_err_report timer active before kfree(sbi) is called. Fix the issue by canceling the s_err_report timer after calling ext4_stop_mmpd. Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com> Reported-and-tested-by: syzbot+59e0101c430934bc9a36@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=59e0101c430934bc9a36 Link: https://patch.msgid.link/20240715043336.98097-1-shenxiaxi26@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2024-08-26	ext4: use seq_putc() in two functions	Markus Elfring
	Single characters (line breaks) should be put into a sequence. Thus use the corresponding function “seq_putc”. This issue was transformed by using the Coccinelle software. Signed-off-by: Markus Elfring <elfring@users.sourceforge.net> Link: https://patch.msgid.link/076974ab-4da3-4176-89dc-0514e020c276@web.de Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-26	ext4: no need to continue when the number of entries is 1	Edward Adam Davis
	Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3") Reported-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0 Signed-off-by: Edward Adam Davis <eadavis@qq.com> Reported-and-tested-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com Link: https://patch.msgid.link/tencent_BE7AEE6C7C2D216CB8949CE8E6EE7ECC2C0A@qq.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2024-08-26	ext4: correct encrypted dentry name hash when not casefolded	yao.ly
	EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH will access struct ext4_dir_entry_hash followed ext4_dir_entry. But there is no ext4_dir_entry_hash followed when inode is encrypted and not casefolded Signed-off-by: yao.ly <yao.ly@linux.alibaba.com> Link: https://patch.msgid.link/1719816219-128287-1-git-send-email-yao.ly@linux.alibaba.com Signed-off-by: Theodore Ts'o <tytso@mit.edu> Cc: stable@kernel.org
2024-08-20	ext4: correct comment of h_checksum	Kemeng Shi
	Checksum of xattr block is always crc32c(uuid+blknum+xattrblock), see ext4_xattr_block_csum_set for detail. Remove incorrect comment that "id = inum if refcount=1". Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://patch.msgid.link/20240606125508.1459893-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-20	ext4: correct comment of ext4_xattr_block_cache_insert	Kemeng Shi
	There is no return value from ext4_xattr_block_cache_insert, just correct it's comment about return value. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://patch.msgid.link/20240606125508.1459893-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-20	ext4: correct comment of ext4_xattr_cmp	Kemeng Shi
	The ext4_xattr_cmp never returns negative error number. Correct possible return value in ext4_xattr_cmp's comment. Signed-off-by: Kemeng Shi <shikemeng@huaweicloud.com> Link: https://patch.msgid.link/20240606125508.1459893-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-20	ext4: fix macro definition error of EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH	carrion bent
	The macro parameter 'entry' of EXT4_DIRENT_HASH and EXT4_DIRENT_MINOR_HASH was not used, but rather the variable 'de' was directly used, which may be a local variable inside a function that calls the macros. Fortunately, all callers have passed in 'de' so far, so this bug didn't have an effect. Signed-off-by: carrion bent <carrionbent@linux.alibaba.com> Link: https://patch.msgid.link/1717652596-58760-1-git-send-email-carrionbent@linux.alibaba.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-20	ext4: filesystems without casefold feature cannot be mounted with siphash	Lizhi Xu
	When mounting the ext4 filesystem, if the default hash version is set to DX_HASH_SIPHASH but the casefold feature is not set, exit the mounting. Reported-by: syzbot+340581ba9dceb7e06fb3@syzkaller.appspotmail.com Signed-off-by: Lizhi Xu <lizhi.xu@windriver.com> Link: https://patch.msgid.link/20240605012335.44086-1-lizhi.xu@windriver.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-20	ext4: adjust the layout of the ext4_inode_info structure to save memory	Junchao Sun
	Using pahole, we can see that there are some padding holes in the current ext4_inode_info structure. Adjusting the layout of ext4_inode_info can reduce these holes, resulting in the size of the structure decreasing from 2424 bytes to 2408 bytes. Signed-off-by: Junchao Sun <sunjunchao2870@gmail.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240603131524.324224-1-sunjunchao2870@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-08-12	introduce fd_file(), convert all accessors to it.	Al Viro
	For any changes of struct fd representation we need to turn existing accesses to fields into calls of wrappers. Accesses to struct fd::flags are very few (3 in linux/file.h, 1 in net/socket.c, 3 in fs/overlayfs/file.c and 3 more in explicit initializers). Those can be dealt with in the commit converting to new layout; accesses to struct fd::file are too many for that. This commit converts (almost) all of f.file to fd_file(f). It's not entirely mechanical ('file' is used as a member name more than just in struct fd) and it does not even attempt to distinguish the uses in pointer context from those in boolean context; the latter will be eventually turned into a separate helper (fd_empty()). NOTE: mass conversion to fd_empty(), tempting as it might be, is a bad idea; better do that piecewise in commit that convert from fdget...() to CLASS(...). [conflicts in fs/fhandle.c, kernel/bpf/syscall.c, mm/memcontrol.c caught by git; fs/stat.c one got caught by git grep] [fs/xattr.c conflict] Reviewed-by: Christian Brauner <brauner@kernel.org> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2024-08-07	buffer: Convert __block_write_begin() to take a folio	Matthew Wilcox (Oracle)
	Almost all callers have a folio now, so change __block_write_begin() to take a folio and remove a call to compound_head(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07	fs: Convert aops->write_begin to take a folio	Matthew Wilcox (Oracle)
	Convert all callers from working on a page to working on one page of a folio (support for working on an entire folio can come later). Removes a lot of folio->page->folio conversions. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07	fs: Convert aops->write_end to take a folio	Matthew Wilcox (Oracle)
	Most callers have a folio, and most implementations operate on a folio, so remove the conversion from folio->page->folio to fit through this interface. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07	buffer: Convert block_write_end() to take a folio	Matthew Wilcox (Oracle)
	All callers now have a folio, so pass it in instead of converting from a folio to a page and back to a folio again. Saves a call to compound_head(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-07-18	Merge tag 'ext4_for_linus-6.11-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "Many cleanups and bug fixes in ext4, especially for the fast commit feature. Also some performance improvements; in particular, improving IOPS and throughput on fast devices running Async Direct I/O by up to 20% by optimizing jbd2_transaction_committed()" * tag 'ext4_for_linus-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits) ext4: make sure the first directory block is not a hole ext4: check dot and dotdot of dx_root before making dir indexed ext4: sanity check for NULL pointer after ext4_force_shutdown jbd2: increase maximum transaction size jbd2: drop pointless shrinker batch initialization jbd2: avoid infinite transaction commit loop jbd2: precompute number of transaction descriptor blocks jbd2: make jbd2_journal_get_max_txn_bufs() internal jbd2: avoid mount failed when commit block is partial submitted ext4: avoid writing unitialized memory to disk in EA inodes ext4: don't track ranges in fast_commit if inode has inlined data ext4: fix possible tid_t sequence overflows ext4: use ext4_update_inode_fsync_trans() helper in inode creation ext4: add missing MODULE_DESCRIPTION() jbd2: add missing MODULE_DESCRIPTION() ext4: use memtostr_pad() for s_volume_name jbd2: speed up jbd2_transaction_committed() ext4: make ext4_da_map_blocks() buffer_head unaware ext4: make ext4_insert_delayed_block() insert multi-blocks ext4: factor out a helper to check the cluster allocation state ...
2024-07-15	Merge tag 'vfs-6.11.mount.api' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs mount API updates from Christian Brauner: - Add a generic helper to parse uid and gid mount options. Currently we open-code the same logic in various filesystems which is error prone, especially since the verification of uid and gid mount options is a sensitive operation in the face of idmappings. Add a generic helper and convert all filesystems over to it. Make sure that filesystems that are mountable in unprivileged containers verify that the specified uid and gid can be represented in the owning namespace of the filesystem. - Convert hostfs to the new mount api. * tag 'vfs-6.11.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: fuse: Convert to new uid/gid option parsing helpers fuse: verify {g,u}id mount options correctly fat: Convert to new uid/gid option parsing helpers fat: Convert to new mount api fat: move debug into fat_mount_options vboxsf: Convert to new uid/gid option parsing helpers tracefs: Convert to new uid/gid option parsing helpers smb: client: Convert to new uid/gid option parsing helpers tmpfs: Convert to new uid/gid option parsing helpers ntfs3: Convert to new uid/gid option parsing helpers isofs: Convert to new uid/gid option parsing helpers hugetlbfs: Convert to new uid/gid option parsing helpers ext4: Convert to new uid/gid option parsing helpers exfat: Convert to new uid/gid option parsing helpers efivarfs: Convert to new uid/gid option parsing helpers debugfs: Convert to new uid/gid option parsing helpers autofs: Convert to new uid/gid option parsing helpers fs_parse: add uid & gid option option parsing helpers hostfs: Add const qualifier to host_root in hostfs_fill_super() hostfs: convert hostfs to use the new mount API
2024-07-10	ext4: make sure the first directory block is not a hole	Baokun Li
	The syzbot constructs a directory that has no dirblock but is non-inline, i.e. the first directory block is a hole. And no errors are reported when creating files in this directory in the following flow. ext4_mknod ... ext4_add_entry // Read block 0 ext4_read_dirblock(dir, block, DIRENT) bh = ext4_bread(NULL, inode, block, 0) if (!bh && (type == INDEX \|\| type == DIRENT_HTREE)) // The first directory block is a hole // But type == DIRENT, so no error is reported. After that, we get a directory block without '.' and '..' but with a valid dentry. This may cause some code that relies on dot or dotdot (such as make_indexed_dir()) to crash. Therefore when ext4_read_dirblock() finds that the first directory block is a hole report that the filesystem is corrupted and return an error to avoid loading corrupted data from disk causing something bad. Reported-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0 Fixes: 4e19d6b65fb4 ("ext4: allow directory holes") Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240702132349.2600605-3-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-10	ext4: check dot and dotdot of dx_root before making dir indexed	Baokun Li
	Syzbot reports a issue as follows: ============================================ BUG: unable to handle page fault for address: ffffed11022e24fe PGD 23ffee067 P4D 23ffee067 PUD 0 Oops: Oops: 0000 [#1] PREEMPT SMP KASAN PTI CPU: 0 PID: 5079 Comm: syz-executor306 Not tainted 6.10.0-rc5-g55027e689933 #0 Call Trace: <TASK> make_indexed_dir+0xdaf/0x13c0 fs/ext4/namei.c:2341 ext4_add_entry+0x222a/0x25d0 fs/ext4/namei.c:2451 ext4_rename fs/ext4/namei.c:3936 [inline] ext4_rename2+0x26e5/0x4370 fs/ext4/namei.c:4214 [...] ============================================ The immediate cause of this problem is that there is only one valid dentry for the block to be split during do_split, so split==0 results in out of bounds accesses to the map triggering the issue. do_split unsigned split dx_make_map count = 1 split = count/2 = 0; continued = hash2 == map[split - 1].hash; ---> map[4294967295] The maximum length of a filename is 255 and the minimum block size is 1024, so it is always guaranteed that the number of entries is greater than or equal to 2 when do_split() is called. But syzbot's crafted image has no dot and dotdot in dir, and the dentry distribution in dirblock is as follows: bus dentry1 hole dentry2 free \|xx--\|xx-------------\|...............\|xx-------------\|...............\| 0 12 (8+248)=256 268 256 524 (8+256)=264 788 236 1024 So when renaming dentry1 increases its name_len length by 1, neither hole nor free is sufficient to hold the new dentry, and make_indexed_dir() is called. In make_indexed_dir() it is assumed that the first two entries of the dirblock must be dot and dotdot, so bus and dentry1 are left in dx_root because they are treated as dot and dotdot, and only dentry2 is moved to the new leaf block. That's why count is equal to 1. Therefore add the ext4_check_dx_root() helper function to add more sanity checks to dot and dotdot before starting the conversion to avoid the above issue. Reported-by: syzbot+ae688d469e36fb5138d0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=ae688d469e36fb5138d0 Fixes: ac27a0ec112a ("[PATCH] ext4: initial copy of files from ext3") Cc: stable@kernel.org Signed-off-by: Baokun Li <libaokun1@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240702132349.2600605-2-libaokun@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08	ext4: sanity check for NULL pointer after ext4_force_shutdown	Wojciech Gładysz
	Test case: 2 threads write short inline data to a file. In ext4_page_mkwrite the resulting inline data is converted. Handling ext4_grp_locked_error with description "block bitmap and bg descriptor inconsistent: X vs Y free clusters" calls ext4_force_shutdown. The conversion clears EXT4_STATE_MAY_INLINE_DATA but fails for ext4_destroy_inline_data_nolock and ext4_mark_iloc_dirty due to ext4_forced_shutdown. The restoration of inline data fails for the same reason not setting EXT4_STATE_MAY_INLINE_DATA. Without the flag set a regular process path in ext4_da_write_end follows trying to dereference page folio private pointer that has not been set. The fix calls early return with -EIO error shall the pointer to private be NULL. Sample crash report: Unable to handle kernel paging request at virtual address dfff800000000004 KASAN: null-ptr-deref in range [0x0000000000000020-0x0000000000000027] Mem abort info: ESR = 0x0000000096000005 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x05: level 1 translation fault Data abort info: ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [dfff800000000004] address between user and kernel address ranges Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP Modules linked in: CPU: 1 PID: 20274 Comm: syz-executor185 Not tainted 6.9.0-rc7-syzkaller-gfda5695d692c #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024 pstate: 80400005 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : __block_commit_write+0x64/0x2b0 fs/buffer.c:2167 lr : __block_commit_write+0x3c/0x2b0 fs/buffer.c:2160 sp : ffff8000a1957600 x29: ffff8000a1957610 x28: dfff800000000000 x27: ffff0000e30e34b0 x26: 0000000000000000 x25: dfff800000000000 x24: dfff800000000000 x23: fffffdffc397c9e0 x22: 0000000000000020 x21: 0000000000000020 x20: 0000000000000040 x19: fffffdffc397c9c0 x18: 1fffe000367bd196 x17: ffff80008eead000 x16: ffff80008ae89e3c x15: 00000000200000c0 x14: 1fffe0001cbe4e04 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000001 x10: 0000000000ff0100 x9 : 0000000000000000 x8 : 0000000000000004 x7 : 0000000000000000 x6 : 0000000000000000 x5 : fffffdffc397c9c0 x4 : 0000000000000020 x3 : 0000000000000020 x2 : 0000000000000040 x1 : 0000000000000020 x0 : fffffdffc397c9c0 Call trace: __block_commit_write+0x64/0x2b0 fs/buffer.c:2167 block_write_end+0xb4/0x104 fs/buffer.c:2253 ext4_da_do_write_end fs/ext4/inode.c:2955 [inline] ext4_da_write_end+0x2c4/0xa40 fs/ext4/inode.c:3028 generic_perform_write+0x394/0x588 mm/filemap.c:3985 ext4_buffered_write_iter+0x2c0/0x4ec fs/ext4/file.c:299 ext4_file_write_iter+0x188/0x1780 call_write_iter include/linux/fs.h:2110 [inline] new_sync_write fs/read_write.c:497 [inline] vfs_write+0x968/0xc3c fs/read_write.c:590 ksys_write+0x15c/0x26c fs/read_write.c:643 __do_sys_write fs/read_write.c:655 [inline] __se_sys_write fs/read_write.c:652 [inline] __arm64_sys_write+0x7c/0x90 fs/read_write.c:652 __invoke_syscall arch/arm64/kernel/syscall.c:34 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:48 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:133 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:152 el0_svc+0x54/0x168 arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:598 Code: 97f85911 f94002da 91008356 d343fec8 (38796908) ---[ end trace 0000000000000000 ]--- ---------------- Code disassembly (best guess): 0: 97f85911 bl 0xffffffffffe16444 4: f94002da ldr x26, [x22] 8: 91008356 add x22, x26, #0x20 c: d343fec8 lsr x8, x22, #3 * 10: 38796908 ldrb w8, [x8, x25] <-- trapping instruction Reported-by: syzbot+18df508cf00a0598d9a6@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=18df508cf00a0598d9a6 Link: https://lore.kernel.org/all/000000000000f19a1406109eb5c5@google.com/T/ Signed-off-by: Wojciech Gładysz <wojciech.gladysz@infogain.com> Link: https://patch.msgid.link/20240703070112.10235-1-wojciech.gladysz@infogain.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08	ext4: avoid writing unitialized memory to disk in EA inodes	Jan Kara
	If the extended attribute size is not a multiple of block size, the last block in the EA inode will have uninitialized tail which will get written to disk. We will never expose the data to userspace but still this is not a good practice so just zero out the tail of the block as it isn't going to cause a noticeable performance overhead. Fixes: e50e5129f384 ("ext4: xattr-in-inode support") Reported-by: syzbot+9c1fe13fcb51574b249b@syzkaller.appspotmail.com Reported-by: Hugh Dickins <hughd@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240613150234.25176-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08	ext4: don't track ranges in fast_commit if inode has inlined data	Luis Henriques (SUSE)
	When fast-commit needs to track ranges, it has to handle inodes that have inlined data in a different way because ext4_fc_write_inode_data(), in the actual commit path, will attempt to map the required blocks for the range. However, inodes that have inlined data will have it's data stored in inode->i_block and, eventually, in the extended attribute space. Unfortunately, because fast commit doesn't currently support extended attributes, the solution is to mark this commit as ineligible. Link: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1039883 Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Tested-by: Ben Hutchings <benh@debian.org> Fixes: 9725958bb75c ("ext4: fast commit may miss tracking unwritten range during ftruncate") Link: https://patch.msgid.link/20240618144312.17786-1-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-08	ext4: fix possible tid_t sequence overflows	Luis Henriques (SUSE)
	In the fast commit code there are a few places where tid_t variables are being compared without taking into account the fact that these sequence numbers may wrap. Fix this issue by using the helper functions tid_gt() and tid_geq(). Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Harshad Shirwadkar <harshadshirwadkar@gmail.com> Link: https://patch.msgid.link/20240529092030.9557-3-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05	ext4: use ext4_update_inode_fsync_trans() helper in inode creation	Luis Henriques (SUSE)
	Call helper function ext4_update_inode_fsync_trans() instead of open coding it in __ext4_new_inode(). This helper checks both that the handle is valid and that it hasn't been aborted due to some fatal error in the journalling layer, using is_handle_aborted(). Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Link: https://patch.msgid.link/20240527161447.21434-1-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05	ext4: add missing MODULE_DESCRIPTION()	Jeff Johnson
	Fix the 'make W=1' warning: WARNING: modpost: missing MODULE_DESCRIPTION() in fs/ext4/ext4-inode-test.o Signed-off-by: Jeff Johnson <quic_jjohnson@quicinc.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240527-md-fs-ext4-v1-1-07aad5936bb1@quicinc.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-05	ext4: use memtostr_pad() for s_volume_name	Kees Cook
	As with the other strings in struct ext4_super_block, s_volume_name is not NUL terminated. The other strings were marked in commit 072ebb3bffe6 ("ext4: add nonstring annotations to ext4.h"). Using strscpy() isn't the right replacement for strncpy(); it should use memtostr_pad() instead. Reported-by: syzbot+50835f73143cc2905b9e@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/00000000000019f4c00619192c05@google.com/ Fixes: 744a56389f73 ("ext4: replace deprecated strncpy with alternatives") Signed-off-by: Kees Cook <keescook@chromium.org> Link: https://patch.msgid.link/20240523225408.work.904-kees@kernel.org Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-07-02	ext4: Convert to new uid/gid option parsing helpers	Eric Sandeen
	Convert to new uid/gid option parsing helpers Signed-off-by: Eric Sandeen <sandeen@redhat.com> Link: https://lore.kernel.org/r/a84be40d-5110-4dac-83b1-0ea8e043f0fd@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-27	ext4: make ext4_da_map_blocks() buffer_head unaware	Zhang Yi
	After calling the ext4_da_map_blocks(), a delalloc extent state could be identified through the EXT4_MAP_DELAYED flag in map. So factor out buffer_head related handles in ext4_da_map_blocks(), make this function buffer_head unaware and becomes a common helper, and also update the stale function commtents, preparing for the iomap da write path in the future. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-11-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: make ext4_insert_delayed_block() insert multi-blocks	Zhang Yi
	Rename ext4_insert_delayed_block() to ext4_insert_delayed_blocks(), pass length parameter to make it insert multiple delalloc blocks at a time. For non-bigalloc case, just reserve len blocks and insert delalloc extent. For bigalloc case, we can ensure that the clusters in the middle of a extent must be unallocated, we only need to check whether the start and end clusters are delayed/allocated. We should subtract the space for the start and/or end block(s) if they are allocated. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-10-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: factor out a helper to check the cluster allocation state	Zhang Yi
	Factor out a common helper ext4_clu_alloc_state(), check whether the cluster containing a delalloc block to be added has been allocated or has delalloc reservation, no logic changes. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-9-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: make ext4_da_reserve_space() reserve multi-clusters	Zhang Yi
	Add 'nr_resv' parameter to ext4_da_reserve_space(), which indicates the number of clusters wants to reserve, make it reserve multiple clusters at a time. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-8-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: make ext4_es_insert_delayed_block() insert multi-blocks	Zhang Yi
	Rename ext4_es_insert_delayed_block() to ext4_es_insert_delayed_extent() and pass length parameter to make it insert multiple delalloc blocks at a time. For the case of bigalloc, split the allocated parameter to lclu_allocated and end_allocated. lclu_allocated indicates the allocation state of the cluster which is containing the lblk, end_allocated indicates the allocation state of the extent end, clusters in the middle of delay allocated extent must be unallocated. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-7-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: drop iblock parameter	Zhang Yi
	The start block of the delalloc extent to be inserted is equal to map->m_lblk, just drop the duplicate iblock input parameter. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/20240517124005.347221-6-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: trim delalloc extent	Zhang Yi
	In ext4_da_map_blocks(), we could find four kind of extents in the extent status tree: hole, unwritten, written and delayed extent. Now we only trim the map len if we found an unwritten extent or a written extent. This is okay now since map->m_len is always set to one and we always insert one delayed block at a time. But this will become isn't okay for other two cases if ext4_insert_delayed_block() and ext4_da_map_blocks() support inserting multiple map->len blocks later. 1. If we found a hole in the extent status tree which es->es_len is shorter than the length we want to write, we should trim the map->m_len to prevent adding extra delay more blocks than we expected. For example, assume we write data [A, C) to a file that contains a hole extent [A, B) and a written extent [B, D) in the cache. A B C D before da write: ...hhhhhh\|wwwwww.... Then we will get extent [A, B), we should trim map->m_len to B-A before inserting new delalloc blocks, if not, the range [B, C) will be duplicated. 2. If we found a delayed extent in the extent status tree which es->es_len is shorter than the length we want to write, we should trim the map->m_len to es->es_len and return directly since the front part of this map has been delayed, we can't insert the delalloc extent that contains the latter part in this round, we should return the delayed length and the caller should increase the position and call ext4_da_map_blocks() again. For example, assume we write data [A, C) to a file that contains a delayed extent [A, B) in the cache. A B C before da write: ...dddddd\|hhh.... Then we will get delayed extent [A, B), we should also trim map->m_len to B-A and return, if not, we will incorrectly assume that the write is complete and won't insert [B, C). So we need to always trim the map->m_len if the found es->es_len in the extent status tree is shorter than the map->m_len, prearing for inserting a extent with multiple delalloc blocks. This patch only does a pre-fix, the handle is crude and ext4_da_map_blocks() deserve a cleanup, we will do that later. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-5-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: warn if delalloc counters are not zero on inactive	Zhang Yi
	The per-inode i_reserved_data_blocks count the reserved delalloc blocks in a regular file, it should be zero when destroying the file. The per-fs s_dirtyclusters_counter count all reserved delalloc blocks in a filesystem, it also should be zero when umounting the filesystem. Now we have only an error message if the i_reserved_data_blocks is not zero, which is unable to be simply captured, so add WARN_ON_ONCE to make it more visable. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-4-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: check the extent status again before inserting delalloc block	Zhang Yi
	ext4_da_map_blocks looks up for any extent entry in the extent status tree (w/o i_data_sem) and then the looks up for any ondisk extent mapping (with i_data_sem in read mode). If it finds a hole in the extent status tree or if it couldn't find any entry at all, it then takes the i_data_sem in write mode to add a da entry into the extent status tree. This can actually race with page mkwrite & fallocate path. Note that this is ok between 1. ext4 buffered-write path v/s ext4_page_mkwrite(), because of the folio lock 2. ext4 buffered write path v/s ext4 fallocate because of the inode lock. But this can race between ext4_page_mkwrite() & ext4 fallocate path ext4_page_mkwrite() ext4_fallocate() block_page_mkwrite() ext4_da_map_blocks() //find hole in extent status tree ext4_alloc_file_blocks() ext4_map_blocks() //allocate block and unwritten extent ext4_insert_delayed_block() ext4_da_reserve_space() //reserve one more block ext4_es_insert_delayed_block() //drop unwritten extent and add delayed extent by mistake Then, the delalloc extent is wrong until writeback and the extra reserved block can't be released any more and it triggers below warning: EXT4-fs (pmem2): Inode 13 (00000000bbbd4d23): i_reserved_data_blocks(1) not cleared! Fix the problem by looking up extent status tree again while the i_data_sem is held in write mode. If it still can't find any entry, then we insert a new da entry into the extent status tree. Cc: stable@vger.kernel.org Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240517124005.347221-3-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: factor out a common helper to query extent map	Zhang Yi
	Factor out a new common helper ext4_map_query_blocks() from the ext4_da_map_blocks(), it query and return the extent map status on the inode's extent path, no logic changes. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Ritesh Harjani (IBM) <ritesh.list@gmail.com> Link: https://patch.msgid.link/20240517124005.347221-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: fix infinite loop when replaying fast_commit	Luis Henriques (SUSE)
	When doing fast_commit replay an infinite loop may occur due to an uninitialized extent_status struct. ext4_ext_determine_insert_hole() does not detect the replay and calls ext4_es_find_extent_range(), which will return immediately without initializing the 'es' variable. Because 'es' contains garbage, an integer overflow may happen causing an infinite loop in this function, easily reproducible using fstest generic/039. This commit fixes this issue by unconditionally initializing the structure in function ext4_es_find_extent_range(). Thanks to Zhang Yi, for figuring out the real problem! Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> Reviewed-by: Zhang Yi <yi.zhang@huawei.com> Link: https://patch.msgid.link/20240515082857.32730-1-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: fix uninitialized variable in ext4_inlinedir_to_tree	Xiaxi Shen
	Syzbot has found an uninit-value bug in ext4_inlinedir_to_tree This error happens because ext4_inlinedir_to_tree does not handle the case when ext4fs_dirhash returns an error This can be avoided by checking the return value of ext4fs_dirhash and propagating the error, similar to how it's done with ext4_htree_store_dirent Signed-off-by: Xiaxi Shen <shenxiaxi26@gmail.com> Reported-and-tested-by: syzbot+eaba5abe296837a640c0@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=eaba5abe296837a640c0 Link: https://patch.msgid.link/20240501033017.220000-1-shenxiaxi26@gmail.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-27	ext4: block_validity: Remove unnecessary ‘NULL’ values from new_node	Li zeming
	new_node is assigned first, so it does not need to initialize the assignment. Signed-off-by: Li zeming <zeming@nfschina.com> Reviewed-by: Andreas Dilger <adilger@dilger.ca> Link: https://patch.msgid.link/20240402022300.25858-1-zeming@nfschina.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-06-07	ext4: Move CONFIG_UNICODE defguards into the code flow	Gabriel Krisman Bertazi
	Instead of a bunch of ifdefs, make the unicode built checks part of the code flow where possible, as requested by Torvalds. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> [eugen.hristev@collabora.com: port to 6.10-rc1] Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Link: https://lore.kernel.org/r/20240606073353.47130-7-eugen.hristev@collabora.com Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-07	ext4: Reuse generic_ci_match for ci comparisons	Gabriel Krisman Bertazi
	Instead of reimplementing ext4_match_ci, use the new libfs helper. It also adds a comment explaining why fname->cf_name.name must be checked prior to the encryption hash optimization, because that tripped me before. Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Link: https://lore.kernel.org/r/20240606073353.47130-5-eugen.hristev@collabora.com Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-07	ext4: Simplify the handling of cached casefolded names	Gabriel Krisman Bertazi
	Keeping it as qstr avoids the unnecessary conversion in ext4_match Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com> [eugen.hristev@collabora.com: port to 6.10-rc1] Signed-off-by: Eugen Hristev <eugen.hristev@collabora.com> Link: https://lore.kernel.org/r/20240606073353.47130-2-eugen.hristev@collabora.com Reviewed-by: Eric Biggers <ebiggers@google.com> Reviewed-by: Gabriel Krisman Bertazi <krisman@suse.de> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-05-21	Merge tag 'pull-bd_inode-1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull bdev bd_inode updates from Al Viro: "Replacement of bdev->bd_inode with sane(r) set of primitives by me and Yu Kuai" * tag 'pull-bd_inode-1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: RIP ->bd_inode dasd_format(): killing the last remaining user of ->bd_inode nilfs_attach_log_writer(): use ->bd_mapping->host instead of ->bd_inode block/bdev.c: use the knowledge of inode/bdev coallocation gfs2: more obvious initializations of mapping->host fs/buffer.c: massage the remaining users of ->bd_inode to ->bd_mapping blk_ioctl_{discard,zeroout}(): we only want ->bd_inode->i_mapping here... grow_dev_folio(): we only want ->bd_inode->i_mapping there use ->bd_mapping instead of ->bd_inode->i_mapping block_device: add a pointer to struct address_space (page cache of bdev) missing helpers: bdev_unhash(), bdev_drop() block: move two helpers into bdev.c block2mtd: prevent direct access of bd_inode dm-vdo: use bdev_nr_bytes(bdev) instead of i_size_read(bdev->bd_inode) blkdev_write_iter(): saner way to get inode and bdev bcachefs: remove dead function bdev_sectors() ext4: remove block_device_ejected() erofs_buf: store address_space instead of inode erofs: switch erofs_bread() to passing offset instead of block number
2024-05-21	Merge tag 'pull-set_blocksize' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull vfs blocksize updates from Al Viro: "This gets rid of bogus set_blocksize() uses, switches it over to be based on a 'struct file ' and verifies that the caller has the device opened exclusively" tag 'pull-set_blocksize' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: make set_blocksize() fail unless block device is opened exclusive set_blocksize(): switch to passing struct file * btrfs_get_bdev_and_sb(): call set_blocksize() only for exclusive opens swsusp: don't bother with setting block size zram: don't bother with reopening - just use O_EXCL for open swapon(2): open swap with O_EXCL swapon(2)/swapoff(2): don't bother with block size pktcdvd: sort set_blocksize() calls out bcache_register(): don't bother with set_blocksize()
2024-05-18	Merge tag 'ext4_for_linus-6.10-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: - more folio conversion patches - add support for FS_IOC_GETFSSYSFSPATH - mballoc cleaups and add more kunit tests - sysfs cleanups and bug fixes - miscellaneous bug fixes and cleanups * tag 'ext4_for_linus-6.10-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (40 commits) ext4: fix error pointer dereference in ext4_mb_load_buddy_gfp() jbd2: add prefix 'jbd2' for 'shrink_type' jbd2: use shrink_type type instead of bool type for __jbd2_journal_clean_checkpoint_list() ext4: fix uninitialized ratelimit_state->lock access in __ext4_fill_super() ext4: remove calls to to set/clear the folio error flag ext4: propagate errors from ext4_sb_bread() in ext4_xattr_block_cache_find() ext4: fix mb_cache_entry's e_refcnt leak in ext4_xattr_block_cache_find() jbd2: remove redundant assignement to variable err ext4: remove the redundant folio_wait_stable() ext4: fix potential unnitialized variable ext4: convert ac_buddy_page to ac_buddy_folio ext4: convert ac_bitmap_page to ac_bitmap_folio ext4: convert ext4_mb_init_cache() to take a folio ext4: convert bd_buddy_page to bd_buddy_folio ext4: convert bd_bitmap_page to bd_bitmap_folio ext4: open coding repeated check in next_linear_group ext4: use correct criteria name instead stale integer number in comment ext4: call ext4_mb_mark_free_simple to free continuous bits in found chunk ext4: add test_mb_mark_used_cost to estimate cost of mb_mark_used ext4: keep "prefetch_grp" and "nr" consistent ...
2024-05-17	ext4: fix error pointer dereference in ext4_mb_load_buddy_gfp()	Dan Carpenter
	This code calls folio_put() on an error pointer which will lead to a crash. Check for both error pointers and NULL pointers before calling folio_put(). Fixes: 5eea586b47f0 ("ext4: convert bd_buddy_page to bd_buddy_folio") Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Link: https://lore.kernel.org/r/eaafa1d9-a61c-4af4-9f97-d3ad72c60200@moroto.mountain Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2024-05-13	Merge tag 'vfs-6.10.misc' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull misc vfs updates from Christian Brauner: "This contains the usual miscellaneous features, cleanups, and fixes for vfs and individual fses. Features: - Free up FMODE_* bits. I've freed up bits 6, 7, 8, and 24. That means we now have six free FMODE_* bits in total (but bit #6 already got used for FMODE_WRITE_RESTRICTED) - Add FOP_HUGE_PAGES flag (follow-up to FMODE_* cleanup) - Add fd_raw cleanup class so we can make use of automatic cleanup provided by CLASS(fd_raw, f)(fd) for O_PATH fds as well - Optimize seq_puts() - Simplify __seq_puts() - Add new anon_inode_getfile_fmode() api to allow specifying f_mode instead of open-coding it in multiple places - Annotate struct file_handle with __counted_by() and use struct_size() - Warn in get_file() whether f_count resurrection from zero is attempted (epoll/drm discussion) - Folio-sophize aio - Export the subvolume id in statx() for both btrfs and bcachefs - Relax linkat(AT_EMPTY_PATH) requirements - Add F_DUPFD_QUERY fcntl() allowing to compare two file descriptors for dup() equality replacing kcmp() Cleanups: - Compile out swapfile inode checks when swap isn't enabled - Use (1 << n) notation for FMODE_ bitshifts for clarity - Remove redundant variable assignment in fs/direct-io - Cleanup uses of strncpy in orangefs - Speed up and cleanup writeback - Move fsparam_string_empty() helper into header since it's currently open-coded in multiple places - Add kernel-doc comments to proc_create_net_data_write() - Don't needlessly read dentry->d_flags twice Fixes: - Fix out-of-range warning in nilfs2 - Fix ecryptfs overflow due to wrong encryption packet size calculation - Fix overly long line in xfs file_operations (follow-up to FMODE_* cleanup) - Don't raise FOP_BUFFER_{R,W}ASYNC for directories in xfs (follow-up to FMODE_* cleanup) - Don't call xfs_file_open from xfs_dir_open (follow-up to FMODE_* cleanup) - Fix stable offset api to prevent endless loops - Fix afs file server rotations - Prevent xattr node from overflowing the eraseblock in jffs2 - Move fdinfo PTRACE_MODE_READ procfs check into the .permission() operation instead of .open() operation since this caused userspace regressions" * tag 'vfs-6.10.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (39 commits) afs: Fix fileserver rotation getting stuck selftests: add F_DUPDFD_QUERY selftests fcntl: add F_DUPFD_QUERY fcntl() file: add fd_raw cleanup class fs: WARN when f_count resurrection is attempted seq_file: Simplify __seq_puts() seq_file: Optimize seq_puts() proc: Move fdinfo PTRACE_MODE_READ check into the inode .permission operation fs: Create anon_inode_getfile_fmode() xfs: don't call xfs_file_open from xfs_dir_open xfs: drop fop_flags for directories xfs: fix overly long line in the file_operations shmem: Fix shmem_rename2() libfs: Add simple_offset_rename() API libfs: Fix simple_offset_rename_exchange() jffs2: prevent xattr node from overflowing the eraseblock vfs, swap: compile out IS_SWAPFILE() on swapless configs vfs: relax linkat() AT_EMPTY_PATH - aka flink() - requirements fs/direct-io: remove redundant assignment to variable retval fs/dcache: Re-use value stored to dentry->d_flags instead of re-reading ...