linux-arm.git - Russell King's ARM Linux kernel tree

Age	Commit message (Collapse)	Author
2025-10-20	Coccinelle-based conversion to use ->i_state accessors	Mateusz Guzik
	All places were patched by coccinelle with the default expecting that ->i_lock is held, afterwards entries got fixed up by hand to use unlocked variants as needed. The script: @@ expression inode, flags; @@ - inode->i_state & flags + inode_state_read(inode) & flags @@ expression inode, flags; @@ - inode->i_state &= ~flags + inode_state_clear(inode, flags) @@ expression inode, flag1, flag2; @@ - inode->i_state &= ~flag1 & ~flag2 + inode_state_clear(inode, flag1 \| flag2) @@ expression inode, flags; @@ - inode->i_state \|= flags + inode_state_set(inode, flags) @@ expression inode, flags; @@ - inode->i_state = flags + inode_state_assign(inode, flags) @@ expression inode, flags; @@ - flags = inode->i_state + flags = inode_state_read(inode) @@ expression inode, flags; @@ - READ_ONCE(inode->i_state) & flags + inode_state_read(inode) & flags Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-09-22	fs: udf: fix OOB read in lengthAllocDescs handling	Larshin Sergey
	When parsing Allocation Extent Descriptor, lengthAllocDescs comes from on-disk data and must be validated against the block size. Crafted or corrupted images may set lengthAllocDescs so that the total descriptor length (sizeof(allocExtDesc) + lengthAllocDescs) exceeds the buffer, leading udf_update_tag() to call crc_itu_t() on out-of-bounds memory and trigger a KASAN use-after-free read. BUG: KASAN: use-after-free in crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60 Read of size 1 at addr ffff888041e7d000 by task syz-executor317/5309 CPU: 0 UID: 0 PID: 5309 Comm: syz-executor317 Not tainted 6.12.0-rc4-syzkaller-00261-g850925a8133c #0 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 crc_itu_t+0x1d5/0x2b0 lib/crc-itu-t.c:60 udf_update_tag+0x70/0x6a0 fs/udf/misc.c:261 udf_write_aext+0x4d8/0x7b0 fs/udf/inode.c:2179 extent_trunc+0x2f7/0x4a0 fs/udf/truncate.c:46 udf_truncate_tail_extent+0x527/0x7e0 fs/udf/truncate.c:106 udf_release_file+0xc1/0x120 fs/udf/file.c:185 __fput+0x23f/0x880 fs/file_table.c:431 task_work_run+0x24f/0x310 kernel/task_work.c:239 exit_task_work include/linux/task_work.h:43 [inline] do_exit+0xa2f/0x28e0 kernel/exit.c:939 do_group_exit+0x207/0x2c0 kernel/exit.c:1088 __do_sys_exit_group kernel/exit.c:1099 [inline] __se_sys_exit_group kernel/exit.c:1097 [inline] __x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1097 x64_sys_call+0x2634/0x2640 arch/x86/include/generated/asm/syscalls_64.h:232 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x77/0x7f </TASK> Validate the computed total length against epos->bh->b_size. Found by Linux Verification Center (linuxtesting.org) with Syzkaller. Reported-by: syzbot+8743fca924afed42f93e@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8743fca924afed42f93e Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Larshin Sergey <Sergey.Larshin@kaspersky.com> Link: https://patch.msgid.link/20250922131358.745579-1-Sergey.Larshin@kaspersky.com Signed-off-by: Jan Kara <jack@suse.cz>
2025-07-28	Merge tag 'fs_for_v6.17-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull udf and ext2 updates from Jan Kara: "A few udf and ext2 fixes and cleanups" * tag 'fs_for_v6.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Verify partition map count udf: stop using write_cache_pages ext2: Handle fiemap on empty files to prevent EINVAL
2025-07-16	fs: change write_begin/write_end interface to take struct kiocb *	Taotao Chen
	Change the address_space_operations callbacks write_begin() and write_end() to take struct kiocb * as the first argument instead of struct file *. Update all affected function prototypes, implementations, call sites, and related documentation across VFS, filesystems, and block layer. Part of a series refactoring address_space_operations write_begin and write_end callbacks to use struct kiocb for passing write context and flags. Signed-off-by: Taotao Chen <chentaotao@didiglobal.com> Link: https://lore.kernel.org/20250716093559.217344-4-chentaotao@didiglobal.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-07-11	udf: Verify partition map count	Jan Kara
	Verify that number of partition maps isn't insanely high which can lead to large allocation in udf_sb_alloc_partition_maps(). All partition maps have to fit in the LVD which is in a single block. Reported-by: syzbot+478f2c1a6f0f447a46bb@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz>
2025-07-11	udf: stop using write_cache_pages	Christoph Hellwig
	Stop using the obsolete write_cache_pages and use writeback_iter directly. Use the chance to refactor the inacb writeback code to not have a separate writeback helper. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20250711081036.564232-1-hch@lst.de
2025-05-07	udf: Make sure i_lenExtents is uptodate on inode eviction	Jan Kara
	UDF maintains total length of all extents in i_lenExtents. Generally we keep extent lengths (and thus i_lenExtents) block aligned because it makes the file appending logic simpler. However the standard mandates that the inode size must match the length of all extents and thus we trim the last extent when closing the file. To catch possible bugs we also verify that i_lenExtents matches i_size when evicting inode from memory. Commit b405c1e58b73 ("udf: refactor udf_next_aext() to handle error") however broke the code updating i_lenExtents and thus udf_evict_inode() ended up spewing lots of errors about incorrectly sized extents although the extents were actually sized properly. Fix the updating of i_lenExtents to silence the errors. Fixes: b405c1e58b73 ("udf: refactor udf_next_aext() to handle error") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2025-04-01	Merge tag 'mm-stable-2025-03-30-16-52' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - The series "Enable strict percpu address space checks" from Uros Bizjak uses x86 named address space qualifiers to provide compile-time checking of percpu area accesses. This has caused a small amount of fallout - two or three issues were reported. In all cases the calling code was found to be incorrect. - The series "Some cleanup for memcg" from Chen Ridong implements some relatively monir cleanups for the memcontrol code. - The series "mm: fixes for device-exclusive entries (hmm)" from David Hildenbrand fixes a boatload of issues which David found then using device-exclusive PTE entries when THP is enabled. More work is needed, but this makes thins better - our own HMM selftests now succeed. - The series "mm: zswap: remove z3fold and zbud" from Yosry Ahmed remove the z3fold and zbud implementations. They have been deprecated for half a year and nobody has complained. - The series "mm: further simplify VMA merge operation" from Lorenzo Stoakes implements numerous simplifications in this area. No runtime effects are anticipated. - The series "mm/madvise: remove redundant mmap_lock operations from process_madvise()" from SeongJae Park rationalizes the locking in the madvise() implementation. Performance gains of 20-25% were observed in one MADV_DONTNEED microbenchmark. - The series "Tiny cleanup and improvements about SWAP code" from Baoquan He contains a number of touchups to issues which Baoquan noticed when working on the swap code. - The series "mm: kmemleak: Usability improvements" from Catalin Marinas implements a couple of improvements to the kmemleak user-visible output. - The series "mm/damon/paddr: fix large folios access and schemes handling" from Usama Arif provides a couple of fixes for DAMON's handling of large folios. - The series "mm/damon/core: fix wrong and/or useless damos_walk() behaviors" from SeongJae Park fixes a few issues with the accuracy of kdamond's walking of DAMON regions. - The series "expose mapping wrprotect, fix fb_defio use" from Lorenzo Stoakes changes the interaction between framebuffer deferred-io and core MM. No functional changes are anticipated - this is preparatory work for the future removal of page structure fields. - The series "mm/damon: add support for hugepage_size DAMOS filter" from Usama Arif adds a DAMOS filter which permits the filtering by huge page sizes. - The series "mm: permit guard regions for file-backed/shmem mappings" from Lorenzo Stoakes extends the guard region feature from its present "anon mappings only" state. The feature now covers shmem and file-backed mappings. - The series "mm: batched unmap lazyfree large folios during reclamation" from Barry Song cleans up and speeds up the unmapping for pte-mapped large folios. - The series "reimplement per-vma lock as a refcount" from Suren Baghdasaryan puts the vm_lock back into the vma. Our reasons for pulling it out were largely bogus and that change made the code more messy. This patchset provides small (0-10%) improvements on one microbenchmark. - The series "Docs/mm/damon: misc DAMOS filters documentation fixes and improves" from SeongJae Park does some maintenance work on the DAMON docs. - The series "hugetlb/CMA improvements for large systems" from Frank van der Linden addresses a pile of issues which have been observed when using CMA on large machines. - The series "mm/damon: introduce DAMOS filter type for unmapped pages" from SeongJae Park enables users of DMAON/DAMOS to filter my the page's mapped/unmapped status. - The series "zsmalloc/zram: there be preemption" from Sergey Senozhatsky teaches zram to run its compression and decompression operations preemptibly. - The series "selftests/mm: Some cleanups from trying to run them" from Brendan Jackman fixes a pile of unrelated issues which Brendan encountered while runnimg our selftests. - The series "fs/proc/task_mmu: add guard region bit to pagemap" from Lorenzo Stoakes permits userspace to use /proc/pid/pagemap to determine whether a particular page is a guard page. - The series "mm, swap: remove swap slot cache" from Kairui Song removes the swap slot cache from the allocation path - it simply wasn't being effective. - The series "mm: cleanups for device-exclusive entries (hmm)" from David Hildenbrand implements a number of unrelated cleanups in this code. - The series "mm: Rework generic PTDUMP configs" from Anshuman Khandual implements a number of preparatoty cleanups to the GENERIC_PTDUMP Kconfig logic. - The series "mm/damon: auto-tune aggregation interval" from SeongJae Park implements a feedback-driven automatic tuning feature for DAMON's aggregation interval tuning. - The series "Fix lazy mmu mode" from Ryan Roberts fixes some issues in powerpc, sparc and x86 lazy MMU implementations. Ryan did this in preparation for implementing lazy mmu mode for arm64 to optimize vmalloc. - The series "mm/page_alloc: Some clarifications for migratetype fallback" from Brendan Jackman reworks some commentary to make the code easier to follow. - The series "page_counter cleanup and size reduction" from Shakeel Butt cleans up the page_counter code and fixes a size increase which we accidentally added late last year. - The series "Add a command line option that enables control of how many threads should be used to allocate huge pages" from Thomas Prescher does that. It allows the careful operator to significantly reduce boot time by tuning the parallalization of huge page initialization. - The series "Fix calculations in trace_balance_dirty_pages() for cgwb" from Tang Yizhou fixes the tracing output from the dirty page balancing code. - The series "mm/damon: make allow filters after reject filters useful and intuitive" from SeongJae Park improves the handling of allow and reject filters. Behaviour is made more consistent and the documention is updated accordingly. - The series "Switch zswap to object read/write APIs" from Yosry Ahmed updates zswap to the new object read/write APIs and thus permits the removal of some legacy code from zpool and zsmalloc. - The series "Some trivial cleanups for shmem" from Baolin Wang does as it claims. - The series "fs/dax: Fix ZONE_DEVICE page reference counts" from Alistair Popple regularizes the weird ZONE_DEVICE page refcount handling in DAX, permittig the removal of a number of special-case checks. - The series "refactor mremap and fix bug" from Lorenzo Stoakes is a preparatoty refactoring and cleanup of the mremap() code. - The series "mm: MM owner tracking for large folios (!hugetlb) + CONFIG_NO_PAGE_MAPCOUNT" from David Hildenbrand reworks the manner in which we determine whether a large folio is known to be mapped exclusively into a single MM. - The series "mm/damon: add sysfs dirs for managing DAMOS filters based on handling layers" from SeongJae Park adds a couple of new sysfs directories to ease the management of DAMON/DAMOS filters. - The series "arch, mm: reduce code duplication in mem_init()" from Mike Rapoport consolidates many per-arch implementations of mem_init() into code generic code, where that is practical. - The series "mm/damon/sysfs: commit parameters online via damon_call()" from SeongJae Park continues the cleaning up of sysfs access to DAMON internal data. - The series "mm: page_ext: Introduce new iteration API" from Luiz Capitulino reworks the page_ext initialization to fix a boot-time crash which was observed with an unusual combination of compile and cmdline options. - The series "Buddy allocator like (or non-uniform) folio split" from Zi Yan reworks the code to split a folio into smaller folios. The main benefit is lessened memory consumption: fewer post-split folios are generated. - The series "Minimize xa_node allocation during xarry split" from Zi Yan reduces the number of xarray xa_nodes which are generated during an xarray split. - The series "drivers/base/memory: Two cleanups" from Gavin Shan performs some maintenance work on the drivers/base/memory code. - The series "Add tracepoints for lowmem reserves, watermarks and totalreserve_pages" from Martin Liu adds some more tracepoints to the page allocator code. - The series "mm/madvise: cleanup requests validations and classifications" from SeongJae Park cleans up some warts which SeongJae observed during his earlier madvise work. - The series "mm/hwpoison: Fix regressions in memory failure handling" from Shuai Xue addresses two quite serious regressions which Shuai has observed in the memory-failure implementation. - The series "mm: reliable huge page allocator" from Johannes Weiner makes huge page allocations cheaper and more reliable by reducing fragmentation. - The series "Minor memcg cleanups & prep for memdescs" from Matthew Wilcox is preparatory work for the future implementation of memdescs. - The series "track memory used by balloon drivers" from Nico Pache introduces a way to track memory used by our various balloon drivers. - The series "mm/damon: introduce DAMOS filter type for active pages" from Nhat Pham permits users to filter for active/inactive pages, separately for file and anon pages. - The series "Adding Proactive Memory Reclaim Statistics" from Hao Jia separates the proactive reclaim statistics from the direct reclaim statistics. - The series "mm/vmscan: don't try to reclaim hwpoison folio" from Jinjiang Tu fixes our handling of hwpoisoned pages within the reclaim code. * tag 'mm-stable-2025-03-30-16-52' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (431 commits) mm/page_alloc: remove unnecessary __maybe_unused in order_to_pindex() x86/mm: restore early initialization of high_memory for 32-bits mm/vmscan: don't try to reclaim hwpoison folio mm/hwpoison: introduce folio_contain_hwpoisoned_page() helper cgroup: docs: add pswpin and pswpout items in cgroup v2 doc mm: vmscan: split proactive reclaim statistics from direct reclaim statistics selftests/mm: speed up split_huge_page_test selftests/mm: uffd-unit-tests support for hugepages > 2M docs/mm/damon/design: document active DAMOS filter type mm/damon: implement a new DAMOS filter type for active pages fs/dax: don't disassociate zero page entries MM documentation: add "Unaccepted" meminfo entry selftests/mm: add commentary about 9pfs bugs fork: use __vmalloc_node() for stack allocation docs/mm: Physical Memory: Populate the "Zones" section xen: balloon: update the NR_BALLOON_PAGES state hv_balloon: update the NR_BALLOON_PAGES state balloon_compaction: update the NR_BALLOON_PAGES state meminfo: add a per node counter for balloon drivers mm: remove references to folio in __memcg_kmem_uncharge_page() ...
2025-03-31	Merge tag 'fs_for_v6.15-rc1' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs Pull ext2, udf, and isofs updates from Jan Kara: - conversion of ext2 to the new mount API - small folio conversion work for ext2 - a fix of an unexpected return value in udf in inode_getblk() - a fix of handling of corrupted directory in isofs * tag 'fs_for_v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs: udf: Fix inode_getblk() return value ext2: Make ext2_params_spec static ext2: create ext2_msg_fc for use during parsing ext2: convert to the new mount API ext2: Remove reference to bh->b_page isofs: fix KMSAN uninit-value bug in do_isofs_readdir()
2025-03-16	fs: convert block_commit_write() to take a folio	Matthew Wilcox (Oracle)
	All callers now have a folio, so pass it in instead of converting folio->page->folio. Link: https://lkml.kernel.org/r/20250217192009.437916-1-willy@infradead.org Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-03-13	udf: Fix inode_getblk() return value	Jan Kara
	Smatch noticed that inode_getblk() can return 1 on successful mapping of a block instead of expected 0 after commit b405c1e58b73 ("udf: refactor udf_next_aext() to handle error"). This could confuse some of the callers and lead to strange failures (although the one reported by Smatch in udf_mkdir() is impossible to trigger in practice). Fix the return value of inode_getblk(). Link: https://lore.kernel.org/all/cb514af7-bbe0-435b-934f-dd1d7a16d2cd@stanley.mountain Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com> Fixes: b405c1e58b73 ("udf: refactor udf_next_aext() to handle error") CC: stable@vger.kernel.org Signed-off-by: Jan Kara <jack@suse.cz>
2025-02-27	Change inode_operations.mkdir to return struct dentry *	NeilBrown
	Some filesystems, such as NFS, cifs, ceph, and fuse, do not have complete control of sequencing on the actual filesystem (e.g. on a different server) and may find that the inode created for a mkdir request already exists in the icache and dcache by the time the mkdir request returns. For example, if the filesystem is mounted twice the directory could be visible on the other mount before it is on the original mount, and a pair of name_to_handle_at(), open_by_handle_at() calls could instantiate the directory inode with an IS_ROOT() dentry before the first mkdir returns. This means that the dentry passed to ->mkdir() may not be the one that is associated with the inode after the ->mkdir() completes. Some callers need to interact with the inode after the ->mkdir completes and they currently need to perform a lookup in the (rare) case that the dentry is no longer hashed. This lookup-after-mkdir requires that the directory remains locked to avoid races. Planned future patches to lock the dentry rather than the directory will mean that this lookup cannot be performed atomically with the mkdir. To remove this barrier, this patch changes ->mkdir to return the resulting dentry if it is different from the one passed in. Possible returns are: NULL - the directory was created and no other dentry was used ERR_PTR() - an error occurred non-NULL - this other dentry was spliced in This patch only changes file-systems to return "ERR_PTR(err)" instead of "err" or equivalent transformations. Subsequent patches will make further changes to some file-systems to return a correct dentry. Not all filesystems reliably result in a positive hashed dentry: - NFS, cifs, hostfs will sometimes need to perform a lookup of the name to get inode information. Races could result in this returning something different. Note that this lookup is non-atomic which is what we are trying to avoid. Placing the lookup in filesystem code means it only happens when the filesystem has no other option. - kernfs and tracefs leave the dentry negative and the ->revalidate operation ensures that lookup will be called to correctly populate the dentry. This could be fixed but I don't think it is important to any of the users of vfs_mkdir() which look at the dentry. The recommendation to use d_drop();d_splice_alias() is ugly but fits with current practice. A planned future patch will change this. Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: NeilBrown <neilb@suse.de> Link: https://lore.kernel.org/r/20250227013949.536172-2-neilb@suse.de Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-11-26	udf: Verify inode link counts before performing rename	Jan Kara
	During rename, we are updating link counts of various inodes either when rename deletes target or when moving directory across directories. Verify involved link counts are sane so that we don't trip warnings in VFS. Reported-by: syzbot+3ff7365dc04a6bcafa66@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz>
2024-11-26	udf: Skip parent dir link count update if corrupted	Jan Kara
	If the parent directory link count is too low (likely directory inode corruption), just skip updating its link count as if it goes to 0 too early it can cause unexpected issues. Signed-off-by: Jan Kara <jack@suse.cz>
2024-10-02	udf: fix uninit-value use in udf_get_fileshortad	Gianfranco Trad
	Check for overflow when computing alen in udf_current_aext to mitigate later uninit-value use in udf_get_fileshortad KMSAN bug[1]. After applying the patch reproducer did not trigger any issue[2]. [1] https://syzkaller.appspot.com/bug?extid=8901c4560b7ab5c2f9df [2] https://syzkaller.appspot.com/x/log.txt?x=10242227980000 Reported-by: syzbot+8901c4560b7ab5c2f9df@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=8901c4560b7ab5c2f9df Tested-by: syzbot+8901c4560b7ab5c2f9df@syzkaller.appspotmail.com Suggested-by: Jan Kara <jack@suse.com> Signed-off-by: Gianfranco Trad <gianf.trad@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20240925074613.8475-3-gianf.trad@gmail.com
2024-10-02	udf: refactor inode_bmap() to handle error	Zhao Mengmeng
	Refactor inode_bmap() to handle error since udf_next_aext() can return error now. On situations like ftruncate, udf_extend_file() can now detect errors and bail out early without resorting to checking for particular offsets and assuming internal behavior of these functions. Reported-by: syzbot+7a4842f0b1801230a989@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=7a4842f0b1801230a989 Tested-by: syzbot+7a4842f0b1801230a989@syzkaller.appspotmail.com Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241001115425.266556-4-zhaomzhao@126.com
2024-10-02	udf: refactor udf_next_aext() to handle error	Zhao Mengmeng
	Since udf_current_aext() has error handling, udf_next_aext() should have error handling too. Besides, when too many indirect extents found in one inode, return -EFSCORRUPTED; when reading block failed, return -EIO. Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241001115425.266556-3-zhaomzhao@126.com
2024-10-02	udf: refactor udf_current_aext() to handle error	Zhao Mengmeng
	As Jan suggested in links below, refactor udf_current_aext() to differentiate between error, hit EOF and success, it now takes pointer to etype to store the extent type, return 1 when getting etype success, return 0 when hitting EOF and return -errno when err. Link: https://lore.kernel.org/all/20240912111235.6nr3wuqvktecy3vh@quack3/ Signed-off-by: Zhao Mengmeng <zhaomengmeng@kylinos.cn> Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://patch.msgid.link/20241001115425.266556-2-zhaomzhao@126.com
2024-09-16	Merge tag 'vfs-6.12.file' of ↵	Linus Torvalds
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull vfs file updates from Christian Brauner: "This is the work to cleanup and shrink struct file significantly. Right now, (focusing on x86) struct file is 232 bytes. After this series struct file will be 184 bytes aka 3 cacheline and a spare 8 bytes for future extensions at the end of the struct. With struct file being as ubiquitous as it is this should make a difference for file heavy workloads and allow further optimizations in the future. - struct fown_struct was embedded into struct file letting it take up 32 bytes in total when really it shouldn't even be embedded in struct file in the first place. Instead, actual users of struct fown_struct now allocate the struct on demand. This frees up 24 bytes. - Move struct file_ra_state into the union containg the cleanup hooks and move f_iocb_flags out of the union. This closes a 4 byte hole we created earlier and brings struct file to 192 bytes. Which means struct file is 3 cachelines and we managed to shrink it by 40 bytes. - Reorder struct file so that nothing crosses a cacheline. I suspect that in the future we will end up reordering some members to mitigate false sharing issues or just because someone does actually provide really good perf data. - Shrinking struct file to 192 bytes is only part of the work. Files use a slab that is SLAB_TYPESAFE_BY_RCU and when a kmem cache is created with SLAB_TYPESAFE_BY_RCU the free pointer must be located outside of the object because the cache doesn't know what part of the memory can safely be overwritten as it may be needed to prevent object recycling. That has the consequence that SLAB_TYPESAFE_BY_RCU may end up adding a new cacheline. So this also contains work to add a new kmem_cache_create_rcu() function that allows the caller to specify an offset where the freelist pointer is supposed to be placed. Thus avoiding the implicit addition of a fourth cacheline. - And finally this removes the f_version member in struct file. The f_version member isn't particularly well-defined. It is mainly used as a cookie to detect concurrent seeks when iterating directories. But it is also abused by some subsystems for completely unrelated things. It is mostly a directory and filesystem specific thing that doesn't really need to live in struct file and with its wonky semantics it really lacks a specific function. For pipes, f_version is (ab)used to defer poll notifications until a write has happened. And struct pipe_inode_info is used by multiple struct files in their ->private_data so there's no chance of pushing that down into file->private_data without introducing another pointer indirection. But pipes don't rely on f_pos_lock so this adds a union into struct file encompassing f_pos_lock and a pipe specific f_pipe member that pipes can use. This union of course can be extended to other file types and is similar to what we do in struct inode already" * tag 'vfs-6.12.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (26 commits) fs: remove f_version pipe: use f_pipe fs: add f_pipe ubifs: store cookie in private data ufs: store cookie in private data udf: store cookie in private data proc: store cookie in private data ocfs2: store cookie in private data input: remove f_version abuse ext4: store cookie in private data ext2: store cookie in private data affs: store cookie in private data fs: add generic_llseek_cookie() fs: use must_set_pos() fs: add must_set_pos() fs: add vfs_setpos_cookie() s390: remove unused f_version ceph: remove unused f_version adi: remove unused f_version mm: Removed @freeptr_offset to prevent doc warning ...
2024-09-12	udf: store cookie in private data	Christian Brauner
	Store the cookie to detect concurrent seeks on directories in file->private_data. Link: https://lore.kernel.org/r/20240830-vfs-file-f_version-v1-15-6d3e4816aa7b@kernel.org Reviewed-by: Jan Kara <jack@suse.cz> Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07	buffer: Convert __block_write_begin() to take a folio	Matthew Wilcox (Oracle)
	Almost all callers have a folio now, so change __block_write_begin() to take a folio and remove a call to compound_head(). Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07	fs: Convert aops->write_begin to take a folio	Matthew Wilcox (Oracle)
	Convert all callers from working on a page to working on one page of a folio (support for working on an entire folio can come later). Removes a lot of folio->page->folio conversions. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-08-07	fs: Convert aops->write_end to take a folio	Matthew Wilcox (Oracle)
	Most callers have a folio, and most implementations operate on a folio, so remove the conversion from folio->page->folio to fit through this interface. Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2024-06-26	udf: prevent integer overflow in udf_bitmap_free_blocks()	Roman Smirnov
	An overflow may occur if the function is called with the last block and an offset greater than zero. It is necessary to add a check to avoid this. Found by Linux Verification Center (linuxtesting.org) with Svace. [JK: Make test cover also unalloc table freeing] Link: https://patch.msgid.link/20240620072413.7448-1-r.smirnov@omp.ru Suggested-by: Jan Kara <jack@suse.com> Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-26	udf: Avoid excessive partition lengths	Jan Kara
	Avoid mounting filesystems where the partition would overflow the 32-bits used for block number. Also refuse to mount filesystems where the partition length is so large we cannot safely index bits in a block bitmap. Link: https://patch.msgid.link/20240620130403.14731-1-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-26	udf: Drop load_block_bitmap() wrapper	Jan Kara
	The wrapper is completely pointless as all the checks are already done in __load_block_bitmap(). Just drop it and rename __load_block_bitmap(). Link: https://patch.msgid.link/20240617154201.29512-3-jack@suse.cz Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-26	udf: Avoid using corrupted block bitmap buffer	Jan Kara
	When the filesystem block bitmap is corrupted, we detect the corruption while loading the bitmap and fail the allocation with error. However the next allocation from the same bitmap will notice the bitmap buffer is already loaded and tries to allocate from the bitmap with mixed results (depending on the exact nature of the bitmap corruption). Fix the problem by using BH_verified bit to indicate whether the bitmap is valid or not. Reported-by: syzbot+5f682cd029581f9edfd1@syzkaller.appspotmail.com CC: stable@vger.kernel.org Link: https://patch.msgid.link/20240617154201.29512-2-jack@suse.cz Fixes: 1e0d4adf17e7 ("udf: Check consistency of Space Bitmap Descriptor") Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-20	udf: Fix bogus checksum computation in udf_rename()	Jan Kara
	Syzbot reports uninitialized memory access in udf_rename() when updating checksum of '..' directory entry of a moved directory. This is indeed true as we pass on-stack diriter.fi to the udf_update_tag() and because that has only struct fileIdentDesc included in it and not the impUse or name fields, the checksumming function is going to checksum random stack contents beyond the end of the structure. This is actually harmless because the following udf_fiiter_write_fi() will recompute the checksum from on-disk buffers where everything is properly included. So all that is needed is just removing the bogus calculation. Fixes: e9109a92d2a9 ("udf: Convert udf_rename() to new directory iteration code") Link: https://lore.kernel.org/all/000000000000cf405f060d8f75a9@google.com/T/ Link: https://patch.msgid.link/20240617154201.29512-1-jack@suse.cz Reported-by: syzbot+d31185aa54170f7fc1f5@syzkaller.appspotmail.com Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-05	udf: Fix lock ordering in udf_evict_inode()	Jan Kara
	udf_evict_inode() calls udf_setsize() to truncate deleted inode. However inode deletion through udf_evict_inode() can happen from inode reclaim context and udf_setsize() grabs mapping->invalidate_lock which isn't generally safe to acquire from fs reclaim context since we allocate pages under mapping->invalidate_lock for example in a page fault path. This is however not a real deadlock possibility as by the time udf_evict_inode() is called, nobody can be accessing the inode, even less work with its page cache. So this is just a lockdep triggering false positive. Fix the problem by moving mapping->invalidate_lock locking outsize of udf_setsize() into udf_setattr() as grabbing mapping->invalidate_lock from udf_evict_inode() is pointless. Reported-by: syzbot+0333a6f4b88bcd68a62f@syzkaller.appspotmail.com Fixes: b9a861fd527a ("udf: Protect truncate and file type conversion with invalidate_lock") Signed-off-by: Jan Kara <jack@suse.cz>
2024-06-05	udf: Drop pointless IS_IMMUTABLE and IS_APPEND check	Jan Kara
	udf_setsize() checks for IS_IMMUTABLE and IS_APPEND flags. This is however pointless as UDF does not have capability to store these flags and never allows to set them. Furthermore this is the only place in UDF code that was actually checking these flags. Remove the pointless check. Signed-off-by: Jan Kara <jack@suse.cz>
2024-04-23	udf: Use a folio in udf_write_end()	Matthew Wilcox (Oracle)
	Convert the page to a folio and use the folio APIs. Replaces three calls to compound_head() with one. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-8-willy@infradead.org>
2024-04-23	udf: Convert udf_page_mkwrite() to use a folio	Matthew Wilcox (Oracle)
	Convert the vm_fault page to a folio, then use it throughout. Replaces five calls to compound_head() with one. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-7-willy@infradead.org>
2024-04-23	udf: Convert udf_symlink_getattr() to use a folio	Matthew Wilcox (Oracle)
	We're getting this from the page cache, so it's definitely a folio. Saves a call to compound_head() hidden in put_page(). Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-6-willy@infradead.org>
2024-04-23	udf: Convert udf_adinicb_readpage() to udf_adinicb_read_folio()	Matthew Wilcox (Oracle)
	Now that all three callers have a folio, convert this function to take a folio, and use the folio APIs. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-5-willy@infradead.org>
2024-04-23	udf: Convert udf_expand_file_adinicb() to use a folio	Matthew Wilcox (Oracle)
	Use the folio APIs throughout this function. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Fixes: 1eeceaec794e ("udf: Convert udf_expand_file_adinicb() to avoid kmap_atomic()") Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-4-willy@infradead.org>
2024-04-23	udf: Convert udf_write_begin() to use a folio	Matthew Wilcox (Oracle)
	Use the folio APIs throughout instead of the deprecated page APIs. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-3-willy@infradead.org>
2024-04-23	udf: Convert udf_symlink_filler() to use a folio	Matthew Wilcox (Oracle)
	Remove the conversion to struct page and use folio APIs throughout. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240417150416.752929-2-willy@infradead.org>
2024-04-10	udf: udftime: prevent overflow in udf_disk_stamp_to_time()	Roman Smirnov
	An overflow can occur in a situation where src.centiseconds takes the value of 255. This situation is unlikely, but there is no validation check anywere in the code. Found by Linux Verification Center (linuxtesting.org) with Svace. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Roman Smirnov <r.smirnov@omp.ru> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240327132755.13945-1-r.smirnov@omp.ru>
2024-04-02	udf: replace deprecated strncpy/strcpy with strscpy	Justin Stitt
	strncpy() is deprecated for use on NUL-terminated destination strings [1] and as such we should prefer more robust and less ambiguous string interfaces. Also replace an instance of strcpy() which is also deprecated. s_volume_ident is a NUL-terminated string which is evident from its usage in udf_debug: \| udf_debug("volIdent[] = '%s'\n", UDF_SB(sb)->s_volume_ident); s_volume_ident should also be NUL-padded as it is copied out to userspace: \| if (copy_to_user((char __user *)arg, \| UDF_SB(inode->i_sb)->s_volume_ident, 32)) \| return -EFAULT; Considering the above, a suitable replacement is `strscpy_pad` [2] due to the fact that it guarantees both NUL-termination and NUL-padding on the destination buffer. To simplify the code, let's use the new 2-argument version of strscpy_pad() introduced in Commit e6584c3964f2f ("string: Allow 2-argument strscpy()"). Also zero-allocate @outstr so we can safely use a non-@ret length argument. This is just in case udf_dstrCS0toChar() doesn't include the NUL-byte in its return length, we won't truncate @outstr or write garbage bytes either. Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strncpy-on-nul-terminated-strings [1] Link: https://manpages.debian.org/testing/linux-manual-4.8/strscpy.9.en.html [2] Link: https://github.com/KSPP/linux/issues/90 Cc: linux-hardening@vger.kernel.org Signed-off-by: Justin Stitt <justinstitt@google.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240401-strncpy-fs-udf-super-c-v1-1-80cddab7a281@google.com>
2024-03-25	udf: Remove second semicolon	Colin Ian King
	There is a statement with two semicolons. Remove the second one, it is redundant. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20240315091949.2430585-1-colin.i.king@gmail.com>
2024-03-05	udf: remove SLAB_MEM_SPREAD flag usage	Chengming Zhou
	The SLAB_MEM_SPREAD flag is already a no-op after removal of SLAB allocator and in [1] it was fully deprecated. Remove its usage so we can delete it from slab. No functional change. Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/all/20240223-slab-cleanup-flags-v2-1-02f1753e8303@suse.cz/ Message-Id: <20240224135229.830356-1-chengming.zhou@linux.dev>
2024-02-21	udf: convert to new mount API	Eric Sandeen
	Convert the UDF filesystem to the new mount API. UDF is slightly unique in that it always preserves prior mount options across a remount, so that's handled by udf_init_options(). Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <fcb3faf3-e2df-450c-b37a-11000c274585@redhat.com>
2024-02-21	udf: convert novrs to an option flag	Eric Sandeen
	There's no reason to treat novers specially, convert it to a flag in uopt->flags like other flag options. Signed-off-by: Eric Sandeen <sandeen@redhat.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <b5d53492-b99a-4b3c-82c0-581df9a9e384@redhat.com>
2024-02-05	udf: Avoid invalid LVID used on mount	Jan Kara
	udf_load_logicalvolint() loads logical volume integrity descriptors. Since there can be multiple blocks with LVIDs, we verify the contents of only the last (prevailing) LVID found. However if we fail to load the last LVID (either due to IO error or because it's checksum fails to match), we never perform the verification of validity of the LVID we are going to use. If such LVID contains invalid data, we can hit out-of-bounds access or similar issues. Fix the problem by verifying each LVID we are potentially going to accept. Reported-by: Robert Morris <rtm@csail.mit.edu> Signed-off-by: Jan Kara <jack@suse.cz>
2024-01-23	udf: Remove GFP_NOFS allocation in udf_expand_file_adinicb()	Jan Kara
	udf_expand_file_adinicb() is called under inode->i_rwsem and mapping->invalidate_lock. i_rwsem is safe wrt fs reclaim, invalidate_lock on this inode is safe as well (we hold inode reference so reclaim will not touch it, furthermore even lockdep should not complain as invalidate_lock is acquired from udf_evict_inode() only when truncating inode which should not happen from fs reclaim). Signed-off-by: Jan Kara <jack@suse.cz>
2024-01-23	udf: Avoid GFP_NOFS allocation in udf_load_pvoldesc()	Jan Kara
	udf_load_pvoldesc() is called only during mount when it is safe to enter fs reclaim (we hold only s_umount semaphore). Change GFP_NOFS to GFP_KERNEL allocation. Signed-off-by: Jan Kara <jack@suse.cz>
2024-01-23	udf: Avoid GFP_NOFS allocation in udf_symlink()	Jan Kara
	The GFP_NOFS allocation in udf_symlink() is called only under inode->i_rwsem and UDF_I(inode)->i_data_sem. The first is safe wrt reclaim, the second should be as well but allocating unde this lock is actually unnecessary. Move the allocation from under i_data_sem and change it to GFP_KERNEL. Signed-off-by: Jan Kara <jack@suse.cz>
2024-01-23	udf: Remove GFP_NOFS from dir iteration code	Jan Kara
	Directory iteration code was using GFP_NOFS allocations in two places. However the code is called only under inode->i_rwsem which is generally safe wrt reclaim. So we can do the allocations with GFP_KERNEL instead. Signed-off-by: Jan Kara <jack@suse.cz>
2024-01-11	Merge tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs	Linus Torvalds
	Pull misc filesystem updates from Al Viro: "Misc cleanups (the part that hadn't been picked by individual fs trees)" * tag 'pull-misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: apparmorfs: don't duplicate kfree_link() orangefs: saner arguments passing in readdir guts ocfs2_find_match(): there's no such thing as NULL or negative ->d_parent reiserfs_add_entry(): get rid of pointless namelen checks __ocfs2_add_entry(), ocfs2_prepare_dir_for_insert(): namelen checks ext4_add_entry(): ->d_name.len is never 0 befs: d_obtain_alias(ERR_PTR(...)) will do the right thing affs: d_obtain_alias(ERR_PTR(...)) will do the right thing /proc/sys: use d_splice_alias() calling conventions to simplify failure exits hostfs: use d_splice_alias() calling conventions to simplify failure exits udf_fiiter_add_entry(): check for zero ->d_name.len is bogus... udf: d_obtain_alias(ERR_PTR(...)) will do the right thing... udf: d_splice_alias() will do the right thing on ERR_PTR() inode nfsd: kill stale comment about simple_fill_super() requirements bfs_add_entry(): get rid of pointless ->d_name.len checks nilfs2: d_obtain_alias(ERR_PTR(...)) will do the right thing... zonefs: d_splice_alias() will do the right thing on ERR_PTR() inode
2023-12-21	udf_fiiter_add_entry(): check for zero ->d_name.len is bogus...	Al Viro
	Acked-by: Jan Kara <jack@suse.cz> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>