summaryrefslogtreecommitdiff
path: root/fs
AgeCommit message (Collapse)Author
2023-04-14ext4: Commit transaction before writing back pages in data=journal modeJan Kara
When journalling data we currently just walk over pages, journal those that are marked for delayed dirtying (only pinned pages dirtied behing our back these days) and checkpoint other dirty pages. Because some pages may be part of running transaction the result is that after filemap_write_and_wait() we are not guaranteed pages are stable on disk. Thus places that want to flush current pagecache content need to jump through hoops to make sure journalled data is not lost. This is manageable in cases completely controlled by ext4 (such as extent shifting operations or inode eviction) but it gets ugly for stuff like fsverity. Furthermore it is rather error prone as people often do not realize journalled data needs special handling. So change ext4_writepages() to commit transaction with inode's data before going through the writeback loop in WB_SYNC_ALL mode. As a result filemap_write_and_wait() is now really getting pages to stable storage and makes pagecache pages safe to reclaim. Consequently we can remove the special handling of journalled data from several places in follow up patches. Note that this will make fsync(2) for journalled data more expensive as we will end up not only committing the transaction we need but also checkpointing the data (which we may have previously skipped if the data was part of the running transaction). If we really cared, we would need to introduce special VFS function for writing out & invalidating page cache for a range, use ->launder_page callback to perform checkpointing, and use it from all the places that need this functionality. But at this point I'm not convinced the complexity is worth it. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-5-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-14ext4: Clear dirty bit from pages without data to writeJan Kara
With journalled data it can happen that checkpointing code will write out page contents without clearing the page dirty bit. The logic in ext4_page_nomap_can_writeout() then results in us never calling mpage_submit_page() and thus clearing the dirty bit. Drop the optimization with ext4_page_nomap_can_writeout() and just always call to mpage_submit_page(). ext4_bio_write_page() knows when to redirty the page and the additional clearing & setting of page dirty bit for ordered mode writeout is not that expensive to jump through the hoops for it. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-4-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-14ext4: Keep pages with journalled data dirtyJan Kara
Currently we clear page dirty bit when we checkpoint some buffers from a page with journalled data or when we perform delayed dirtying of a page in ext4_writepages(). In a quest to simplify handling of journalled data we want to keep page dirty as long as it has either buffers to checkpoint or journalled dirty data. So make sure to keep page dirty in ext4_writepages() if it still has journalled data attached to it. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-3-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-14ext4: Mark pages with journalled data dirtyJan Kara
Currently pages with journalled data written by write(2) or modified by block zeroing during truncate(2) are not marked as dirty. They are dirtied only once the transaction commits. This however makes writeback code think inode has no pages to write and so ext4_writepages() is not called to make pages with journalled data persistent. Mark pages with journalled data dirty (similarly as it happens for writes through mmap) so that writeback code knows about them and ext4_writepages() can do what it needs to to the inode. Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-2-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-14jdb2: Don't refuse invalidation of already invalidated buffersJan Kara
When invalidating buffers under the partial tail page, jbd2_journal_invalidate_folio() returns -EBUSY if the buffer is part of the committing transaction as we cannot safely modify buffer state. However if the buffer is already invalidated (due to previous invalidation attempts from ext4_wait_for_tail_page_commit()), there's nothing to do and there's no point in returning -EBUSY. This fixes occasional warnings from ext4_journalled_invalidate_folio() triggered by generic/051 fstest when blocksize < pagesize. Fixes: 53e872681fed ("ext4: fix deadlock in journal_unmap_buffer()") Signed-off-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230329154950.19720-1-jack@suse.cz Signed-off-by: Theodore Ts'o <tytso@mit.edu>
2023-04-14quota: mark PRINT_QUOTA_WARNING as BROKENYangtao Li
PRINT_QUOTA_WARNING is deprecated since commit 8e8934695dfd ("quota: send messages via netlink") merged in 2007. Users should rather be using notification over netlink socket if they are interested about explicit notification in addition to plain EDQUOT error. Since printing to console from deep inside filesystem code is problematic, mark the feature as BROKEN now and see who complains. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230413163833.43913-1-frank.li@vivo.com>
2023-04-13f2fs: support iopoll methodWu Bo
Wire up the iopoll method to the common implementation. As f2fs use common dio infrastructure: commit a1e09b03e6f5 ("f2fs: use iomap for direct I/O") Signed-off-by: Wu Bo <bo.wu@vivo.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-13f2fs: fix to check return value of inc_valid_block_count()Chao Yu
In __replace_atomic_write_block(), we missed to check return value of inc_valid_block_count(), for extreme testcase that f2fs image is run out of space, it may cause inconsistent status in between SIT table and total valid block count. Cc: Daeho Jeong <daehojeong@google.com> Fixes: 3db1de0e582c ("f2fs: change the current atomic write way") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-13f2fs: fix to check return value of f2fs_do_truncate_blocks()Chao Yu
Otherwise, if truncation on cow_inode failed, remained data may pollute current transaction of atomic write. Cc: Daeho Jeong <daehojeong@google.com> Fixes: a46bebd502fe ("f2fs: synchronize atomic write aborts") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-13f2fs: fix passing relative address when discard zonesDaeho Jeong
We should not pass relative address in a zone to __f2fs_issue_discard_zone(). Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-13Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski
Conflicts: tools/testing/selftests/net/config 62199e3f1658 ("selftests: net: Add VXLAN MDB test") 3a0385be133e ("selftests: add the missing CONFIG_IP_SCTP in net config") Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-04-14Merge tag 'fix-asciici-bugs-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: fix ascii-ci problems, then kill it [v2] Last week, I was fiddling around with the metadump name obfuscation code while writing a debugger command to generate directories full of names that all have the same hash name. I had a few questions about how well all that worked with ascii-ci mode, and discovered a nasty discrepancy between the kernel and glibc's implementations of the tolower() function. I discovered that I could create a directory that is large enough to require separate leaf index blocks. The hashes stored in the dabtree use the ascii-ci specific hash function, which uses a library function to convert the name to lowercase before hashing. If the kernel and C library's versions of tolower do not behave exactly identically, xfs_ascii_ci_hashname will not produce the same results for the same inputs. xfs_repair will deem the leaf information corrupt and rebuild the directory. After that, lookups in the kernel will fail because the hash index doesn't work. The kernel's tolower function will convert extended ascii uppercase letters (e.g. A-with-umlaut) to extended ascii lowercase letters (e.g. a-with-umlaut), whereas glibc's will only do that if you force LANG to ascii. Tiny embedded libc implementations just plain won't do it at all, and the result is a mess. Stabilize the behavior of the hash function by encoding the name transformation function in libxfs, add it to the selftest, and fix all the userspace tools, none of which handle this transformation correctly. The v1 series generated a /lot/ of discussion, in which several things became very clear: (1) Linus is not enamored of case folding of any kind; (2) Dave and Christoph don't seem to agree on whether the feature is supposed to work for 7-bit ascii or latin1; (3) it trashes UTF8 encoded names if those happen to show up; and (4) I don't want to maintain this mess any longer than I have to. Kill it in 2030. v2: rename the functions to make it clear we're moving away from the letters t, o, l, o, w, e, and r; and deprecate the whole feature once we've fixed the bugs and added tests. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-strengthen-rmap-checking-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: strengthen rmapbt scrubbing [v24.5] This series strengthens space allocation record cross referencing by using AG block bitmaps to compute the difference between space used according to the rmap records and the primary metadata, and reports cross-referencing errors for any discrepancies. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'repair-bitmap-rework-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: rework online fsck incore bitmap [v24.5] In this series, we make some changes to the incore bitmap code: First, we shorten the prefix to 'xbitmap'. Then, we rework some utility functions for later use by online repair and clarify how the walk functions are supposed to be used. Finally, we use all these new pieces to convert the incore bitmap to use an interval tree instead of linked lists. This lifts the limitation that callers had to be careful not to set a range that was already set; and gets us ready for the btree rebuilder functions needing to be able to set bits in a bitmap and generate maximal contiguous extents for the set ranges. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-fix-xattr-memory-mgmt-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: clean up memory management in xattr scrub [v24.5] Currently, the extended attribute scrubber uses a single VLA to store all the context information needed in various parts of the scrubber code. This includes xattr leaf block space usage bitmaps, and the value buffer used to check the correctness of remote xattr value block headers. We try to minimize the insanity through the use of helper functions, but this is a memory management nightmare. Clean this up by making the bitmap and value pointers explicit members of struct xchk_xattr_buf. Second, strengthen the xattr checking by teaching it to look for overlapping data structures in the shortform attr data. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-detect-mergeable-records-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: detect mergeable and overlapping btree records [v24.5] While I was doing differential fuzz analysis between xfs_scrub and xfs_repair, I noticed that xfs_repair was only partially effective at detecting btree records that can be merged, and xfs_scrub totally didn't notice at all. For every interval btree type except for the bmbt, there should never exist two adjacent records with adjacent keyspaces because the blockcount field is always large enough to span the entire keyspace of the domain. This is because the free space, rmap, and refcount btrees have a blockcount field large enough to store the maximum AG length, and there can never be an allocation larger than an AG. The bmbt is a different story due to its ondisk encoding where the blockcount is only 21 bits wide. Because AGs can span up to 2^31 blocks and the RT volume can span up to 2^52 blocks, a preallocation of 2^22 blocks will be expressed as two records of 2^21 length. We don't opportunistically combine records when doing bmbt operations, which is why the fsck tools have never complained about this scenario. Offline repair is partially effective at detecting mergeable records because I taught it to do that for the rmap and refcount btrees. This series enhances the free space, rmap, and refcount scrubbers to detect mergeable records. For the bmbt, it will flag the file as being eligible for an optimization to shrink the size of the data structure. The last patch in this set also enhances the rmap scrubber to detect records that overlap incorrectly. This check is done automatically for non-overlapping btree types, but we have to do it separately for the rmapbt because there are constraints on which allocation types are allowed to overlap. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-merge-bmap-records-6.4_2023-04-12' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: merge bmap records for faster scrubs [v24.5] I started looking into performance problems with the data fork scrubber in generic/333, and noticed a few things that needed improving. First, due to design reasons, it's possible for file forks btrees to contain multiple contiguous mappings to the same physical space. Instead of checking each ondisk mapping individually, it's much faster to combine them when possible and check the combined mapping because that's fewer trips through the rmap btree, and we can drop this check-around behavior that it does when an rmapbt lookup produces a record that starts before or ends after a particular bmbt mapping. Second, I noticed that the bmbt scrubber decides to walk every reverse mapping in the filesystem if the file fork is in btree format. This is very costly, and only necessary if the inode repair code had to zap a fork to convince iget to work. Constraining the full-rmap scan to this one case means we can skip it for normal files, which drives the runtime of this test from 8 hours down to 45 minutes (observed with realtime reflink and rebuild-all mode.) Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-iget-fixes-6.4_2023-04-12' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: fix iget/irele usage in online fsck [v24.5] This patchset fixes a handful of problems relating to how we get and release incore inodes in the online scrub code. The first patch fixes how we handle DONTCACHE -- our reasons for setting (or clearing it) depend entirely on the runtime environment at irele time. Hence we can refactor iget and irele to use our own wrappers that set that context appropriately. The second patch fixes a race between the iget call in the inode core scrubber and other writer threads that are allocating or freeing inodes in the same AG by changing the behavior of xchk_iget (and the inode core scrub setup function) to return either an incore inode or the AGI buffer so that we can be sure that the inode cannot disappear on us. The final patch elides MMAPLOCK from scrub paths when possible. It did not fit anywhere else. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-parent-fixes-6.4_2023-04-12' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: fix bugs in parent pointer checking [v24.5] Jan Kara pointed out that the VFS doesn't take i_rwsem of a child subdirectory that is being moved from one parent to another. Upon deeper analysis, I realized that this was the source of a very hard to trigger false corruption report in the parent pointer checking code. Now that we've refactored how directory walks work in scrub, we can also get rid of all the unnecessary and broken locking to make parent pointer scrubbing work properly. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-dir-iget-fixes-6.4_2023-04-12' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: fix iget usage in directory scrub [v24.5] In this series, we fix some problems with how the directory scrubber grabs child inodes. First, we want to reduce EDEADLOCK returns by replacing fixed-iteration loops with interruptible trylock loops. Second, we add UNTRUSTED to the child iget call so that we can detect a dirent that points to an unallocated inode. Third, we fix a bug where we weren't checking the inode pointed to by dotdot entries at all. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-detect-rmapbt-gaps-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: detect incorrect gaps in rmap btree [v24.5] Following in the theme of the last two patchsets, this one strengthens the rmap btree record checking so that scrub can count the number of space records that map to a given owner and that do not map to a given owner. This enables us to determine exclusive ownership of space that can't be shared. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-detect-inobt-gaps-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: detect incorrect gaps in inode btree [v24.5] This series continues the corrections for a couple of problems I found in the inode btree scrubber. The first problem is that we don't directly check the inobt records have a direct correspondence with the finobt records, and vice versa. The second problem occurs on filesystems with sparse inode chunks -- the cross-referencing we do detects sparseness, but it doesn't actually check the consistency between the inobt hole records and the rmap data. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-detect-refcount-gaps-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: detect incorrect gaps in refcount btree [v24.5] The next few patchsets address a deficiency in scrub that I found while QAing the refcount btree scrubber. If there's a gap between refcount records, we need to cross-reference that gap with the reverse mappings to ensure that there are no overlapping records in the rmap btree. If we find any, then the refcount btree is not consistent. This is not a property that is specific to the refcount btree; they all need to have this sort of keyspace scanning logic to detect inconsistencies. To do this accurately, we need to be able to scan the keyspace of a btree (which we already do) to be able to tell the caller if the keyspace is empty, sparse, or fully covered by records. The first few patches add the keyspace scanner to the generic btree code, along with the ability to mask off parts of btree keys because when we scan the rmapbt, we only care about space usage, not the owners. The final patch closes the scanning gap in the refcountbt scanner. v23.1: create helpers for the key extraction and comparison functions, improve documentation, and eliminate the ->mask_key indirect calls Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-btree-key-enhancements-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: enhance btree key scrubbing [v24.5] This series fixes the scrub btree block checker to ensure that the keys in the parent block accurately represent the block, and check the ordering of all interior key records. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'rmap-btree-fix-key-handling-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: fix rmap btree key flag handling [v24.5] This series fixes numerous flag handling bugs in the rmapbt key code. The most serious transgression is that key comparisons completely strip out all flag bits from rm_offset, including the ones that participate in record lookups. The second problem is that for years we've been letting the unwritten flag (which is an attribute of a specific record and not part of the record key) escape from leaf records into key records. The solution to the second problem is to filter attribute flags when creating keys from records, and the solution to the first problem is to preserve *only* the flags used for key lookups. The ATTR and BMBT flags are a part of the lookup key, and the UNWRITTEN flag is a record attribute. This has worked for years without generating user complaints because ATTR and BMBT extents cannot be shared, so key comparisons succeed solely on rm_startblock. Only file data fork extents can be shared, and those records never set any of the three flag bits, so comparisons that dig into rm_owner and rm_offset work just fine. A filesystem written with an unpatched kernel and mounted on a patched kernel will work correctly because the ATTR/BMBT flags have been conveyed into keys correctly all along, and we still ignore the UNWRITTEN flag in any key record. This was what doomed my previous attempt to correct this problem in 2019. A filesystem written with a patched kernel and mounted on an unpatched kernel will also work correctly because unpatched kernels ignore all flags. With this patchset applied, the scrub code gains the ability to detect rmap btrees with incorrectly set attr and bmbt flags in the key records. After three years of testing, I haven't encountered any problems. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'btree-hoist-scrub-checks-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: hoist scrub record checks into libxfs [v24.5] There are a few things about btree records that scrub checked but the libxfs _get_rec functions didn't. Move these bits into libxfs so that everyone can benefit. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'btree-complain-bad-records-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: standardize btree record checking code [v24.5] While I was cleaning things up for 6.1, I noticed that the btree _query_range and _query_all functions don't perform the same checking that the _get_rec functions perform. In fact, they don't perform /any/ sanity checking, which means that callers aren't warned about impossible records. Therefore, hoist the record validation and complaint logging code into separate functions, and call them from any place where we convert an ondisk record into an incore record. For online scrub, we can replace checking code with a call to the record checking functions in libxfs, thereby reducing the size of the codebase. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-drain-intents-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: drain deferred work items when scrubbing [v24.5] The design doc for XFS online fsck contains a long discussion of the eventual consistency models in use for XFS metadata. In that chapter, we note that it is possible for scrub to collide with a chain of deferred space metadata updates, and proposes a lightweight solution: The use of a pending-intents counter so that scrub can wait for the system to drain all chains. This patchset implements that scrub drain. The first patch implements the basic mechanism, and the subsequent patches reduce the runtime overhead by converting the implementation to use sloppy counters and introducing jump labels to avoid walking into scrub hooks when it isn't running. This last paradigm repeats elsewhere in this megaseries. v23.1: make intent items take an active ref to the perag structure and document why we bump and drop the intent counts when we do Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'scrub-fix-legalese-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs_scrub: fix licensing and copyright notices [v24.5] Fix various attribution problems in the xfs_scrub source code, such as the author's contact information, out of date SPDX tags, and a rough estimate of when the feature was under heavy development. The most egregious parts are the files that are missing license information completely. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'pass-perag-refs-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: pass perag references around when possible [v24.5] Avoid the cost of perag radix tree lookups by passing around active perag references when possible. v24.2: rework some of the naming and whatnot so there's less opencoding Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-14Merge tag 'intents-perag-refs-6.4_2023-04-11' of ↵Dave Chinner
git://git.kernel.org/pub/scm/linux/kernel/git/djwong/xfs-linux into guilt/xfs-for-next xfs: make intent items take a perag reference [v24.5] Now that we've cleaned up some code warts in the deferred work item processing code, let's make intent items take an active perag reference from their creation until they are finally freed by the defer ops machinery. This change facilitates the scrub drain in the next patchset and will make it easier for the future AG removal code to detect a busy AG in need of quiescing. Signed-off-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Dave Chinner <david@fromorbit.com>
2023-04-13unicode: remove MODULE_LICENSE in non-modulesNick Alcock
Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Acked-by: Gabriel Krisman Bertazi <krisman@suse.de> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Gabriel Krisman Bertazi <krisman@collabora.com> Cc: linux-fsdevel@vger.kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13NFSv4.2: remove MODULE_LICENSE in non-modulesNick Alcock
Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Trond Myklebust <trond.myklebust@hammerspace.com> Cc: Anna Schumaker <anna@kernel.org> Cc: Chuck Lever <chuck.lever@oracle.com> Cc: Jeff Layton <jlayton@kernel.org> Cc: linux-nfs@vger.kernel.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13binfmt_elf: remove MODULE_LICENSE in non-modulesNick Alcock
Since commit 8b41fc4454e ("kbuild: create modules.builtin without Makefile.modbuiltin or tristate.conf"), MODULE_LICENSE declarations are used to identify modules. As a consequence, uses of the macro in non-modules will cause modprobe to misidentify their containing object file as a module when it is not (false positives), and modprobe might succeed rather than failing with a suitable error message. So remove it in the files in this commit, none of which can be built as modules. Signed-off-by: Nick Alcock <nick.alcock@oracle.com> Suggested-by: Luis Chamberlain <mcgrof@kernel.org> Cc: Luis Chamberlain <mcgrof@kernel.org> Cc: linux-modules@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: Hitomi Hasegawa <hasegawa-hitomi@fujitsu.com> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Christian Brauner <brauner@kernel.org> Cc: linux-fsdevel@vger.kernel.org Cc: linux-mm@kvack.org Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13fs: fix sysctls.c builtKefeng Wang
'obj-$(CONFIG_SYSCTL) += sysctls.o' must be moved after "obj-y :=", or it won't be built as it is overwrited. Note that there is nothing that is going to break by linking sysctl.o later, we were just being way to cautious and patches have been updated to reflect these considerations and sent for stable as well with the whole "base" stuff needing to be linked prior to child sysctl tables that use that directory. All of the kernel sysctl APIs always share the same directory, and races against using it should end up re-using the same single created directory. And so something we can do eventually is do away with all the base stuff. For now it's fine, it's not creating an issue. It is just a bit pedantic and careful. Fixes: ab171b952c6e ("fs: move namespace sysctls and declare fs base directory") Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com> [mcgrof: enhanced commit log for stable criteria and clarify base stuff ] Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13ksmbd: avoid out of bounds access in decode_preauth_ctxt()David Disseldorp
Confirm that the accessed pneg_ctxt->HashAlgorithms address sits within the SMB request boundary; deassemble_neg_contexts() only checks that the eight byte smb2_neg_context header + (client controlled) DataLength are within the packet boundary, which is insufficient. Checking for sizeof(struct smb2_preauth_neg_context) is overkill given that the type currently assumes SMB311_SALT_SIZE bytes of trailing Salt. Signed-off-by: David Disseldorp <ddiss@suse.de> Acked-by: Namjae Jeon <linkinjeon@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Steve French <stfrench@microsoft.com>
2023-04-13ntfs: simplfy one-level sysctl registration for ntfs_sysctlsLuis Chamberlain
There is no need to declare an extra tables to just create directory, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13coda: simplify one-level sysctl registration for coda_tableLuis Chamberlain
There is no need to declare an extra tables to just create directory, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Acked-by: Jan Harkes <jaharkes@cs.cmu.edu> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13fs/cachefiles: simplify one-level sysctl registration for cachefiles_sysctlsLuis Chamberlain
There is no need to declare an extra tables to just create directory, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13xfs: simplify two-level sysctl registration for xfs_tableLuis Chamberlain
There is no need to declare two tables to just create directories, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Reviewed-by: Dave Chinner <dchinner@redhat.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13nfs: simplify two-level sysctl registration for nfs_cb_sysctlsLuis Chamberlain
There is no need to declare two tables to just create directories, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13nfs: simplify two-level sysctl registration for nfs4_cb_sysctlsLuis Chamberlain
There is no need to declare two tables to just create directories, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13lockd: simplify two-level sysctl registration for nlm_sysctlsLuis Chamberlain
There is no need to declare two tables to just create directories, this can be easily be done with a prefix path with register_sysctl(). Simplify this registration. Reviewed-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13proc_sysctl: enhance documentationLuis Chamberlain
Expand documentation to clarify: o that paths don't need to exist for the new API callers o clarify that we *require* callers to keep the memory of the table around during the lifetime of the sysctls o annotate routines we are trying to deprecate and later remove Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13sysctl: clarify register_sysctl_init() base directory orderLuis Chamberlain
Relatively new docs which I added which hinted the base directories needed to be created before is wrong, remove that incorrect comment. This has been hinted before by Eric twice already [0] [1], I had just not verified that until now. Now that I've verified that updates the docs to relax the context described. [0] https://lkml.kernel.org/r/875ys0azt8.fsf@email.froward.int.ebiederm.org [1] https://lkml.kernel.org/r/87ftbiud6s.fsf@x220.int.ebiederm.org Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Suggested-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13proc_sysctl: move helper which creates required subdirectoriesLuis Chamberlain
Move the code which creates the subdirectories for a ctl table into a helper routine so to make it easier to review. Document the goal. This creates no functional changes. Reviewed-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13proc_sysctl: update docs for __register_sysctl_table()Luis Chamberlain
Update the docs for __register_sysctl_table() to make it clear no child entries can be passed. When the child is true these are non-leaf entries on the ctl table and sysctl treats these as directories. The point to __register_sysctl_table() is to deal only with directories not part of the ctl table where thay may riside, to be simple and avoid recursion. While at it, hint towards using long on extra1 and extra2 later. Cc: stable@vger.kernel.org # v5.17 Cc: Christian Brauner <brauner@kernel.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
2023-04-13quota: update Kconfig commentYangtao Li
f2fs support quota since commit 0abd675e97e6 ("f2fs: support plain user/group quota"), let's document it. Signed-off-by: Yangtao Li <frank.li@vivo.com> Signed-off-by: Jan Kara <jack@suse.cz> Message-Id: <20230413151412.30059-1-frank.li@vivo.com>
2023-04-12f2fs: fix potential corruption when moving a directoryJaegeuk Kim
F2FS has the same issue in ext4_rename causing crash revealed by xfstests/generic/707. See also commit 0813299c586b ("ext4: Fix possible corruption when moving a directory") CC: stable@vger.kernel.org Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
2023-04-12f2fs: add radix_tree_preload_end in error caseYohan Joung
To prevent excessive increase in preemption count add radix_tree_preload_end in retry Signed-off-by: Yohan Joung <yohan.joung@sk.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>