summaryrefslogtreecommitdiff
path: root/fs/bcachefs
AgeCommit message (Collapse)Author
2025-05-30bcachefs: Fix infinite loop in journal_entry_btree_keys_to_text()Kent Overstreet
Fix an infinite loop when bkey_i->k.u64s is 0. This only happens in userspace, where 'bcachefs list_journal' can print the entire contents of the journal, and non-dirty entries aren't validated. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-30bcachefs: Journal read error message improvementsKent Overstreet
- Don't print a checksum error when we first read a journal entry: we print a checksum error later if we'll be using the journal entry. - Continuing with the theme of of improving error messages and grouping errors into a single log message per error, print a single 'checksum error' message per journal entry, and use bch2_journal_ptr_to_text() to print out where on the device it was. - Factor out checksum error messages and checking for missing journal entries into helpers, bch2_journal_read() has gotten obnoxiously big. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27Merge tag 'timers-cleanups-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer cleanups from Thomas Gleixner: "Another set of timer API cleanups: - Convert init_timer*(), try_to_del_timer_sync() and destroy_timer_on_stack() over to the canonical timer_*() namespace convention. There is another large conversion pending, which has not been included because it would have caused a gazillion of merge conflicts in next. The conversion scripts will be run towards the end of the merge window and a pull request sent once all conflict dependencies have been merged" * tag 'timers-cleanups-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: treewide, timers: Rename destroy_timer_on_stack() as timer_destroy_on_stack() treewide, timers: Rename try_to_del_timer_sync() as timer_delete_sync_try() timers: Rename init_timers() as timers_init() timers: Rename NEXT_TIMER_MAX_DELTA as TIMER_NEXT_MAX_DELTA timers: Rename __init_timer_on_stack() as __timer_init_on_stack() timers: Rename __init_timer() as __timer_init() timers: Rename init_timer_on_stack_key() as timer_init_key_on_stack() timers: Rename init_timer_key() as timer_init_key()
2025-05-27bcachefs: Don't rewind to run a recovery pass we already ranKent Overstreet
Fix a small regression from the "run recovery passes" rewrite, which enabled async recovery passes. This fixes getting stuck in a loop in recovery. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Move unicode message to after the startup messageKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Fix missing commit in check_direntsKent Overstreet
Other repair code seems to be doing commits themselves, but check_key_has_snapshot() does not. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Fix lost rebalance wakeupsKent Overstreet
Fix a missing wakeup in 'bcachefs set-file-option' -> xattr option update -> inode_write this was missing because the wakeup needs to happen after transaction commit. Also, add a 'kick' counter, to make sure we don't miss a wakeup that occured right after we finished checking the rebalance_work btree. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: bch2_kthread_io_clock_wait_once()Kent Overstreet
Add a version of bch2_kthread_io_clock_wait() that only schedules once - behaving more like schedule_timeout(). This will be used for fixing rebalance wakeups. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-27bcachefs: Ensure we print output of run_recovery_pass if it errorsKent Overstreet
Also, don't error out in bucket_ref_update_err(): we don't want to return -BCH_ERR_cannot_rewind_recovery if it's not an insert, if it's an overwrite we continue. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-26Merge tag 'v6.16-p1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Fix memcpy_sglist to handle partially overlapping SG lists - Use memcpy_sglist to replace null skcipher - Rename CRYPTO_TESTS to CRYPTO_BENCHMARK - Flip CRYPTO_MANAGER_DISABLE_TEST into CRYPTO_SELFTESTS - Hide CRYPTO_MANAGER - Add delayed freeing of driver crypto_alg structures Compression: - Allocate large buffers on first use instead of initialisation in scomp - Drop destination linearisation buffer in scomp - Move scomp stream allocation into acomp - Add acomp scatter-gather walker - Remove request chaining - Add optional async request allocation Hashing: - Remove request chaining - Add optional async request allocation - Move partial block handling into API - Add ahash support to hmac - Fix shash documentation to disallow usage in hard IRQs Algorithms: - Remove unnecessary SIMD fallback code on x86 and arm/arm64 - Drop avx10_256 xts(aes)/ctr(aes) on x86 - Improve avx-512 optimisations for xts(aes) - Move chacha arch implementations into lib/crypto - Move poly1305 into lib/crypto and drop unused Crypto API algorithm - Disable powerpc/poly1305 as it has no SIMD fallback - Move sha256 arch implementations into lib/crypto - Convert deflate to acomp - Set block size correctly in cbcmac Drivers: - Do not use sg_dma_len before mapping in sun8i-ss - Fix warm-reboot failure by making shutdown do more work in qat - Add locking in zynqmp-sha - Remove cavium/zip - Add support for PCI device 0x17D8 to ccp - Add qat_6xxx support in qat - Add support for RK3576 in rockchip-rng - Add support for i.MX8QM in caam Others: - Fix irq_fpu_usable/kernel_fpu_begin inconsistency during CPU bring-up - Add new SEV/SNP platform shutdown API in ccp" * tag 'v6.16-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (382 commits) x86/fpu: Fix irq_fpu_usable() to return false during CPU onlining crypto: qat - add missing header inclusion crypto: api - Redo lookup on EEXIST Revert "crypto: testmgr - Add hash export format testing" crypto: marvell/cesa - Do not chain submitted requests crypto: powerpc/poly1305 - add depends on BROKEN for now Revert "crypto: powerpc/poly1305 - Add SIMD fallback" crypto: ccp - Add missing tee info reg for teev2 crypto: ccp - Add missing bootloader info reg for pspv5 crypto: sun8i-ce - move fallback ahash_request to the end of the struct crypto: octeontx2 - Use dynamic allocated memory region for lmtst crypto: octeontx2 - Initialize cptlfs device info once crypto: xts - Only add ecb if it is not already there crypto: lrw - Only add ecb if it is not already there crypto: testmgr - Add hash export format testing crypto: testmgr - Use ahash for generic tfm crypto: hmac - Add ahash support crypto: testmgr - Ignore EEXIST on shash allocation crypto: algapi - Add driver template support to crypto_inst_setname crypto: shash - Set reqsize in shash_alg ...
2025-05-25bcachefs: Fix missing BTREE_UPDATE_internal_snapshot_nodeKent Overstreet
Repair code will do updates on older snapshot versions, so needs the correct annotation. Reported-by: syzbot+42581416dba62b364750@syzkaller.appspotmail.com Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-25bcachefs: fix REFLINK_P_MAY_UPDATE_OPTIONSKent Overstreet
If we're doing a reflink copy of existing reflinked data, we may only set REFLINK_P_MAY_UPDATE_OPTIONS if it was set on the reflink pointer we're copying from. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Don't mount bs > ps without TRANSPARENT_HUGEPAGEKent Overstreet
Large folios aren't supported without TRANSPARENT_HUGEPAGE Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Fix btree_iter_next_node() for new locking assertsKent Overstreet
We can't unlock a should_be_locked path unless we're in a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Ensure we don't use a blacklisted journal seqKent Overstreet
Different versions differ on the size of the blacklist range; it is theoretically possible that we could end up with blacklisted journal sequence numbers newer than the newest seq we find in the journal, and pick a new start seq that's blacklisted. Explicitly check for this in bch2_fs_journal_start(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Small check_fix_ptr fixesKent Overstreet
We don't want to change the bucket gen, on gen mismatch: it's possible to have multiple btree nodes with different gens in the same bucket that we want to keep, if we have to recover from btree node scan. It's also not necessary to set g->gen_valid; add a comment to that effect. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Fix opts.recovery_pass_lastKent Overstreet
This was lost in the giant recovery pass rework - but it's used heavily by bcachefs subcommand utilities. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Fix allocate -> self healing pathKent Overstreet
When we go to allocate and find taht a bucket in the freespace btree is actually allocated, we're supposed to return nonzero to tell the allocator to skip it. This fixes an emergency read only due to a bucket/ptr gen mismatch - we also don't return the correct bucket gen when this happens. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Fix endianness in casefold check/repairKent Overstreet
Fixes: 010c89468134 ("bcachefs: Check for casefolded dirents in non casefolded dirs") Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Path must be locked if trans->locked && should_be_lockedKent Overstreet
If path->should_be_locked is true, that means user code (of the btree API) has seen, in this transaction, something guarded by the node this path has locked, and we have to keep it locked until the end of the transaction. Assert that we're not violating this; should_be_locked should also be cleared only in _very_ special situations. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Simplify bch2_path_put()Kent Overstreet
Simplify the "do we need to keep this locked?" checks. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Plumb btree_trans for more locking assertsKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Clear trans->locked before unlockKent Overstreet
We're adding new should_be_locked assertions: it's going to be illegal to unlock a should_be_locked path when trans->locked is true. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Clear should_be_locked before unlock in key_cache_drop()Kent Overstreet
We're adding new should_be_locked assertions, also add a comment explaining why clearing should_be_locked is safe here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: bch2_path_get() reuses paths if upgrade_fails & !should_be_lockedKent Overstreet
Small additional optimization over the previous patch, bringing us closer to the original behaviour, except when we need to clone to avoid a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Give out new path if upgrade failsKent Overstreet
Avoid transaction restarts due to failure to upgrade - we can traverse a new iterator without a transaction restart. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Fix btree_path_get_locks when not doing trans restartKent Overstreet
btree_path_get_locks, on failure, shouldn't unlock if we're not issuing a transaction restart: we might drop locks we're not supposed to (if path->should_be_locked is set). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: btree_node_locked_type_nowrite()Kent Overstreet
Small helper to improve locking assertions. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-23bcachefs: Kill bch2_path_put_nokeep()Kent Overstreet
bch2_path_put_nokeep() was intended for paths we wouldn't need to preserve for a transaction restart - it always frees them right away when the ref hits 0. But since paths are shared, freeing unconditionally is a bug, the path might have been used elsewhere and have should_be_locked set, i.e. we need to keep it locked until the end of the transaction. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_journal_write_checksum()Kent Overstreet
We need to delay checksumming the journal write; we don't know the blocksize until after we allocate the write. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Reduce stack usage in data_update_index_update()Kent Overstreet
Separate tracepoint message generation and other slowpath code into non-inline functions, and use bch2_trans_log_str() instead of using a printbuf for our journal message. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: bch2_trans_log_str()Kent Overstreet
The data update path doesn't need a printbuf for its log message - this will help reduce stack usage. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-22bcachefs: Kill bkey_buf usage in data_update_index_update()Kent Overstreet
Reduce stack usage - bkey_buf has a 96 byte buffer on the stack, but the btree_trans bump allocator works just fine here. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Drop empty accounting updatesKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Improve trace_trans_restart_upgradeKent Overstreet
- Convert to a 'fs_str' tracepoint that just emits as a string: this lets us build up the tracepoint with a printbuf, using our pretty printers, and they're much easier to manage - Include locks_held, before and after - Include the btree node pointer we failed on (error pointer, null, or real node) Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: fix bch2_inum_snapshot_to_path()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: fix duplicate printkKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: BCH_INODE_has_case_insensitiveKent Overstreet
Add a flag for tracking whether a directory has case-insensitive descendents - so that overlayfs can disallow mounting, even though the filesystem supports case insensitivity. This is a new on disk format version, with a (cheap) upgrade to ensure the flag is correctly set on existing inodes. Create, rename and fssetxattr are all plumbed to ensure the new flag is set, and we've got new fsck code that hooks into check_inode(0. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch2_inode_find_by_inum_snapshot()Kent Overstreet
Move a fsck.c helper into inode.c, eliminate some duplicate and organize the inode lookup helpers. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch2_inum_snapshot_to_path()Kent Overstreet
Add a better helper for printing out paths of inodes when we don't know the subvolume, for fsck. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: bch2_rename_trans() only runs rename-to-dir code if neededKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: subvol_inum_eq()Kent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Don't set bi_casefold on non directoriesKent Overstreet
bi_casefold only makes sense for directories, and since it's one of the variable length fields setting it unnecessarily wastes space. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Remove duplicate call to bch2_trans_begin()Alan Huang
There is one in for_each_btree_key_max(). Signed-off-by: Alan Huang <mmpgouride@gmail.com> Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Call bch2_bkey_set_needs_rebalance() earlier in write pathKent Overstreet
There's no reason to be running this inside our transaction; it forces us to copy the key we're updating to a temporary, which we'd like to skip. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Simplify bch2_extent_atomic_end()Kent Overstreet
It used to be that we had a fixed maximum number of btree paths to work with - 64. That's no longer the case, so bch2_extent_atomic_end() doesn't have to be as strict. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Coalesce accounting in trans commitKent Overstreet
Accounting has gotten quite heavy, and there's lots of redundancy in accounting updates within a transaction, as we often add/delete multiple extents that touch the same accountign counters. This will reduce the amount of data that we journal, and reduce pressure downstream on the btree write buffer. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Split out accounting in transaction commitKent Overstreet
There can be a lot of rendundancy in accounting updates within a single btree transaction. Split out accounting updates so that they can be deduped, in the next commit. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: btree_trans_subbufKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
2025-05-21bcachefs: Make accounting mismatch errors more readableKent Overstreet
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>