Age | Commit message (Collapse) | Author |
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Now handled in one place.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
check_inode_hash_info_matches_root()
Clang 18 and newer warns (or errors with CONFIG_WERROR=y):
fs/bcachefs/str_hash.c:164:2: error: label followed by a declaration is a C23 extension [-Werror,-Wc23-extensions]
164 | struct bch_inode_unpacked inode;
| ^
In Clang 17 and prior, this is an unconditional hard error:
fs/bcachefs/str_hash.c:164:2: error: expected expression
164 | struct bch_inode_unpacked inode;
| ^
fs/bcachefs/str_hash.c:165:30: error: use of undeclared identifier 'inode'
165 | ret = bch2_inode_unpack(k, &inode);
| ^
fs/bcachefs/str_hash.c:169:55: error: use of undeclared identifier 'inode'
169 | struct bch_hash_info hash2 = bch2_hash_info_init(c, &inode);
| ^
fs/bcachefs/str_hash.c:171:40: error: use of undeclared identifier 'inode'
171 | ret = repair_inode_hash_info(trans, &inode);
| ^
Add an empty statement between the label and the declaration to fix the
warning/error without disturbing the code too much.
Fixes: 2519d3b0d656 ("bcachefs: bch2_str_hash_check_key() now checks inode hash info")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202412092339.QB7hffGC-lkp@intel.com/
Signed-off-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_snapshot_equiv() is going away; convert users that just wanted to
know if the snapshot exists to something better
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Versions of the same inode in different snapshots must have the same
hash info; this is critical for lookups to work correctly.
We're going to be running the str_hash checks online, at readdir or
xattr list time, so we now need str_hash_check_key() to check for inode
hash seed mismatches, since it won't be run right after check_inodes().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Bkey validation checks that inodes are well-formed and unpack
successfully, so an unpack error should always indicate memory
corruption or some other kind of hardware bug - but these are still
errors we can recover from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Since we added per-inode counters there's now far too many counters to
show in one shot - if we want this in the future, it'll have to be in
debugfs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We can't hold mark_lock while calling fsck_err() - that's a deadlock,
mark_lock is meant to be a leaf node lock.
It's also unnecessary for gc_bucket() and bucket_gen(); rcu suffices
since the bucket_gens array describes its size, and we can't race with
device removal or resize during gc/fsck since that takes state lock.
Reported-by: syzbot+38641fcbda1aaffefdd4@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This fixes a deadlock during journal replay when btree node read errors
kick off a ton of rewrites: we don't want them competing with journal
replay.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
nonempty transition
For each bucket we track when the bucket became nonempty and when it
became empty again: if we can ensure that there will be no journal
flushes in the range [nonempty, empty) (possibly because they occured at
the same journal sequence number), then it's safe to reuse the bucket
without waiting for a journal commit.
This is a major performance optimization for erasure coding, where
writes are initially replicated, but the extra replicas are quickly
dropped: if those buckets are reused and overwritten without issuing a
cache flush to the underlying device, then they only cost bus bandwidth.
But there's a tricky corner case when there's multiple empty -> nonempty
-> empty transitions in quick succession, i.e. when data is getting
overwritten immediately as it's being written.
If this happens and the previous empty transition hasn't been flushed,
we need to continue tracking the previous nonempty transition - not
start a new one.
Fixing this means we now need to track both the nonempty and empty
transitions in bch_alloc_v4.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Harder to screw up if we're explicit about the range, and more correct
as journal reservations can be outstanding on multiple journal entries
simultaneously.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This lets us print the exact location in the journal if it was found in
the journal, or correctly print if it was found in the superblock.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an O(n^2) issue when we find many overlapping (overwritten) btree
nodes - especially when one node overwrites many smaller nodes.
This was discovered to be an issue with the bcachefs
merge_torture_flakey test - if we had a large btree that was then
emptied, the number of difficult overwrites can be unbounded.
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
size_t is the correct type for a count of objects that can fit in
memory: this also means heaps now have the same memory layout as darrays
(fs/bcachefs/darray.h), and darrays can be used as heaps.
Cc: Kuan-Wei Chiu <visitorckw@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Coly Li <colyli@suse.de>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Check open buckets and buckets waiting for journal commit before doing
other expensive lookups.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
tested repairing from a bug uncovered by the merge_torture_flakey test
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
update succeeds
Originally, btree splits always succeeded once we got to the point of
recursing to the btree_insert_node() call.
But that changed when we switched to not taking intent locks all the way
up to the root, and that introduced a bug, because
bch2_btree_interior_update_will_free_node() cancels paending writes and
reparents a node that's going to be made visible on disk by another
btree update to the current btree update.
This was discovered in recent backpointers work, because
bch2_btree_interior_update_will_free_node() also clears the
will_make_reachable flag, causing backpointer target lookup to
spuriously thing it had found a dangling backpointer (when the
backpointer just hadn't been created yet by
btree_update_nodes_written()).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We should always signal to rewind if the requested pass hasn't been run,
even if called multiple times.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This lets us use darray macros on dev_alloc_list (and it will become a
darray eventually, when we increase the maximum number of devices).
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When allocating a journal write fails, then retries after doing
discards, we were failing to count already allocated replicas.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
kill another standard error code use
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a tracepoint for inserting new accounting entries: we're seeing odd
spinning behaviour in accounting read.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Don't spin.
Fixes: de95cc201a97 ("bcachefs: Kill bch2_get_next_backpointer()")
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The validate late path was iterating over accounting entries in
eytzinger order, which is unnecessarily tricky when we may have to
remove entries.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
we wish to use the logged ops btree for other items that aren't strictly
logged ops: cursors for inode allocation
There's no reason to create another cached btree for inode allocator
cursors - so reserve different parts of the keyspace for different
purposes.
Older versions will ignore or delete the cursors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Introduce a typedef to handle the difference between unsigned
long/struct urcu_gp_poll_state.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When tracing is disabled, there is no point in asking the user about
enabling extra btree_path tracepoints in bcachefs.
Fixes: 32ed4a620c5405be ("bcachefs: Btree path tracepoints")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The "journal space available" calculations didn't take into account
mismatched bucket sizes; we need to take the minimum space available out
of our devices.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a method to flush btree node rewrites at the end of recovery, to
ensure that corrected errors are persisted.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Ensure that "invalid bkey" repair gets persisted, so that it doesn't
repeatedly spam the logs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a function for walking backpointers to find a path from a given
inode number, and convert various error messages to use it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The function bch2_bucket_alloc_trans() lacked a description for the
nowait parameter in its documentation comment block. This patch adds the
missing description to ensure all parameters are properly documented.
Reported-by: Abaci Robot <abaci@linux.alibaba.com>
Closes: https://bugzilla.openanolis.cn/show_bug.cgi?id=12179
Signed-off-by: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|