Age | Commit message (Collapse) | Author |
|
We were using our device pointer after we'd released our ref to it.
Unlikely to be a race that's practical to hit, since actually removing a
member device is a whole process besides just taking it offline, but -
needs to be fixed.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If a device is ro or failed, we might not have anywhere to move a
replica.
Check for this early, before doing the read and attempting to write.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This implements a new extent field bitflags that apply to the whole
extent. There's been a couple things we've wanted this for in the past,
but the immediate need is extent poisoning, to solve a rebalance issue.
Unknown extent fields can't be parsed (we won't known their size, so we
can't advance to the next field), so this is an incompat feature, and
using it prevents the filesystem from being mounted by old versions.
This also adds the BCH_EXTENT_poisoned flag; this indicates that the
data is known to be bad (i.e. there was a checksum error, and we had to
write a new checksum) and reads will return errors.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
For future usage, we'll want a dedicated error code for better
debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Use error type alloc_v3_unpack_error in bch2_alloc_v3_validate().
Fixes: b65db750e2bb ("bcachefs: Enumerate fsck errors")
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
These features are set on format and incompat upgarde.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a debug knob to manually trigger the btree updates worker.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This patch implements support for case-insensitive file name lookups
in bcachefs.
The implementation uses the same UTF-8 lowering and normalization that
ext4 and f2fs is using.
More information is provided in Documentation/bcachefs/casefolding.rst
Compatibility notes:
This uses the new versioning scheme for incompatible features where an
incompatible feature is tied to a version number: the superblock says
"we may use incompat features up to x" and "incompat features up to x
are in use", disallowing mounting by previous versions.
Additionally, and old style incompat feature bit is used, so that
kernels without utf8 casefolding support know if casefolding
specifically is in use and they're allowed to mount.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Splits out the code that allocates the dirent and initializes the name
to make things easier to implement casefolding in a future commit.
Cc: André Almeida <andrealmeid@igalia.com>
Cc: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Cc: Hongbo Li <lihongbo22@huawei.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a persistent LRU for stripes, ordered by "number of empty blocks",
i.e. order in which we wish to reuse them.
This will replace the in-memory stripes heap, so we can kill off reading
stripes into memory at startup.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Stripes now have backpointers.
This is needed for proper scrub - stripe checksums need to be verified,
separately from extents within the stripe, since a block may not be full
of live extents but it's still needed for reconstruct.
And this will be needed for (efficient) evacuate/repair paths.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Now that we've got cached backpointers and aren't leaving around stale
pointers on bucket invalidation, we no longer need the periodic (rare)
gc_gens - which recalculates each bucket's oldest gen to avoid wraparound.
We can't delete that code because we've got to support existing
filesystems that will still have stale pointers, but this gets rid of
another scalability limit.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If we don't leave stale pointers around, we won't have to deal with
bucket gen wraparound.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Cached pointers now have backpointers.
This means that we'll be able to kill cached pointers in the
bucket_invalidate path, when invalidating/reusing buckets containing
cached data, instead of leaving them around to be cleaned up by gc_gens
garbago collection - which requires a full metadata scan.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Transactional triggers need to run in a defined ordering, which is not
quite the same as btree ID integer comparison.
Previously this was handled in a hacky way in
bch2_trans_commit_run_triggers(), since it was only the alloc btree that
needed special handling, but upcoming stripe btree changes are going to
require more ordering changes - so, define that ordering.
Next patch will change the transaction commit path to use it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Introduce per-entry locks, like with struct bucket - the stripes heap is
going away.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's now easier to add new LRU types.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Pass in the backpointer explicitly, instead of assuming 'referring_k' is
an alloc key and calculating it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
FRAGMENTATION_START was incorrect, there's currently only one
fragmentation LRU (at the end of the reserved bits for LRU type), and
we're getting ready to add a stripe fragmentation lru - so give it a
better name.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Minor cleanup, no reason for the caller to have to this.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
A user has been seeing the "error verifying existing checksum while
rewriting existing data (memory corruption?)" error.
This generally indicates a hardware issue (and that may be the case
here), but it might also indicate a bug, in which case we need more
information to look for patterns.
Reported-by: Roland Vet <vet.roland@protonmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This option only applies filesystem wide.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The eytzinger code was previously relying on the following wrap-around
properties and their "eytzinger0" equivalents:
eytzinger1_prev(0, size) == eytzinger1_last(size)
eytzinger1_next(0, size) == eytzinger1_first(size)
However, these properties are no longer relied upon and no longer
necessary, so remove the corresponding asserts and forbid the use of
eytzinger1_prev(0, size) and eytzinger1_next(0, size).
This allows to further simplify the code in eytzinger1_next() and
eytzinger1_prev(): where the left shifting happens, eytzinger1_next() is
trying to move i to the lowest child on the left, which is equivalent to
doubling i until the next doubling would cause it to be greater than
size. This is implemented by shifting i to the left so that the most
significant bits align and then shifting i to the right by one if the
result is greater than size.
Likewise, eytzinger1_prev() is trying to move to the lowest child on the
right; the same applies here.
The 1-offset in (size - 1) in eytzinger1_next() isn't needed at all, but
the equivalent offset in eytzinger1_prev() is surprisingly needed to
preserve the 'eytzinger1_prev(0, size) == eytzinger1_last(size)'
property. However, since we no longer support that property, we can get
rid of these offsets as well. This saves one addition in each function
and makes the code less confusing.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In this second step, transform the eytzinger indexes i, j, and k in
eytzinger1_sort_r() from 0-based to 1-based. This step looks a bit
messy, but the resulting code is slightly better.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In this first step, convert the eytzinger sort functions to use 1-based
primitives.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Several of the algorithms on eytzinger trees are implemented in terms of
the eytzinger0 primitives. However, those algorithms can just as easily
be expressed in terms of the eytzinger1 primitives, and that leads to
better and easier to understand code. Start by converting
eytzinger0_find().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Function eytzinger0_find() isn't currently covered, so add a self test.
We can rely on eytzinger0_find_le() here because it is being
tested independently.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add an eytzinger0_find_ge() self test similar to eytzinger0_find_gt().
Note that this test requires eytzinger0_find_ge() to return the first
matching element in the array in case of duplicates. To prevent
bisection errors, we only add this test after strenghening the original
implementation (see the previous commit).
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Implement eytzinger0_find_ge() directly instead of implementing it in
terms of eytzinger0_find_le() and adjusting the result.
This turns eytzinger0_find_ge() into a minimum search, so when there are
duplicate elements, the result of eytzinger0_find_ge() will now always
point at the first matching element.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Instead of implementing eytzinger0_find_gt() in terms of
eytzinger0_find_le() and adjusting the result, implement it directly.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add an eytzinger0_find_gt() self test similar to eytzinger0_find_le().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Replace the over-complicated implementation of eytzinger0_find_le() by
an equivalent, simpler version.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
eytzinger0_find_le() is also easy to concert to 1-based eytzinger (but
see the next commit).
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Rename eytzinger0_find_test_val() to eytzinger0_find_test_le() and add a
new eytzinger0_find_test_val() wrapper that calls it.
We have already established that the array is sorted in eytzinger order,
so we can use the eytzinger iterator functions and check the boundary
conditions to verify the result of eytzinger0_find_le().
Only scan the entire array if we get an incorrect result. When we need
to scan, use eytzinger0_for_each_prev() so that we'll stop at the
highest matching element in the array in case there are duplicates;
going through the array linearly wouldn't give us that.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add an eytzinger0_for_each_prev() macro for iterating through an
eytzinger array in reverse.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In eytzinger0_find_test(), remember the smallest element seen so far
instead of comparing adjacent array elements.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
In eytzinger[01]_test(), make sure that eytzinger[01]_for_each()
iterates over all array elements.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Fix an obvious typo in cmp_u16().
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
pr_info() format strings need to be newline terminated.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
The iterator variable of eytzinger0_for_each() loops has been changed to
be locally scoped at some point, so remove variables defined outside the
loop that are now unused. In addition and for clarity, use a different
variable inside those loops where an outside variable would be shadowed.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When EYTZINGER_DEBUG is defined, <linux/bug.h> needs to be included.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Use an eytzinger0_for_each() loop here.
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Necessary for adding backpointers for cached pointers.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We have other metadata IO types covered, this was missing.
Note: this includes the time until completion, i.e. including parent
pointer update.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
impossible
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|