Age | Commit message (Collapse) | Author |
|
Single device filesystems are now identified by the block device name,
not the UUID - and single device filesystems with the same UUID can be
mounted simultaneously, without any special options.
This allocates a new bit in the superblock, BCH_SB_MULTI_DEVICE, which
indicates whether a filesystem has ever been multi device.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
On single device filesystems, c->name contains the block device name,
not the UUID.
Initialize this earlier, so that single device mode can use it for
initializing sysfs/debugfs.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Alan Huang <mmpgouride@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
It's now a wrapper around bch2_journal_halt_locked().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a helper that lets us change bch_member.data_allowed at runtime.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Make the output slightly clearer, and include a counter for "nodes we
couldn't free because we would have gone under our reserve".
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Factor out a helper so we're not duplicating checks after locking the
btree node.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Small cleanup, just always increment the counters.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Kill 'opts.very_degraded', and make 'opts.degraded' a persistent option,
stored in the superblock.
It's now an enum, with available choices ask/yes/very/no.
"ask" mode will be handled by the mount helper, for prompting the user
(on a machine used interactively) for whether to do a degraded mount.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Needed for userspcae.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This patch uses printbuf_indent_add_nextline() to set a consistent
indentation level for error messages of invalid compression.
In my previous patch [1], the newline is added by using '\n' in
the argument of prt_str(). This patch replaces prt_str() with
prt_printf() to make indentation level work correctly.
[1] Link: https://lore.kernel.org/20250406152659.205997-2-integral@archlinuxcn.org
Signed-off-by: Integral <integral@archlinuxcn.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
When an invalid compression type or level is passed as an argument
to `--compression`, two error messages are squashed into one line:
> bcachefs format --compression=lzo bcachefs-comp.img
invalid option: invalid compression typecompression: parse error
> bcachefs format --compression=lz4:16 bcachefs-comp.img
invalid option: invalid compression levelcompression: parse error
To resolve this issue, add a newline character at the end of the
first error message to separate them into two lines.
Signed-off-by: Integral <integral@archlinuxcn.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Currently, when passing a negative integer as argument, the error
message is "too big" due to casting to an unsigned integer:
> bcachefs format --block_size=-1 bcachefs.img
invalid option: block_size: too big (max 65536)
When negative value in argument detected, return early before
calling bch2_opt_validate().
A new error code `BCH_ERR_option_negative` is added.
Signed-off-by: Integral <integral@archlinuxcn.org>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Defer memory allocations only needed in RW mode until we actually go RW.
This is part of improved support for RO images.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
_init_early() is for initialization that cannot fail, and often must
happen for teardown partway through initialization to work.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Hygeine, and fix build in userspace.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a better helper for check_snapshot_exists().
create_snapids() can't be changed to use this, unfortunately, because
the transaction that creates new snapshot will also be inserting other
keys (e.g. root inode) that reference that snapshot ID, and they expect
the snapshot table to already be updated.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
More stack usage improvements: instead of creating a new alloc_request
(currently on the stack), save/restore just the fields we need to reuse.
This is a bit tricky, because we're doing a normal alloc_foreground.c
allocation, which calls into ec.c to get a stripe, which then does more
normal allocations - some of the fields get reused, and used
differently.
So we have to save and restore them - but the stack usage improvements
will be well worth it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Add a struct for common state for satisfying an on disk allocation,
instead of passing the same long list of items to every function.
This will help with stack usage, performance, and perhaps enable some
code cleanups.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
We're occasionally seeing the WARN_ON() for bump allocator usage
exceeding BTREE_TRANS_MEM_MAX; add some tracing so we can see what's
going on.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
This was achieved before by zero-ing out the source buffer and then
copying the bytes into the destination buffer. This can also be done with
memcpy_and_pad which will zero out only the destination buffer if its
size is bigger than the size of the source buffer. This is already used in
the same way in journal_transaction_name().
Moreover, zero-ing the source buffer was done twice, first in
__bch2_fs_log_msg() and then in bch2_trans_log_msg(). And this method
may also require allocating some extra memory for the source buffer.
In conclusion, using memcpy_and_pad is better even tough the result is
the same because it brings uniformity with what's already used in
journal_transaction_name, it avoids code duplication and reallocating
extra memory.
Signed-off-by: Roxana Nicolescu <nicolescu.roxana@protonmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Strncpy is now deprecated.
The buffer destination is not required to be NULL-terminated, but we also
want to zero out the rest of the buffer as it is already done in other
places.
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Roxana Nicolescu <nicolescu.roxana@protonmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Let's not move poisoned extents unnecessarily, since we can't guard
against introducing more bitrot.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Now, if an extent is poisoned we can move it even if there was a
checksum error. We'll have to give it a new checksum, but the poison bit
means that userspace will still see the appropriate error when they try
to read it.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Copygc needs to be able to move extents that have bitrotted. We don't
want to delete them - in the future we'll have an API for "read me the
data even if there's checksum errors", and in general we don't want to
delete anything unless the user asks us to.
That will require writing it with a new checksum, which means we can't
forget that there was a checksum error so we return the correct error to
userspace.
Rebalance also wants to skip bad extents; we can now use the poison flag
for that.
This is currently disabled by default, as we want read fua support so
that we can distinguish between transient and permanent errors from the
device. It may be enabled with the module parameter:
poison_extents_on_checksum_error
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
If the extent we're reading from changes, due to be being overwritten or
moved (possibly partially) - we need to reset bch_io_failures so that we
don't accidentally mark a new extent as poisoned prematurely.
This means we have to separately track (in the retry path) the extent we
previously read from.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Check for mismatches between casefold dirents and casefold directories.
A mismatch will cause lookups to fail, as we'll be doing the lookup with
the casefolded name, which won't match the non-casefolded dirent, and
vice versa.
Reported-by: Christopher Snowhill <chris@kode54.net>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
bch2_dirent_create_snapshot(), used in fsck, neglected to create a
casefolded dirent.
Just move this into dirent_create_key().
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Changing the casefold option requires extra checks/work - factor out a
helper from bch2_fileattr_set() for the xattr code to use.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
|
|
Since commit 013a07052a1a ("nilfs2: convert metadata aops from writepage
to writepages"), nilfs_mdt_write_folio can't be called from reclaim
context any more. Remove the code keyed of the wbc->for_reclaim flag,
which is now only set for writing out swap or shmem pages inside the swap
code, but never passed to file systems.
Link: https://lkml.kernel.org/r/20250508054938.15894-7-hch@lst.de
Link: https://lkml.kernel.org/r/20250516123417.6779-1-konishi.ryusuke@gmail.com
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|