diff options
author | Linus Torvalds <torvalds@linux-foundation.org> | 2022-10-06 17:36:48 -0700 |
---|---|---|
committer | Linus Torvalds <torvalds@linux-foundation.org> | 2022-10-06 17:36:48 -0700 |
commit | 76e45035348c247a70ed50eb29a9906657e4444f (patch) | |
tree | e4101b34b1a3ddfea00be656586c22f704b33a2d /fs/btrfs/compression.c | |
parent | 4c0ed7d8d6e3dc013c4599a837de84794baa5b62 (diff) | |
parent | cbddcc4fa3443fe8cfb2ff8e210deb1f6a0eea38 (diff) |
Merge tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs updates from David Sterba:
"There's a bunch of performance improvements, most notably the FIEMAP
speedup, the new block group tree to speed up mount on large
filesystems, more io_uring integration, some sysfs exports and the
usual fixes and core updates.
Summary:
Performance:
- outstanding FIEMAP speed improvement
- algorithmic change how extents are enumerated leads to orders of
magnitude speed boost (uncached and cached)
- extent sharing check speedup (2.2x uncached, 3x cached)
- add more cancellation points, allowing to interrupt seeking in
files with large number of extents
- more efficient hole and data seeking (4x uncached, 1.3x cached)
- sample results:
256M, 32K extents: 4s -> 29ms (~150x)
512M, 64K extents: 30s -> 59ms (~550x)
1G, 128K extents: 225s -> 120ms (~1800x)
- improved inode logging, especially for directories (on dbench
workload throughput +25%, max latency -21%)
- improved buffered IO, remove redundant extent state tracking,
lowering memory consumption and avoiding rb tree traversal
- add sysfs tunable to let qgroup temporarily skip exact accounting
when deleting snapshot, leading to a speedup but requiring a rescan
after that, will be used by snapper
- support io_uring and buffered writes, until now it was just for
direct IO, with the no-wait semantics implemented in the buffered
write path it now works and leads to speed improvement in IOPS
(2x), throughput (2.2x), latency (depends, 2x to 150x)
- small performance improvements when dropping and searching for
extent maps as well as when flushing delalloc in COW mode
(throughput +5MB/s)
User visible changes:
- new incompatible feature block-group-tree adding a dedicated tree
for tracking block groups, this allows a much faster load during
mount and avoids seeking unlike when it's scattered in the extent
tree items
- this reduces mount time for many-terabyte sized filesystems
- conversion tool will be provided so existing filesystem can also
be updated in place
- to reduce test matrix and feature combinations requires no-holes
and free-space-tree (mkfs defaults since 5.15)
- improved reporting of super block corruption detected by scrub
- scrub also tries to repair super block and does not wait until next
commit
- discard stats and tunables are exported in sysfs
(/sys/fs/btrfs/FSID/discard)
- qgroup status is exported in sysfs
(/sys/sys/fs/btrfs/FSID/qgroups/)
- verify that super block was not modified when thawing filesystem
Fixes:
- FIEMAP fixes
- fix extent sharing status, does not depend on the cached status
where merged
- flush delalloc so compressed extents are reported correctly
- fix alignment of VMA for memory mapped files on THP
- send: fix failures when processing inodes with no links (orphan
files and directories)
- fix race between quota enable and quota rescan ioctl
- handle more corner cases for read-only compat feature verification
- fix missed extent on fsync after dropping extent maps
Core:
- lockdep annotations to validate various transactions states and
state transitions
- preliminary support for fs-verity in send
- more effective memory use in scrub for subpage where sector is
smaller than page
- block group caching progress logic has been removed, load is now
synchronous
- simplify end IO callbacks and bio handling, use chained bios
instead of own tracking
- add no-wait semantics to several functions (tree search, nocow,
flushing, buffered write
- cleanups and refactoring
MM changes:
- export balance_dirty_pages_ratelimited_flags"
* tag 'for-6.1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (177 commits)
btrfs: set generation before calling btrfs_clean_tree_block in btrfs_init_new_buffer
btrfs: drop extent map range more efficiently
btrfs: avoid pointless extent map tree search when flushing delalloc
btrfs: remove unnecessary next extent map search
btrfs: remove unnecessary NULL pointer checks when searching extent maps
btrfs: assert tree is locked when clearing extent map from logging
btrfs: remove unnecessary extent map initializations
btrfs: remove the refcount warning/check at free_extent_map()
btrfs: add helper to replace extent map range with a new extent map
btrfs: move open coded extent map tree deletion out of inode eviction
btrfs: use cond_resched_rwlock_write() during inode eviction
btrfs: use extent_map_end() at btrfs_drop_extent_map_range()
btrfs: move btrfs_drop_extent_cache() to extent_map.c
btrfs: fix missed extent on fsync after dropping extent maps
btrfs: remove stale prototype of btrfs_write_inode
btrfs: enable nowait async buffered writes
btrfs: assert nowait mode is not used for some btree search functions
btrfs: make btrfs_buffered_write nowait compatible
btrfs: plumb NOWAIT through the write path
btrfs: make lock_and_cleanup_extent_if_need nowait compatible
...
Diffstat (limited to 'fs/btrfs/compression.c')
-rw-r--r-- | fs/btrfs/compression.c | 54 |
1 files changed, 22 insertions, 32 deletions
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c index e84d22c5c6a8..54caa00a2245 100644 --- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -152,9 +152,7 @@ static void finish_compressed_bio_read(struct compressed_bio *cb) } /* Do io completion on the original bio */ - if (cb->status != BLK_STS_OK) - cb->orig_bio->bi_status = cb->status; - bio_endio(cb->orig_bio); + btrfs_bio_end_io(btrfs_bio(cb->orig_bio), cb->status); /* Finally free the cb struct */ kfree(cb->compressed_pages); @@ -166,16 +164,15 @@ static void finish_compressed_bio_read(struct compressed_bio *cb) * before decompressing it into the original bio and freeing the uncompressed * pages. */ -static void end_compressed_bio_read(struct bio *bio) +static void end_compressed_bio_read(struct btrfs_bio *bbio) { - struct compressed_bio *cb = bio->bi_private; + struct compressed_bio *cb = bbio->private; struct inode *inode = cb->inode; struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_inode *bi = BTRFS_I(inode); bool csum = !(bi->flags & BTRFS_INODE_NODATASUM) && !test_bit(BTRFS_FS_STATE_NO_CSUMS, &fs_info->fs_state); - blk_status_t status = bio->bi_status; - struct btrfs_bio *bbio = btrfs_bio(bio); + blk_status_t status = bbio->bio.bi_status; struct bvec_iter iter; struct bio_vec bv; u32 offset; @@ -186,9 +183,8 @@ static void end_compressed_bio_read(struct bio *bio) if (!status && (!csum || !btrfs_check_data_csum(inode, bbio, offset, bv.bv_page, bv.bv_offset))) { - clean_io_failure(fs_info, &bi->io_failure_tree, - &bi->io_tree, start, bv.bv_page, - btrfs_ino(bi), bv.bv_offset); + btrfs_clean_io_failure(bi, start, bv.bv_page, + bv.bv_offset); } else { int ret; @@ -209,7 +205,7 @@ static void end_compressed_bio_read(struct bio *bio) if (refcount_dec_and_test(&cb->pending_ios)) finish_compressed_bio_read(cb); btrfs_bio_free_csum(bbio); - bio_put(bio); + bio_put(&bbio->bio); } /* @@ -301,20 +297,20 @@ static void btrfs_finish_compressed_write_work(struct work_struct *work) * This also calls the writeback end hooks for the file pages so that metadata * and checksums can be updated in the file. */ -static void end_compressed_bio_write(struct bio *bio) +static void end_compressed_bio_write(struct btrfs_bio *bbio) { - struct compressed_bio *cb = bio->bi_private; + struct compressed_bio *cb = bbio->private; - if (bio->bi_status) - cb->status = bio->bi_status; + if (bbio->bio.bi_status) + cb->status = bbio->bio.bi_status; if (refcount_dec_and_test(&cb->pending_ios)) { struct btrfs_fs_info *fs_info = btrfs_sb(cb->inode->i_sb); - btrfs_record_physical_zoned(cb->inode, cb->start, bio); + btrfs_record_physical_zoned(cb->inode, cb->start, &bbio->bio); queue_work(fs_info->compressed_write_workers, &cb->write_end_work); } - bio_put(bio); + bio_put(&bbio->bio); } /* @@ -335,7 +331,8 @@ static void end_compressed_bio_write(struct bio *bio) static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_bytenr, - blk_opf_t opf, bio_end_io_t endio_func, + blk_opf_t opf, + btrfs_bio_end_io_t endio_func, u64 *next_stripe_start) { struct btrfs_fs_info *fs_info = btrfs_sb(cb->inode->i_sb); @@ -344,12 +341,8 @@ static struct bio *alloc_compressed_bio(struct compressed_bio *cb, u64 disk_byte struct bio *bio; int ret; - bio = btrfs_bio_alloc(BIO_MAX_VECS); - + bio = btrfs_bio_alloc(BIO_MAX_VECS, opf, endio_func, cb); bio->bi_iter.bi_sector = disk_bytenr >> SECTOR_SHIFT; - bio->bi_opf = opf; - bio->bi_private = cb; - bio->bi_end_io = endio_func; em = btrfs_get_chunk_map(fs_info, disk_bytenr, fs_info->sectorsize); if (IS_ERR(em)) { @@ -478,8 +471,7 @@ blk_status_t btrfs_submit_compressed_write(struct btrfs_inode *inode, u64 start, if (!skip_sum) { ret = btrfs_csum_one_bio(inode, bio, start, true); if (ret) { - bio->bi_status = ret; - bio_endio(bio); + btrfs_bio_end_io(btrfs_bio(bio), ret); break; } } @@ -596,7 +588,7 @@ static noinline int add_ra_bio_pages(struct inode *inode, } page_end = (pg_index << PAGE_SHIFT) + PAGE_SIZE - 1; - lock_extent(tree, cur, page_end); + lock_extent(tree, cur, page_end, NULL); read_lock(&em_tree->lock); em = lookup_extent_mapping(em_tree, cur, page_end + 1 - cur); read_unlock(&em_tree->lock); @@ -610,7 +602,7 @@ static noinline int add_ra_bio_pages(struct inode *inode, (cur + fs_info->sectorsize > extent_map_end(em)) || (em->block_start >> 9) != cb->orig_bio->bi_iter.bi_sector) { free_extent_map(em); - unlock_extent(tree, cur, page_end); + unlock_extent(tree, cur, page_end, NULL); unlock_page(page); put_page(page); break; @@ -630,7 +622,7 @@ static noinline int add_ra_bio_pages(struct inode *inode, add_size = min(em->start + em->len, page_end + 1) - cur; ret = bio_add_page(cb->orig_bio, page, add_size, offset_in_page(cur)); if (ret != add_size) { - unlock_extent(tree, cur, page_end); + unlock_extent(tree, cur, page_end, NULL); unlock_page(page); put_page(page); break; @@ -799,8 +791,7 @@ void btrfs_submit_compressed_read(struct inode *inode, struct bio *bio, ret = btrfs_lookup_bio_sums(inode, comp_bio, NULL); if (ret) { - comp_bio->bi_status = ret; - bio_endio(comp_bio); + btrfs_bio_end_io(btrfs_bio(comp_bio), ret); break; } @@ -826,8 +817,7 @@ fail: kfree(cb); out: free_extent_map(em); - bio->bi_status = ret; - bio_endio(bio); + btrfs_bio_end_io(btrfs_bio(bio), ret); return; } |