diff options
author | Christoph Hellwig <hch@lst.de> | 2025-02-13 05:50:08 +0100 |
---|---|---|
committer | Christoph Hellwig <hch@lst.de> | 2025-03-03 08:17:07 -0700 |
commit | 058dd70c65ab736ab979df085b060c05a6cb3bd9 (patch) | |
tree | 1530707ab8021f6e317a0cf60858a278f9fdbea6 /fs/xfs/xfs_reflink.c | |
parent | 080d01c41d44f0993f2c235a6bfdb681f0a66be6 (diff) |
xfs: implement buffered writes to zoned RT devices
Implement buffered writes including page faults and block zeroing for
zoned RT devices. Buffered writes to zoned RT devices are split into
three phases:
1) a reservation for the worst case data block usage is taken before
acquiring the iolock. When there are enough free blocks but not
enough available one, garbage collection is kicked off to free the
space before continuing with the write. If there isn't enough
freeable space, the block reservation is reduced and a short write
will happen as expected by normal Linux write semantics.
2) with the iolock held, the generic iomap buffered write code is
called, which through the iomap_begin operation usually just inserts
delalloc extents for the range in a single iteration. Only for
overwrites of existing data that are not block aligned, or zeroing
operations the existing extent mapping is read to fill out the srcmap
and to figure out if zeroing is required.
3) the ->map_blocks callback to the generic iomap writeback code
calls into the zoned space allocator to actually allocate on-disk
space for the range before kicking of the writeback.
Note that because all writes are out of place, truncate or hole punches
that are not aligned to block size boundaries need to allocate space.
For block zeroing from truncate, ->setattr is called with the iolock
(aka i_rwsem) already held, so a hacky deviation from the above
scheme is needed. In this case the space reservations is called with
the iolock held, but is required not to block and can dip into the
reserved block pool. This can lead to -ENOSPC when truncating a
file, which is unfortunate. But fixing the calling conventions in
the VFS is probably much easier with code requiring it already in
mainline.
Similarly because all writes are out place, the zoned allocator can't
support unwritten extents and thus the FALLOC_FL_ALLOCATE_RANGE range
mode of fallocate. Other fallocate modes that would reserved space
but don't need to to provide proper semantics do work but do not
reserve space.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: "Darrick J. Wong" <djwong@kernel.org>
Diffstat (limited to 'fs/xfs/xfs_reflink.c')
-rw-r--r-- | fs/xfs/xfs_reflink.c | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index b977930c4ebc..cc3b4df88110 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1532,7 +1532,7 @@ xfs_reflink_zero_posteof( return 0; trace_xfs_zero_eof(ip, isize, pos - isize); - return xfs_zero_range(ip, isize, pos - isize, NULL); + return xfs_zero_range(ip, isize, pos - isize, NULL, NULL); } /* |