summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorFilipe Manana <fdmanana@suse.com>2024-12-10 11:41:33 +0000
committerDavid Sterba <dsterba@suse.com>2025-01-13 14:53:16 +0100
commit2747c555958448eb32fc953b28798c1b55df4e1d (patch)
tree1bcb539fb816ff0f2a7e8887e88829935056e176
parent9e0d43ea4e69aee29ab58f9cd8c90d2c807c862a (diff)
btrfs: add function comment for check_committed_ref()
There are some not immediately obvious details about the operation of check_committed_ref(), namely that when it returns 0 it must return with the path having a locked leaf from the extent tree that contains the extent's extent item, so that we can later check for delayed refs when calling check_delayed_ref() in a way that doesn't race with a task running delayed references. For similar reasons, it must also return with a locked leaf when the extent item is not found, and that leaf is where the extent item should be located, because we may have delayed references that are going to create the extent item. Also document that the function can return false positives in order to not be too slow, and that the most important is to not return false negatives. So add a function comment to check_committed_ref(). Reviewed-by: Qu Wenruo <wqu@suse.com> Signed-off-by: Filipe Manana <fdmanana@suse.com> Signed-off-by: David Sterba <dsterba@suse.com>
-rw-r--r--fs/btrfs/extent-tree.c42
1 files changed, 42 insertions, 0 deletions
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index af3893ad784b..b76df4d9d1ee 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2295,6 +2295,48 @@ static noinline int check_delayed_ref(struct btrfs_inode *inode,
return ret;
}
+/*
+ * Check if there are references for a data extent other than the one belonging
+ * to the given inode and offset.
+ *
+ * @inode: The only inode we expect to find associated with the data extent.
+ * @path: A path to use for searching the extent tree.
+ * @offset: The only offset we expect to find associated with the data extent.
+ * @bytenr: The logical address of the data extent.
+ *
+ * When the extent does not have any other references other than the one we
+ * expect to find, we always return a value of 0 with the path having a locked
+ * leaf that contains the extent's extent item - this is necessary to ensure
+ * we don't race with a task running delayed references, and our caller must
+ * have such a path when calling check_delayed_ref() - it must lock a delayed
+ * ref head while holding the leaf locked. In case the extent item is not found
+ * in the extent tree, we return -ENOENT with the path having the leaf (locked)
+ * where the extent item should be, in order to prevent races with another task
+ * running delayed references, so that we don't miss any reference when calling
+ * check_delayed_ref().
+ *
+ * Note: this may return false positives, and this is because we want to be
+ * quick here as we're called in write paths (when flushing delalloc and
+ * in the direct IO write path). For example we can have an extent with
+ * a single reference but that reference is not inlined, or we may have
+ * many references in the extent tree but we also have delayed references
+ * that cancel all the reference except the one for our inode and offset,
+ * but it would be expensive to do such checks and complex due to all
+ * locking to avoid races between the checks and flushing delayed refs,
+ * plus non-inline references may be located on leaves other than the one
+ * that contains the extent item in the extent tree. The important thing
+ * here is to not return false negatives and that the false positives are
+ * not very common.
+ *
+ * Returns: 0 if there are no cross references and with the path having a locked
+ * leaf from the extent tree that contains the extent's extent item.
+ *
+ * 1 if there are cross references (false positives can happen).
+ *
+ * < 0 in case of an error. In case of -ENOENT the leaf in the extent
+ * tree where the extent item should be located at is read locked and
+ * accessible in the given path.
+ */
static noinline int check_committed_ref(struct btrfs_inode *inode,
struct btrfs_path *path,
u64 offset, u64 bytenr)