xfs: AIL doesn't need manual pushing

We have a mechanism that checks the amount of log space remaining available every time we make a transaction reservation. If the amount of space is below a threshold (25% free) we push on the AIL to tell it to do more work. To do this, we end up calculating the LSN that the AIL needs to push to on every reservation and updating the push target for the AIL with that new target LSN. This is silly and expensive. The AIL is perfectly capable of calculating the push target itself, and it will always be running when the AIL contains objects. What the target does is determine if the AIL needs to do any work before it goes back to sleep. If we haven't run out of reservation space or memory (or some other push all trigger), it will simply go back to sleep for a while if there is more than 25% of the journal space free without doing anything. If there are items in the AIL at a lower LSN than the target, it will try to push up to the target or to the point of getting stuck before going back to sleep and trying again soon after.` Hence we can modify the AIL to calculate it's own 25% push target before it starts a push using the same reserve grant head based calculation as is currently used, and remove all the places where we ask the AIL to push to a new 25% free target. We can also drop the minimum free space size of 256BBs from the calculation because the 25% of a minimum sized log is *always going to be larger than 256BBs. This does still require a manual push in certain circumstances. These circumstances arise when the AIL is not full, but the reservation grants consume the entire of the free space in the log. In this case, we still need to push on the AIL to free up space, so when we hit this condition (i.e. reservation going to sleep to wait on log space) we do a single push to tell the AIL it should empty itself. This will keep the AIL moving as new reservations come in and want more space, rather than keep queuing them and having to push the AIL repeatedly. The reason for using the "push all" when grant space runs out is that we can run out of grant space when there is more than 25% of the log free. Small logs are notorious for this, and we have a hack in the log callback code (xlog_state_set_callback()) where we push the AIL because the *head* moved) to ensure that we kick the AIL when we consume space in it because that can push us over the "less than 25% available" available that starts tail pushing back up again. Hence when we run out of grant space and are going to sleep, we have to consider that the grant space may be consuming almost all the log space and there is almost nothing in the AIL. In this situation, the AIL pins the tail and moving the tail forwards is the only way the grant space will come available, so we have to force the AIL to push everything to guarantee grant space will eventually be returned. Hence triggering a "push all" just before sleeping removes all the nasty corner cases we have in other parts of the code that work around the "we didn't ask the AIL to push enough to free grant space" condition that leads to log space hangs... Signed-off-by: Dave Chinner <dchinner@redhat.com> Reviewed-by: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
author: Dave Chinner <dchinner@redhat.com> 2024-06-20 09:21:20 +0200
committer: Chandan Babu R <chandanbabu@kernel.org> 2024-07-04 12:46:46 +0530
commit: 9adf40249e6cfd7231c2973bb305f6c20902bfd9 (patch)
tree: 9c62a4072c6a2abee35841abdd47b21954718de5 /fs/xfs/xfs_log.c
parent: 613e2fdbbc7b4bd0cd324e3a025b3061eb8c947d (diff)
1 files changed, 5 insertions, 130 deletions
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c
index 416c15494983..235fcf6dc4ee 100644
--- a/fs/xfs/xfs_log.c
+++ b/fs/xfs/xfs_log.c
@@ -30,10 +30,6 @@ xlog_alloc_log(
 	struct xfs_buftarg	*log_target,
 	xfs_daddr_t		blk_offset,
 	int			num_bblks);
-STATIC int
-xlog_space_left(
-	struct xlog		*log,
-	atomic64_t		*head);
 STATIC void
 xlog_dealloc_log(
 	struct xlog		*log);
@@ -51,10 +47,6 @@ xlog_state_get_iclog_space(
 	struct xlog_ticket	*ticket,
 	int			*logoffsetp);
 STATIC void
-xlog_grant_push_ail(
-	struct xlog		*log,
-	int			need_bytes);
-STATIC void
 xlog_sync(
 	struct xlog		*log,
 	struct xlog_in_core	*iclog,
@@ -242,42 +234,15 @@ xlog_grant_head_wake(
 {
 	struct xlog_ticket	*tic;
 	int			need_bytes;
-	bool			woken_task = false;
 
 	list_for_each_entry(tic, &head->waiters, t_queue) {
-
-		/*
-		 * There is a chance that the size of the CIL checkpoints in
-		 * progress at the last AIL push target calculation resulted in
-		 * limiting the target to the log head (l_last_sync_lsn) at the
-		 * time. This may not reflect where the log head is now as the
-		 * CIL checkpoints may have completed.
-		 *
-		 * Hence when we are woken here, it may be that the head of the
-		 * log that has moved rather than the tail. As the tail didn't
-		 * move, there still won't be space available for the
-		 * reservation we require.  However, if the AIL has already
-		 * pushed to the target defined by the old log head location, we
-		 * will hang here waiting for something else to update the AIL
-		 * push target.
-		 *
-		 * Therefore, if there isn't space to wake the first waiter on
-		 * the grant head, we need to push the AIL again to ensure the
-		 * target reflects both the current log tail and log head
-		 * position before we wait for the tail to move again.
-		 */
-
 		need_bytes = xlog_ticket_reservation(log, head, tic);
-		if (*free_bytes < need_bytes) {
-			if (!woken_task)
-				xlog_grant_push_ail(log, need_bytes);
+		if (*free_bytes < need_bytes)
 			return false;
-		}
 
 		*free_bytes -= need_bytes;
 		trace_xfs_log_grant_wake_up(log, tic);
 		wake_up_process(tic->t_task);
-		woken_task = true;
 	}
 
 	return true;
@@ -296,13 +261,15 @@ xlog_grant_head_wait(
 	do {
 		if (xlog_is_shutdown(log))
 			goto shutdown;
-		xlog_grant_push_ail(log, need_bytes);
 
 		__set_current_state(TASK_UNINTERRUPTIBLE);
 		spin_unlock(&head->lock);
 
 		XFS_STATS_INC(log->l_mp, xs_sleep_logspace);
 
+		/* Push on the AIL to free up all the log space. */
+		xfs_ail_push_all(log->l_ailp);
+
 		trace_xfs_log_grant_sleep(log, tic);
 		schedule();
 		trace_xfs_log_grant_wake(log, tic);
@@ -418,9 +385,6 @@ xfs_log_regrant(
 	 * of rolling transactions in the log easily.
 	 */
 	tic->t_tid++;
-
-	xlog_grant_push_ail(log, tic->t_unit_res);
-
 	tic->t_curr_res = tic->t_unit_res;
 	if (tic->t_cnt > 0)
 		return 0;
@@ -477,12 +441,7 @@ xfs_log_reserve(
 	ASSERT(*ticp == NULL);
 	tic = xlog_ticket_alloc(log, unit_bytes, cnt, permanent);
 	*ticp = tic;
-
-	xlog_grant_push_ail(log, tic->t_cnt ? tic->t_unit_res * tic->t_cnt
-					    : tic->t_unit_res);
-
 	trace_xfs_log_reserve(log, tic);
-
 	error = xlog_grant_head_check(log, &log->l_reserve_head, tic,
 				      &need_bytes);
 	if (error)
@@ -1330,7 +1289,7 @@ xlog_assign_tail_lsn(
  * shortcut invalidity asserts in this case so that we don't trigger them
  * falsely.
  */
-STATIC int
+int
 xlog_space_left(
 	struct xlog	*log,
 	atomic64_t	*head)
@@ -1668,89 +1627,6 @@ out:
 }	/* xlog_alloc_log */
 
 /*
- * Compute the LSN that we'd need to push the log tail towards in order to have
- * (a) enough on-disk log space to log the number of bytes specified, (b) at
- * least 25% of the log space free, and (c) at least 256 blocks free.  If the
- * log free space already meets all three thresholds, this function returns
- * NULLCOMMITLSN.
- */
-xfs_lsn_t
-xlog_grant_push_threshold(
-	struct xlog	*log,
-	int		need_bytes)
-{
-	xfs_lsn_t	threshold_lsn = 0;
-	xfs_lsn_t	last_sync_lsn;
-	int		free_blocks;
-	int		free_bytes;
-	int		threshold_block;
-	int		threshold_cycle;
-	int		free_threshold;
-
-	ASSERT(BTOBB(need_bytes) < log->l_logBBsize);
-
-	free_bytes = xlog_space_left(log, &log->l_reserve_head.grant);
-	free_blocks = BTOBBT(free_bytes);
-
-	/*
-	 * Set the threshold for the minimum number of free blocks in the
-	 * log to the maximum of what the caller needs, one quarter of the
-	 * log, and 256 blocks.
-	 */
-	free_threshold = BTOBB(need_bytes);
-	free_threshold = max(free_threshold, (log->l_logBBsize >> 2));
-	free_threshold = max(free_threshold, 256);
-	if (free_blocks >= free_threshold)
-		return NULLCOMMITLSN;
-
-	xlog_crack_atomic_lsn(&log->l_tail_lsn, &threshold_cycle,
-						&threshold_block);
-	threshold_block += free_threshold;
-	if (threshold_block >= log->l_logBBsize) {
-		threshold_block -= log->l_logBBsize;
-		threshold_cycle += 1;
-	}
-	threshold_lsn = xlog_assign_lsn(threshold_cycle,
-					threshold_block);
-	/*
-	 * Don't pass in an lsn greater than the lsn of the last
-	 * log record known to be on disk. Use a snapshot of the last sync lsn
-	 * so that it doesn't change between the compare and the set.
-	 */
-	last_sync_lsn = atomic64_read(&log->l_last_sync_lsn);
-	if (XFS_LSN_CMP(threshold_lsn, last_sync_lsn) > 0)
-		threshold_lsn = last_sync_lsn;
-
-	return threshold_lsn;
-}
-
-/*
- * Push the tail of the log if we need to do so to maintain the free log space
- * thresholds set out by xlog_grant_push_threshold.  We may need to adopt a
- * policy which pushes on an lsn which is further along in the log once we
- * reach the high water mark.  In this manner, we would be creating a low water
- * mark.
- */
-STATIC void
-xlog_grant_push_ail(
-	struct xlog	*log,
-	int		need_bytes)
-{
-	xfs_lsn_t	threshold_lsn;
-
-	threshold_lsn = xlog_grant_push_threshold(log, need_bytes);
-	if (threshold_lsn == NULLCOMMITLSN || xlog_is_shutdown(log))
-		return;
-
-	/*
-	 * Get the transaction layer to kick the dirty buffers out to
-	 * disk asynchronously. No point in trying to do this if
-	 * the filesystem is shutting down.
-	 */
-	xfs_ail_push(log->l_ailp, threshold_lsn);
-}
-
-/*
  * Stamp cycle number in every block
  */
 STATIC void
@@ -2712,7 +2588,6 @@ xlog_state_set_callback(
 		return;
 
 	atomic64_set(&log->l_last_sync_lsn, header_lsn);
-	xlog_grant_push_ail(log, 0);
 }
 
 /*
author	Dave Chinner <dchinner@redhat.com>	2024-06-20 09:21:20 +0200
committer	Chandan Babu R <chandanbabu@kernel.org>	2024-07-04 12:46:46 +0530
commit	9adf40249e6cfd7231c2973bb305f6c20902bfd9 (patch)
tree	9c62a4072c6a2abee35841abdd47b21954718de5 /fs/xfs/xfs_log.c
parent	613e2fdbbc7b4bd0cd324e3a025b3061eb8c947d (diff)