diff options
author | Darrick J. Wong <djwong@kernel.org> | 2024-12-02 10:57:38 -0800 |
---|---|---|
committer | Darrick J. Wong <djwong@kernel.org> | 2024-12-12 17:45:11 -0800 |
commit | acc8f8628c3737108f36e5637f4d5daeaf96d90e (patch) | |
tree | 4688312b99df3b19f0bdc1db2ded61a4e51c5297 /fs/xfs/xfs_qm.c | |
parent | ec88b41b932d5731291dcc0d0d63ea13ab8e07d5 (diff) |
xfs: attach dquot buffer to dquot log item buffer
Ever since 6.12-rc1, I've observed a pile of warnings from the kernel
when running fstests with quotas enabled:
WARNING: CPU: 1 PID: 458580 at mm/page_alloc.c:4221 __alloc_pages_noprof+0xc9c/0xf18
CPU: 1 UID: 0 PID: 458580 Comm: xfsaild/sda3 Tainted: G W 6.12.0-rc6-djwa #rc6 6ee3e0e531f6457e2d26aa008a3b65ff184b377c
<snip>
Call trace:
__alloc_pages_noprof+0xc9c/0xf18
alloc_pages_mpol_noprof+0x94/0x240
alloc_pages_noprof+0x68/0xf8
new_slab+0x3e0/0x568
___slab_alloc+0x5a0/0xb88
__slab_alloc.constprop.0+0x7c/0xf8
__kmalloc_noprof+0x404/0x4d0
xfs_buf_get_map+0x594/0xde0 [xfs 384cb02810558b4c490343c164e9407332118f88]
xfs_buf_read_map+0x64/0x2e0 [xfs 384cb02810558b4c490343c164e9407332118f88]
xfs_trans_read_buf_map+0x1dc/0x518 [xfs 384cb02810558b4c490343c164e9407332118f88]
xfs_qm_dqflush+0xac/0x468 [xfs 384cb02810558b4c490343c164e9407332118f88]
xfs_qm_dquot_logitem_push+0xe4/0x148 [xfs 384cb02810558b4c490343c164e9407332118f88]
xfsaild+0x3f4/0xde8 [xfs 384cb02810558b4c490343c164e9407332118f88]
kthread+0x110/0x128
ret_from_fork+0x10/0x20
---[ end trace 0000000000000000 ]---
This corresponds to the line:
WARN_ON_ONCE(current->flags & PF_MEMALLOC);
within the NOFAIL checks. What's happening here is that the XFS AIL is
trying to write a disk quota update back into the filesystem, but for
that it needs to read the ondisk buffer for the dquot. The buffer is
not in memory anymore, probably because it was evicted. Regardless, the
buffer cache tries to allocate a new buffer, but those allocations are
NOFAIL. The AIL thread has marked itself PF_MEMALLOC (aka noreclaim)
since commit 43ff2122e6492b ("xfs: on-stack delayed write buffer lists")
presumably because reclaim can push on XFS to push on the AIL.
An easy way to fix this probably would have been to drop the NOFAIL flag
from the xfs_buf allocation and open code a retry loop, but then there's
still the problem that for bs>ps filesystems, the buffer itself could
require up to 64k worth of pages.
Inode items had similar behavior (multi-page cluster buffers that we
don't want to allocate in the AIL) which we solved by making transaction
precommit attach the inode cluster buffers to the dirty log item. Let's
solve the dquot problem in the same way.
So: Make a real precommit handler to read the dquot buffer and attach it
to the log item; pass it to dqflush in the push method; and have the
iodone function detach the buffer once we've flushed everything. Add a
state flag to the log item to track when a thread has entered the
precommit -> push mechanism to skip the detaching if it turns out that
the dquot is very busy, as we don't hold the dquot lock between log item
commit and AIL push).
Reading and attaching the dquot buffer in the precommit hook is inspired
by the work done for inode cluster buffers some time ago.
Cc: <stable@vger.kernel.org> # v6.12
Fixes: 903edea6c53f09 ("mm: warn about illegal __GFP_NOFAIL usage in a more appropriate location and manner")
Signed-off-by: "Darrick J. Wong" <djwong@kernel.org>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Diffstat (limited to 'fs/xfs/xfs_qm.c')
-rw-r--r-- | fs/xfs/xfs_qm.c | 9 |
1 files changed, 6 insertions, 3 deletions
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index d9ac50a33c57..7d07d4b5c339 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -148,7 +148,7 @@ xfs_qm_dqpurge( * We don't care about getting disk errors here. We need * to purge this dquot anyway, so we go ahead regardless. */ - error = xfs_dquot_read_buf(NULL, dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); if (error == -EAGAIN) { xfs_dqfunlock(dqp); dqp->q_flags &= ~XFS_DQFLAG_FREEING; @@ -168,6 +168,7 @@ xfs_qm_dqpurge( } xfs_dqflock(dqp); } + xfs_dquot_detach_buf(dqp); out_funlock: ASSERT(atomic_read(&dqp->q_pincount) == 0); @@ -505,7 +506,7 @@ xfs_qm_dquot_isolate( /* we have to drop the LRU lock to flush the dquot */ spin_unlock(&lru->lock); - error = xfs_dquot_read_buf(NULL, dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); if (error) { xfs_dqfunlock(dqp); goto out_unlock_dirty; @@ -523,6 +524,8 @@ xfs_qm_dquot_isolate( xfs_buf_relse(bp); goto out_unlock_dirty; } + + xfs_dquot_detach_buf(dqp); xfs_dqfunlock(dqp); /* @@ -1510,7 +1513,7 @@ xfs_qm_flush_one( goto out_unlock; } - error = xfs_dquot_read_buf(NULL, dqp, &bp); + error = xfs_dquot_read_buf(NULL, dqp, XBF_TRYLOCK, &bp); if (error) goto out_unlock; |