diff options
author | Nikhil Dhama <nikhil.dhama@amd.com> | 2025-04-07 16:22:19 +0530 |
---|---|---|
committer | Andrew Morton <akpm@linux-foundation.org> | 2025-05-27 19:38:27 -0700 |
commit | c544a952ba61b1a025455098033c17e0573ab085 (patch) | |
tree | e48933db072864be98f8b11983e75f96d22d21e1 /rust/helpers/drm.c | |
parent | 05275594a311cd7427a64b8bcc6097645c0344af (diff) |
mm: pcp: increase pcp->free_count threshold to trigger free_high
In old pcp design, pcp->free_factor gets incremented in nr_pcp_free()
which is invoked by free_pcppages_bulk(). So, it used to increase
free_factor by 1 only when we try to reduce the size of pcp list and
free_high used to trigger only for order > 0 and order < costly_order
and pcp->free_factor > 0.
For iperf3 I noticed that with older design in kernel v6.6, pcp list
was drained mostly when pcp->count > high (more often when count goes
above 530). and most of the time pcp->free_factor was 0, triggering
very few high order flushes.
But this is changed in the current design, introduced in commit
6ccdcb6d3a74 ("mm, pcp: reduce detecting time of consecutive high order
page freeing"), where pcp->free_factor is changed to pcp->free_count to
keep track of the number of pages freed contiguously. In this design,
pcp->free_count is incremented on every deallocation, irrespective of
whether pcp list was reduced or not. And logic to trigger free_high is
if pcp->free_count goes above batch (which is 63) and there are two
contiguous page free without any allocation.
With this design, for iperf3, pcp list is getting flushed more
frequently because free_high heuristics is triggered more often now. I
observed that high order pcp list is drained as soon as both count and
free_count goes above 63.
Due to this more aggressive high order flushing, applications doing
contiguous high order allocation will require to go to global list more
frequently.
On a 2-node AMD machine with 384 vCPUs on each node, connected via
Mellonox connectX-7, I am seeing a ~30% performance reduction if we
scale number of iperf3 client/server pairs from 32 to 64.
Though this new design reduced the time to detect high order flushes,
but for application which are allocating high order pages more
frequently it may be flushing the high order list pre-maturely. This
motivates towards tuning on how late or early we should flush high
order lists.
So, in this patch, we increased the pcp->free_count threshold to
trigger free_high from "batch" to "batch + pcp->high_min / 2" as
suggested by Ying [1], In the original pcp->free_factor solution,
free_high is triggered for contiguous freeing with size ranging from
"batch" to "pcp->high + batch". So, the average value is "batch +
pcp->high / 2". While in the pcp->free_count solution, free_high will
be triggered for contiguous freeing with size "batch". So, to restore
the original behavior, we can use the threshold "batch + pcp->high_min
/ 2"
This new threshold keeps high order pages in pcp list for a longer
duration which can help the application doing high order allocations
frequently.
With this patch performace to Iperf3 is restored and score for other
benchmarks on the same machine are as follows:
iperf3 lmbench3 netperf kbuild
(AF_UNIX) (SCTP_STREAM_MANY)
------- --------- ----------------- ------
v6.6 vanilla (base) 100 100 100 100
v6.12 vanilla 69 113 98.5 98.8
v6.12 + this patch 100 110.3 100.2 99.3
netperf-tcp:
6.12 6.12
vanilla this_patch
Hmean 64 732.14 ( 0.00%) 730.45 ( -0.23%)
Hmean 128 1417.46 ( 0.00%) 1419.44 ( 0.14%)
Hmean 256 2679.67 ( 0.00%) 2676.45 ( -0.12%)
Hmean 1024 8328.52 ( 0.00%) 8339.34 ( 0.13%)
Hmean 2048 12716.98 ( 0.00%) 12743.68 ( 0.21%)
Hmean 3312 15787.79 ( 0.00%) 15887.25 ( 0.63%)
Hmean 4096 17311.91 ( 0.00%) 17332.68 ( 0.12%)
Hmean 8192 20310.73 ( 0.00%) 20465.09 ( 0.76%)
Link: https://lore.kernel.org/all/875xjmuiup.fsf@DESKTOP-5N7EMDA/ [1]
Link: https://lkml.kernel.org/r/20250407105219.55351-1-nikhil.dhama@amd.com
Fixes: 6ccdcb6d3a74 ("mm, pcp: reduce detecting time of consecutive high order page freeing")
Signed-off-by: Nikhil Dhama <nikhil.dhama@amd.com>
Suggested-by: Huang Ying <ying.huang@linux.alibaba.com>
Reviewed-by: Huang Ying <ying.huang@linux.alibaba.com>
Cc: Raghavendra K T <raghavendra.kt@amd.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Bharata B Rao <bharata@amd.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Diffstat (limited to 'rust/helpers/drm.c')
0 files changed, 0 insertions, 0 deletions