diff options
| author | Frederic Weisbecker <frederic@kernel.org> | 2025-03-18 14:56:18 +0100 | 
|---|---|---|
| committer | Joel Fernandes <joelagnelf@nvidia.com> | 2025-04-08 14:55:54 -0400 | 
| commit | 4d949edbc4026eeb81a8f931586cfd4f1de5957e (patch) | |
| tree | 5e23e6f54c62c9ce0e6ef450dd6df6f4c99f5142 /tools/perf/scripts/python/arm-cs-trace-disasm.py | |
| parent | 4aa6e94cf90c0d7414580bd6a936600aef8b1e43 (diff) | |
rcu: Comment on the extraneous delta test on rcu_seq_done_exact()
The numbers used in rcu_seq_done_exact() lack some explanation behind
their magic. Especially after the commit:
    85aad7cc4178 ("rcu: Fix get_state_synchronize_rcu_full() GP-start detection")
which reported a subtle issue where a new GP sequence snapshot was taken
on the root node state while a grace period had already been started and
reflected on the global state sequence but not yet on the root node
sequence, making a polling user waiting on a wrong already started grace
period that would ignore freshly online CPUs.
The fix involved taking the snaphot on the global state sequence and
waiting on the root node sequence. And since a grace period is first
started on the global state and only afterward reflected on the root
node, a snapshot taken on the global state sequence might be two full
grace periods ahead of the root node as in the following example:
rnp->gp_seq = rcu_state.gp_seq = 0
    CPU 0                                           CPU 1
    -----                                           -----
    // rcu_state.gp_seq = 1
    rcu_seq_start(&rcu_state.gp_seq)
                                                    // snap = 8
                                                    snap = rcu_seq_snap(&rcu_state.gp_seq)
                                                    // Two full GP differences
                                                    rcu_seq_done_exact(&rnp->gp_seq, snap)
    // rnp->gp_seq = 1
    WRITE_ONCE(rnp->gp_seq, rcu_state.gp_seq);
Add a comment about those expectations and to clarify the magic within
the relevant function.
Note that the issue arises mainly with the use of rcu_seq_done_exact()
which has a much tigher guardband (of 2 GPs) to ensure the false-negative
window of the API during wraparound is limited to just 2 GPs.
rcu_seq_done() does not have such strict requirements, however its large
false-negative window of ULONG_MAX/2 is not ideal for the polling API.
However, this also means care is needed to ensure the guardband is as
large as needed to avoid the example scenario describe above which a
warning added in an earlier patch does.
[ Comment wordsmithing by Joel ]
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
Diffstat (limited to 'tools/perf/scripts/python/arm-cs-trace-disasm.py')
0 files changed, 0 insertions, 0 deletions
