summaryrefslogtreecommitdiff
path: root/include/linux/srcu.h
AgeCommit message (Collapse)Author
2025-02-05srcu: Add srcu_down_read_fast() and srcu_up_read_fast()Paul E. McKenney
A pair of matching srcu_read_lock_fast() and srcu_read_unlock_fast() invocations must take place within the same context, for example, within the same task. Otherwise, lockdep complains, as is the right thing to do for most use cases. However, there are use cases involving tracing (for example, uretprobes) in which an SRCU reader needs to begin in one task and end in a timer handler, which might interrupt some other task. This commit therefore supplies the semaphore-like srcu_down_read_fast() and srcu_up_read_fast() functions, which act like srcu_read_lock_fast() and srcu_read_unlock_fast(), but permitting srcu_up_read_fast() to be invoked in a different context than was the matching srcu_down_read_fast(). Neither srcu_down_read_fast() nor srcu_up_read_fast() may be invoked from an NMI handler. Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Document that srcu_{read_lock,down_read}() can share srcu_structPaul E. McKenney
This commit adds a sentence to the srcu_down_read() function's kernel-doc header noting that it is permissible to use srcu_down_read() and srcu_read_lock() on the same srcu_struct, even concurrently. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Fix srcu_read_unlock_{lite,nmisafe}() kernel-docPaul E. McKenney
The srcu_read_unlock_lite() and srcu_read_unlock_nmisafe() both say that their idx parameters must come from srcu_read_lock(). This would be bad, because a given srcu_struct structure may be used only with one flavor of SRCU reader. This commit therefore updates the srcu_read_unlock_lite() kernel-doc header to say that its idx parameter must be obtained from srcu_read_lock_lite() and the srcu_read_unlock_nmisafe() kernel-doc header to say that its idx parameter must be obtained from srcu_read_lock_nmisafe(). Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Add SRCU-fast readersPaul E. McKenney
This commit adds srcu_read_{,un}lock_fast(), which is similar to srcu_read_{,un}lock_lite(), but avoids the array-indexing and pointer-following overhead. On a microbenchmark featuring tight loops around empty readers, this results in about a 20% speedup compared to RCU Tasks Trace on my x86 laptop. Please note that SRCU-fast has drawbacks compared to RCU Tasks Trace, including: o Lack of CPU stall warnings. o SRCU-fast readers permitted only where rcu_is_watching(). o A pointer-sized return value from srcu_read_lock_fast() must be passed to the corresponding srcu_read_unlock_fast(). o In the absence of readers, a synchronize_srcu() having _fast() readers will incur the latency of at least two normal RCU grace periods. o RCU Tasks Trace priority boosting could be easily added. Boosting SRCU readers is more difficult. SRCU-fast also has a drawback compared to SRCU-lite, namely that the return value from srcu_read_lock_fast()-fast is a 64-bit pointer and that from srcu_read_lock_lite() is only a 32-bit int. [ paulmck: Apply feedback from Akira Yokosawa. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Move SRCU Tree/Tiny definitions from srcu.hPaul E. McKenney
There are a couple of definitions under "#ifdef CONFIG_TINY_SRCU" in include/linux/srcu.h. There is no point in them being there, so this commit moves them to include/linux/srcutiny.h and include/linux/srcutree.c, thus eliminating that #ifdef. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Add SRCU_READ_FLAVOR_SLOWGP to flag need for synchronize_rcu()Paul E. McKenney
This commit switches from a direct test of SRCU_READ_FLAVOR_LITE to a new SRCU_READ_FLAVOR_SLOWGP macro to check for substituting synchronize_rcu() for smp_mb() in SRCU grace periods. Right now, SRCU_READ_FLAVOR_SLOWGP is exactly SRCU_READ_FLAVOR_LITE, but the addition of the _fast() flavor of SRCU will change that. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Rename srcu_check_read_flavor_lite() to srcu_check_read_flavor_force()Paul E. McKenney
This commit renames the srcu_check_read_flavor_lite() function to srcu_check_read_flavor_force() and adds a read_flavor argument in order to support an srcu_read_lock_fast() variant that is to avoid array indexing in both the lock and unlock primitives. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2025-02-05srcu: Define SRCU_READ_FLAVOR_ALL in terms of symbolsPaul E. McKenney
This commit defines SRCU_READ_FLAVOR_ALL in terms of the SRCU_READ_FLAVOR_* definitions instead of a hexadecimal constant. Suggested-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2024-12-14Merge branches 'fixes.2024.12.14a', 'rcutorture.2024.12.14a', ↵Uladzislau Rezki (Sony)
'srcu.2024.12.14a' and 'torture-test.2024.12.14a' into rcu-merge.2024.12.14a fixes.2024.12.14a: RCU fixes rcutorture.2024.12.14a: Torture-test updates srcu.2024.12.14a: SRCU updates torture-test.2024.12.14a: Adding an extra test, fixes
2024-12-14srcu: Guarantee non-negative return value from srcu_read_lock()Paul E. McKenney
For almost 20 years, the int return value from srcu_read_lock() has been always either zero or one. This commit therefore documents the fact that it will be non-negative, and does the same for the underlying __srcu_read_lock(). [ paulmck: Apply Andrii Nakryiko feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-12-14rcutorture: Use symbols for SRCU reader flavorsPaul E. McKenney
This commit converts rcutorture.c values for the reader_flavor module parameter from hexadecimal to the SRCU_READ_FLAVOR_* C-preprocessor macros. The actual modprobe or kernel-boot-parameter values for read_flavor must still be entered in hexadecimal. Link: https://lore.kernel.org/all/c48c9dca-fe07-4833-acaa-28c827e5a79e@amd.com/ Suggested-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
2024-11-15srcu: Unconditionally record srcu_read_lock_lite() in ->srcu_reader_flavorPaul E. McKenney
Currently, srcu_read_lock_lite() uses the SRCU_READ_FLAVOR_LITE bit in ->srcu_reader_flavor to communicate to the grace-period processing in srcu_readers_active_idx_check() that the smp_mb() must be replaced by a synchronize_rcu(). Unfortunately, ->srcu_reader_flavor is not updated unless the kernel is built with CONFIG_PROVE_RCU=y. Therefore in all kernels built with CONFIG_PROVE_RCU=n, srcu_readers_active_idx_check() incorrectly uses smp_mb() instead of synchronize_rcu() for srcu_struct structures whose readers use srcu_read_lock_lite(). This commit therefore causes Tree SRCU srcu_read_lock_lite() to unconditionally update ->srcu_reader_flavor so that srcu_readers_active_idx_check() can make the correct choice. Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12srcu: Remove smp_mb() from srcu_read_unlock_lite()Paul E. McKenney
The srcu_read_unlock_lite() function invokes __srcu_read_unlock() instead of __srcu_read_unlock_lite(), which means that it is doing an unnecessary smp_mb(). This is harmless other than the performance degradation. This commit therefore switches to __srcu_read_unlock_lite(). Reported-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Closes: https://lore.kernel.org/all/d07e8f4a-d5ff-4c8e-8e61-50db285c57e9@amd.com/ Fixes: c0f08d6b5a61 ("srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()") Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12srcu: Improve srcu_read_lock_lite() kernel-doc commentPaul E. McKenney
Where RCU is watching is where it is OK to invoke rcu_read_lock(). Reported-by: Andrii Nakryiko <andrii@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Andrii Nakryiko <andrii@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12srcu: Add srcu_read_lock_lite() and srcu_read_unlock_lite()Paul E. McKenney
This patch adds srcu_read_lock_lite() and srcu_read_unlock_lite(), which dispense with the read-side smp_mb() but also are restricted to code regions that RCU is watching. If a given srcu_struct structure uses srcu_read_lock_lite() and srcu_read_unlock_lite(), it is not permitted to use any other SRCU read-side marker, before, during, or after. Another price of light-weight readers is heavier weight grace periods. Such readers mean that SRCU grace periods on srcu_struct structures used by light-weight readers will incur at least two calls to synchronize_rcu(). In addition, normal SRCU grace periods for light-weight-reader srcu_struct structures never auto-expedite. Note that expedited SRCU grace periods for light-weight-reader srcu_struct structures still invoke synchronize_rcu(), not synchronize_srcu_expedited(). Something about wishing to keep the IPIs down to a dull roar. The srcu_read_lock_lite() and srcu_read_unlock_lite() functions may not (repeat, *not*) be used from NMI handlers, but if this is needed, an additional flavor of SRCU reader can be added by some future commit. [ paulmck: Apply Alexei Starovoitov expediting feedback. ] [ paulmck: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: kernel test robot <oliver.sang@intel.com> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12srcu: Create CPP macros for normal and NMI-safe SRCU readersPaul E. McKenney
This commit creates SRCU_READ_FLAVOR_NORMAL and SRCU_READ_FLAVOR_NMI C-preprocessor macros for srcu_read_lock() and srcu_read_lock_nmisafe(), respectively. These replace the old true/false values that were previously passed to srcu_check_read_flavor(). In addition, the srcu_check_read_flavor() function itself requires a bit of rework to handle bitmasks instead of true/false values. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12srcu: Improve srcu_read_lock{,_nmisafe}() commentsPaul E. McKenney
This commit adds some additional usage constraints to the kernel-doc headers of srcu_read_lock() and srcu_read_lock_nmi_safe(). Suggested-by: Andrii Nakryiko <andrii.nakryiko@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-11-12srcu: Renaming in preparation for additional reader flavorPaul E. McKenney
Currently, there are only two flavors of readers, normal and NMI-safe. A number of fields, functions, and types reflect this restriction. This renaming-only commit prepares for the addition of light-weight (as in memory-barrier-free) readers. OK, OK, there is also a drive-by white-space fixeup! Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: <bpf@vger.kernel.org> Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2024-07-20Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds
Pull kvm updates from Paolo Bonzini: "ARM: - Initial infrastructure for shadow stage-2 MMUs, as part of nested virtualization enablement - Support for userspace changes to the guest CTR_EL0 value, enabling (in part) migration of VMs between heterogenous hardware - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of the protocol - FPSIMD/SVE support for nested, including merged trap configuration and exception routing - New command-line parameter to control the WFx trap behavior under KVM - Introduce kCFI hardening in the EL2 hypervisor - Fixes + cleanups for handling presence/absence of FEAT_TCRX - Miscellaneous fixes + documentation updates LoongArch: - Add paravirt steal time support - Add support for KVM_DIRTY_LOG_INITIALLY_SET - Add perf kvm-stat support for loongarch RISC-V: - Redirect AMO load/store access fault traps to guest - perf kvm stat support - Use guest files for IMSIC virtualization, when available s390: - Assortment of tiny fixes which are not time critical x86: - Fixes for Xen emulation - Add a global struct to consolidate tracking of host values, e.g. EFER - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC bus frequency, because TDX - Print the name of the APICv/AVIC inhibits in the relevant tracepoint - Clean up KVM's handling of vendor specific emulation to consistently act on "compatible with Intel/AMD", versus checking for a specific vendor - Drop MTRR virtualization, and instead always honor guest PAT on CPUs that support self-snoop - Update to the newfangled Intel CPU FMS infrastructure - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads '0' and writes from userspace are ignored - Misc cleanups x86 - MMU: - Small cleanups, renames and refactoring extracted from the upcoming Intel TDX support - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't hold leafs SPTEs - Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager page splitting, to avoid stalling vCPUs when splitting huge pages - Bug the VM instead of simply warning if KVM tries to split a SPTE that is non-present or not-huge. KVM is guaranteed to end up in a broken state because the callers fully expect a valid SPTE, it's all but dangerous to let more MMU changes happen afterwards x86 - AMD: - Make per-CPU save_area allocations NUMA-aware - Force sev_es_host_save_area() to be inlined to avoid calling into an instrumentable function from noinstr code - Base support for running SEV-SNP guests. API-wise, this includes a new KVM_X86_SNP_VM type, encrypting/measure the initial image into guest memory, and finalizing it before launching it. Internally, there are some gmem/mmu hooks needed to prepare gmem-allocated pages before mapping them into guest private memory ranges This includes basic support for attestation guest requests, enough to say that KVM supports the GHCB 2.0 specification There is no support yet for loading into the firmware those signing keys to be used for attestation requests, and therefore no need yet for the host to provide certificate data for those keys. To support fetching certificate data from userspace, a new KVM exit type will be needed to handle fetching the certificate from userspace. An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS exit type to handle this was introduced in v1 of this patchset, but is still being discussed by community, so for now this patchset only implements a stub version of SNP Extended Guest Requests that does not provide certificate data x86 - Intel: - Remove an unnecessary EPT TLB flush when enabling hardware - Fix a series of bugs that cause KVM to fail to detect nested pending posted interrupts as valid wake eents for a vCPU executing HLT in L2 (with HLT-exiting disable by L1) - KVM: x86: Suppress MMIO that is triggered during task switch emulation Explicitly suppress userspace emulated MMIO exits that are triggered when emulating a task switch as KVM doesn't support userspace MMIO during complex (multi-step) emulation Silently ignoring the exit request can result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace for some other reason prior to purging mmio_needed See commit 0dc902267cb3 ("KVM: x86: Suppress pending MMIO write exits if emulator detects exception") for more details on KVM's limitations with respect to emulated MMIO during complex emulator flows Generic: - Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE, because the special casing needed by these pages is not due to just unmovability (and in fact they are only unmovable because the CPU cannot access them) - New ioctl to populate the KVM page tables in advance, which is useful to mitigate KVM page faults during guest boot or after live migration. The code will also be used by TDX, but (probably) not through the ioctl - Enable halt poll shrinking by default, as Intel found it to be a clear win - Setup empty IRQ routing when creating a VM to avoid having to synchronize SRCU when creating a split IRQCHIP on x86 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag that arch code can use for hooking both sched_in() and sched_out() - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid truncating a bogus value from userspace, e.g. to help userspace detect bugs - Mark a vCPU as preempted if and only if it's scheduled out while in the KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest memory when retrieving guest state during live migration blackout Selftests: - Remove dead code in the memslot modification stress test - Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs - Print the guest pseudo-RNG seed only when it changes, to avoid spamming the log for tests that create lots of VMs - Make the PMU counters test less flaky when counting LLC cache misses by doing CLFLUSH{OPT} in every loop iteration" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits) crypto: ccp: Add the SNP_VLEK_LOAD command KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops KVM: x86: Replace static_call_cond() with static_call() KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event x86/sev: Move sev_guest.h into common SEV header KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event KVM: x86: Suppress MMIO that is triggered during task switch emulation KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory() KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault" KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory KVM: Document KVM_PRE_FAULT_MEMORY ioctl mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE perf kvm: Add kvm-stat for loongarch64 LoongArch: KVM: Add PV steal time support in guest side ...
2024-06-18srcu: Fill out polled grace-period APIsPaul E. McKenney
This commit adds the get_completed_synchronize_srcu() and the same_state_synchronize_srcu() functions. The first returns a cookie that is always interpreted as corresponding to an expired grace period. The second does an equality comparison of a pair of cookies. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Kent Overstreet <kent.overstreet@linux.dev>
2024-06-18srcu: Add NUM_ACTIVE_SRCU_POLL_OLDSTATEPaul E. McKenney
This commit adds NUM_ACTIVE_SRCU_POLL_OLDSTATE, which gives the maximum number of distinct return values from get_state_synchronize_rcu() that can, at a given point in time, correspond to not-completed SRCU grace periods. Reported-by: Kent Overstreet <kent.overstreet@linux.dev> Closes: https://lore.kernel.org/all/irycqy4sinjdgm2hkyix2bffunpcmuwgeufsx6nlljvqme3wiu@ify3zdnrmzph/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2024-06-07srcu: Add an API for a memory barrier after SRCU read lockYan Zhao
To avoid redundant memory barriers, add smp_mb__after_srcu_read_lock() to pair with smp_mb__after_srcu_read_unlock() for use in paths that need to emit a memory barrier, but already do srcu_read_lock(), which includes a full memory barrier. Provide an API, e.g. as opposed to having callers document the behavior via a comment, as the full memory barrier provided by srcu_read_lock() is an implementation detail that shouldn't bleed into random subsystems. KVM will use smp_mb__after_srcu_read_lock() in it's VM-Exit path to ensure a memory barrier is emitted, which is necessary to ensure correctness of mixed memory types on CPUs that support self-snoop. Cc: Paolo Bonzini <pbonzini@redhat.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com> [sean: massage changelog] Tested-by: Xiangfei Ma <xiangfeix.ma@intel.com> Tested-by: Yongwei Ma <yongwei.ma@intel.com> Reviewed-by: Paul E. McKenney <paulmck@kernel.org Link: https://lore.kernel.org/r/20240309010929.1403984-4-seanjc@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2023-12-13srcu: Use try-lock lockdep annotation for NMI-safe access.Sebastian Andrzej Siewior
It is claimed that srcu_read_lock_nmisafe() NMI-safe. However it triggers a lockdep if used from NMI because lockdep expects a deadlock since nothing disables NMIs while the lock is acquired. This is because commit f0f44752f5f61 ("rcu: Annotate SRCU's update-side lockdep dependencies") annotates synchronize_srcu() as a write lock usage. This helps to detect a deadlocks such as srcu_read_lock(); synchronize_srcu(); srcu_read_unlock(); The side effect is that the lock srcu_struct now has a USED usage in normal contexts, so it conflicts with a USED_READ usage in NMI. But this shouldn't cause a real deadlock because the write lock usage from synchronize_srcu() is a fake one and only used for read/write deadlock detection. Use a try-lock annotation for srcu_read_lock_nmisafe() to avoid lockdep complains if used from NMI. Fixes: f0f44752f5f6 ("rcu: Annotate SRCU's update-side lockdep dependencies") Link: https://lore.kernel.org/r/20230927160231.XRCDDSK4@linutronix.de Reviewed-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Neeraj Upadhyay (AMD) <neeraj.iitr10@gmail.com>
2023-07-04Merge tag 'core_guards_for_6.5_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue Pull scope-based resource management infrastructure from Peter Zijlstra: "These are the first few patches in the Scope-based Resource Management series that introduce the infrastructure but not any conversions as of yet. Adding the infrastructure now allows multiple people to start using them. Of note is that Sparse will need some work since it doesn't yet understand this attribute and might have decl-after-stmt issues" * tag 'core_guards_for_6.5_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue: kbuild: Drop -Wdeclaration-after-statement locking: Introduce __cleanup() based infrastructure apparmor: Free up __cleanup() name dmaengine: ioat: Free up __cleanup() name
2023-06-26locking: Introduce __cleanup() based infrastructurePeter Zijlstra
Use __attribute__((__cleanup__(func))) to build: - simple auto-release pointers using __free() - 'classes' with constructor and destructor semantics for scope-based resource management. - lock guards based on the above classes. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20230612093537.614161713%40infradead.org
2023-05-09srcu: Remove extraneous parentheses from srcu_read_lock() etc.Paul E. McKenney
This commit removes extraneous parentheses from srcu_read_lock(), srcu_read_lock_nmisafe(), srcu_read_unlock(), and srcu_read_unlock_nmisafe(). Looks like someone was once a macro. Cc: Christoph Hellwig <hch@lst.de> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Tested-by: "Zhang, Qiang1" <qiang1.zhang@intel.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2023-03-27rcu: Annotate SRCU's update-side lockdep dependenciesBoqun Feng
Although all flavors of RCU readers are annotated correctly with lockdep as recursive read locks, they do not set the lock_acquire 'check' parameter. This means that RCU read locks are not added to the lockdep dependency graph, which in turn means that lockdep cannot detect RCU-based deadlocks. This is not a problem for RCU flavors having atomic read-side critical sections because context-based annotations can catch these deadlocks, see for example the RCU_LOCKDEP_WARN() statement in synchronize_rcu(). But context-based annotations are not helpful for sleepable RCU, especially given that it is perfectly legal to do synchronize_srcu(&srcu1) within an srcu_read_lock(&srcu2). However, we can detect SRCU-based by: (1) Making srcu_read_lock() a 'check'ed recursive read lock and (2) Making synchronize_srcu() a empty write lock critical section. Even better, with the newly introduced lock_sync(), we can avoid false positives about irq-unsafe/safe. This commit therefore makes it so. Note that NMI-safe SRCU read side critical sections are currently not annotated, but might be annotated in the future. Signed-off-by: Boqun Feng <boqun.feng@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> [ boqun: Add comments for annotation per Waiman's suggestion ] [ boqun: Fix comment warning reported by Stephen Rothwell ] Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
2023-01-03rcu: Add srcu_down_read() and srcu_up_read()Paul E. McKenney
A pair of matching srcu_read_lock() and srcu_read_unlock() invocations must take place within the same context, for example, within the same task. Otherwise, lockdep complains, as is the right thing to do for most use cases. However, there are use cases involving asynchronous I/O where the SRCU reader needs to begin on one task and end on another. This commit therefore supplies the semaphore-like srcu_down_read() and srcu_up_read(), which act like srcu_read_lock() and srcu_read_unlock(), but permitting srcu_up_read() to be invoked in a different context than was the matching srcu_down_read(). Neither srcu_down_read() nor srcu_up_read() may be invoked from an NMI handler. Reported-by: Jan Kara <jack@suse.cz> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Tested-by: Amir Goldstein <amir73il@gmail.com>
2022-11-29rcu: Make SRCU mandatoryPaul E. McKenney
Kernels configured with CONFIG_PRINTK=n and CONFIG_SRCU=n get build failures. This causes trouble for deep embedded systems. But given that there are more than 25 instances of "select SRCU" in the kernel, it is hard to believe that there are many kernels running in production without SRCU. This commit therefore makes SRCU mandatory. The SRCU Kconfig option remains for backwards compatibility, and will be removed when it is no longer used. [ paulmck: Update per kernel test robot feedback. ] Reported-by: John Ogness <john.ogness@linutronix.de> Reported-by: Petr Mladek <pmladek@suse.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: <linux-arch@vger.kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: John Ogness <john.ogness@linutronix.de>
2022-10-21srcu: Debug NMI safety even on archs that don't require itFrederic Weisbecker
Currently the NMI safety debugging is only performed on architectures that don't support NMI-safe this_cpu_inc(). Reorder the code so that other architectures like x86 also detect bad uses. [ paulmck: Apply kernel test robot, Stephen Rothwell, and Zqiang feedback. ] Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2022-10-20srcu: Check for consistent per-CPU per-srcu_struct NMI safetyPaul E. McKenney
This commit adds runtime checks to verify that a given srcu_struct uses consistent NMI-safe (or not) read-side primitives on a per-CPU basis. Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com>
2022-10-20srcu: Create an srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe()Paul E. McKenney
On strict load-store architectures, the use of this_cpu_inc() by srcu_read_lock() and srcu_read_unlock() is not NMI-safe in TREE SRCU. To see this suppose that an NMI arrives in the middle of srcu_read_lock(), just after it has read ->srcu_lock_count, but before it has written the incremented value back to memory. If that NMI handler also does srcu_read_lock() and srcu_read_lock() on that same srcu_struct structure, then upon return from that NMI handler, the interrupted srcu_read_lock() will overwrite the NMI handler's update to ->srcu_lock_count, but leave unchanged the NMI handler's update by srcu_read_unlock() to ->srcu_unlock_count. This can result in a too-short SRCU grace period, which can in turn result in arbitrary memory corruption. If the NMI handler instead interrupts the srcu_read_unlock(), this can result in eternal SRCU grace periods, which is not much better. This commit therefore creates a pair of new srcu_read_lock_nmisafe() and srcu_read_unlock_nmisafe() functions, which allow SRCU readers in both NMI handlers and in process and IRQ context. It is bad practice to mix the existing and the new _nmisafe() primitives on the same srcu_struct structure. Use one set or the other, not both. Just to underline that "bad practice" point, using srcu_read_lock() at process level and srcu_read_lock_nmisafe() in your NMI handler will not, repeat NOT, work. If you do not immediately understand why this is the case, please review the earlier paragraphs in this commit log. [ paulmck: Apply kernel test robot feedback. ] [ paulmck: Apply feedback from Randy Dunlap. ] [ paulmck: Apply feedback from John Ogness. ] [ paulmck: Apply feedback from Frederic Weisbecker. ] Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/ Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Acked-by: Randy Dunlap <rdunlap@infradead.org> # build-tested Reviewed-by: Frederic Weisbecker <frederic@kernel.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: John Ogness <john.ogness@linutronix.de> Cc: Petr Mladek <pmladek@suse.com>
2021-11-30rcu: Replace ________p1 and _________p1 with __UNIQUE_ID(rcu)Chun-Hung Tseng
This commit replaces both ________p1 and _________p1 with __UNIQUE_ID(rcu), and also adjusts the callers of the affected macros. __UNIQUE_ID(rcu) will generate unique variable names during compilation, which eliminates the need of ________p1 and _________p1 (both having 4 occurrences prior to the code change). This also avoids the variable name shadowing issue, or at least makes those wishing to cause shadowing problems work much harder to do so. The same idea is used for the min/max macros (commit 589a978 and commit e9092d0). Signed-off-by: Jim Huang <jserv@ccns.ncku.edu.tw> Signed-off-by: Chun-Hung Tseng <henrybear327@gmail.com> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-05-10srcu: Initialize SRCU after timersFrederic Weisbecker
Once srcu_init() is called, the SRCU core will make use of delayed workqueues, which rely on timers. However init_timers() is called several steps after rcu_init(). This means that a call_srcu() after rcu_init() but before init_timers() would find itself within a dangerously uninitialized timer core. This commit therefore creates a separate call to srcu_init() after init_timer() completes, which ensures that we stay in early SRCU mode until timers are safe(r). Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Cc: Uladzislau Rezki <urezki@gmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: Neeraj Upadhyay <neeraju@codeaurora.org> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Joel Fernandes <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2021-01-04srcu: Provide polling interfaces for Tiny SRCU grace periodsPaul E. McKenney
There is a need for a polling interface for SRCU grace periods, so this commit supplies get_state_synchronize_srcu(), start_poll_synchronize_srcu(), and poll_state_synchronize_srcu() for this purpose. The first can be used if future grace periods are inevitable (perhaps due to a later call_srcu() invocation), the second if future grace periods might not otherwise happen, and the third to check if a grace period has elapsed since the corresponding call to either of the first two. As with get_state_synchronize_rcu() and cond_synchronize_rcu(), the return value from either get_state_synchronize_srcu() or start_poll_synchronize_srcu() must be passed in to a later call to poll_state_synchronize_srcu(). Link: https://lore.kernel.org/rcu/20201112201547.GF3365678@moria.home.lan/ Reported-by: Kent Overstreet <kent.overstreet@gmail.com> [ paulmck: Add EXPORT_SYMBOL_GPL() per kernel test robot feedback. ] [ paulmck: Apply feedback from Neeraj Upadhyay. ] Link: https://lore.kernel.org/lkml/20201117004017.GA7444@paulmck-ThinkPad-P72/ Reviewed-by: Neeraj Upadhyay <neeraju@codeaurora.org> Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
2019-03-26srcu: Remove cleanup_srcu_struct_quiesced()Paul E. McKenney
The cleanup_srcu_struct_quiesced() function was added because NVME used WQ_MEM_RECLAIM workqueues and SRCU did not, which meant that NVME workqueues waiting on SRCU workqueues could result in deadlocks during low-memory conditions. However, SRCU now also has WQ_MEM_RECLAIM workqueues, so there is no longer a potential for deadlock. Furthermore, it turns out to be extremely hard to use cleanup_srcu_struct_quiesced() correctly due to the fact that SRCU callback invocation accesses the srcu_struct structure's per-CPU data area just after callbacks are invoked. Therefore, the usual practice of using srcu_barrier() to wait for callbacks to be invoked before invoking cleanup_srcu_struct_quiesced() fails because SRCU's callback-invocation workqueue handler might be delayed, which can result in cleanup_srcu_struct_quiesced() being invoked (and thus freeing the per-CPU data) before the SRCU's callback-invocation workqueue handler is finished using that per-CPU data. Nor is this a theoretical problem: KASAN emitted use-after-free warnings because of this problem on actual runs. In short, NVME can now safely invoke cleanup_srcu_struct(), which avoids the use-after-free scenario. And cleanup_srcu_struct_quiesced() is quite difficult to use safely. This commit therefore removes cleanup_srcu_struct_quiesced(), switching its sole user back to cleanup_srcu_struct(). This effectively reverts the following pair of commits: f7194ac32ca2 ("srcu: Add cleanup_srcu_struct_quiesced()") 4317228ad9b8 ("nvme: Avoid flush dependency in delete controller flow") Reported-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Tested-by: Bart Van Assche <bvanassche@acm.org>
2019-02-09Merge branches 'doc.2019.01.26a', 'fixes.2019.01.26a', 'sil.2019.01.26a', ↵Paul E. McKenney
'spdx.2019.02.09a', 'srcu.2019.01.26a' and 'torture.2019.01.26a' into HEAD doc.2019.01.26a: Documentation updates. fixes.2019.01.26a: Miscellaneous fixes. sil.2019.01.26a: Removal of a few more spin_is_locked() instances. spdx.2019.02.09a: Add SPDX identifiers to RCU files srcu.2019.01.26a: SRCU updates. torture.2019.01.26a: Torture-test updates.
2019-02-09linux/srcu: Convert to SPDX license identifierPaul E. McKenney
Replace the license boiler plate with a SPDX license identifier. While in the area, update an email address. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com> [ paulmck: Update ,h SPDX format per Joe Perches. ] Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
2019-01-25srcu: Check for invalid idx argument in srcu_read_unlock()Paul E. McKenney
The current SRCU implementation has an idx argument of zero or one, and never anything else. This commit therefore adds a WARN_ON_ONCE() to complain if this restriction is violated. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-27srcu: Use "ssp" instead of "sp" for srcu_struct pointerPaul E. McKenney
In RCU, the distinction between "rsp", "rnp", and "rdp" has served well for a great many years, but in SRCU, "sp" vs. "sdp" has proven confusing. This commit therefore renames SRCU's "sp" pointers to "ssp", so that there is "ssp" for srcu_struct pointer, "snp" for srcu_node pointer, and "sdp" for srcu_data pointer. Signed-off-by: Paul E. McKenney <paulmck@linux.ibm.com>
2018-11-08srcu: Fix kernel-doc missing notationRandy Dunlap
Fix kernel-doc warnings for missing parameter descriptions: ../include/linux/srcu.h:175: warning: Function parameter or member 'p' not described in 'srcu_dereference_notrace' ../include/linux/srcu.h:175: warning: Function parameter or member 'sp' not described in 'srcu_dereference_notrace' Fixes: 0b764a6e4e19d ("srcu: Add notrace variant of srcu_dereference") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Lai Jiangshan <jiangshanlai@gmail.com> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Josh Triplett <josh@joshtriplett.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2018-07-26srcu: Add notrace variant of srcu_dereferenceJoel Fernandes (Google)
In the last patch in this series, we are making lockdep register hooks onto the irq_{disable,enable} tracepoints. These tracepoints use the _rcuidle tracepoint variant. In this series we switch the _rcuidle tracepoint callers to use SRCU instead of sched-RCU. Inorder to dereference the pointer to the probe functions, we could call srcu_dereference, however this API will call back into lockdep to check if the lock is held *before* the lockdep probe hooks have a chance to run and annotate the IRQ enabled/disabled state. For this reason we need a notrace variant of srcu_dereference since otherwise we get lockdep splats. This patch adds the needed srcu_dereference_notrace variant. Link: http://lkml.kernel.org/r/20180628182149.226164-3-joel@joelfernandes.org Reviewed-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-07-26srcu: Add notrace variants of srcu_read_{lock,unlock}Paul McKenney
This is needed for a future tracepoint patch that uses srcu, and to make sure it doesn't call into lockdep. tracepoint code already calls notrace variants for rcu_read_lock_sched so this patch does the same for srcu which will be used in a later patch. Keeps it consistent with rcu-sched. [Joel: Added commit message] Link: http://lkml.kernel.org/r/20180628182149.226164-2-joel@joelfernandes.org Reviewed-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Signed-off-by: Paul McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2018-05-15srcu: Add cleanup_srcu_struct_quiesced()Paul E. McKenney
The current cleanup_srcu_struct() flushes work, which prevents it from being invoked from some workqueue contexts, as well as from atomic (non-blocking) contexts. This patch therefore introduced a cleanup_srcu_struct_quiesced(), which can be invoked only after all activity on the specified srcu_struct has completed. This restriction allows cleanup_srcu_struct_quiesced() to be invoked from workqueue contexts as well as from atomic contexts. Suggested-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Tested-by: Nitzan Carmi <nitzanc@mellanox.com> Tested-by: Nicholas Piggin <npiggin@gmail.com>
2018-01-18lockdep: Convert some users to constMatthew Wilcox
These users of lockdep_is_held() either wanted lockdep_is_held to take a const pointer, or would benefit from providing a const pointer. Signed-off-by: Matthew Wilcox <mawilcox@microsoft.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: "David S. Miller" <davem@davemloft.net> Link: https://lkml.kernel.org/r/20180117151414.23686-4-willy@infradead.org
2017-10-19doc: Fix various RCU docbook comment-header problemsPaul E. McKenney
Because many of RCU's files have not been included into docbook, a number of errors have accumulated. This commit fixes them. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-08srcu: Remove Classic SRCUPaul E. McKenney
Classic SRCU was only ever intended to be a fallback in case of issues with Tree/Tiny SRCU, and the latter two are doing quite well in testing. This commit therefore removes Classic SRCU. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08srcu: Shrink srcu.h by moving docbook and private functionPaul E. McKenney
The call_srcu() docbook entry is currently in include/linux/srcu.h, which causes needless processing for each include point. This commit therefore moves this entry to kernel/rcu/srcutree.c, which the compiler reads only once. In addition, the srcu_batches_completed() function is used only within RCU and its torture-test suites. This commit therefore also moves this function's declaration from include/linux/srcutiny.h, include/linux/srcutree.h, and include/linux/srcuclassic.h to kernel/rcu/rcu.h. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-06-08srcu: Make SRCU be once again optionalPaul E. McKenney
Commit d160a727c40e ("srcu: Make SRCU be built by default") in response to build errors, which were caused by code that included srcu.h despite !SRCU. However, srcutiny.o is almost 2K of code, which is not insignificant for those attempting to run the Linux kernel on IoT devices. This commit therefore makes SRCU be once again optional, and adjusts srcu.h to allow error-free inclusion in !SRCU kernel builds. Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Acked-by: Nicolas Pitre <nico@linaro.org>
2017-06-08srcu: Allow use of Classic SRCU from both process and interrupt contextPaolo Bonzini
Linu Cherian reported a WARN in cleanup_srcu_struct() when shutting down a guest running iperf on a VFIO assigned device. This happens because irqfd_wakeup() calls srcu_read_lock(&kvm->irq_srcu) in interrupt context, while a worker thread does the same inside kvm_set_irq(). If the interrupt happens while the worker thread is executing __srcu_read_lock(), updates to the Classic SRCU ->lock_count[] field or the Tree SRCU ->srcu_lock_count[] field can be lost. The docs say you are not supposed to call srcu_read_lock() and srcu_read_unlock() from irq context, but KVM interrupt injection happens from (host) interrupt context and it would be nice if SRCU supported the use case. KVM is using SRCU here not really for the "sleepable" part, but rather due to its IPI-free fast detection of grace periods. It is therefore not desirable to switch back to RCU, which would effectively revert commit 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING", 2014-01-16). However, the docs are overly conservative. You can have an SRCU instance only has users in irq context, and you can mix process and irq context as long as process context users disable interrupts. In addition, __srcu_read_unlock() actually uses this_cpu_dec() on both Tree SRCU and Classic SRCU. For those two implementations, only srcu_read_lock() is unsafe. When Classic SRCU's __srcu_read_unlock() was changed to use this_cpu_dec(), in commit 5a41344a3d83 ("srcu: Simplify __srcu_read_unlock() via this_cpu_dec()", 2012-11-29), __srcu_read_lock() did two increments. Therefore it kept __this_cpu_inc(), with preempt_disable/enable in the caller. Tree SRCU however only does one increment, so on most architectures it is more efficient for __srcu_read_lock() to use this_cpu_inc(), and any performance differences appear to be down in the noise. Cc: stable@vger.kernel.org Fixes: 719d93cd5f5c ("kvm/irqchip: Speed up KVM_SET_GSI_ROUTING") Reported-by: Linu Cherian <linuc.decode@gmail.com> Suggested-by: Linu Cherian <linuc.decode@gmail.com> Cc: kvm@vger.kernel.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>