summaryrefslogtreecommitdiff
path: root/arch/powerpc/perf
AgeCommit message (Collapse)Author
2025-05-26Merge tag 'perf-core-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: "Core & generic-arch updates: - Add support for dynamic constraints and propagate it to the Intel driver (Kan Liang) - Fix & enhance driver-specific throttling support (Kan Liang) - Record sample last_period before updating on the x86 and PowerPC platforms (Mark Barnett) - Make perf_pmu_unregister() usable (Peter Zijlstra) - Unify perf_event_free_task() / perf_event_exit_task_context() (Peter Zijlstra) - Simplify perf_event_release_kernel() and perf_event_free_task() (Peter Zijlstra) - Allocate non-contiguous AUX pages by default (Yabin Cui) Uprobes updates: - Add support to emulate NOP instructions (Jiri Olsa) - selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa) x86 Intel PMU enhancements: - Support Intel Auto Counter Reload [ACR] (Kan Liang) - Add PMU support for Clearwater Forest (Dapeng Mi) - Arch-PEBS preparatory changes: (Dapeng Mi) - Parse CPUID archPerfmonExt leaves for non-hybrid CPUs - Decouple BTS initialization from PEBS initialization - Introduce pairs of PEBS static calls x86 AMD PMU enhancements: - Use hrtimer for handling overflows in the AMD uncore driver (Sandipan Das) - Prevent UMC counters from saturating (Sandipan Das) Fixes and cleanups: - Fix put_ctx() ordering (Frederic Weisbecker) - Fix irq work dereferencing garbage (Frederic Weisbecker) - Misc fixes and cleanups (Changbin Du, Frederic Weisbecker, Ian Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang, Sandipan Das, Thorsten Blum)" * tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) perf/headers: Clean up <linux/perf_event.h> a bit perf/uapi: Clean up <uapi/linux/perf_event.h> a bit perf/uapi: Fix PERF_RECORD_SAMPLE comments in <uapi/linux/perf_event.h> mips/perf: Remove driver-specific throttle support xtensa/perf: Remove driver-specific throttle support sparc/perf: Remove driver-specific throttle support loongarch/perf: Remove driver-specific throttle support csky/perf: Remove driver-specific throttle support arc/perf: Remove driver-specific throttle support alpha/perf: Remove driver-specific throttle support perf/apple_m1: Remove driver-specific throttle support perf/arm: Remove driver-specific throttle support s390/perf: Remove driver-specific throttle support powerpc/perf: Remove driver-specific throttle support perf/x86/zhaoxin: Remove driver-specific throttle support perf/x86/amd: Remove driver-specific throttle support perf/x86/intel: Remove driver-specific throttle support perf: Only dump the throttle log for the leader perf: Fix the throttle logic for a group perf/core: Add the is_event_in_freq_mode() helper to simplify the code ...
2025-05-21powerpc/perf: Remove driver-specific throttle supportKan Liang
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250520181644.2673067-7-kan.liang@linux.intel.com
2025-04-16powerpc/kvm-hv-pmu: Add perf-events for Hostwide countersVaibhav Jain
Update 'kvm-hv-pmu.c' to add five new perf-events mapped to the five Hostwide counters. Since these newly introduced perf events are at system wide scope and can be read from any L1-Lpar CPU, 'kvmppc_pmu' scope and capabilities are updated appropriately. Also introduce two new helpers. First is kvmppc_update_l0_stats() that uses the infrastructure introduced in previous patches to issues the H_GUEST_GET_STATE hcall L0-PowerVM to fetch guest-state-buffer holding the latest values of these counters which is then parsed and 'l0_stats' variable updated. Second helper is kvmppc_pmu_event_update() which is called from 'kvmppv_pmu' callbacks and uses kvmppc_update_l0_stats() to update 'l0_stats' and the update the 'struct perf_event's event-counter. Some minor updates to kvmppc_pmu_{add, del, read}() to remove some debug scaffolding code. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Reviewed-by: Athira Rajeev <atrajeev@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250416162740.93143-7-vaibhav@linux.ibm.com
2025-04-16powerpc/kvm-hv-pmu: Implement GSB message-ops for hostwide countersVaibhav Jain
Implement and setup necessary structures to send a prepolulated Guest-State-Buffer(GSB) requesting hostwide counters to L0-PowerVM and have the returned GSB holding the values of these counters parsed. This is done via existing GSB implementation and with the newly added support of Hostwide elements in GSB. The request to L0-PowerVM to return Hostwide counters is done using a pre-allocated GSB named 'gsb_l0_stats'. To be able to populate this GSB with the needed Guest-State-Elements (GSIDs) a instance of 'struct kvmppc_gs_msg' named 'gsm_l0_stats' is introduced. The 'gsm_l0_stats' is tied to an instance of 'struct kvmppc_gs_msg_ops' named 'gsb_ops_l0_stats' which holds various callbacks to be compute the size ( hostwide_get_size() ), populate the GSB ( hostwide_fill_info() ) and refresh ( hostwide_refresh_info() ) the contents of 'l0_stats' that holds the Hostwide counters returned from L0-PowerVM. To protect these structures from simultaneous access a spinlock 'lock_l0_stats' has been introduced. The allocation and initialization of the above structures is done in newly introduced kvmppc_init_hostwide() and similarly the cleanup is performed in newly introduced kvmppc_cleanup_hostwide(). Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250416162740.93143-6-vaibhav@linux.ibm.com
2025-04-16kvm powerpc/book3s-apiv2: Introduce kvm-hv specific PMUVaibhav Jain
Introduce a new PMU named 'kvm-hv' inside a new module named 'kvm-hv-pmu' to report Book3s kvm-hv specific performance counters. This will expose KVM-HV specific performance attributes to user-space via kernel's PMU infrastructure and would enableusers to monitor active kvm-hv based guests. The patch creates necessary scaffolding to for the new PMU callbacks and introduces the new kernel module name 'kvm-hv-pmu' which is built with CONFIG_KVM_BOOK3S_HV_PMU. The patch doesn't introduce any perf-events yet, which will be introduced in later patches Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250416162740.93143-5-vaibhav@linux.ibm.com
2025-04-09perf/arch: Record sample last_period before updating on the x86 and PowerPC ↵Mark Barnett
platforms This change alters the PowerPC and x86 driver implementations to record the last sample period before the event is updated for the next period. A common pattern in PMU driver implementations is to have a "*_event_set_period" function which takes care of updating the various period-related fields in a perf_event structure. In most cases, the drivers choose to call this function after initializing a sample data structure with perf_sample_data_init. The x86 and PowerPC drivers deviate from this, choosing to update the period before initializing the sample data. When using an event with an alternate sample period, this causes an incorrect period to be written to the sample data that gets reported to userspace. Signed-off-by: Mark Barnett <mark.barnett@arm.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20250408171530.140858-2-mark.barnett@arm.com
2025-04-01Merge tag 'driver-core-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updatesk from Greg KH: "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff happened this development cycle, including: - kernfs scaling changes to make it even faster thanks to rcu - bin_attribute constify work in many subsystems - faux bus minor tweaks for the rust bindings - rust binding updates for driver core, pci, and platform busses, making more functionaliy available to rust drivers. These are all due to people actually trying to use the bindings that were in 6.14. - make Rafael and Danilo full co-maintainers of the driver core codebase - other minor fixes and updates" * tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits) rust: platform: require Send for Driver trait implementers rust: pci: require Send for Driver trait implementers rust: platform: impl Send + Sync for platform::Device rust: pci: impl Send + Sync for pci::Device rust: platform: fix unrestricted &mut platform::Device rust: pci: fix unrestricted &mut pci::Device rust: device: implement device context marker rust: pci: use to_result() in enable_device_mem() MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers rust/kernel/faux: mark Registration methods inline driver core: faux: only create the device if probe() succeeds rust/faux: Add missing parent argument to Registration::new() rust/faux: Drop #[repr(transparent)] from faux::Registration rust: io: fix devres test with new io accessor functions rust: io: rename `io::Io` accessors kernfs: Move dput() outside of the RCU section. efi: rci2: mark bin_attribute as __ro_after_init rapidio: constify 'struct bin_attribute' firmware: qemu_fw_cfg: constify 'struct bin_attribute' powerpc/perf/hv-24x7: Constify 'struct bin_attribute' ...
2025-03-27Merge tag 'powerpc-6.15-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Madhavan Srinivasan: - Remove support for IBM Cell Blades - SMP support for microwatt platform - Support for inline static calls on PPC32 - Enable pmu selftests for power11 platform - Enable hardware trace macro (HTM) hcall support - Support for limited address mode capability - Changes to RMA size from 512 MB to 768 MB to handle fadump - Misc fixes and cleanups Thanks to Abhishek Dubey, Amit Machhiwal, Andreas Schwab, Arnd Bergmann, Athira Rajeev, Avnish Chouhan, Christophe Leroy, Disha Goel, Donet Tom, Gaurav Batra, Gautam Menghani, Hari Bathini, Kajol Jain, Kees Cook, Mahesh Salgaonkar, Michael Ellerman, Paul Mackerras, Ritesh Harjani (IBM), Sathvika Vasireddy, Segher Boessenkool, Sourabh Jain, Vaibhav Jain, and Venkat Rao Bagalkote. * tag 'powerpc-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (61 commits) powerpc/kexec: fix physical address calculation in clear_utlb_entry() crypto: powerpc: Mark ghashp8-ppc.o as an OBJECT_FILES_NON_STANDARD powerpc: Fix 'intra_function_call not a direct call' warning powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu' KVM: PPC: Enable CAP_SPAPR_TCE_VFIO on pSeries KVM guests powerpc/prom_init: Fixup missing #size-cells on PowerBook6,7 powerpc/microwatt: Add SMP support powerpc: Define config option for processors with broadcast TLBIE powerpc/microwatt: Define an idle power-save function powerpc/microwatt: Device-tree updates powerpc/microwatt: Select COMMON_CLK in order to get the clock framework net: toshiba: Remove reference to PPC_IBM_CELL_BLADE net: spider_net: Remove powerpc Cell driver cpufreq: ppc_cbe: Remove powerpc Cell driver genirq: Remove IRQ_EDGE_EOI_HANDLER docs: Remove reference to removed CBE_CPUFREQ_SPU_GOVERNOR powerpc: Remove UDBG_RTAS_CONSOLE powerpc/io: Use standard barrier macros in io.c powerpc/io: Rename _insw_ns() etc. powerpc/io: Use generic raw accessors ...
2025-03-17perf: Supply task information to sched_task()Kan Liang
To save/restore LBR call stack data in system-wide mode, the task_struct information is required. Extend the parameters of sched_task() to supply task_struct information. When schedule in, the LBR call stack data for new task will be restored. When schedule out, the LBR call stack data for old task will be saved. Only need to pass the required task_struct information. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250314172700.438923-4-kan.liang@linux.intel.com
2025-03-07powerpc/perf: Fix ref-counting on the PMU 'vpa_pmu'Vaibhav Jain
Commit 176cda0619b6 ("powerpc/perf: Add perf interface to expose vpa counters") introduced 'vpa_pmu' to expose Book3s-HV nested APIv2 provided L1<->L2 context switch latency counters to L1 user-space via perf-events. However the newly introduced PMU named 'vpa_pmu' doesn't assign ownership of the PMU to the module 'vpa_pmu'. Consequently the module 'vpa_pmu' can be unloaded while one of the perf-events are still active, which can lead to kernel oops and panic of the form below on a Pseries-LPAR: BUG: Kernel NULL pointer dereference on read at 0x00000058 <snip> NIP [c000000000506cb8] event_sched_out+0x40/0x258 LR [c00000000050e8a4] __perf_remove_from_context+0x7c/0x2b0 Call Trace: [c00000025fc3fc30] [c00000025f8457a8] 0xc00000025f8457a8 (unreliable) [c00000025fc3fc80] [fffffffffffffee0] 0xfffffffffffffee0 [c00000025fc3fcd0] [c000000000501e70] event_function+0xa8/0x120 <snip> Kernel panic - not syncing: Aiee, killing interrupt handler! Fix this by adding the module ownership to 'vpa_pmu' so that the module 'vpa_pmu' is ref-counted and prevented from being unloaded when perf-events are initialized. Fixes: 176cda0619b6 ("powerpc/perf: Add perf interface to expose vpa counters") Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250204153527.125491-1-vaibhav@linux.ibm.com
2025-02-21powerpc/perf/hv-24x7: Constify 'struct bin_attribute'Thomas Weißschuh
The sysfs core now allows instances of 'struct bin_attribute' to be moved into read-only memory. Make use of that to protect them against accidental or malicious modifications. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Link: https://lore.kernel.org/r/20241216-sysfs-const-bin_attr-powerpc-v1-5-bbed8906f476@weissschuh.net Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-11arch/powerpc/perf: Update get_mem_data_src function to use saved values of ↵Athira Rajeev
sier and mmcra regs During performance monitor interrupt handling, the regs are setup using perf_read_regs function. Here some of the pt_regs fields is overloaded. Samples Instruction Event Register (SIER) is loaded into pt_regs, overloading regs->dar. And regs->dsisr to store MMCRA (Monitor Mode Control Register A) so that we only need to read these once on each interrupt. Update the isa207_get_mem_data_src function to use regs->dar instead of reading from SPRN_SIER again. Also use regs->dsisr to read the mmcra value Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250121131621.39054-2-atrajeev@linux.vnet.ibm.com
2025-02-11arch/powerpc/perf: Check the instruction type before creating sample with ↵Athira Rajeev
perf_mem_data_src perf mem report aborts as below sometimes (during some corner case) in powerpc: # ./perf mem report 1>out *** stack smashing detected ***: terminated Aborted (core dumped) The backtrace is as below: __pthread_kill_implementation () raise () abort () __libc_message __fortify_fail __stack_chk_fail hist_entry.lvl_snprintf __sort__hpp_entry __hist_entry__snprintf hists.fprintf cmd_report cmd_mem Snippet of code which triggers the issue from tools/perf/util/sort.c static int hist_entry__lvl_snprintf(struct hist_entry *he, char *bf, size_t size, unsigned int width) { char out[64]; perf_mem__lvl_scnprintf(out, sizeof(out), he->mem_info); return repsep_snprintf(bf, size, "%-*s", width, out); } The value of "out" is filled from perf_mem_data_src value. Debugging this further showed that for some corner cases, the value of "data_src" was pointing to wrong value. This resulted in bigger size of string and causing stack check fail. The perf mem data source values are captured in the sample via isa207_get_mem_data_src function. The initial check is to fetch the type of sampled instruction. If the type of instruction is not valid (not a load/store instruction), the function returns. Since 'commit e16fd7f2cb1a ("perf: Use sample_flags for data_src")', data_src field is not initialized by the perf_sample_data_init() function. If the PMU driver doesn't set the data_src value to zero if type is not valid, this will result in uninitailised value for data_src. The uninitailised value of data_src resulted in stack check fail followed by abort for "perf mem report". When requesting for data source information in the sample, the instruction type is expected to be load or store instruction. In ISA v3.0, due to hardware limitation, there are corner cases where the instruction type other than load or store is observed. In ISA v3.0 and before values "0" and "7" are considered reserved. In ISA v3.1, value "7" has been used to indicate "larx/stcx". Drop the sample if instruction type has reserved values for this field with a ISA version check. Initialize data_src to zero in isa207_get_mem_data_src if the instruction type is not load/store. Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://patch.msgid.link/20250121131621.39054-1-atrajeev@linux.vnet.ibm.com
2024-11-23Merge tag 'powerpc-6.13-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Rework kfence support for the HPT MMU to work on systems with >= 16TB of RAM. - Remove the powerpc "maple" platform, used by the "Yellow Dog Powerstation". - Add support for DYNAMIC_FTRACE_WITH_CALL_OPS, DYNAMIC_FTRACE_WITH_DIRECT_CALLS & BPF Trampolines. - Add support for running KVM nested guests on Power11. - Other small features, cleanups and fixes. Thanks to Amit Machhiwal, Arnd Bergmann, Christophe Leroy, Costa Shulyupin, David Hunter, David Wang, Disha Goel, Gautam Menghani, Geert Uytterhoeven, Hari Bathini, Julia Lawall, Kajol Jain, Keith Packard, Lukas Bulwahn, Madhavan Srinivasan, Markus Elfring, Michal Suchanek, Ming Lei, Mukesh Kumar Chaurasiya, Nathan Chancellor, Naveen N Rao, Nicholas Piggin, Nysal Jan K.A, Paulo Miguel Almeida, Pavithra Prakash, Ritesh Harjani (IBM), Rob Herring (Arm), Sachin P Bappalige, Shen Lichuan, Simon Horman, Sourabh Jain, Thomas Weißschuh, Thorsten Blum, Thorsten Leemhuis, Venkat Rao Bagalkote, Zhang Zekun, and zhang jiao. * tag 'powerpc-6.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (89 commits) EDAC/powerpc: Remove PPC_MAPLE drivers powerpc/perf: Add per-task/process monitoring to vpa_pmu driver powerpc/kvm: Add vpa latency counters to kvm_vcpu_arch docs: ABI: sysfs-bus-event_source-devices-vpa-pmu: Document sysfs event format entries for vpa_pmu powerpc/perf: Add perf interface to expose vpa counters MAINTAINERS: powerpc: Mark Maddy as "M" powerpc/Makefile: Allow overriding CPP powerpc-km82xx.c: replace of_node_put() with __free ps3: Correct some typos in comments powerpc/kexec: Fix return of uninitialized variable macintosh: Use common error handling code in via_pmu_led_init() powerpc/powermac: Use of_property_match_string() in pmac_has_backlight_type() powerpc: remove dead config options for MPC85xx platform support powerpc/xive: Use cpumask_intersects() selftests/powerpc: Remove the path after initialization. powerpc/xmon: symbol lookup length fixed powerpc/ep8248e: Use %pa to format resource_size_t powerpc/ps3: Reorganize kerneldoc parameter names KVM: PPC: Book3S HV: Fix kmv -> kvm typo powerpc/sstep: make emulate_vsx_load and emulate_vsx_store static ...
2024-11-23Merge tag 'mm-stable-2024-11-18-19-27' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: - The series "zram: optimal post-processing target selection" from Sergey Senozhatsky improves zram's post-processing selection algorithm. This leads to improved memory savings. - Wei Yang has gone to town on the mapletree code, contributing several series which clean up the implementation: - "refine mas_mab_cp()" - "Reduce the space to be cleared for maple_big_node" - "maple_tree: simplify mas_push_node()" - "Following cleanup after introduce mas_wr_store_type()" - "refine storing null" - The series "selftests/mm: hugetlb_fault_after_madv improvements" from David Hildenbrand fixes this selftest for s390. - The series "introduce pte_offset_map_{ro|rw}_nolock()" from Qi Zheng implements some rationaizations and cleanups in the page mapping code. - The series "mm: optimize shadow entries removal" from Shakeel Butt optimizes the file truncation code by speeding up the handling of shadow entries. - The series "Remove PageKsm()" from Matthew Wilcox completes the migration of this flag over to being a folio-based flag. - The series "Unify hugetlb into arch_get_unmapped_area functions" from Oscar Salvador implements a bunch of consolidations and cleanups in the hugetlb code. - The series "Do not shatter hugezeropage on wp-fault" from Dev Jain takes away the wp-fault time practice of turning a huge zero page into small pages. Instead we replace the whole thing with a THP. More consistent cleaner and potentiall saves a large number of pagefaults. - The series "percpu: Add a test case and fix for clang" from Andy Shevchenko enhances and fixes the kernel's built in percpu test code. - The series "mm/mremap: Remove extra vma tree walk" from Liam Howlett optimizes mremap() by avoiding doing things which we didn't need to do. - The series "Improve the tmpfs large folio read performance" from Baolin Wang teaches tmpfs to copy data into userspace at the folio size rather than as individual pages. A 20% speedup was observed. - The series "mm/damon/vaddr: Fix issue in damon_va_evenly_split_region()" fro Zheng Yejian fixes DAMON splitting. - The series "memcg-v1: fully deprecate charge moving" from Shakeel Butt removes the long-deprecated memcgv2 charge moving feature. - The series "fix error handling in mmap_region() and refactor" from Lorenzo Stoakes cleanup up some of the mmap() error handling and addresses some potential performance issues. - The series "x86/module: use large ROX pages for text allocations" from Mike Rapoport teaches x86 to use large pages for read-only-execute module text. - The series "page allocation tag compression" from Suren Baghdasaryan is followon maintenance work for the new page allocation profiling feature. - The series "page->index removals in mm" from Matthew Wilcox remove most references to page->index in mm/. A slow march towards shrinking struct page. - The series "damon/{self,kunit}tests: minor fixups for DAMON debugfs interface tests" from Andrew Paniakin performs maintenance work for DAMON's self testing code. - The series "mm: zswap swap-out of large folios" from Kanchana Sridhar improves zswap's batching of compression and decompression. It is a step along the way towards using Intel IAA hardware acceleration for this zswap operation. - The series "kasan: migrate the last module test to kunit" from Sabyrzhan Tasbolatov completes the migration of the KASAN built-in tests over to the KUnit framework. - The series "implement lightweight guard pages" from Lorenzo Stoakes permits userapace to place fault-generating guard pages within a single VMA, rather than requiring that multiple VMAs be created for this. Improved efficiencies for userspace memory allocators are expected. - The series "memcg: tracepoint for flushing stats" from JP Kobryn uses tracepoints to provide increased visibility into memcg stats flushing activity. - The series "zram: IDLE flag handling fixes" from Sergey Senozhatsky fixes a zram buglet which potentially affected performance. - The series "mm: add more kernel parameters to control mTHP" from Maíra Canal enhances our ability to control/configuremultisize THP from the kernel boot command line. - The series "kasan: few improvements on kunit tests" from Sabyrzhan Tasbolatov has a couple of fixups for the KASAN KUnit tests. - The series "mm/list_lru: Split list_lru lock into per-cgroup scope" from Kairui Song optimizes list_lru memory utilization when lockdep is enabled. * tag 'mm-stable-2024-11-18-19-27' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (215 commits) cma: enforce non-zero pageblock_order during cma_init_reserved_mem() mm/kfence: add a new kunit test test_use_after_free_read_nofault() zram: fix NULL pointer in comp_algorithm_show() memcg/hugetlb: add hugeTLB counters to memcg vmstat: call fold_vm_zone_numa_events() before show per zone NUMA event mm: mmap_lock: check trace_mmap_lock_$type_enabled() instead of regcount zram: ZRAM_DEF_COMP should depend on ZRAM MAINTAINERS/MEMORY MANAGEMENT: add document files for mm Docs/mm/damon: recommend academic papers to read and/or cite mm: define general function pXd_init() kmemleak: iommu/iova: fix transient kmemleak false positive mm/list_lru: simplify the list_lru walk callback function mm/list_lru: split the lock to per-cgroup scope mm/list_lru: simplify reparenting and initial allocation mm/list_lru: code clean up for reparenting mm/list_lru: don't export list_lru_add mm/list_lru: don't pass unnecessary key parameters kasan: add kunit tests for kmalloc_track_caller, kmalloc_node_track_caller kasan: change kasan_atomics kunit test as KUNIT_CASE_SLOW kasan: use EXPORT_SYMBOL_IF_KUNIT to export symbols ...
2024-11-19powerpc/perf: Add per-task/process monitoring to vpa_pmu driverKajol Jain
Enhance the vpa_pmu driver with a feature to observe context switch latency event for both per-task (tid) and per-pid (pid) option. Couple of new helper functions are added to hide the abstraction of reading the context switch latency counter from kvm_vcpu_arch struct and these helper functions are defined in the "kvm/book3s_hv.c". "PERF_ATTACH_TASK" flag is used to decide whether to read the counter values from lppaca or kvm_vcpu_arch struct. Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Co-developed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241118114114.208964-4-kjain@linux.ibm.com
2024-11-19powerpc/perf: Add perf interface to expose vpa countersKajol Jain
To support performance measurement for KVM on PowerVM(KoP) feature, PowerVM hypervisor has added couple of new software counters in Virtual Process Area(VPA) of the partition. Commit e1f288d2f9c69 ("KVM: PPC: Book3S HV nestedv2: Add support for reading VPA counters for pseries guests") have updated the paca fields with corresponding changes. Proposed perf interface is to expose these new software counters for monitoring of context switch latencies and runtime aggregate. Perf interface driver is called "vpa_pmu" and it has dependency on KVM and perf, hence added new config called "VPA_PMU" which depends on "CONFIG_KVM_BOOK3S_64_HV" and "CONFIG_HV_PERF_CTRS". Since, kvm and kvm_host are currently compiled as built-in modules, this perf interface takes the same path and registered as a module. vpa_pmu perf interface needs access to some of the kvm functions and structures like kvmhv_get_l2_counters_status(), hence kvm_book3s_64.h and kvm_ppc.h are included. Below are the events added to monitor KoP: vpa_pmu/l1_to_l2_lat/ vpa_pmu/l2_to_l1_lat/ vpa_pmu/l2_runtime_agg/ and vpa_pmu driver supports only per-cpu monitoring with this patch. Example usage: [command]# perf stat -e vpa_pmu/l1_to_l2_lat/ -a -I 1000 1.001017682 727,200 vpa_pmu/l1_to_l2_lat/ 2.003540491 1,118,824 vpa_pmu/l1_to_l2_lat/ 3.005699458 1,919,726 vpa_pmu/l1_to_l2_lat/ 4.007827011 2,364,630 vpa_pmu/l1_to_l2_lat/ Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Co-developed-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://patch.msgid.link/20241118114114.208964-1-kjain@linux.ibm.com
2024-11-14perf/powerpc: Use perf_arch_instruction_pointer()Colton Lewis
Make sure PowerPC uses the arch-specific function now that those have been reorganized. Signed-off-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com> Link: https://lore.kernel.org/r/20241113190156.2145593-4-coltonlewis@google.com
2024-11-14perf/core: Hoist perf_instruction_pointer() and perf_misc_flags()Colton Lewis
For clarity, rename the arch-specific definitions of these functions to perf_arch_* to denote they are arch-specifc. Define the generic-named functions in one place where they can call the arch-specific ones as needed. Signed-off-by: Colton Lewis <coltonlewis@google.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Oliver Upton <oliver.upton@linux.dev> Acked-by: Thomas Richter <tmricht@linux.ibm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Acked-by: Madhavan Srinivasan <maddy@linux.ibm.com> Acked-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20241113190156.2145593-3-coltonlewis@google.com
2024-11-07asm-generic: introduce text-patching.hMike Rapoport (Microsoft)
Several architectures support text patching, but they name the header files that declare patching functions differently. Make all such headers consistently named text-patching.h and add an empty header in asm-generic for architectures that do not support text patching. Link: https://lkml.kernel.org/r/20241023162711.2579610-4-rppt@kernel.org Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> # m68k Acked-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Luis Chamberlain <mcgrof@kernel.org> Tested-by: kdevops <kdevops@lists.linux.dev> Cc: Andreas Larsson <andreas@gaisler.com> Cc: Andy Lutomirski <luto@kernel.org> Cc: Ard Biesheuvel <ardb@kernel.org> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Brian Cain <bcain@quicinc.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Christophe Leroy <christophe.leroy@csgroup.eu> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dinh Nguyen <dinguyen@kernel.org> Cc: Guo Ren <guoren@kernel.org> Cc: Helge Deller <deller@gmx.de> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ingo Molnar <mingo@redhat.com> Cc: Johannes Berg <johannes@sipsolutions.net> Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> Cc: Kent Overstreet <kent.overstreet@linux.dev> Cc: Liam R. Howlett <Liam.Howlett@Oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu (Google) <mhiramat@kernel.org> Cc: Matt Turner <mattst88@gmail.com> Cc: Max Filippov <jcmvbkbc@gmail.com> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Michal Simek <monstr@monstr.eu> Cc: Oleg Nesterov <oleg@redhat.com> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Richard Weinberger <richard@nod.at> Cc: Russell King <linux@armlinux.org.uk> Cc: Song Liu <song@kernel.org> Cc: Stafford Horne <shorne@gmail.com> Cc: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Uladzislau Rezki (Sony) <urezki@gmail.com> Cc: Vineet Gupta <vgupta@kernel.org> Cc: Will Deacon <will@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-06-17powerpc/perf: Set cpumode flags using sample addressAnjali K
Currently in some cases, when the sampled instruction address register latches to a specific address during sampling, the privilege bits captured in the sampled event register are incorrect. For example, a snippet from the perf report on a power10 system is: Overhead Address Command Shared Object Symbol ........ .................. ............ ................. ....................... 2.41% 0x7fff9f94a02c null_syscall [unknown] [k] 0x00007fff9f94a02c 2.20% 0x7fff9f94a02c null_syscall libc.so.6 [.] syscall perf_get_misc_flags() function looks at the privilege bits to return the corresponding flags to be used for the address symbol and these privilege bit details are read from the sampled event register. In the above snippet, address "0x00007fff9f94a02c" is shown as "k" (kernel) due to the incorrect privilege bits captured in the sampled event register. To address this case check whether the sampled address is in the kernel area. Since this is specific to the latest platform, a new pmu flag is added called "PPMU_P10" and is used to contain the proposed fix. PPMU_P10_DD1 marked events are also included under PPMU_P10, hence remove the code specific to PPMU_P10_DD1 marked events. Signed-off-by: Anjali K <anjalik@linux.ibm.com> Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com <mailto:atrajeev@linux.vnet.ibm.com>> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240528040356.2722275-1-anjalik@linux.ibm.com
2024-05-04driver core: Add device_show_string() helper for sysfs attributesLukas Wunner
For drivers wishing to expose an unsigned long, int or bool at a static memory location in sysfs, the driver core provides ready-made helpers such as device_show_ulong() to be used as ->show() callback. Some drivers need to expose a string and so far they all provide their own ->show() implementation. arch/powerpc/perf/hv-24x7.c went so far as to create a device_show_string() helper but kept it private. Make it public for reuse by other drivers. The pattern seems to be sufficiently frequent to merit a public helper. Add a DEVICE_STRING_ATTR_RO() macro in line with the existing DEVICE_ULONG_ATTR() and similar macros to ease declaration of string attributes. Signed-off-by: Lukas Wunner <lukas@wunner.de> Acked-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/2e3eaaf2600bb55c0415c23ba301e809403a7aa2.1713608122.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-03-03powerpc/hv-gpci: Fix the H_GET_PERF_COUNTER_INFO hcall return value checksKajol Jain
Running event hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ in one of the system throws below error: ---Logs--- # perf list | grep hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=?/[Kernel PMU event] # perf stat -v -e hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ sleep 2 Using CPUID 00800200 Control descriptor is not initialized Warning: hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ event is not supported by the kernel. failed to read counter hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ Performance counter stats for 'system wide': <not supported> hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ 2.000700771 seconds time elapsed The above error is because of the hcall failure as required permission "Enable Performance Information Collection" is not set. Based on current code, single_gpci_request function did not check the error type incase hcall fails and by default returns EINVAL. But we can have other reasons for hcall failures like H_AUTHORITY/H_PARAMETER with detail_rc as GEN_BUF_TOO_SMALL, for which we need to act accordingly. Fix this issue by adding new checks in the single_gpci_request and h_gpci_event_init functions. Result after fix patch changes: # perf stat -e hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ sleep 2 Error: No permission to enable hv_gpci/dispatch_timebase_by_processor_processor_time_in_timebase_cycles,phys_processor_idx=0/ event. Fixes: 220a0c609ad1 ("powerpc/perf: Add support for the hv gpci (get performance counter info) interface") Reported-by: Akanksha J N <akanksha@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240229122847.101162-1-kjain@linux.ibm.com
2024-02-22powerpc: Use user_mode() macro when possibleChristophe Leroy
There is a nice macro to check user mode. Use it instead of open coding anding with MSR_PR to increase readability and avoid having to comment what that anding is for. Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/fbf74887dcf1f1ba9e1680fc3247cbb581b00662.1708078228.git.christophe.leroy@csgroup.eu
2024-02-21powerpc/perf: Power11 Performance Monitoring supportMadhavan Srinivasan
Base enablement patch to register performance monitoring hardware support for Power11. Most of fields are copied from power10_pmu struct for power11_pmu struct. Signed-off-by: Madhavan Srinivasan <maddy@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20240221044623.1598642-2-mpe@ellerman.id.au
2024-01-08Merge tag 'perf-core-2024-01-08' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: - Add branch stack counters ABI extension to better capture the growing amount of information the PMU exposes via branch stack sampling. There's matching tooling support. - Fix race when creating the nr_addr_filters sysfs file - Add Intel Sierra Forest and Grand Ridge intel/cstate PMU support - Add Intel Granite Rapids, Sierra Forest and Grand Ridge uncore PMU support - Misc cleanups & fixes * tag 'perf-core-2024-01-08' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel/uncore: Factor out topology_gidnid_map() perf/x86/intel/uncore: Fix NULL pointer dereference issue in upi_fill_topology() perf/x86/amd: Reject branch stack for IBS events perf/x86/intel/uncore: Support Sierra Forest and Grand Ridge perf/x86/intel/uncore: Support IIO free-running counters on GNR perf/x86/intel/uncore: Support Granite Rapids perf/x86/uncore: Use u64 to replace unsigned for the uncore offsets array perf/x86/intel/uncore: Generic uncore_get_uncores and MMIO format of SPR perf: Fix the nr_addr_filters fix perf/x86/intel/cstate: Add Grand Ridge support perf/x86/intel/cstate: Add Sierra Forest support x86/smp: Export symbol cpu_clustergroup_mask() perf/x86/intel/cstate: Cleanup duplicate attr_groups perf/core: Fix narrow startup race when creating the perf nr_addr_filters sysfs file perf/x86/intel: Support branch counters logging perf/x86/intel: Reorganize attrs and is_visible perf: Add branch_sample_call_stack perf/x86: Add PERF_X86_EVENT_NEEDS_BRANCH_STACK flag perf: Add branch stack counters
2023-12-13powerpc/imc-pmu: Add a null pointer check in update_events_in_group()Kunwu Chan
kasprintf() returns a pointer to dynamically allocated memory which can be NULL upon failure. Fixes: 885dcd709ba9 ("powerpc/perf: Add nest IMC PMU support") Signed-off-by: Kunwu Chan <chentao@kylinos.cn> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231126093719.1440305-1-chentao@kylinos.cn
2023-12-13powerpc/hv-gpci: Add return value check in ↵Kajol Jain
affinity_domain_via_partition_show function To access hv-gpci kernel interface files data, the "Enable Performance Information Collection" option has to be set in hmc. Incase that option is not set and user try to read the interface files, it should give error message as operation not permitted. Result of accessing added interface files with disabled performance collection option: [command]# cat processor_bus_topology cat: processor_bus_topology: Operation not permitted [command]# cat processor_config cat: processor_config: Operation not permitted [command]# cat affinity_domain_via_domain cat: affinity_domain_via_domain: Operation not permitted [command]# cat affinity_domain_via_virtual_processor cat: affinity_domain_via_virtual_processor: Operation not permitted [command]# cat affinity_domain_via_partition Based on above result there is no error message when reading affinity_domain_via_partition file because of missing check for failed hcall. Fix this issue by adding a check in the start of affinity_domain_via_partition_show function, to return error incase hcall fails, with error type other then H_PARAMETER. Fixes: a15e0d6a6929 ("powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity domain via partition information") Reported-by: Disha Goel <disgoel@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231116122033.160964-1-kjain@linux.ibm.com
2023-11-15Merge branch 'tip/perf/urgent'Peter Zijlstra
Avoid conflicts, base on fixes. Signed-off-by: Peter Zijlstra <peterz@infradead.org>
2023-10-27perf: Add branch stack countersKan Liang
Currently, the additional information of a branch entry is stored in a u64 space. With more and more information added, the space is running out. For example, the information of occurrences of events will be added for each branch. Two places were suggested to append the counters. https://lore.kernel.org/lkml/20230802215814.GH231007@hirez.programming.kicks-ass.net/ One place is right after the flags of each branch entry. It changes the existing struct perf_branch_entry. The later ARCH specific implementation has to be really careful to consistently pick the right struct. The other place is right after the entire struct perf_branch_stack. The disadvantage is that the pointer of the extra space has to be recorded. The common interface perf_sample_save_brstack() has to be updated. The latter is much straightforward, and should be easily understood and maintained. It is implemented in the patch. Add a new branch sample type, PERF_SAMPLE_BRANCH_COUNTERS, to indicate the event which is recorded in the branch info. The "u64 counters" may store the occurrences of several events. The information regarding the number of events/counters and the width of each counter should be exposed via sysfs as a reference for the perf tool. Define the branch_counter_nr and branch_counter_width ABI here. The support will be implemented later in the Intel-specific patch. Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20231025201626.3000228-1-kan.liang@linux.intel.com
2023-10-20powerpc/perf: Fix disabling BHRB and instruction samplingNicholas Piggin
When the PMU is disabled, MMCRA is not updated to disable BHRB and instruction sampling. This can lead to those features remaining enabled, which can slow down a real or emulated CPU. Fixes: 1cade527f6e9 ("powerpc/perf: BHRB control to disable BHRB logic when not used") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231018153423.298373-1-npiggin@gmail.com
2023-10-20powerpc/imc-pmu: Use the correct spinlock initializer.Sebastian Andrzej Siewior
The macro __SPIN_LOCK_INITIALIZER() is implementation specific. Users that desire to initialize a spinlock in a struct must use __SPIN_LOCK_UNLOCKED(). Use __SPIN_LOCK_UNLOCKED() for the spinlock_t in imc_global_refc. Fixes: 76d588dddc459 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230309134831.Nz12nqsU@linutronix.de
2023-10-19powerpc/perf: Optimize find_alternatives_list() using binary searchKuan-Wei Chiu
This patch improves the performance of event alternative lookup by replacing the previous linear search with a more efficient binary search. This change reduces the time complexity for the search process from O(n) to O(log(n)). A pre-sorted table of event values and their corresponding indices has been introduced to expedite the search process. Signed-off-by: Kuan-Wei Chiu <visitorckw@gmail.com> [mpe: Call the array "presort*ed*_event_table", minor formatting] Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231013175714.2142775-1-visitorckw@gmail.com
2023-10-19powerpc: Annotate endianness of various variables and functionsBenjamin Gray
Sparse reports several endianness warnings on variables and functions that are consistently treated as big endian. There are no multi-endianness shenanigans going on here so fix these low hanging fruit up in one patch. All changes are just type annotations; no endianness switching operations are introduced by this patch. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231011053711.93427-7-bgray@linux.ibm.com
2023-10-19powerpc: Use NULL instead of 0 for null pointersBenjamin Gray
Sparse reports several uses of 0 for pointer arguments and comparisons. Replace with NULL to better convey the intent. Remove entirely if a comparison to follow the kernel style of implicit boolean conversions. Signed-off-by: Benjamin Gray <bgray@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20231011053711.93427-5-bgray@linux.ibm.com
2023-09-18powerpc/perf/hv-24x7: Update domain value checkKajol Jain
Valid domain value is in range 1 to HV_PERF_DOMAIN_MAX. Current code has check for domain value greater than or equal to HV_PERF_DOMAIN_MAX. But the check for domain value 0 is missing. Fix this issue by adding check for domain value 0. Before: # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1 Using CPUID 00800200 Control descriptor is not initialized Error: The sys_perf_event_open() syscall returned with 5 (Input/output error) for event (hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/). /bin/dmesg | grep -i perf may provide additional information. Result from dmesg: [ 37.819387] hv-24x7: hcall failed: [0 0x60040000 0x100 0] => ret 0xfffffffffffffffc (-4) detail=0x2000000 failing ix=0 After: # ./perf stat -v -e hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ sleep 1 Using CPUID 00800200 Control descriptor is not initialized Warning: hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ event is not supported by the kernel. failed to read counter hv_24x7/CPM_ADJUNCT_INST,domain=0,core=1/ Fixes: ebd4a5a3ebd9 ("powerpc/perf/hv-24x7: Minor improvements") Reported-by: Krishan Gopal Sarawast <krishang@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Disha Goel <disgoel@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230825055601.360083-1-kjain@linux.ibm.com
2023-08-18powerpc/perf: Convert fsl_emb notifier to state machine callbacksChristophe Leroy
CC arch/powerpc/perf/core-fsl-emb.o arch/powerpc/perf/core-fsl-emb.c:675:6: error: no previous prototype for 'hw_perf_event_setup' [-Werror=missing-prototypes] 675 | void hw_perf_event_setup(int cpu) | ^~~~~~~~~~~~~~~~~~~ Looks like fsl_emb was completely missed by commit 3f6da3905398 ("perf: Rework and fix the arch CPU-hotplug hooks") So, apply same changes as commit 3f6da3905398 ("perf: Rework and fix the arch CPU-hotplug hooks") then commit 57ecde42cc74 ("powerpc/perf: Convert book3s notifier to state machine callbacks") While at it, also fix following error: arch/powerpc/perf/core-fsl-emb.c: In function 'perf_event_interrupt': arch/powerpc/perf/core-fsl-emb.c:648:13: error: variable 'found' set but not used [-Werror=unused-but-set-variable] 648 | int found = 0; | ^~~~~ Fixes: 3f6da3905398 ("perf: Rework and fix the arch CPU-hotplug hooks") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/603e1facb32608f88f40b7d7b9094adc50e7b2dc.1692349125.git.christophe.leroy@csgroup.eu
2023-08-16powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity ↵Kajol Jain
domain via partition information The hcall H_GET_PERF_COUNTER_INFO with counter request value as AFFINITY_DOMAIN_INFORMATION_BY_PARTITION(0XB1), can be used to get the system affinity domain via partition information. To expose the system affinity domain via partition information, patch adds sysfs file called "affinity_domain_via_partition" to the "/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver. Add new entry for AFFINITY_DOMAIN_VIA_PAR in sysinfo_counter_request array, which points to the counter request value "affinity_domain_via_partition" in hv-gpci.c file. Also add a new function called "affinity_domain_via_partition_result_parse" to parse the hcall result and store it in output buffer. The affinity_domain_via_partition sysfs file is only available for power10 and above platforms. Add a macro called INTERFACE_AFFINITY_DOMAIN_VIA_PAR_ATTR, which points to the index of NULL placeholder, for affinity_domain_via_partition attribute in interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR macro in hv-gpci.c file. Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230729073455.7918-10-kjain@linux.ibm.com
2023-08-16powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity ↵Kajol Jain
domain via domain information The hcall H_GET_PERF_COUNTER_INFO with counter request value as AFFINITY_DOMAIN_INFORMATION_BY_DOMAIN(0XB0), can be used to get the system affinity domain via domain information. To expose the system affinity domain via domain information, patch adds sysfs file called "affinity_domain_via_domain" to the "/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver. Add new entry for AFFINITY_DOMAIN_VIA_DOM in sysinfo_counter_request array, which points to the counter request value "affinity_domain_via_domain" in hv-gpci.c file. The affinity_domain_via_domain sysfs file is only available for power10 and above platforms. Add a macro called INTERFACE_AFFINITY_DOMAIN_VIA_DOM_ATTR, which points to the index of NULL placeholder, for affinity_domain_via_domain attribute in interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR macro in hv-gpci.c file. Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230729073455.7918-8-kjain@linux.ibm.com
2023-08-16powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show affinity ↵Kajol Jain
domain via virtual processor information The hcall H_GET_PERF_COUNTER_INFO with counter request value as AFFINITY_DOMAIN_INFORMATION_BY_VIRTUAL_PROCESSOR(0XA0), can be used to get the system affinity domain via virtual processor information. To expose the system affinity domain via virtual processor information, patch adds sysfs file called "affinity_domain_via_virtual_processor" to the "/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver. The affinity_domain_via_virtual_processor sysfs file is only available for power10 and above platforms. Add a macro called INTERFACE_AFFINITY_DOMAIN_VIA_VP_ATTR, which points to the index of NULL placeholder, for affinity_domain_via_virtual_processor attribute in interface_attrs array. Also updated the value of INTERFACE_NULL_ATTR macro in hv-gpci.c file. Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230729073455.7918-6-kjain@linux.ibm.com
2023-08-16powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor ↵Kajol Jain
config information The hcall H_GET_PERF_COUNTER_INFO with counter request value as PROCESSOR_CONFIG(0X90), can be used to get the system processor configuration information. To expose the system processor config information, patch adds sysfs file called "processor_config" to the "/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver. Add enum and sysinfo_counter_request array to get required counter request value in hv-gpci.c file. Also add a new function called "sysinfo_device_attr_create", which will create and return required device attribute to the add_sysinfo_interface_files function. The processor_config sysfs file is only available for power10 and above platforms. Add a new macro called INTERFACE_PROCESSOR_CONFIG_ATTR, which points to the index of NULL placefolder, for processor_config attribute in the interface_attrs array. Also add macro INTERFACE_NULL_ATTR which points to index of NULL attribute in interface_attrs array. Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230729073455.7918-4-kjain@linux.ibm.com
2023-08-16powerpc/hv_gpci: Add sysfs file inside hv_gpci device to show processor bus ↵Kajol Jain
topology information The hcall H_GET_PERF_COUNTER_INFO with counter request value as PROCESSOR_BUS_TOPOLOGY(0XD0), can be used to get the system topology information. To expose the system topology information, patch adds sysfs file called "processor_bus_topology" to the "/sys/devices/hv_gpci/interface/" of hv_gpci pmu driver. Add macro for PROCESSOR_BUS_TOPOLOGY counter request value in hv-gpci.c file. Also add a new function called "systeminfo_gpci_request", to make the H_GET_PERF_COUNTER_INFO hcall with added macro and populates the output buffer. The processor_bus_topology sysfs file is only available for power10 and above platforms. Add a new function called "add_sysinfo_interface_files", which will add processor_bus_topology attribute in the interface_attrs array, only for power10 and above platforms. Also add macro INTERFACE_PROCESSOR_BUS_TOPOLOGY_ATTR in hv-gpci.c file, which points to the index of NULL placefolder, for processor_bus_topology attribute. Reviewed-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/20230729073455.7918-2-kjain@linux.ibm.com
2023-03-30powerpc/perf: Properly detect mpc7450 familyChristophe Leroy
Unlike PVR_POWER8, etc ...., PVR_7450 represents a full PVR value and not a family value. To avoid confusion, do like E500 family and define the relevant PVR_VER_xxxx values for the 7450 family: 0x8000 ==> 7450 0x8001 ==> 7455 0x8002 ==> 7447 0x8003 ==> 7447A 0x8004 ==> 7448 And use them to detect 7450 family for perf events. Reported-by: kernel test robot <lkp@intel.com> Reported-by: Dan Carpenter <error27@gmail.com> Link: https://lore.kernel.org/r/202302260657.7dM9Uwev-lkp@intel.com/ Fixes: ec3eb9d941a9 ("powerpc/perf: Use PVR rather than oprofile field to determine CPU version") Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://msgid.link/99ca1da2e5a6cf82a8abf4bc034918e500e31781.1677513277.git.christophe.leroy@csgroup.eu
2023-02-25Merge tag 'powerpc-6.3-1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux Pull powerpc updates from Michael Ellerman: - Support for configuring secure boot with user-defined keys on PowerVM LPARs - Simplify the replay of soft-masked IRQs by making it non-recursive - Add support for KCSAN on 64-bit Book3S - Improvements to the API & code which interacts with RTAS (pseries firmware) - Change 32-bit powermac to assign PCI bus numbers per domain by default - Some improvements to the 32-bit BPF JIT - Various other small features and fixes Thanks to Anders Roxell, Andrew Donnellan, Andrew Jeffery, Benjamin Gray, Christophe Leroy, Frederic Barrat, Ganesh Goudar, Geoff Levand, Greg Kroah-Hartman, Jan-Benedict Glaw, Josh Poimboeuf, Kajol Jain, Laurent Dufour, Mahesh Salgaonkar, Mathieu Desnoyers, Mimi Zohar, Murphy Zhou, Nathan Chancellor, Nathan Lynch, Nayna Jain, Nicholas Piggin, Pali Rohár, Petr Mladek, Rohan McLure, Russell Currey, Sachin Sant, Sathvika Vasireddy, Sourabh Jain, Stefan Berger, Stephen Rothwell, and Sudhakar Kuppusamy. * tag 'powerpc-6.3-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (114 commits) powerpc/pseries: Avoid hcall in plpks_is_available() on non-pseries powerpc: dts: turris1x.dts: Set lower priority for CPLD syscon-reboot powerpc/e500: Add missing prototype for 'relocate_init' powerpc/64: Fix unannotated intra-function call warning powerpc/epapr: Don't use wrteei on non booke powerpc: Pass correct CPU reference to assembler powerpc/mm: Rearrange if-else block to avoid clang warning powerpc/nohash: Fix build with llvm-as powerpc/nohash: Fix build error with binutils >= 2.38 powerpc/pseries: Fix endianness issue when parsing PLPKS secvar flags macintosh: windfarm: Use unsigned type for 1-bit bitfields powerpc/kexec_file: print error string on usable memory property update failure powerpc/machdep: warn when machine_is() used too early powerpc/64: Replace -mcpu=e500mc64 by -mcpu=e5500 powerpc/eeh: Set channel state after notifying the drivers selftests/powerpc: Fix incorrect kernel headers search path powerpc/rtas: arch-wide function token lookup conversions powerpc/rtas: introduce rtas_function_token() API powerpc/pseries/lpar: convert to papr_sysparm API powerpc/pseries/hv-24x7: convert to papr_sysparm API ...
2023-02-20Merge tag 'perf-core-2023-02-20' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf updates from Ingo Molnar: - Optimize perf_sample_data layout - Prepare sample data handling for BPF integration - Update the x86 PMU driver for Intel Meteor Lake - Restructure the x86 uncore code to fix a SPR (Sapphire Rapids) discovery breakage - Fix the x86 Zhaoxin PMU driver - Cleanups * tag 'perf-core-2023-02-20' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits) perf/x86/intel/uncore: Add Meteor Lake support x86/perf/zhaoxin: Add stepping check for ZXC perf/x86/intel/ds: Fix the conversion from TSC to perf time perf/x86/uncore: Don't WARN_ON_ONCE() for a broken discovery table perf/x86/uncore: Add a quirk for UPI on SPR perf/x86/uncore: Ignore broken units in discovery table perf/x86/uncore: Fix potential NULL pointer in uncore_get_alias_name perf/x86/uncore: Factor out uncore_device_to_die() perf/core: Call perf_prepare_sample() before running BPF perf/core: Introduce perf_prepare_header() perf/core: Do not pass header for sample ID init perf/core: Set data->sample_flags in perf_prepare_sample() perf/core: Add perf_sample_save_brstack() helper perf/core: Add perf_sample_save_raw_data() helper perf/core: Add perf_sample_save_callchain() helper perf/core: Save the dynamic parts of sample data size x86/kprobes: Use switch-case for 0xFF opcodes in prepare_emulation perf/core: Change the layout of perf_sample_data perf/x86/msr: Add Meteor Lake support perf/x86/cstate: Add Meteor Lake support ...
2023-02-13powerpc/pseries/hv-24x7: convert to papr_sysparm APINathan Lynch
The new papr_sysparm API handles the details of system parameter retrieval. Use that instead of open-coding the RTAS call, work area management, and retries. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-17-26929c8cce78@linux.ibm.com
2023-02-13powerpc/perf/hv-24x7: add missing RTAS retry status handlingNathan Lynch
The ibm,get-system-parameter RTAS function may return -2 or 990x, which indicate that the caller should try again. read_24x7_sys_info() ignores this, allowing transient failures in reporting processor module information. Move the RTAS call into a coventional rtas_busy_delay()-based loop, along with the parsing of results on success. Signed-off-by: Nathan Lynch <nathanl@linux.ibm.com> Fixes: 8ba214267382 ("powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details") Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230125-b4-powerpc-rtas-queue-v3-2-26929c8cce78@linux.ibm.com
2023-02-12Merge branch 'fixes' into nextMichael Ellerman
Merge our fixes branch to bring in some changes that conflict with upcoming next content.
2023-02-10powerpc/hv-24x7: Fix pvr check when setting interface versionKajol Jain
Commit ec3eb9d941a9 ("powerpc/perf: Use PVR rather than oprofile field to determine CPU version") added usage of pvr value instead of oprofile field to determine the platform. In hv-24x7 pmu driver code, pvr check uses PVR_POWER8 when assigning the interface version for power8 platform. But power8 can also have other pvr values like PVR_POWER8E and PVR_POWER8NVL. Hence the interface version won't be set properly incase of PVR_POWER8E and PVR_POWER8NVL. Fix this issue by adding the checks for PVR_POWER8E and PVR_POWER8NVL as well. Fixes: ec3eb9d941a9 ("powerpc/perf: Use PVR rather than oprofile field to determine CPU version") Reported-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Kajol Jain <kjain@linux.ibm.com> Tested-by: Sachin Sant <sachinp@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230131184804.220756-1-kjain@linux.ibm.com
2023-01-31powerpc/imc-pmu: Revert nest_init_lock to being a mutexMichael Ellerman
The recent commit 76d588dddc45 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") fixed warnings (and possible deadlocks) in the IMC PMU driver by converting the locking to use spinlocks. It also converted the init-time nest_init_lock to a spinlock, even though it's not used at runtime in IRQ disabled sections or while holding other spinlocks. This leads to warnings such as: BUG: sleeping function called from invalid context at include/linux/percpu-rwsem.h:49 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0 preempt_count: 1, expected: 0 CPU: 7 PID: 1 Comm: swapper/0 Not tainted 6.2.0-rc2-14719-gf12cd06109f4-dirty #1 Hardware name: Mambo,Simulated-System POWER9 0x4e1203 opal:v6.6.6 PowerNV Call Trace: dump_stack_lvl+0x74/0xa8 (unreliable) __might_resched+0x178/0x1a0 __cpuhp_setup_state+0x64/0x1e0 init_imc_pmu+0xe48/0x1250 opal_imc_counters_probe+0x30c/0x6a0 platform_probe+0x78/0x110 really_probe+0x104/0x420 __driver_probe_device+0xb0/0x170 driver_probe_device+0x58/0x180 __driver_attach+0xd8/0x250 bus_for_each_dev+0xb4/0x140 driver_attach+0x34/0x50 bus_add_driver+0x1e8/0x2d0 driver_register+0xb4/0x1c0 __platform_driver_register+0x38/0x50 opal_imc_driver_init+0x2c/0x40 do_one_initcall+0x80/0x360 kernel_init_freeable+0x310/0x3b8 kernel_init+0x30/0x1a0 ret_from_kernel_thread+0x5c/0x64 Fix it by converting nest_init_lock back to a mutex, so that we can call sleeping functions while holding it. There is no interaction between nest_init_lock and the runtime spinlocks used by the actual PMU routines. Fixes: 76d588dddc45 ("powerpc/imc-pmu: Fix use of mutex in IRQs disabled section") Tested-by: Kajol Jain<kjain@linux.ibm.com> Reviewed-by: Kajol Jain<kjain@linux.ibm.com> Signed-off-by: Michael Ellerman <mpe@ellerman.id.au> Link: https://lore.kernel.org/r/20230130014401.540543-1-mpe@ellerman.id.au